But as GPT-2 spreads online and is appropriated by more people like disumbrationist – amateur makers who are using the tool to create everything from Reddit threads, to short stories and poems, to restaurant reviews – the team at OpenAI are also grappling with how their powerful tool might flood the internet with fake text, making it harder to know the origins of anything we read online.
For Clark, convincing machine text like the variety GPT-2 is capable of pose a similar threat to “deepfakes” – machine-learning generated fake images and videos that can been used to make people appear to do things they never did, say things they never said (like this video of former president Barack Obama). “They are essentially the same,” Clark told me. “You have technology that makes it cheaper and easier to fake something, which means that it will just get harder to offer guarantees about the truth of information in the future.
Clark and the team at OpenAI take this threat so seriously that when they unveiled GPT-2 in February this year, they released a blogpost alongside it stating that they weren’t releasing the full version of the tool due to “concerns about malicious applications”. (They have since released a larger version of the model, which is being used to create the fake Reddit threads, poems and so on.
While such digital forensics are useful, Britt Paris, a researcher at New York-based institute Data & Society, worries that such solutions misleadingly frame fake news as a technological problem when, in fact, most misinformation is created and spread online without the help of sophisticated technologies. “We already have a ton of ways for generating false information and people do a pretty good job of circulating this stuff without the help of machines,” she said.
In a recent testimony given at a House intelligence committee hearing about the threat of AI-generated fake media, Clark said he foresees fake text being used “for the production of [literal] ‘fake news’, or to potentially impersonate people who had produced a lot of text online, or simply to generate troll-grade propaganda for social networks”.
For Zack Lipton, professor of business technologies at Carnegie Mellon University, the assessment of the risk of the technology was disingenuous. “Of all the bad uses of AI – from recommender systems that lead to filter bubbles and the racial consequences that emerge from automated categorization – I would put the threat of language modeling at the bottom of the list,” he said. “What OpenAI have done is commandeered the discourse and fear about AI and used it to generate hype around their product.
This simulated forum was created by a Reddit user called disumbrationist using a tool called GPT-2, a machine learning language generator that was unveiled in February by OpenAI, one of the world’s leading AI labs.
While a system like GPT-2 can produce semi-coherent articles at scale, it is a long way from being able to replicate this type of psychological manipulation. “The simple ability to generate false text at scale is not likely to affect most forms of disinformation,” he told me.
Could ‘fake text’ be the next global political threat? https://t.co/xpcZo1HO0V— Guardian Tech (@guardiantech) July 4, 2019