Research recently published in Nature suggests that intelligent machines planning to replicate themselves for world domination are doomed to failure. This means that the Terminator genre of science fiction stands discredited. But we mustn’t despair, because a fun hack has shown that following the loosening of norms in scientific publishing and the entry of digital entities in education, even a house cat can pose as a renowned scientist and get away with it.
The present learned cat is Larry Richardson. He is technically the uncle of researcher Reese Richardson, since he is his grandmother’s cat. He was enrolled into the research rat race to overawe pioneering Siamese cat academic FDC Willard, whose initials stand for Felis Domesticus Chester. FDC Willard is not a fraud. He studies helium. He entered STEM when his human, professor emeritus of physics and astronomy at Michigan State University, Jack Hetherington, wrote a paper solo using the royal ‘we’. Instead of laboriously replacing ‘we’ with ‘I’ throughout, he promoted his cat to researcher and shared credit with him.
Hetherington & Willard produced real work, which was really cited. The fact that FDC Willard became a popular meme cat does not detract from the physics he lent his awesome name to. In contrast, Larry signed off on completely fake papers generated using SCIgen and Mathgen, services that write up garbage papers in computer science and math using compelling technical jargon. These prank services help show how easily rubbish can be published to increase academic ratings, cutting through safeguards and bypassing peer review. A journal discredited itself by accepting the first Mathgen garbage paper in 2012, and 12 years later, Larry has used the same service to breach the ceiling for learned cats.
Reese Richardson generated 12 fake papers by Larry, complete with references and citations, and used them to give the cat a fine profile on Google Scholar, whose ratings universities consult when hiring. The hack is to upload garbage papers to Researchgate, wait for Google to pick them up, then delete them. They linger on Google Scholar and the process costs nothing.
This issue is important because over the last decade, there’s been a wave of fake academic journals and papers. In Covid years, standards were relaxed and unvetted preprints became commonplace as researchers rushed to bring vaccines to the market in record time, and the wave swelled into a tsunami. Authorship is now sold as a commodity, both freelance and via ‘paper mills’—pay up, and you can be named lead researcher of a real or garbage paper. Journals have grown faster than papers in volume, indicating some of them are garbage bins.
The world’s most populous country has contributed to the problem since 2010, when UGC made it compulsory for Indian college teachers to publish research to advance careers. Since India has thousands of institutions without trained staff, funds or infrastructure fit to conduct research, plagiarism and fakery have boomed. Larry’s academic career began when Reese Richardson’s attention was drawn to a service that offered to improve profiles on Google Scholar. All but two of the scholars it claimed to have helped were Indians.
With the rapid entry of AI into almost all aspects of life, fake content is no longer just an academic concern. For financial gain, AI is being used to fill millions of sites with garbage. Since AIs are trained on terabytes of content scraped from the internet, it is only a matter of time before they start using AI-generated garbage as input.
A paper in Nature has now investigated the garbage-in-garbage-out phenomenon by training generations of AIs on output of prior generations, instead of human-produced material. It worked like inbreeding in biology, which has explained the prevalence of haemophilia among European royalty. Starved of human input, the cannibal AIs became incoherent and by the ninth generation, produced only bizarre garbage. AIs are not independently creative but only emulate human behaviour, so this was inevitable when human input was cut off.
The experiment used only textual material. It remains to be seen what sort of images and sounds result from digital inbreeding. Will the art be like Jackson Pollock or Joan Miró? Will degraded music be like chewed-up Kraftwerk cassettes or like Cacofonix at the feast the moment before Fulliautomatix censors him?
It is clear machine self-replication, which earlier attracted attention of scientists like John von Neumann and Freeman Dyson, will fail without human inputs. Sadly, that destroys credibility of most stirring dystopian fiction. We need a new genre where machines do not hunt humans but farm them to harvest their culture as input.
The fake content menace is scarier than fiction. Space-fillers that the internet has been fattened on can poison the well for AI projects, but fake academic material is even more dangerous. Governments make decisions on the basis of academic work, affecting millions of lives. Institutions, including those of academia, use it to make projections and create material that can affect almost anything—market conditions, space projects, defence planning, drug development, and teaching material for schools. Early internet marketing gurus used to say that “content is king”, and the dictum was generally right. But in the age of alternative facts, it sounds ominous.
(Views are personal)
(On X @pratik_k)
Pratik Kanjilal | For years, the author has been speaking easy to a surprisingly tolerant public