Across the globe, scientific reputations are being damaged with news about inexperienced researchers coming under pressure to publish research papers. Research fraud is being committed by fabricating or falsifying data and reporting incorrect findings. This negative trend is damaging the scientific integrity of researchers.
A recent Finnish study indicated that between 2010 and 2014, the number of articles published by “predatory” journals rose from 53,000 to half a million. Predatory journals are publications who charge money for fake peer reviews and publications. Reportedly, the Chinese government cracked down on more than 400 authors for damaging China’s scientific reputation with such fraudulent research papers.
In a bid to build up their resumes, young researchers are pushed to publish scientific research with fabricated data which hurts the community and makes it difficult for journals to sort through the noise. Reportedly, almost two percent scientists admitted to falsifying data, and almost 34% admitted to using questionable research practices. The survey further indicated that 14.2% resorted to even falsification. Not just that, tools like SCIGen are also being used to generate valid-looking articles. In 2013, the IEEE reportedly pulled 120 papers out of their publication since they were computer-generated.
Interestingly, a decade ago, researchers Jeremy Stribling, Dan Aguayo and Max Krohna of MIT’s Computer Science and Artificial Intelligence Lab (CSAIL) built a computer science paper generator that could stitch together nonsense papers with impressive graphs and such. The papers drummed up by SCIgen software were accepted at major conferences and even by reputed journals. Reportedly, the SCIgen software has been leveraged by scores of academics to publish journals articles and conference proceedings by reputed publishers such as Springer and the Institute of Electrical and Electronic Engineers (IEEE, US).
However, French computer scientist Cyril Labbe has spent a considerable amount of time flagging down these 120 SCIgen papers with ML tools.
Now, AI and machine learning techniques are being used to exponentially improve the way that research is being conducted and published. Some of the top ways AI is being used to benefit the scientific community are:
- NLP tools are being used to fight plagiarism and identify sections which are reworded
- AI and machine learning techniques are being deployed to find data that is flawed or misreported. AI capabilities can detect how statistics was applied to arrive at a certain outcome. It is also being used to detect whether data was manipulated to get the desired result
Other Tools
Machine Box: To counter the menace of fake research, Machine Box leverages ML capabilities into Docker containers so that developers can easily incorporate NLP, facial detection, object recognition, etc into your the apps quickly. It is also cheaper than any of the cloud services and the data doesn’t leave the organisation/individual’s infrastructure.
Algorithmic Tool to mine research papers: In a similar vein, Daniel Acuna, Assistant Professor at Syracuse University’s School of Information Studies, and his team revealed how they successfully implemented an algorithmic approach to mine nearly 800,000 biomedical papers and 2 million images for duplication. According to Acuna, ML can be used to detect duplicate images whether they were rotated or skewed in some way.
How AI Helps In Automating Peer Review Process
Besides combating research fraud, AI can also automate the process of peer review process. With Google Translate, which follows rules-based machine translation, one can improve the peer review process. While computers may play an increasingly useful role in editorial and peer review processes, it will still require a human-in-the-loop element.
However, peer reviewers cannot be replaced by machines — humans are needed to review language competence. As the machines are only as good as the people who programmed them, machine intelligence can never keep pace with scientific research. Hence, humans are needed to evaluate and provide feedback on manuscripts and feed information to computers to help them improve. In a similar vein, administrators handling research manuscripts will continue to be necessary for dealing with the unexpected: answering questions and managing projects.