AI Tools To Mine COVID-19 Literature

AI Tools To Mine COVID-19 Literature
Parvathy Hariharan

The age of face masks and social distancing — an unimaginable concept even several months ago — is here for the foreseeable future, thanks to the novel coronavirus. While the public is learning to homeschool with virtual learning and work with Zoom meetings, scientists are tackling a new challenge: an unprecedented flooding of research data on the novel coronavirus. Around 23,000 scientific papers on COVID-19 were released just in the first five months of this year and is expected to increase exponentially in the days to come.

These vast numbers of publications and the prolific amount of research being published each day are overwhelming an already broken system of scientific communication without a centralized, comprehensive database, an issue that existed even prior to COVID-19. The pandemic has brought to light the critical importance of evidence synthesis and data analysis with research institutions, publications, and scientists are sharing their data and resources openly. More than ever now, a critical piece of research that came out yesterday can literally save lives, enable better use of funding to avoid redundant research, and build upon the existing knowledge base on one of the largest scales in modern history.

A number of new artificial-intelligence tools have been designed to help scientists tackle the COVID-19 literature. Many of them are freely available for researchers. These tools are also important to fight what some scientists are calling as the pandemic of misinformation and track retractions of scientific publications. Here is a quick list summarizing a few of these tools. This list is not a comprehensive summary of all COVID-19 data-mining tools available.

*COVID-19 Data Visualization Nest by Nested Knowledge: Our database presents a comprehensive, updateable literature review whose results can narrowed down to subpopulations and outcomes of interests, all represented by interactive visuals. Read more on our blogpost.

*SciSight by the Allen Institute for AI: This interactive visual tool helps explore the COVID-19 Open Research Dataset (CORD-19) to identify associations between various biological factors such as genes, proteins, and enzymes and other chemicals to diseases. In addition, it also shows the network of various international scientific teams, affiliations, and topics as research clusters.

*The CORD-19 AI challenge hosted by Kaggle: Kaggle, a platform for data scientists and machine learning experts, hosts a research challenge on tackling the most important questions on COVID-19, together with several other institutions such as National Institutes of Health and the White House. Data scientists can participate in this competition to contribute innovative findings and win cash awards.

*The COVID-19 Research Explorer by Google: This technology uses natural language processing tools to explore the CORD-19 dataset. It encourages uses to provide specific context-based questions for processing data to further narrow down search results.

*The AI deep search tool by IBM Research: This cloud-based tool helps researchers perform deep searches on important research topics from the CORD-19 database in addition to several others from GenBank,, and DrugBank. IBM has also developed a IBM Watson Insights for Medical Literature COVID-19 Navigator to extract data from published literature.

*COVID-19 Primer by Primer AI: This tool provides daily updates on all research findings, news, and social media related to the novel coronavirus and the COVID-19 pandemic using natural language processing.