The unknown knowns: a graph-based approach for temporal COVID-19 literature mining

Creative Commons License

Bayram U., Roy R., Assalil A., Benhiba L.

ONLINE INFORMATION REVIEW, vol.45, no.4, pp.687-708, 2021 (SCI-Expanded) identifier identifier

  • Publication Type: Article / Article
  • Volume: 45 Issue: 4
  • Publication Date: 2021
  • Doi Number: 10.1108/oir-12-2020-0562
  • Journal Indexes: Science Citation Index Expanded (SCI-EXPANDED), Social Sciences Citation Index (SSCI), Scopus, Academic Search Premier, FRANCIS, ABI/INFORM, Aerospace Database, Applied Science & Technology Source, CINAHL, Communication Abstracts, Compendex, Computer & Applied Sciences, EBSCO Education Source, Education Abstracts, Information Science and Technology Abstracts, INSPEC, Library and Information Science Abstracts, Library Literature and Information Science, Library, Information Science & Technology Abstracts (LISTA), Metadex, vLex, DIALNET, Civil Engineering Abstracts
  • Page Numbers: pp.687-708
  • Keywords: COVID-19, Semantic graphs, Natural language processing, Link prediction, Machine learning
  • Çanakkale Onsekiz Mart University Affiliated: Yes


The COVID-19 pandemic has sparked a remarkable volume of research literature, and scientists are increasingly in need of intelligent tools to cut through the noise and uncover relevant research directions. As a response, we propose a novel framework. In this framework, we develop a novel weighted semantic graph model to compress the research studies efficiently. Also, we present two analyses on this graph to propose alternative ways to uncover additional aspects of COVID-19 research.
We construct the semantic graph using state-of-the-art Natural Language Processing (NLP) techniques on COVID-19 publication texts (>100,000 texts). Next, we conduct an evolutionary analysis to capture the changes in COVID-19 research across time. Finally, we apply a link prediction study to detect novel COVID-19 research directions that are so far undiscovered.
Findings reveal the success of the semantic graph in capturing scientific knowledge and its evolution. Meanwhile, the prediction experiments provide 79% accuracy on returning intelligible links, showing the reliability of the methods for predicting novel connections that could help scientists discover potential new directions.
To our knowledge, this is the first study to propose a holistic framework that includes encoding the scientific knowledge in a semantic graph, demonstrates an evolutionary examination of past and ongoing research, and offers scientists with tools to generate new hypotheses and research directions through predictive modeling and deep machine learning techniques.

Creative Commons Attribution Non-commercial International Licence 4.0 (CC BY-NC 4.0)