La Trobe
elife-52614-v1.pdf (703.89 kB)

Wikidata as a knowledge graph for the life sciences

Download (703.89 kB)
Version 2 2021-02-11, 23:49
Version 1 2020-12-10, 02:48
journal contribution
posted on 2021-02-11, 23:49 authored by Andra Waagmeester, Gregory Stupp, Sebastian Burgstaller-Muehlbacher, Benjamin M Good, Malachi Griffith, Obi L Griffith, Kristina Hanspers, Henning Hermjakob, Toby S Hudson, Kevin Hybiske, Sarah M Keating, Magnus Manske, Michael Mayers, Daniel Mietchen, Elvira Mitraka, Alexander R Pico, Timothy Putman, Anders Riutta, Nuria Queralt-Rosinach, Lynn M Schriml, Thomas Shafee, Denise Slenter, Ralf Stephan, Katherine Thornton, Ginger Tsueng, Roger Tu, Sabah Ul-Hasan, Egon Willighagen, Chunlei Wu
© Waagmeester et al.

Wikidata is a community-maintained knowledge base that has been assembled from repositories in the fields of genomics, proteomics, genetic variants, pathways, chemical compounds, and diseases, and that adheres to the FAIR principles of findability, accessibility, interoperability and reusability. Here we describe the breadth and depth of the biomedical knowledge contained within Wikidata, and discuss the open-source tools we have built to add information to Wikidata and to synchronize it with source databases. We also demonstrate several use cases for Wikidata, including the crowdsourced curation of biomedical ontologies, phenotype-based diagnosis of disease, and drug repurposing.

Funding

National Institute of General Medical Sciences R01 GM089820 Andrew I SuNational Institute of General Medical Sciences U54 GM114833 Henning Hermjakob Andrew I SuNational Institute of General Medical Sciences R01 GM100039 Alexander R PicoNational Human Genome Research Institute R00HG007940 Malachi GriffithNational Cancer Institute U24CA237719 Malachi GriffithV Foundation for Cancer Research V2018-007 Malachi GriffithNational Institute of Allergy and Infectious Diseases R01 Al126785 Kevin HybiskeNational Center for Advancing Translational Sciences UL1 TR002550 Andrew I SuThe funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

National Institute of General Medical Sciences | R01 GM089820

National Institute of General Medical Sciences | U54 GM114833

National Institute of General Medical Sciences | R01 GM100039

National Human Genome Research Institute | R00HG007940

National Cancer Institute | U24CA237719

V Foundation for Cancer Research | V2018-007

National Institute of Allergy and Infectious Diseases | R01 Al126785

National Center for Advancing Translational Sciences | UL1 TR002550

History

Publication Date

2020-03-17

Journal

eLife

Volume

9

Article Number

e52614

Pagination

15p. (p. 1-15)

Publisher

eLife Sciences Publications

ISSN

2050-084X

Rights Statement

The Author reserves all moral rights over the deposited text and must be credited if any re-use occurs. Documents deposited in OPAL are the Open Access versions of outputs published elsewhere. Changes resulting from the publishing process may therefore not be reflected in this document. The final published version may be obtained via the publisher’s DOI. Please note that additional copyright and access restrictions may apply to the published version.