lunes, 15 de diciembre de 2014

DisGeNET - a database of gene-disease associations

DisGeNET - a database of gene-disease associations





One of the most challenging problems in biomedical research is to understand the underlying mechanisms of complex diseases. Great effort has been spent on finding the genes associated to diseases (Botstein and Risch, 2003 ; Kann, 2009). However, more and more evidences indicate that most human diseases cannot be attributed to a single gene but arise due to complex interactions among multiple genetic variants and environmental risk factors (Hirschhorn and Daly, 2005). Several databases have been developed storing associations between genes and diseases such as CTDTM (Davis, et al., 2014), OMIM® (Hamosh et al.,2005) and GAD (Becker et al., 2004). Each of these databases focuses on different aspects of the phenotype-genotype relationship, and due to the nature of the database curation process, they are not complete. Hence, integration of different databases with information extracted from the literature is needed to allow a comprehensive view of the state of the art knowledge within this research field. With this need in mind, we have created DisGeNET.
DisGeNET is a discovery platform integrating information on gene-disease associations from several public data sources and the literature. The current version contains 381056 associations, between 16666 genes and 13172 diseases. Given the large number of gene-disease associations compiled in DisGeNET, we have also developed a score in order to rank the associations based on the supporting evidence. Importantly, useful tools have also been created to explore and analyze the data contained in DisGeNET. DisGeNET can be queried through Search and Browse functionalities available from this web interface, or by a plugin created for Cytoscape to query and analyze a network representation of the data. Moreover, DisGeNET data can be queried by downloading the SQLite database to your local repository. Furthermore, an RDF (Resource Description Framework) representation of DisGeNET database is also available. It can be queried using our SPARQL endpoint and a Faceted Browser. Follow the link for more information.
DisGeNET database has been cited by several papers. Some of them can be reviewed here.
The DisGeNET database is made available under the Open Database License. Any rights in individual contents of the database are licensed under the Database Contents License.



NOTE to R users: If you are using our automatically generated R scripts to perform your queries, for some R versions there might be issues with the "rawToChar" function. In order to solve them, just remove the function an execute it as follows:

dataTsv <- getURLContent(url, readfunction =charToRaw(oql), upload = TRUE, customrequest = "POST")

News

November 26-27, 2014: Our PostDoc, Núria Queralt Rosinach, was invited as a Linked Data expert to the RDConnect/Elixir/BioMedBridges BYOD workshop named 'Rare Disease Registries (and biobanks)'. DisGeNET was brought to the event as a linkable dataset to show the value of Linked Data. Our involvement in the Open PHACTS project was highlighted during the workshop. See more at BYOD Workshop.
October 23, 2014: We have published DisGeNET as nanopublications, which is a new Linked Dataset using the nanopublication approach and the Trusty URI technique. To see more information, go to DisGeNET Nanopublications section.
October 15, 2014: DisGeNET appears for the first time in the LOD cloud diagram (2014-08-30 update). This diagram shows datasets published in Linked Data format and it is built based on their metadata description on the DataHub as well as on metadata extracted from a crawl of the Linked Data web (DisGeNET DataHub site here)
October 2, 2014: Sequence variant information available in DisGeNET! More than 8,000 SNPs associated to disease have been annotated using text mining. Check some examples here. More information coming soon....
September 16, 2014: We have updated the metadata description of the DisGeNET Linked dataset, see it at DisGeNET VoID.
August 27, 2014: An update of the DisGeNET Cytoscape tutorial has been made available.
August 8, 2014: A tutorial to illustrate the functionalities of the web interface has been made available. Follow the link for more information.
July 01, 2014: The update of DisGeNET RDF has been released (version 2.1.0). 13172 diseases and 16666 genes linked by 381056 gene-disease associations in the Semantic Web represented by a new data model and more annotation and linkouts. Please, find all the new data and information related @ http://rdf.disgenet.org/ 
June 16, 2014: Bug in DisGeNET plugin concerning coloring nodes by disease class has been fixed. We recommend our users to download the plugin and the database again. For more information, contact support(at)disgenet(dot)org
May 28, 2014: DisGeNET domain name has been moved to www.disgenet.org. Our faceted browser is now at http://rdf.disgenet.org/fct/ and our sparql endpoint at http://rdf.disgenet.org/sparql/ .
May 5, 2014: A new version of DisGeNET has been released (version 2.1). We have new data from text mining using our BeFree system .
April 26-27, 2014: Janet Piñero and Núria Queralt participated to the Network of Biothings Hackathon.
April 24, 2014: DisGeNET RDF integrated in the Web of Linked Data can be now navigated via a new implemented Faceted browser.
April 16, 2014: The paper describing the Biomedical Named Entity Recognition (BioNER) used to extract and identify genes/proteins and diseases in DisGeNET BeFree dataset is out! Check it here.
April 11, 2014: The paper introducing the Semanticscience Integrated Ontology (SIO) in which the DisGeNET association type ontology it has been integrated is out! Check it here.
February 5, 2014: New release of DisGeNET available, with updated info and data from two new resources: Text mining (TEXTM) and Rat Genome Database (RGD).
December 10, 2013: "DisGeNET RDF: a gene-disease association Linked Open Data resource” presented at the 2013 edition of SWAT4LS in Edinburgh.PosterPoster Abstract.
September, 2013: An RDF (Resource Description Framework) representation of DisGeNET database has been created that can be queried using our SPARQL endpoint. Making DisGeNET data available as RDF Linked Open Data promotes integration with other RDF representations of resources in the semantic web. Follow the link for more information."
February, 2013: The DisGeNET association type ontology developed in our group has been integrated in the Semantics Science Integrated Ontology, (SIO) which is an integrated ontology of types and relations for rich description of objects, processes and their attributes. Thanks to Dr. Michel Dumontier for accepting this collaboration and helping us in the integration.
January, 2013: DisGeNET registered with the Neuroscience Lexicon NeuroLex.
November 30, 2012: “DisGeNET: from MySQL to Nanopublication, modelling gene-disease associations for the Semantic Web” will be presented at theSWAT4LS in Paris.
November, 2012: DisGeNET announced in the International Society for Biocuration Newsletter November issue
October 8-9, 2012: DisGeNET presented at the SME Bioinformatics Forum, in Barcelona.
July 20, 2012: New release of DisGeNET available, with updated info and data from two new resources: Genetic Association Database (GAD) and Mouse Genomics Database (MGD).

No hay comentarios: