What is GIN?
GIN (Gene Interaction Network) is a system for browsing articles and molecule interaction information. What makes GIN stand out from other similar systems is that it uses automated methods (such as dependency parsing) to mine the text for relevant information (such as protein interactions) and computes statistics for the interaction network. The user can browse articles with highlighted summary sentences, citing sentences (sentences from other articles that cite the article in question), and interaction sentences. The user can also browse molecules to view their interactions, neighborhood, and other network statistics.
GIN uses text mining and network analysis methods to predict gene-disease associations. Disease-specific interaction networks are built by starting with initial lists of seed disease genes that are known to be related to a disease. All the interactions among the seed genes and the genes that interact with them are extracted automatically from the literature. Centrality metrics are used to rank the genes in the constructed disease-specific networks. The hypothesis is that, the central genes in these networks (inferred disease genes) are likely to be related to the diseases.
The gene names in GIN are normalized to their official HGNC (HUGO Gene Nomenclature Committee) symbols.
Corpus and Network Statistics
GIN uses articles from PubMed Central Open Access. It currently has access to 48,245 articles and 43,956 citations between these articles. Currently 9,291 interaction sentences from 2,888 articles for 2,422 molecules have been extracted.
The statistics for the disease-specific networks are:
- Prostate Cancer: 226 molecules and 2,767 interaction sentences

