• Extraction of Drug-Gene Relationships from Literature Using Gene Functional Contexts

      Llamas, Eduardo; Kann, Maricel G. (2013)
      Drug side effects and toxicity are often the result of off–target drug–gene interactions that affect biological processes unrelated to the focus of treatment. Drugs with multiple drug targets often fail in clinical trials due to limited knowledge of their inherent polypharmacology. The biomedical literature is rich in drug–gene relationship data, but much of it remains inaccessible, further hindering comprehensive knowledge of drug–gene interactions. Thus, accessible, high quality drug information databases providing comprehensive drug–gene information are critical for research in pharmacology, toxicology, and pharmacogenomics. Current resources of drug–gene information rely on manual curation that can be inefficient and expensive. Thus, automatic methods aiming to accelerate the identification of drug–gene relationships extracted from biomedical literature have great potential for increasing the coverage and efficiency of drug–gene annotations. The text–mining method Literature Extraction of Drug–Gene relationships (LEx–DG), described here, identifies drug–gene relationships in biomedical abstracts when the target genes can be grouped into biological function–relevant clusters. These clusters identify GO terms, pathways, protein domains, sequence motifs, and protein–protein interaction networks that may be affected or targeted by a drug, thereby increasing the precision of the method and leading to new hypotheses about possible functional relationships among a drug's multiple gene targets. In comparison with the PGx pipeline, a well–established drug–gene relationship prediction method, LEx–DG achieves significantly higher precision and F–score in identifying known drug–gene relationships. Manual curation of LEx–DG results for gemcitabine lead to the identification of 46 relationships that were not previously annotated in PharmGKB or the Comparative Toxicogenomics Database. The results demonstrate the feasible application of LEx–DG for large–scale annotation of drug–gene relationships to facilitate updates of drug–gene interaction resources.