• Active site prediction of phosphorylated SARS-CoV-2 N-Protein using molecular simulation.

      Sankararaman, Sreenidhi; Hamre, John; Almsned, Fahad; Aljouie, Abdulrhman; Bokhari, Yahya; Alawwad, Mohammed; Alomair, Lamya; Jafri, M Saleet (Elsevier, 2022-02-21)
      The Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) nucleocapsid protein (N-protein) is responsible for viral replication by assisting in viral RNA synthesis and attaching the viral genome to the replicase-transcriptase complex (RTC). Numerous studies suggested the N-protein as a drug target. However, the specific N-protein active sites for SARS-CoV-2 drug treatments are yet to be discovered. The purpose of this study was to determine active sites of the SARS-CoV-2 N-protein by identifying torsion angle classifiers for N-protein structural changes that correlated with the respective angle differences between the active and inactive N-protein. In the study, classifiers with a minimum accuracy of 80% determined from molecular simulation data were analyzed by Principal Component Analysis and cross-validated by Logistic Regression, Support Vector Machine, and Random Forest Classification. The ability of torsion angles ψ252 and φ375 to differentiate between phosphorylated and unphosphorylated structures suggested that residues 252 and 375 in the RNA binding domain might be important in N-protein activation. Furthermore, the φ and ψ angles of residue S189 correlated to a 90.7% structural determination accuracy. The key residues involved in the structural changes identified here might suggest possible important functional sites on the N-protein that could be the focus of further study to understand their potential as drug targets. © 2022 The Authors
    • Artificial intelligence for classification of temporal lobe epilepsy with ROI-level MRI data: A worldwide ENIGMA-Epilepsy study

      Gleichgerrcht, Ezequiel; Munsell, Brent C; Alhusaini, Saud; Alvim, Marina K M; Bargalló, Núria; Bender, Benjamin; Bernasconi, Andrea; Bernasconi, Neda; Bernhardt, Boris; Blackmon, Karen; et al. (Elsevier Inc., 2021-07-24)
      Artificial intelligence has recently gained popularity across different medical fields to aid in the detection of diseases based on pathology samples or medical imaging findings. Brain magnetic resonance imaging (MRI) is a key assessment tool for patients with temporal lobe epilepsy (TLE). The role of machine learning and artificial intelligence to increase detection of brain abnormalities in TLE remains inconclusive. We used support vector machine (SV) and deep learning (DL) models based on region of interest (ROI-based) structural (n = 336) and diffusion (n = 863) brain MRI data from patients with TLE with ("lesional") and without ("non-lesional") radiographic features suggestive of underlying hippocampal sclerosis from the multinational (multi-center) ENIGMA-Epilepsy consortium. Our data showed that models to identify TLE performed better or similar (68-75%) compared to models to lateralize the side of TLE (56-73%, except structural-based) based on diffusion data with the opposite pattern seen for structural data (67-75% to diagnose vs. 83% to lateralize). In other aspects, structural and diffusion-based models showed similar classification accuracies. Our classification models for patients with hippocampal sclerosis were more accurate (68-76%) than models that stratified non-lesional patients (53-62%). Overall, SV and DL models performed similarly with several instances in which SV mildly outperformed DL. We discuss the relative performance of these models with ROI-level data and the implications for future applications of machine learning and artificial intelligence in epilepsy care.
    • Bioinformatic and machine learning applications in melanoma risk assessment and prognosis: A literature review

      Ma, Emily Z.; Hoegler, Karl M.; Zhou, Albert E. (MDPI AG, 2021-10-30)
      Over 100,000 people are diagnosed with cutaneous melanoma each year in the United States. Despite recent advancements in metastatic melanoma treatment, such as immunotherapy, there are still over 7,000 melanoma-related deaths each year. Melanoma is a highly heterogenous disease, and many underlying genetic drivers have been identified since the introduction of next-generation sequencing. Despite clinical staging guidelines, the prognosis of metastatic melanoma is variable and difficult to predict. Bioinformatic and machine learning analyses relying on genetic, clinical, and histopathologic inputs have been increasingly used to risk stratify melanoma patients with high accuracy. This literature review summarizes the key genetic drivers of melanoma and recent applications of bioinformatic and machine learning models in the risk stratification of melanoma patients. A robustly validated risk stratification tool can potentially guide the physician management of melanoma patients and ultimately improve patient outcomes. © 2021 by the authors.
    • A Brief History of AI: How to Prevent Another Winter (A Critical Review)

      Toosi, Amirhosein; Bottino, Andrea G; Saboury, Babak; Siegel, Eliot; Rahmim, Arman (Elsevier Inc., 2021-09-09)
    • Current state of and future opportunities for prediction in microbiome research: Report from the mid-atlantic microbiome meet-up in Baltimore on 9 january 2019

      Sakowski, E.; Mongodin, E.F.; Regan, M.J. (American Society for Microbiology, 2019)
      Accurate predictions across multiple fields of microbiome research have far-reaching benefits to society, but there are few widely accepted quantitative tools to make accurate predictions about microbial communities and their functions. More discussion is needed about the current state of microbiome analysis and the tools required to overcome the hurdles preventing development and implementation of predictive analyses. We summarize the ideas generated by participants of the Mid-Atlantic Microbiome Meet-up in January 2019. While it was clear from the presentations that most fields have advanced beyond simple associative and descriptive analyses, most fields lack essential elements needed for the development and application of accurate microbiome predictions. Participants stressed the need for standardization, reproducibility, and accessibility of quantitative tools as key to advancing predictions in microbiome analysis. We highlight hurdles that participants identified and propose directions for future efforts that will advance the use of prediction in microbiome research. Copyright 2019 Sakowski et al.
    • A data-driven travel mode share estimation framework based on mobile device location data

      Yang, Mofeng; Pan, Yixuan; Darzi, Aref; Ghader, Sepehr; Xiong, Chenfeng; Zhang, Lei (Springer Nature, 2021-08-12)
      Mobile device location data (MDLD) contains abundant travel behavior information to support travel demand analysis. Compared to traditional travel surveys, MDLD has larger spatiotemporal coverage of the population and its mobility. However, ground truth information such as trip origins and destinations, travel modes, and trip purposes are not included by default. Such important attributes must be imputed to maximize the usefulness of the data. This paper targets at studying the capability of MDLD on estimating travel mode share at aggregated levels. A data-driven framework is proposed to extract travel behavior information from MDLD. The proposed framework first identifies trip ends with a modified Spatiotemporal Density-based Spatial Clustering of Applications with Noise algorithm. Then three types of features are extracted for each trip to impute travel modes using machine learning models. A labeled MDLD dataset with ground truth information is used to train the proposed models, resulting in a 95% recall rate in identifying trip ends and over 93% tenfold cross-validation accuracy in imputing the five travel modes (drive, rail, bus, bike and walk) with a random forest (RF) classifier. The proposed framework is then applied to two large-scale MDLD datasets, covering the Baltimore-Washington metropolitan area and the United States, respectively. The estimated trip distance, trip time, trip rate distribution, and travel mode share are compared against travel surveys at different geographies. The results suggest that the proposed framework can be readily applied in different states and metropolitan regions with low cost in order to study multimodal travel demand, understand mobility trends, and support decision making. © 2021, The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature.
    • Deep learning model for accurate automatic determination of phakic status in pediatric and adult ultrasound biomicroscopy images

      Le, C.; Baroni, M.; Vinnett, A.; Levin, M.R.; Martinez, C.; Jaafar, M.; Madigan, W.P.; Alexander, J.L. (Association for Research in Vision and Ophthalmology Inc., 2020-12-23)
      Purpose: Ultrasound biomicroscopy (UBM) is a noninvasive method for assessing anterior segment anatomy. Previous studies were prone to intergrader variability, lacked assessment of the lens-iris diaphragm, and excluded pediatric subjects. Lens status classification is an objective task applicable in pediatric and adult populations. We developed and validated a neural network to classify lens status from UBM images. Methods: Two hundred eighty-five UBM images were collected in the Pediatric Anterior Segment Imaging Innovation Study (PASIIS) from 80 eyes of 51 pediatric and adult subjects (median age = 4.6 years, range = 3 weeks to 90 years) with lens status phakic, aphakic, or pseudophakic (n = 33, 7, and 21 subjects, respectively). Following transfer learning, a pretrained Densenet-121 model was fine-tuned on these images. Metrics were calculated for testing dataset results aggregated from fivefold cross-validation. For each fold, 20% of total subjects were partitioned for testing and the remaining subjects were used for training and validation (80:20 split). Results: Our neural network trained across 60 epochs achieved recall 96.15%, precision 96.14%, F1-score 96.14%, false positive rate 3.74%, and area under the curve (AUC) 0.992. Feature saliency heatmaps consistently involved the lens. Algorithm performance was compared using 2 image sets, 1 from subjects of all ages, and the second from only subjects under age 10 years, with similar performance under both circumstances. Conclusions: A neural network trained on a relatively small UBM image set classified lens status with satisfactory recall and precision. Adult and pediatric image sets offered roughly equivalent performance. Future studies will explore automated UBM image classification for complex anterior segment pathology. Translational Relevance: Deep learning models can evaluate lens status from UBM images in adult and pediatric subjects using a limited image set. Copyright 2020 The Authors.
    • Limited generalizability of deep learning algorithm for pediatric pneumonia classification on external data

      Xin, Kevin Z; Li, David; Yi, Paul H (Springer Nature, 2021-10-14)
      Purpose: (1) Develop a deep learning system (DLS) to identify pneumonia in pediatric chest radiographs, and (2) evaluate its generalizability by comparing its performance on internal versus external test datasets. Methods: Radiographs of patients between 1 and 5 years old from the Guangzhou Women and Children's Medical Center (Guangzhou dataset) and NIH ChestXray14 dataset were included. We utilized 5232 radiographs from the Guangzhou dataset to train a ResNet-50 deep convolutional neural network (DCNN) to identify pediatric pneumonia. DCNN testing was performed on a holdout set of 624 radiographs from the Guangzhou dataset (internal test set) and 383 radiographs from the NIH ChestXray14 dataset (external test set). Receiver operating characteristic curves were generated, and area under the curve (AUC) was compared via DeLong parametric method. Colored heatmaps were generated using class activation mapping (CAM) to identify important image pixels for DCNN decision-making. Results: The DCNN achieved AUC of 0.95 and 0.54 for identifying pneumonia on internal and external test sets, respectively (p < 0.0001). Heatmaps generated by the DCNN showed the algorithm focused on clinically relevant features for images from the internal test set, but not for images from the external test set. Conclusion: Our model had high performance when tested on an internal dataset but significantly lower accuracy when tested on an external dataset. Likewise, marked differences existed in the clinical relevance of features highlighted by heatmaps generated from internal versus external datasets. This study underscores potential limitations in the generalizability of such DLS models.
    • Machine learning-based prediction of drug and ligand binding in BCL-2 variants through molecular dynamics

      R. Hamre, John; Klimov, Dmitri K.; McCoy, Matthew D.; Jafri, M. Saleet (Elsevier BV, 2022-01)
      Venetoclax is a BH3 (BCL-2 Homology 3) mimetic used to treat leukemia and lymphoma by inhibiting the anti-apoptotic BCL-2 protein thereby promoting apoptosis of cancerous cells. Acquired resistance to Venetoclax via specific variants in BCL-2 is a major problem for the successful treatment of cancer patients. Replica exchange molecular dynamics (REMD) simulations combined with machine learning were used to define the average structure of variants in aqueous solution to predict changes in drug and ligand binding in BCL-2 variants. The variant structures all show shifts in residue positions that occlude the binding groove, and these are the primary contributors to drug resistance. Correspondingly, we established a method that can predict the severity of a variant as measured by the inhibitory constant (Ki) of Venetoclax by measuring the structure deviations to the binding cleft. In addition, we also applied machine learning to the phi and psi angles of the amino acid backbone to the ensemble of conformations that demonstrated a generalizable method for drug resistant predictions of BCL-2 proteins that elucidates changes where detailed understanding of the structure-function relationship is less clear. © 2021 The Authors
    • Optimizing peptide inhibitors of SARS-Cov-2 nsp10/nsp16 methyltransferase predicted through molecular simulation and machine learning.

      Hamre, John R; Jafri, M Saleet (Elsevier, 2022-02-28)
      Coronaviruses, including the recent pandemic strain SARS-Cov-2, use a multifunctional 2'-O-methyltransferase (2'-O-MTase) to restrict the host defense mechanism and to methylate RNA. The nonstructural protein 16 2'-O-MTase (nsp16) becomes active when nonstructural protein 10 (nsp10) and nsp16 interact. Novel peptide drugs have shown promise in the treatment of numerous diseases and new research has established that nsp10 derived peptides can disrupt viral methyltransferase activity via interaction of nsp16. This study had the goal of optimizing new analogous nsp10 peptides that have the ability to bind nsp16 with equal to or higher affinity than those naturally occurring. The following research demonstrates that in silico molecular simulations can shed light on peptide structures and predict the potential of new peptides to interrupt methyltransferase activity via the nsp10/nsp16 interface. The simulations suggest that misalignments at residues F68, H80, I81, D94, and Y96 or rotation at H80 abrogate MTase function. We develop a new set of peptides based on conserved regions of the nsp10 protein in the Coronaviridae species and test these to known MTase variant values. This results in the prediction that the H80R variant is a solid new candidate for potential new testing. We envision that this new lead is the beginning of a reputable foundation of a new computational method that combats coronaviruses and that is beneficial for new peptide drug development.
    • Pancreatic ductal adenocarcinoma: Machine learning-based quantitative computed tomography texture analysis for prediction of histopathological grade

      Qiu, W.; Wang, Z.; Chen, R. (Dove Medical Press Ltd, 2019)
      Purpose: To assess the performance of combining computed tomography (CT) texture analysis with machine learning for discriminating different histopathological grades of pancreatic ductal adenocarcinoma (PDAC). Methods: From July 2012 to August 2017, this retrospective study comprised 56 patients with confirmed histopathological PDAC (32 men, 24 women, mean age 64.04±7.82 years) who had undergone preoperative contrast-enhanced CT imaging within 1 month before surgery. Two radiologists blinded to the histopathological outcome independently segmented lesions for quantitative texture analysis. Histogram features, co-occurrence, and run-length texture were calculated. A support-vector machine was constructed to predict the pathological grade of PDAC based on preoperative texture features. Results: Pathological analysis confirmed 37 low-grade PDAC (five well-differentiated/grade I and 32 moderately differentiated/grade II) and 19 high-grade PDAC (19 poorly differentiated/grade III) tumors. There were no significant differences in clinical or biological characteristics between patients with high-grade and low-grade tumors (P>0.05). There were significant differences between low-grade PDAC and high-grade PDAC on nine histogram features, seven run-length features, and two co-occurrence features. Cluster shade was the most important predictor (sensitivity 0.315). Using these texture features, the support-vector machine achieved 86% accuracy, 78% sensitivity, 95% and specificity. Conclusion: Machine learning-based CT texture analysis accurately predicted histopathological differentiation grade of PDAC based on preoperative texture features, leading to maximization patient survival and achievement of personalized precision treatment. Copyright 2019 Qiu et al.
    • Predictors of bovine Schistosoma japonicum infection in rural Sichuan, China.

      Grover, Elise; Paull, Sara; Kechris, Katerina; Buchwald, Andrea; James, Katherine; Liu, Yang; Carlton, Elizabeth J (Elsevier, 2022-05-26)
      In China, bovines are believed to be the most common animal source of human schistosomiasis infections, though little is known about what factors promote bovine infections. The current body of literature features inconsistent, and sometimes contradictory results, and to date, few studies have looked beyond physical characteristics to identify the broader environmental conditions that predict bovine schistosomiasis. Because schistosomiasis is a sanitation-related, water-borne disease transmitted by many animals, we hypothesised that several environmental factors - such as the lack of improved sanitation systems, or participation in agricultural production that is water-intensive - could promote schistosomiasis infection in bovines. Using data collected as part of a repeat cross-sectional study conducted in rural villages in Sichuan, China from 2007 to 2016, we used a Random Forests, machine learning approach to identify the best physical and environmental predictors of bovine Schistosoma japonicum infection. Candidate predictors included: (i) physical/biological characteristics of bovines, (ii) human sources of environmental schistosomes, (iii) socio-economic indicators, (iv) animal reservoirs, and (v) agricultural practices. The density of bovines in a village and agricultural practices such as the area of rice and dry summer crops planted, and the use of night soil as an agricultural fertilizer, were among the top predictors of bovine S. japonicum infection in all collection years. Additionally, human infection prevalence, pig ownership and bovine age were found to be strong predictors of bovine infection in at least 1 year. Our findings highlight that presumptively treating bovines in villages with high bovine density or human infection prevalence may help to interrupt transmission. Furthermore, village-level predictors were stronger predictors of bovine infection than household-level predictors, suggesting future investigations may need to apply a broad ecological lens to identify potential underlying sources of persistent transmission.