Identification of immune correlates of protection in Shigella infection by application of machine learning
JournalJournal of Biomedical Informatics
PublisherAcademic Press Inc.
MetadataShow full item record
AbstractBackground Immunologic correlates of protection are important in vaccine development because they give insight into mechanisms of protection, assist in the identification of promising vaccine candidates, and serve as endpoints in bridging clinical vaccine studies. Our goal is the development of a methodology to identify immunologic correlates of protection using the Shigella challenge as a model. Methods The proposed methodology utilizes the Random Forests (RF) machine learning algorithm as well as Classification and Regression Trees (CART) to detect immune markers that predict protection, identify interactions between variables, and define optimal cutoffs. Logistic regression modeling is applied to estimate the probability of protection and the confidence interval (CI) for such a probability is computed by bootstrapping the logistic regression models. Results The results demonstrate that the combination of Classification and Regression Trees and Random Forests complements the standard logistic regression and uncovers subtle immune interactions. Specific levels of immunoglobulin IgG antibody in blood on the day of challenge predicted protection in 75% (95% CI 67-86). Of those subjects that did not have blood IgG at or above a defined threshold, 100% were protected if they had IgA antibody secreting cells above a defined threshold. Comparison with the results obtained by applying only logistic regression modeling with standard Akaike Information Criterion for model selection shows the usefulness of the proposed method. Conclusion Given the complexity of the immune system, the use of machine learning methods may enhance traditional statistical approaches. When applied together, they offer a novel way to quantify important immune correlates of protection that may help the development of vaccines.
SponsorsThis work was supported by the National Institute of Allergy and Infectious Diseases, National Institutes of Health (Cooperative Center for Translational Research in Human Immunology and Biodefense; CCHI; M.B.S) [grant U19 AI082655 ]; and the Career Development Award, CDA J.K.S [grant K23-AI065759 ].
KeywordClassification and Regression Trees
Correlate of protection
Random Forests algorithm
Identifier to cite or link to this itemhttps://www.scopus.com/inward/record.uri?eid=2-s2.0-85028443503&doi=10.1016%2fj.jbi.2017.08.005&partnerID=40&md5=8074152c0d769e0035de5d3de96e8d3f; http://hdl.handle.net/10713/11264