Khoury professor uses machine learning and data science to unlock secrets of the human genome
Professor Predrag Radivojac uses machine learning and data science to unlock secrets of the human genome
Author: Ysabelle Kempe
Date: 03.03.21
Photo by Liz Linder
Predrag Radivojac works with data so small, it’s almost invisible: molecules. Despite his subjects’ unassuming size, the secrets they hold could revolutionize the ways doctors diagnose and treat patients with issues ranging from cancer to autism.
Radivojac, a Khoury professor and director of the data science master’s program, works at the intersection of computational biology, precision medicine, and machine learning. (Precision medicine is a healthcare approach in which treatment is customized for each individual patient through the use of higher resolution data, such as gene expression and genome sequence.) Born in Serbia, Radivojac came to the United States to earn his PhD before spending 14 years as a faculty member at Indiana University. In 2018, he joined Khoury College. His current research focuses on how genetic differences among humans impact diseases and disorders.
Even in state-of-the-art medicine, with genetic diseases it is not always known whether and why gene mutations cause disease — except in a relatively small number of well-known cases, like sickle cell disease or cystic fibrosis. Currently, there are over 100,000 mutations known to cause disease, but it is not fully understood why. This unknown is what motivates Radivojac’s work.
“We study the influence of genetic variants, what they do, and how humans all differ at the molecular level due to genetic differences,” Radivojac said. “We then look at how our genetic makeup translates to the phenotypic level, which is how humans behave, whether we are susceptible to certain diseases, things like that.”
A person’s genes contain the instructions to make protein molecules, which serve various important functions throughout the body. Radivojac and his team are working on machine learning models that, based on genetic and molecular data, can measure and predict what functional alteration is happening in the individual to cause a disease.
These models can be helpful for clinicians in cases where they see a patient with observable symptoms, but do not know which gene to act on, since multiple genes can often be causing the same symptoms and the genes responsible for many disorders have not yet been fully identified. This uncertainty happens in 70% of clinical cases or more, Radivojac said. By using algorithmic predictors, clinicians can be more certain which gene mutation is causing the patient’s issues, potentially leading to a treatment or avoiding ineffective treatment based on an incorrect molecular diagnosis.
Radivojac’s work is also used in research to determine the exact molecular events that go wrong due to certain genetic mutations. “There are multiple different causes for how you could arrive at the same symptoms,” Radivojac said. “We want to pinpoint what happened at the molecular level.”
Radivojac’s work shows that, for the same protein, there could be different processes disrupted at the molecular level. Here, human tumor suppressor p53 is used to illustrate the numerous possible effects of amino acid substitutions on protein structure and function. Protein Data Bank IDs for the structures shown are 1TUP, 1YCS, 2J1W, and2YBG. (Nature Communications, 2020)
In children, these predictive methods could be helpful in diagnosing and treating autism early on, which is the subject of a recent publication in Nature Communications and another paper Radivojac recently submitted. His team pinpointed a few genetic mutations that potentially explain what is happening at a molecular level to cause autism. By keeping an eye on children with these mutations, it is possible to catch and treat autism earlier. Certain therapies could even raise the patient’s cognitive ability, if implemented at a young enough age.
Understanding genetic variation is fundamental in treating cancer as well. Genomes of the patient’s normal cells and tumor cells are different, and understanding those differences can help doctors prevent the tumor’s genome from having an advantage over the healthy, normal cells. In drug prescription too, genetics is pertinent — people with slower metabolisms need lower doses, and those with faster metabolisms need higher doses, as their body processes the drug more quickly.
Radivojac is hopeful about the way machine learning can transform the medical field, but he also believes there should be guidelines for its use. That’s why, in 2019, he joined a national effort to provide standards for how and when machine learning predictors can be used to support clinical decisions. The computational subgroup he is working with to develop these guidelines is a part of ClinGen’s Sequence Variant Interpretation effort. He also chairs a Center of Critical Assessment of Genome Interpretation.
“Right now, the way people use computational tools in the clinic is highly suboptimal and sometimes wrong,” Radivojac said. “Action will be taken, and sometimes this action can be based on incorrect predictions. At the same time, better models exist that are capable of making correct predictions in a high reliability range, but medicine does not yet capitalize on these enough.”
Machine learning and data science concepts are already hitting clinics, according to Radivojac. He predicts computational tools in medicine will only become more numerous. The models will become more refined and likely use more data than just the human genome, such as electronic health records and your RNA expression profile. One concept he is sure will gain more traction is whole-genome prediction, which uses the entire human genome of a patient to predict health issues they may face. The earlier these predictions can be made, the more effective the treatment for genetic diseases can be.
“All of these techniques, like machine learning, are not yet good enough to cure people,” Radivojac said. “Still, there is enough there that we can do things that would improve human life right now.”