Research Projects


Applying frontal cortex metabolites quantified by 7-Tesla 1H-MRS to predict multiple sclerosis subtype through recursive partitioning and conditional inference trees

MR SCIENCE Lab | Columbia University Medical Center
June 2017 - August 2018


Multiple sclerosis (MS) is an autoimmune disease that impairs the central nervous system by attacking the myelin sheaths of neurons. It affects more than 2.3 million people worldwide. One potential key to understanding the metabolic differences between the brains of patients with relapsing-remitting MS (RR-MS) and progressive MS (P-MS); and potentially treating MS is investigating the relationship between different metabolites and the types of MS. Data on the different metabolite levels of patients with relaxing-remitting MS, progressive MS, and controls have been previously obtained from the frontal cortex of adults using 1H-MRS at 7 Tesla. We created classification trees and performed tree-boosting on these data in order to find the metabolites that best categorize the different MS subtypes as accurately as possible. The results of our conditional inference trees exhibited 44%[IA1]  accuracy in predicting whether an individual had relapsing-remitting, progressive, or no MS; they also showed that glycine and glutamine were the strongest factors in this classification. Our data highlight the importance of continued investigation in the potential applications of machine learning in analyzing clinical data.


Multiple Sclerosis (MS) is an autoimmune disease that impairs the nervous system by attacking the myelin sheaths of nerve cells1. Although it affects more than 2.3 million people worldwide, the only way it is diagnosed is either through the McDonald Criteria or through a differential diagnosis. The McDonald Criteria falls short in that it cannot be applied to patients who do not present symptoms during an MR scan. A potential supplement to this is using levels of different metabolites potentially implicated in MS pathogenesis as a predictor for the subtype of MS that a patient has. In this study, we used machine learning to analyze the metabolite levels measured in both RR-MS and Progressive P-MS patients. Machine learning is unique in that it finds patterns in the data not easily found by humans. Furthermore, it is unbiased and is able to learn from previous iterations of the algorithms it uses.


The accuracy of the recursive partitioning tree was 50.32% after cross-validation on the training set and 38.89% after being applied to the test set. The accuracy of the conditional inference tree was 51.02% after cross-validation on the training set and 44.44% after being applied to the test set. The generalized boosted regression model reported a 45.95% accuracy after cross-validation on the training set and a 27.78% accuracy after being applied to the test set.

Furthermore, after using both functions to classify the cases into only healthy patients and those with MS—as opposed to differentiating between RR-MS and P-MS—the recursive partitioning tree had a 64% accuracy after cross-validation on the training set, and a 72% accuracy after applying it to the test set. The conditional inference tree had a 64% accuracy after cross-validation on the training set and a 50% accuracy after applying it to the test set.

Gradient boosting machine algorithms showed glycine had the most influence on the classification of patient types, with a relative influence of 41.46%. Out of ten metabolites tested, total creatine and scyllo-inositol had no influence on the trees created.

Identification of Novel Phage from New York City for Potential Use as Phage Therapy

tenOever Lab | Icahn School of Medicine at Mount Sinai
April 2016 - May 2017


Viruses abound in nature and show greater diversity than any other nucleic-acid based entity. To sample this diversity, we obtained common environmental samples (i.e. soil, decaying food and plants), and filtered the material to exclude all but viruses. Filtered elutes were applied to mammalian, yeast, and bacterial cultures and samples which slowed or stopped growth were further evaluated by next generation sequencing. We identified a single sample capable of slowing the division of S. aureus and identified the causative agent as a novel phage most closely related to Bacillus AR9. These data illustrate viral complexity and suggest that the vast majority of viruses have yet to be discovered. The slowed division of S. aureus could also point towards the potential of the novel virus to be used against harmful bacteria in lieu of antibiotics.


The viruses on Earth are considered to be one of the most diverse entities. However, less than 0.01% of viruses are identified and well-characterized. These viruses can either be detrimental or beneficial to lives of human beings for they can cause widespread lethal epidemics. They can also be used as natural alternatives to antibiotics. Researchers employed the concept of the latter at a hospital in Paris when a patient had a multidrug-resistant strain of a bacteria called Acinetobacter baumannii. The use of bacteriophages (viruses) in that case provided an organic,non-invasive solution to antibiotic resistance. Furthermore, other research, such as one completed through the observation of the bacteria Staphylococcus and related viruses, consistently reproduce the results and benefits of using viruses.

With hopes of expanding said research and more, we surveyed a wide range of samples taken from different New York City environments. We aimed to encounter a novel virus with the potential to be used as a vector (a DNA molecule used as a vehicle to artificially carry foreign genetic material into another cell) or a therapeutic (a pharmaceutical drug taken from a biological source). With each step of the research in place, we hypothesized that we would find a novel virus within our samples that would slow or stop the growth of one or more of three model systems: bacterial, yeast, or mammalian cultured cells.

Specific Aims:

Aim 1: Environmental Sample Collection. We propose to collect and catalog diverse environmental samples including soil, rotting vegetation, decaying produce, moldy bread, and other sources of potential interest.  

Aim 2: Sample Processing and Screening. For this aim, we propose to suspend each sample in phosphate-buffered saline (PBS) and filter the resulting solutions to eliminate particles larger than viral size. Eluate will then be applied to host cells of either bacterial, yeast, or mammalian origin.

Aim 3: Sample Characterization. Samples found to cause plaque formation in S. aureus, slowed growth of Pichia, or death of mammalian cells will be expanded and sequenced using the Illumina platform and characterized bioinformatically.


The slowed division of the S. aureus culture with the A(21-24) sample show that there is organic material on Earth that could be used to eliminate diseases caused by harmful bacteria, thus antibiotics may not be completely needed anymore. Furthermore, the discovery of the novel phage closely related to Bacillus AR9 helped to solidify the reality of beneficial viruses that could alleviate harmful bacteria of their tendency to cause medical issues for humans.

As for the yeast and vero cells, the lack of identification of a virus that could stunt the growth of either cultures may be due to the types of samples and where they came from. To add to that, the vero cells most likely were not affected by the genetic material present in our samples because our samples did not contain substances capable of harming mammalian cells--a finding of viruses capable of harming said cells would have been hazardous to the lab and researchers.

Future studies should focus on gathering larger sample sizes that come from various cities in order to have a wider diversity of possible microbe-containing samples. Also, studies should focus on having multiple types of bacteria because it will help determine if viruses can stop the growth of most known bacteria.


Feature 3

Vivamus a ante congue, porta nunc nec, hendrerit turpis. Mauris egestas at nibh nec finibus.