Reimagining Georgetown University Systems Medicine Graduate Program through Biomedical Data Science Research


With the sequencing of the human genome and the availability of high-power computational methods and various high-throughput technologies, biomedical sciences and medicine is undergoing a revolutionary change. The new field of systems medicine is the application of systems biology approaches and tools to biomedical problems. 

In medicine and biomedical research, complex computational tools will become essential for deriving personalized assessments of disease risk and management including individualized diagnosis, prognosis, and treatment options. This change, involving the use and analysis of enormous quantities and a variety of data, will require a new type of physician/biomedical researcher - one with a grasp of modern computational sciences, “-omic” technologies (genomics, proteomics, metabolomics, transcriptomic, etc.), and a systems approach to medicine. 

Machine Learning for Biomedical Data 2018

One important and emerging area that was missing from the Systems Medicine Curriculum at Georgetown University was Machine Learning. In September 2018, Pine Biotech, USA in collaboration with Dr. Sona Vasudevan, Program Director, Systems Medicine started a pilot course on Machine Learning for Biomedical Data at the Georgetown University Medical Center offered to the Master's Systems Medicine Program. This enhancement in the curriculum prepares students to work on high-impact projects from industry and research partners. A limited number of students were accepted to join this project-based educational experience. 

Georgetown University


Here is a student project that was carried out as a part of program completion:

Unsupervised clustering suggests multiple classes of opioid use disorder patients

Opioid use disorder (OUD) is a chronic physical and psychological dependence on opiates, leading to clinically significant impairment or distress. Currently, there is a huge unmet need to better understand the impacts of OUD on the brain.

Yeonjoo Hwang conducted a research project to examine the transcriptional changes defining different subsets of opioid use disorder patients using differential gene expression analysis and unsupervised clustering.


Her findings suggested that inflammatory pathways are enriched in OUD, with a subset of OUD patients experiencing other neurological changes that may require different interventions. This not only calls for analysis of additional data types that may reveal a more comprehensive look into OUD but also suggests that those with OUD may benefit from patient stratification and precision therapy.


Project Link 

Machine Learning for Biomedical Data (Python) 2022

As Python continues to be the most preferred language for data science and machine learning, there was a growing interest amongst the students to carry out machine learning projects using Python. As a result, in September 2022, Pine Biotech offered another course on “Machine Learning for Biomedical Data (Python)”. The lectures were prepared considering the interest of the program director to address challenges relevant to clinical application, biomedical research, and of interest to the industry, including big pharma and biotech firms. The training ended with students submitting their research projects on OmicsLogic Learn Portal

Georgetown University Student Projects 2022

Let us take a look at the various research projects completed by students from Georgetown University.

A Behavioral Risk Factor Surveillance System-Based Diabetes Predictive Model

Diabetes is a chronic disease that leads to various serious complications and can lead to death when left unmanaged. Integrating a diabetes predictive model is highly valuable in the clinical setting for assessing individuals with a high risk of developing the disease. 

Rima AlHamad worked on creating a model that uses machine learning to predict diabetes and classify patients into 1 of 3 categories to further help and manage their health. The future steps of this research will be to convert this dataset from three classes to two classes having just non-diabetic as one class and pre-diabetic/diabetic as another class.


Project Link 


To know more, watch her presentation recording: 



Arthritis Tissue Classification Using Random Forest 

Osteoarthritis and rheumatoid arthritis are chronic conditions that cause pain and stiffness in the joints. The early diagnosis of osteoarthritis and rheumatoid arthritis can be very important for preventative measures against severe prognoses such as deformity or organ failure. 

Anna-Danielle Bashorun worked on creating a machine learning model to accurately classify tissue samples in the categories of osteoarthritis, regular, and rheumatoid arthritis. Through this work, features that have the most important in relation to the prognosis of rheumatoid arthritis or osteoarthritis can be discovered.


Project Link 


To know more, watch her presentation recording: 



Potential Predictor & Diagnostic Blood Biomarkers For Post-Traumatic Stress Disorder

Post-traumatic stress disorder (PTSD) is a psychiatric disorder characterized by failure to recover after experiencing or witnessing a terrifying event. The pathophysiology of PTSD is yet to be elucidated as it is a complex psychiatric disorder that has many of the same symptoms as other neurological diseases.

Michelle Biete worked on identifying gene expression patterns among PTSD patients. Through this work, molecular biomarkers and therapeutic targets can be identified to improve the effectiveness of treatment.


Project Link 


To know more, watch her presentation recording: 



A Predictive Model to Guide Monkeypox Testing Based on Clinical Presentation

Monkeypox is considered an emerging infectious disease. Only 70 labs exist in the United States and in the first month of the disease outbreak – less than 2,000 Americans were allowed to be tested.

Rohan Harris worked on developing a predictive model for guiding monkeypox testing based on clinical presentation. This model can be useful in identifying and quarantining patients during the early stages of an outbreak, particularly when testing resources are limited. 


Project Link 


To know more, watch his presentation recording: 



Exploration Towards A Predictive Model for Post-Chemotherapy Neurodegeneration based on RNA-Seq Data

Brain Fog is one of the most commonly reported side effects of chemotherapy treatments for cancers. However, there is limited research into the gene expression changes that happen during the condition.

Jordan Henry worked on identifying genes related to the cognitive side effects commonly seen in chemotherapy patients. These markers can then be used to further develop tests that can be administered, to predict the risk of developing neurodegenerative diseases, without needing to resort to invasive or imprecise measures.


Project Link 


To know more, watch his presentation recording: 



Acute Myeloid Leukemia Gene Expression 

Acute Myeloid Leukemia (AML) is a severe form of blood cancer that begins in the bone marrow and affects the development of myeloid cells resulting in an accumulation of these cells in the bone marrow and the peripheral blood.

Sanjna Prasad worked on exploring machine learning classification models to accurately classify data based on gene expression into tissue types - either cancerous (AML) or normal. Through her project, potential biomarkers have been found with this modeling system as the genes selected have important functions in the cell cycle and immune system regulation.  


Project Link 


To know more, watch her presentation recording: 



Exploring Breast Cancer in Stromal vs Epithelial Cells

Next to skin cancer, breast cancer is the most common type of cancer among women. Evaluating the surrounding environment of the body's cells and where they reside is important to furthering cancer research and saving patients.

Khushbu Shah worked on identifying significant stromal and epithelial cells in normal and cancerous breast tissue using a variety of machine-learning techniques to create a model. Her findings will be helpful for scientific research as these techniques can be used for other cancers and evaluating the effect of stromal cells in relation to disease progression. 


Project Link 


To know more, watch her presentation recording: 



Brain Tumor Subtype Exploration For Candidate Biomarkers

Pediatric brain tumors are the foremost cause of death for childhood cancers. There is limited research on biomarker targets in pediatric brain cancer

Cutler Simpson worked on investigating the gene expression differences among brain cancer subtypes using supervised classification. The findings presented indicate the potential for unique biomarkers for the ependymoma, pilocytic astrocytoma, medulloblastoma, and glioblastoma brain cancer subtypes respectively. 


Project Link 


To know more, watch his presentation recording: 



OmicsLogic Research Fellowship Program (Machine Learning Specialization Track)


OmicsLogic Research Fellowship Program


If you too are interested to enter the field of machine learning to carry out a research project, enroll in the OmicsLogic Research Fellowship Program. The Research Fellowship program has been designed to help young researchers and students take advantage of the bioinformatics resources for the analysis of complex life science data and become versed in bioinformatics. The fellowship program will offer a combination of online resources and mentor guidance to prepare you and help you complete a bioinformatics project.


To know more, visit: 


If you have any queries about the program details or registration process, please reach out to us at


See all