Thanh M. Nguyen

Logo

BioInformatics, HealthTech, ML, NLP

LinkedIn
GitHub
Email

Welcome to my pages

Data Science

Osteoporosis prediction using machine learning

In this project, I performed EAD on a case-control dataset for osteoporosis, controlling for demographic variables such as age, gender, and ethnicity. Then I built predictive models using LR, RF, SVM, and XGB. The Gradient Boosting Classifier and Logistic Regression achieved the highest ROC scores. Even though this is a balance dataset, the variables tend to be bias towards the negative class (no-osteoporosis), meaning the True Negative and False Negative scores are always higher than True Positive and False Positive scores. This trend is consistent in the confusion matrix of the 6 ML models.




COVID-19 Vaccine adverse symptoms in VAERS with Association Rule Mining

To address vaccine hesitancy issues, I studied the adverse symptoms in COVID-19 vaccines using the VAERS data. VAERS is the public surveillance system co-manage by the CDC and FDA to detect rare vaccine adverse events. By applying association rule mining, I discovered the top adverse symptoms in COVID-19 vaccines, and compared the differences in adverse symptoms between Moderna and Pfizer vaccines.

Result image


Visualization app in RShiny using MAUDE dataset

MAUDE is the medical device passive surveillance dataset from the FDA. I built a visualization app to project the temporal trends in the harm levels using the 2016 MAUDE data.


Natural Language Processing

coming soon