Developed regression-based age-adjusted telomere length (WGS-TL) percentiles using cleaned “control” cohorts from the TOPMed program
Applied these percentiles to dichotomize the PFF-PR Interstitial Lung Disease cohort into high vs. low age-adjusted TL groups
Evaluated group differences via Kaplan-Meier survival curves and Cox proportional hazards models; benchmarked against continuous residuals from Age ~ TL regression
January 2024 – Present
Conducted rigorous quality control and normalization of whole blood RNA-Seq data from the Pulmonary Fibrosis Foundation Patient Registry (PFF-PR)
Performed unsupervised hierarchical clustering and characterized resulting clusters based on clinical covariates and phenotypes
Executed Gene Set Enrichment Analysis with survival analysis to identify biologically relevant pathways
Projected clusters onto validation cohorts using PCA and Support Vector Machines to distinguish transcriptomic endotypes
August 2022 – May 2023
Conducted a prospective longitudinal clinical trial on 75 patients following catheter ablation to assess novel biomarkers predicting atrial fibrillation recurrence in the case where p >> n
Used correlations, sure independence screening, and random forest models for initial biomarker pre-screening
Applied cross-validated generalized linear models and tree-based methods for final model development
Evaluated improvement in predictive performance compared to baseline clinical characteristics alone
June 2022 – October 2022
Investigated associations of socioeconomic factors and comorbidities in prostate cancer patients receiving androgen deprivation therapy
Conducted retrospective cohort analyses using generalized linear regression, Kaplan-Meier survival analysis, and Cox proportional hazard modeling
Detected increased diabetes risk among prostate cancer patients receiving Anti-Androgen therapies compared to GnRH agonists or Combined Androgen Blockade
June 2020 – October 2020
The project, headed by Dr. Nathan Tintle, created a mathematical framework for conducting computationally efficient, exact, multimarker omnibus tests by leveraging individual marker summary statistics from large biobanks
Mixed methods approach utilized in this research:
Using matrix algebra, we proved that these omnibus tests could be conducted using only summary statistics provided by biobanks
Using simulations, we demonstrated that our methods were accurate, with low variance explained primarily by randomization
Using omega-3 fatty acid data from several cohorts of the Framingham Heart Study, we verified that the methods were highly accurate with low variance in all cases
Communicated results through oral presentations and a written manuscript published in the
Annals of Human Genetics