Research

Methodology Interests

Ordered by experience:

High-dimensional regression and visualization
- General regularization
- Graph penalization
Statistical Learning
Batch effect correction and differential expression analysis
Tensor regression

Application Interests

Ordered by experience:

Metabolomics
High-dimensional data
- Longitudinal
- General tensor data
Multi-Omics
- Particularly microbiomic and proteomic data
High-frequency wearable data

My research thus far has largely involved developing methods for a specific type of data and research question brought by clinical collaborators. That is, I work data-first and apply existing methods or developing new methods to answer the specific questions posed by domain experts. Though my educational training has been in statistical theory and general applications, my experience with collaborative research has taught me how to listen to the collaborators and the data itself.

As an undergraduate I worked with Dr. Irina Gayanova to develop the R package iglu for CGM data to provide a convenient tool designed to allow both clinicial and methodological researchers to calculate important metrics and produce illuminating visualizations. Through this project I learned good practices in R package development and the value of tools like Shiny Apps to reduce the computational barrier to entry for non-statisticians.

As a graduate student I have worked with my Cornell advisors, Dr. Martin T. Wells and Dr. Sumanta Basu, and my Weill Cornell Medicine advisor Dr. Myung Hee Lee on a wide range of projects with applications to diabetes and Tuberculosis. We work closely with domain experts at Weill, and this experience has developed my ability to balance statistical instinct with domain knowledge as well as my communication skills. In addition to developing the model PROLONG and corresponding software, I have worked on batch effect correction and differential expression analysis for protein data associated with diabetes and metabolite+microbiome data associated with Tuberculosis. Recently I have become very interested in tensor regression and decompositions due to their potential with the TB data I primarily work on as well as a wide range of other health data. By allowing tensor data to exist as it is instead of vectorizing and inducing a different structure we may find both computational and methodological advancements.