Experience


Principal Clinical Data Scientist/Engineer

FORMATION BIO (Formerly TrialSpark), ClINICAL DATA ANALYTICS AND PROGRAMMING, NEW YORK, NY

MAY 2023 - PRESENT

Formation Bio is a technology company that helps bring treatments to patients faster. Today, clinical trials are the bottleneck to bringing life-saving treatments to patients. Trials are slow, inefficient, and expensive. Formation Bio is using technology to accelerate the pace of clinical trials and bridge the gap between medical research and patients who need treatment.


Senior Data Scientist/Engineer

LEVELS HEALTH, REMOTE

AUGUST 2021 - FEBRUARY 2023

  • Developed algorithms to personalize the in-app user experience by leveraging machine learning models such as SVM and Random Forest, as well as NLP to build text classification infrastructure.

  • Built and managed data pipelines using AWS, Python, SQL, and DBT that integrate and reconstruct data from various wearables, such as continuous glucose monitors, smart watches, smart scales, etc, into Snowflake.

  • Worked cross-functionally with various department teams to roadmap data/infrastructure needs, define best analytical practices, prototype features, inform metabolic disease and diabetes research strategy, contribute to company publications and media, and translate business problems into actionable data science initiatives.

Levels makes it easy for people to see how their diet affects both their health and their lifestyle in a quantifiable way by measuring biomarkers in real-time. Levels is expanding access to continuous glucose monitoring and making it mainstream, focused on people looking to find their optimal diet and improve their metabolic fitness.


Data Scientist (Computational Genomics)

ICAHN SCHOOL OF MEDICINE AT MOUNT SINAI, MULTISCALE NETWORK MODELING LAB, NEW YORK, NY

AUGUST 2019 - AUGUST 2021

  • Used machine learning techniques, such as Random Forest and Support Vector Machine, to assess various polygenic risks generated from Genome-Wide Association Studies (GWAS) and their association with the development of late-stage Alzheimer's disease (AD).

  • Used maximal information-based nonparametric exploration (MINE) statistics to analyze relationships between RNA-seq, protein expression, and clinical covariate data in AD subjects.

  • Analyzed the correlation between RNA-seq and protein expression data using differential expression and gene set enrichment analyses (GSEA) to gain more insight into AD pathogenesis, diagnosis, and therapeutics.


Data Scientist

CELGENE CORPORATION (Acq. Bristol Myers Squibb), GLOBAL DRUG SAFETY AND RISK MANAGEMENT, SUMMIT, NJ

JUNE 2017 - AUGUST 2019

  • Analyzed 470K+ Individual Case Safety Reports (ICSRs) generated from clinical trials and postmarket drugs - effectively comprising just short of a billion individual data points - using R, Python, and data visualization tools.

  • Used various statistical techniques, such as Principal Component Analysis (PCA), to perform feature reduction and classification of adverse event (AE) case processing data.

  • Used machine learning techniques like Random Forest and Logistic Regression to optimize and automate the AE case processing workflow.


Data Analyst

GOVERNOR ANDREW M. CUOMO, NEW YORK, NY

JULY 2013 - JUNE 2015

  • Analyzed constituent and fundraising data using statistical analysis techniques to inform campaign resource allocation.

  • Led weekly briefs to senior officials about campaign financials, events, and fundraisers.

  • Supported coordination of Governor Cuomo’s campaign events by identifying and solving logistical problems to optimize donations.


Technical and Business Skills

  • Python, Pandas, Pytorch, Tensorflow, Sklearn, UNIX/Linux, R, ShinyR, SQL, Java, Matlab, Git

  • Machine Learning, Deep Learning, Data Engineering, Natural Language Processing, Time Series Analysis

  • AWS (S3, EC2, Lambda), Snowflake, Redshift, PostgreSQL, Data Build Tool (DBT), GitHub Actions, CircleCI, High-Performance Computing (HPC), Fivetran, Docker, Posit

  • Strong leadership skills with a focus on cross-functional teamwork, curiosity, and technical excellence

  • Experience working with Sponsors and Contract Research Organizations (CROs) HIPAA, GDPR, GxP Training

Awards and Organizations

  • Mount Sinai Alumni Network, mentor for Biomedical Data Science students

  • Barnard College Alumni Network, mentor for women in STEM

  • On Deck Data Science (ODDS) Fellow

  • Barnard Quantitative Society

  • NYS Department of Education Scholarship of Academic Excellence

  • New York State School Music Association (NYSSMA) (levels 4, 5, 6 Piano)

  • Japanese National Honor Society

Volunteer Work

  • January 2016 - Columbia University Global Brigades, Public Health Brigade, El Retiro, Honduras

  • June 2015 - Putnam Hospital Center, Clinical Volunteer, Carmel, NY

  • June 2013 - Global Leadership Adventures, Public Health and Sustainable Development, Kilimanjaro, Tanzania

  • June 2012 - Habitat For Humanity, Yonkers, NY

Languages

  • Russian (Conversational)

  • Japanese (Intermediate)

  • German (Introductory)

  • Swahili (Introductory)