Experience
Principal Clinical Data Scientist/Engineer
FORMATION BIO (Formerly TrialSpark), ClINICAL DATA ANALYTICS AND PROGRAMMING, NEW YORK, NY
MAY 2023 - PRESENT
Formation Bio is a technology company that helps bring treatments to patients faster. Today, clinical trials are the bottleneck to bringing life-saving treatments to patients. Trials are slow, inefficient, and expensive. Formation Bio is using technology to accelerate the pace of clinical trials and bridge the gap between medical research and patients who need treatment.
Senior Data Scientist/Engineer
LEVELS HEALTH, REMOTE
AUGUST 2021 - FEBRUARY 2023
Developed algorithms to personalize the in-app user experience by leveraging machine learning models such as SVM and Random Forest, as well as NLP to build text classification infrastructure.
Built and managed data pipelines using AWS, Python, SQL, and DBT that integrate and reconstruct data from various wearables, such as continuous glucose monitors, smart watches, smart scales, etc, into Snowflake.
Worked cross-functionally with various department teams to roadmap data/infrastructure needs, define best analytical practices, prototype features, inform metabolic disease and diabetes research strategy, contribute to company publications and media, and translate business problems into actionable data science initiatives.
Levels makes it easy for people to see how their diet affects both their health and their lifestyle in a quantifiable way by measuring biomarkers in real-time. Levels is expanding access to continuous glucose monitoring and making it mainstream, focused on people looking to find their optimal diet and improve their metabolic fitness.
Data Scientist (Computational Genomics)
ICAHN SCHOOL OF MEDICINE AT MOUNT SINAI, MULTISCALE NETWORK MODELING LAB, NEW YORK, NY
AUGUST 2019 - AUGUST 2021
Used machine learning techniques, such as Random Forest and Support Vector Machine, to assess various polygenic risks generated from Genome-Wide Association Studies (GWAS) and their association with the development of late-stage Alzheimer's disease (AD).
Used maximal information-based nonparametric exploration (MINE) statistics to analyze relationships between RNA-seq, protein expression, and clinical covariate data in AD subjects.
Analyzed the correlation between RNA-seq and protein expression data using differential expression and gene set enrichment analyses (GSEA) to gain more insight into AD pathogenesis, diagnosis, and therapeutics.
Data Scientist
CELGENE CORPORATION (Acq. Bristol Myers Squibb), GLOBAL DRUG SAFETY AND RISK MANAGEMENT, SUMMIT, NJ
JUNE 2017 - AUGUST 2019
Analyzed 470K+ Individual Case Safety Reports (ICSRs) generated from clinical trials and postmarket drugs - effectively comprising just short of a billion individual data points - using R, Python, and data visualization tools.
Used various statistical techniques, such as Principal Component Analysis (PCA), to perform feature reduction and classification of adverse event (AE) case processing data.
Used machine learning techniques like Random Forest and Logistic Regression to optimize and automate the AE case processing workflow.
Data Analyst
GOVERNOR ANDREW M. CUOMO, NEW YORK, NY
JULY 2013 - JUNE 2015
Analyzed constituent and fundraising data using statistical analysis techniques to inform campaign resource allocation.
Led weekly briefs to senior officials about campaign financials, events, and fundraisers.
Supported coordination of Governor Cuomo’s campaign events by identifying and solving logistical problems to optimize donations.
Technical and Business Skills
Python, Pandas, Pytorch, Tensorflow, Sklearn, UNIX/Linux, R, ShinyR, SQL, Java, Matlab, Git
Machine Learning, Deep Learning, Data Engineering, Natural Language Processing, Time Series Analysis
AWS (S3, EC2, Lambda), Snowflake, Redshift, PostgreSQL, Data Build Tool (DBT), GitHub Actions, CircleCI, High-Performance Computing (HPC), Fivetran, Docker, Posit
Strong leadership skills with a focus on cross-functional teamwork, curiosity, and technical excellence
Experience working with Sponsors and Contract Research Organizations (CROs) HIPAA, GDPR, GxP Training
Awards and Organizations
Mount Sinai Alumni Network, mentor for Biomedical Data Science students
Barnard College Alumni Network, mentor for women in STEM
On Deck Data Science (ODDS) Fellow
Barnard Quantitative Society
NYS Department of Education Scholarship of Academic Excellence
New York State School Music Association (NYSSMA) (levels 4, 5, 6 Piano)
Japanese National Honor Society
Volunteer Work
January 2016 - Columbia University Global Brigades, Public Health Brigade, El Retiro, Honduras
June 2015 - Putnam Hospital Center, Clinical Volunteer, Carmel, NY
June 2013 - Global Leadership Adventures, Public Health and Sustainable Development, Kilimanjaro, Tanzania
June 2012 - Habitat For Humanity, Yonkers, NY
Languages
Russian (Conversational)
Japanese (Intermediate)
German (Introductory)
Swahili (Introductory)