![]()
Engineering Manager / Staff Research Scientist
Instagram · Full-time
Mar 2022 - Present
• 3 yrs 5 mosSupporting a team of 20+ Engineers working on Instagram Relevance Integrity
My team works on protecting 100% Instagram users from harmful and unwanted experiences on Instagram largest surfaces
- Instagram Connected Surfaces (Feed, Stories)
- Instagram Recommendation Surfaces (Reels, Explore, Feed Recommendations)
- Threads
Trust & Safety, Maching Learning, Recommendation System, LLM
![]()
Staff Research Scientist
Facebook
Jun 2020 - Present
• 5 yrs 2 mosMarketplace Integrity
- Trust and safety on Facebook Marketplace
- Detection and protection for scam, spam, fraud.
![]()
ServiceNow
Mar 2016 - Jun 2020
Staff Data Scientist
Mar 2019 - Jun 2020
• 1 yr 4 mosAnomaly Detection, Time Series Analysis, Natural Language Processing, Text Mining
Event Management | Overview
Senior Data Scientist
Jul 2017 - Feb 2019
• 1 yr 8 mos★★ Winner of Excellent in Execution Award (Q4 2018) ★★
Data Scientist
Mar 2016 - Jun 2017
• 1 yr 4 mosFeb 2015 - Jan 2016
• 1 yrFirst data scientist hired, build everything from scratch.
Product I was responsible for was:
-- Jazz Crowd. Jazz Crowd is the premier big data solution for the recruiting and performance management industries. Jazz Crowd will help companies improve processes and results related to hiring peak-performing employees in the human capital management industry.
Some of the problems I tackled were:
-- Customer Conversion Analysis
-- Customer Churn Analysis
-- Resume and Candidate Analysis
-- Large-scale Job Title Classification
-- Large-scale Skills Extraction and Parsing
-- Academic Institution Name Entity Normalization
Some of the models I used were: Tree-based Model (Decision Tree, Random Forests, Gradient Boosting), Support Vector Machine, Logistic Regression, KNN, Latent Dirichlet Allocation, Naive Bayes, K-Means Clustering, TF-IDF, Vector space model, Topic modeling.
Some of the techniques I used were: Python (Numpy, Scipy, Scikit-learn, Matplotlib, Django), Java, R, JavaScript, D3.js, Highcharts.js, MySQL, PHP
These U.S. cities are the best bets for new grads seeking jobs
Where does your company fall on the time-to-hire spectrum?
![]()
Intern - Statistician
Equifax
Aug 2014 - Dec 2014
• 5 mosAs a statistician in the analytics group at Equifax, my major research project was on building a suit of Firmographics inference models using machine learning approaches under the supervision of Dr. Yin. These models had been tested and validated on 500M+ records.
My major accomplishments were:
• Researched information retrieval and text mining techniques and developed a text feature extraction algorithm for messy textual data.
• Built an ensemble classification system of multiple Machine Learning techniques, including Logistic Regression, Support Vector Machine, Naive Bayes, KNN, Random Forests, Gradient Goosting.
• Achieved accuracy rate more than 87% on first digit SIC code prediction and 79% on first two digits SIC code prediction.
Tools and Techniques used: Python, SAS, R, Bag-of-words, TF-IDF, Ensemble Learning, Logistic Regression, Support Vector Machine, Naive Bayes, KNN, Random Forests, Gradient Goosting, Cross-validation.
![]()
Research Assistant
Prof. Yao Xie's group, Georgia Institute of Technology
Sep 2013 - Dec 2014
• 1 yr 4 mos• Conducted research on online dimensionality reduction.
• Proposed a new algorithm, online sufficient dimension reduction.
• Analyzed various high dimensional time series data set
• Coded a Python module for real-time multi-gigabyte seismic sensor data processing, cleaning and parsing.
• Supervised by Prof. Yao Xie, School of Industrial & Systems Engineering (ISyE)