Aside

Shun Xie

Contact Info

Skills

Disclaimer

This resume was made with the R package pagedown.

Last updated on 2023-10-03.

Main

Shun Xie

Education

Imperial College London

Msci. (Bachelor and Master) in Mathematics

London, UK

2018

Undergraduate GPA: 3.7/4.0, Graduate GPA: 3.83/4.0, First-Class Honours Degree (UK)

Thesis: Correlation between unemployment and earnings using Distance Correlation

Columbia University

M.S. in Biostatistics

New York, US

2022

First Year GPA: 4.0/4.3

Professional Experience

Data Analyst Intern

Yum China

Shanghai, China

May 2023 - Aug 2023

  • Identified anomalies in the data due to holidays by applying CNN on time series data
  • Reconstructed weekly report after communication with operation team using HiveSQL; Improve SOP ability by supporting operation team to maintain weekly report dashboard
  • Based on A/Btest for the new ai recommendation strategy on cltv1 (low-frequency user), discovered an improvement in ARPU but a decrease in transaction frequency; Proposed to solve the problem by splitting cltv1 into new and sleeping member groups
  • Improved customer value by constructing user churn model using LightGBM/MLP (AUC: 0.86) and defined the KPI via customer return rate for churn model evaluation
  • Offered a data-driven solution using Shapley Value and WOE during weekly meeting; Presented the solution with Altair/Seaborn visualization package to the operation team
  • Summarize the key attributes of churn users through K-means clustering on PySpark

Data Analyst, Intern

Caitong Security Asset Management Co. Ltd.

Shanghai, China

Jun 2020 - Aug 2020

  • Improved accuracy by imputing missing value using developed formula.
  • Analyze data extracted from Wind and generate analysis report on the feature of public equity fund
  • Investigated the idiosyncrasies of the mutual funds with more than 5 billion subscriptions and produced a research report

Data Analyst, Intern

Hycon Research Co. Ltd. 

Shanghai, China

Jul 2019 - Sep 2019

  • Contributed to a market research project to optimize the design of a new towelette product and boost sales
  • Conducted conjoint analysis to identify the weights of different properties of the product gathering consumers’ attention
  • Generated 8 randomly profiles comprising the four factors and gathered 60 responses to the profiles
  • Implemented the algorithm using R and determined that the production location was the major concern

Research Experience

Study of Life Expectancy

Group Leader, Columbia University

New York

Nov 2022 - Dec 2022

  • Built main page of website using R and html language, published the website on github webpage
  • Impute dataset using k mean imputation with 4 groups regarding to countries income level and verified using k mean clustering
  • Based on Pearson and distance correlation, confirmed that linear regression is sufficient for life expectancy analysis and achieved an adjusted R square of 0.78.
  • Chose as the paradigm of the projects and displayed in lecture’s webpage

Correlation between Unemployment and Wages

Master Thesis, Imperial College London

London, UK

Oct 2021 - Jun 2022

  • Applied Distance Correlation in a new field and captured additional 10% non-linear correlations.
  • Implemented spatial regression with approximate profile-likelihood estimator (APLE) to solve the problem of nonzero spatial correlation.
  • Iterated over confounders and concluded that the correlation arises from confounders under different time lag

Depression Status Predicted by COVID-19 Associated Behavior Change

Group Leader, Yale University

Remote

Jun 2021 - Sep 2021

  • Increased accuracy by an average of 6% after replacing selected models (Logistic Regression, Mixed Logistic Regression, Random Forest, k-NN) by Linear SVM
  • Improved sensitivity by 0.6 in k-NN and LR using Under-sampling, Lower-sampling to tackle unbalanced data.
  • Identified Random Forest achieved the optimal F-score (0.787) and sensitivity (0.750) using 4-fold cross-validation

Evaluation of Supervised Classification Models for Image Recognition

Academic, Imperial College London

London, UK

Jan 2021 - Mar 2021

  • Built and optimized the Multilayer Perceptron model to prevent overtraining based on the performance on CIFAR-10 dataset
  • Compared results between MLP and CNN model; Concluded that CNN model has a 20% higher accuracy at epoch 40 and saved 7 megabytes of data

Comparison of Different Word Embedding Models based on IMDb Reviews Jun 2020

Group Leader, Imperial College London, UK

London, UK

Jun 2020

  • Compared LSA, GloVe, and Word2vec models via theoretical analysis using intrinsic and extrinsic evaluation
  • Discovered no global difference among models after applying sentiment analysis on IMDB movie reviews

Activities

Buldhism

In charge of organizing zoom meetings and activities.

Shanghai, China

Sep 2023 - Now

QunYao Consulting.

Teaching for algebra and differential equations for A-level.

Shanghai, China

Jun 2018 - Jul 2018

Selected Publications

Comparison of Models’ Performance for Predicting Depression under COVID-19.

accepted by the 2021 International Conference on Statistics, Applied Mathematics and Computing Science (CSAMCS 2021)

Shanghai, China

2021

Shun Xie