Michelle Zheng - Portfolio | Regis University

About Me

I plan to graduate in May of this year and would like to move to a systems report engineering role or a Business Intelligence developer. During my free time, I like to listen to true crime podcasts and go hiking with family and friends.

Practicum Projects

MSDS 692

Predictive Analytics for Workforce Stability: Identifying Attrition Factors with Supervised Machine Learning

High employee turnover poses significant financial and organizational challenges across industries. This practicum project applies an end-to-end data science approach to predict employee attrition using supervised machine learning models. The project encompasses data acquisition, preprocessing, exploratory analysis, model development, and evaluation using Python. Multiple algorithms are compared to identify the most effective predictive model and to uncover key drivers of attrition such as job satisfaction, compensation, and work-life balance.

GitHub Report Presentation

Regis University | MSDS 692 | Spring 2026

MSDS 696

Wisconsin Circuit Court Recidivism Prediction

Recidivism prediction tools have played a role in the American criminal justice system, but the dominant commercial instrument, the Correctional Offender Management Profiling for Alternative Sanctions (COMPAS), is proprietary and unauditable. This paper presents an open-source, reproducible recidivism prediction model trained on the Wisconsin Circuit Court dataset (WCLD), covering 1,357,746 adjudicated cases filed between 2000 and 2018. The model integrates census-derived neighborhood-level socioeconomic features, including median household income, food stamp eligibility rate, educational attainment, and population density, alongside individual criminal history variables. A feature-engineered Socio-Economic Deprivation Index (SEDI) composite score is introduced to address the well-documented limitation that existing risk tools treat defendants as if they exist in a geographic and socioeconomic vacuum. The SEDI combines multiple census tract measures that individually have low bivariate correlation with recidivism but collectively capture a meaningful dimension of neighborhood disadvantage that linear models cannot represent adequately. Using XGBoost and SHapley Additive exPlanations (SHAP), the study quantifies the marginal contribution of neighborhood variables to recidivism risk. The best-performing model achieves an AUC-ROC of 0.7043, meeting published COMPAS benchmarks. Neighborhood features collectively account for 18.7% of the total SHAP explanation mass. All code, data pipelines, trained models, and SHAP artefacts are released publicly so that researchers can inspect, reproduce, and improve every component of the system.

GitHub Live Demo Report Presentation

Regis University | MSDS 696 | Spring 2026

Get In Touch

Let's Connect

Feel free to reach out for collaboration opportunities, questions about my projects, or just to connect!

mzheng001@regis.edu github.com/studnt001 studnt001.github.io

Send a Message

I'm always open to discussing new projects, creative ideas, or opportunities to be part of your vision.

Email Me Download CV