Master of Science in Data Science
Regis University
Expected Graduation: Spring 2026
I am a graduate student from India at Regis University, graduating in may 2026, have a little over of 2 years experience in Data analyst. My skills include Python, SQL, R, AWS, Tableau, and Power BI. I am passionate about data analytics, cloud engineering, and machine learning.
This project builds an ML pipeline that predicts EC2 instance costs, forecasts daily cloud spending, detects pricing anomalies, and quantifies how much money organizations leave on the table by sticking with on-demand pricing. It uses two datasets from two cloud providers, AWS for regression and cost optimization, Azure for time series forecasting and anomaly detection.
Retail forecasting pipelines often break when moved from one dataset to another because of schema-level inconsistencies, such as varying column names and missing explicit sales fields. RetailMind is a single, schema-agnostic retail analytics pipeline I built for my Data Science Practicum 2 at Regis University to address this problem. The system automates the end-to-end workflow: it maps the incoming data schema, reconstructs missing measures, selects an appropriate temporal aggregation level, and generates demand forecasts using a LightGBM model. It also performs anomaly detection, regression-based driver analysis to identify key performance factors, and inventory reorder recommendations to support data-driven decisions. To ensure robust performance, the pipeline automatically falls back to a naïve baseline when the trained model underperforms. I validated RetailMind on three public retail datasets: Rossmann Store Sales (about one million transactions across 1,115 German drugstores), Walmart retail data (8,399 rows across three product categories), and Iowa Liquor Sales (roughly 900,000 wholesale transactions from January to June 2024, obtained via the Iowa Socrata Open Data API). Using stores as the entity level, the model improved seasonal naïve RMSE by 26.6% across walk-forward folds, and when switched to product-category level with FLAML-based driver analysis, it achieved an R² of 0.747. The final system is deployed as a publicly accessible Streamlit web application.
Feel free to reach out for collaboration opportunities, questions about my projects, or just to connect!
I'm always open to discussing new projects, creative ideas, or opportunities to be part of your vision.