NYC Subway Demand Forecasting Resource Optimization

Project Description

Built an end-to-end time-series forecasting pipeline using real NYC subway ridership data to support operational staffing decisions. The project compares baseline and machine learning models and translates forecast accuracy into staffing requirements and cost impact, supported by an interactive Power BI dashboard.

Key Highlights

Built hourly demand forecasts for Times Sq–42 St and Grand Central–42 St using NYC MTA open data
Compared naive, seasonal-naive, and gradient boosting models to establish strong baselines
Translated forecast error into staffing requirements and weekly cost impact
Designed an interactive Power BI dashboard for operational planning

Results

Seasonal-naive forecasting reduced MAE by ~40% compared to a naive baseline across both stations
Simple baseline models performed competitively with gradient boosting for strongly seasonal demand
Translating forecasts into staffing needs showed an estimated $30K–$70K weekly cost reduction versus naive planning
Peak staffing demand was consistently identified during 8–10 AM and 4–7 PM windows