NYC Subway Demand Forecasting Resource Optimization

Project Description

Built an end-to-end time-series forecasting pipeline using real NYC subway ridership data to support operational staffing decisions. The project compares baseline and machine learning models and translates forecast accuracy into staffing requirements and cost impact, supported by an interactive Power BI dashboard.

Key Highlights

  • Built hourly demand forecasts for Times Sq–42 St and Grand Central–42 St using NYC MTA open data
  • Compared naive, seasonal-naive, and gradient boosting models to establish strong baselines
  • Translated forecast error into staffing requirements and weekly cost impact
  • Designed an interactive Power BI dashboard for operational planning

Results

  • Seasonal-naive forecasting reduced MAE by ~40% compared to a naive baseline across both stations
  • Simple baseline models performed competitively with gradient boosting for strongly seasonal demand
  • Translating forecasts into staffing needs showed an estimated $30K–$70K weekly cost reduction versus naive planning
  • Peak staffing demand was consistently identified during 8–10 AM and 4–7 PM windows