I am working with the "Bus Breakdowns and Delays" dataset provided by NYC Open Data. You can check it out
here.
The dataset had many inconsistent values in its columns related to bus delay times
and bus compnay names. I cleaned my dataset using Python to standardize the values. The dataset
previously had 198,746 rows and now has 156,083 rows. You can see exactly how I chose to clean my dataset
here. Since the dataset is quite big, my code may take a while to load - so please be patient! :)
Make sure to hover your mouse over objects to see the hidden values underneath them in the visualizations below! ↓↓↓