Using machine learning to predict delays and build passenger trust

Zipabout meets UK Rail Minister to discuss the role of data in modernising the transport network.

 — 
June 22, 2020

For years DARWIN has been used by the UK rail industry as its default information platform, responsible for providing arrival and departure predictions, platform numbers, delay estimates, schedule changes and cancellations all in realtime. However, as the system is reliant on purely operational inputs, and doesn’t account for other data sources that may influence delays, inaccurate predictions have led to a level of distrust between passengers and the operators providing information.

I caught up with Rail Minister, Chris Heaton-Harris, to discuss how Zipabout is working with partners Birmingham University, Rail Safety Standards Board (RSSB) and Risk Solutions, to improve the industry’s capability in delivering accurate delay and disruption information to passengers.

As part of the virtual ministerial briefing, we demonstrated how our machine learning (ML)prediction model combines a range of real-time and historic data sources, to create a series of 200+ ‘features’ that could influence delays and disruption.

Using these large quantities of data (including train movement, operational data, weather data, passenger demand and cascading disruption), we are then able to generate more accurate delay predictions on a huge scale. We do this through our unique real-time data processing platform – provided by our technology partner Kx.

This new ML approach enables us to create predictions that account for many of the internal and external factors that impact delays. Results to date suggest that our predictions are 50% more accurate than DARWIN when it comes anticipating delays between two and four and half hours in advance. This is largely due to the fact that DARWIN bases its information on the existing schedule, and only begins to make disruption/ delay predictions an hour before the scheduled arrival time. This can often lead to passengers finding out their train is delayed when they are already at the station.  

However, more accurate delay predictions are only part of the challenge facing the rail industry, it’s also important to consider how this information is communicated to passengers.

Therefore, we looked at how our personalised information service, that delivers journey updates to passengers, can influence passenger behaviour and improve the customer experience. This includes how the timing and wording of an update can help set rail passenger’s expectations, and even improve operational efficiency across the network.

To explore this further, we are currently undertaking a real-world customer trial with LNER and will be collating direct customer feedback at scale to help refine and develop the solution further – so watch this space.

In response to the briefing, Rail Minister Chris Heaton-Harris said:

“Harnessing data and new technology is crucial to modernise and improve our transport network, and we are determined to drive innovation through collaborations like this. These projects will help the industry tackle bottlenecks, delays and improve accessibility, and I look forward to seeing the crucial role they can play in improving journeys for passengers.”
Find out more about the Data Sandbox+ project