The Power of Big Data within the Transport Industry

Read the article in Railway Strategies magazine here

Data – or so-called ‘big data’ – is critical to the successful implementation of any sort of intelligent transport system. According to the International Transport Forum, global thinktank for transport policy, big data has “a vital role to play in informing decisions around sustainable infrastructure planning by providing better estimates of future trends in transport demand through enhanced data linkage and modelling.” [ITF, 2015]

From operations and planning, through to safety and enhanced customer experience, the potential applications of big data, when done properly, are as numerous as they are exciting. In 2012, the Department for Transport set up the Transport Sector Transparency Board to oversee the opening up of transport data, hoping to encourage the transport industry to embrace the ‘Open Data’ revolution. Five years later, there are currently 777 data sets on, both published and unpublished, of which nearly half (364) relate to transport. But while this proliferation in open data, combined with increasingly advanced algorithms, availability of real-time processing capability, and advances in data storage, is revolutionising sectors such as online advertising and e-commerce, is the transport industry really keeping up?

Operational transport data (e.g. real-time train/bus arrival times, route planning etc) is becoming increasingly commoditised, with a relatively mature eco-system of suppliers and developers creating a huge range of useful Apps and solutions to improve the customer experience. The data may not be always be easily accessible, and can be often be incomplete, inaccurate and unreliable, thanks to a fragmented industry approach to the sharing of data across multiple operators and service providers. Additionally, it is clear that there remains huge potential in the sharing of additional datasets around detailed historical and predicted performance, more accurate live positioning, vehicle loading and ticket barrier data, especially if this data can be made available in real-time.

Despite this relative maturity in terms of the availability of data, the transport industry has valuable lessons to learn from other sectors around maximising the value of both the data itself, and the arguably even more valuable ‘meta data’ generated through the deployment, and use of big data. I have a background in online advertising technology, an industry with a reputation for wringing every last drop of value from their data assets, with organisations such as Facebook and leading the way. The transport industry is heavily focused on identifying ways to extract this value from their existing operational big data, through the ubiquitous ‘hackathons’ and ‘innovation competitions’ that are announced on a seemingly daily basis, but many of these challenges have already been solved in other industries!

Transport data provides its own set of unique challenges (and opportunities), however the incumbent technology providers in the transport sector, the ones behind the aforementioned hackathons and competitions, do not have the right skills to exploit many of these opportunities. We have had conversations with large global engineering firms about how often we have to delete our data, or with legacy technology providers about how we structure our data in a way that allows systems to scale. The idea of deleting data or using traditional relational database technology (such as SQL Server or Oracle) would be laughable in other sectors working with ‘big data’. With a lack of experience and understanding of cloud computing, open-source NoSQL databases and real-time stream processing technology, these large, slow moving suppliers are simply incapable of creating the technology platforms required to deliver a modern big data processing capability.

There is currently an enormous amount of excitement around the use of Machine Learning (ML) and AI (Artificial Intelligence) both within the transport sector and across industry as a whole. Leading edge technology companies, such as Google DeepMind, are creating some incredibly exciting technologies in this space, however even some of the most sophisticated tools utilising ML and AI are becoming increasingly commoditised. For example Amazon Web Services (AWS) provide advanced machine learning and AI toolsets ‘off-the-shelf’, facilitating predictive capability and automated ‘chatbot’ technology for just a few dollars a day.

For the transport sector, ML and AI is already being used to identify patterns in existing operational data sets – predicting when trains will be delayed, the impact of disruption, or even to predict infrastructure failure. However the real impact of these technologies is being felt in other sectors through its application to human interactions and behavioural data. Personalisation, customer sentiment analysis and targeted messaging are just some of the areas in which giants such as and Facebook are leading the way.

In order to deliver these sorts of services and products, companies such as do one thing extremely well – they understand the value of human behavioural ‘big-data’. The transport sector has historically been relatively unsophisticated in their (digital) interactions with their customers – with the communication of operational data such as delays, cancellations and other disruption being seen as an obligation rather than an opportunity. The data generated through how customers interact with information, when they interact, and even how long they take to make a decision based on the information presented to them, is incredibly valuable, if somewhat unstructured, data. This sort of data has been used in online advertising technology for many years (we are all familiar with those adverts that follow us around the Internet!), and underpins the ability to both understand and engage more meaningfully with your customer.

For the transport sector, this human behavioural ‘big-data’ has incredible value. For example, data on when individuals plan a journey, or check a train/bus time prior to travelling, can be used to predict aggregated demand on services over the next few hours. How customers interact with your Mobile App can provide a unique insight into how they use your network – do they check for parking availability or accessibility information before they make their journey? At an individual level, these sorts of interactions can help to build a unique profile for each and every customer, and ultimately deliver an enhanced, personalised customer experience. More importantly, aggregating this data at a network level, with potentially 100’s of millions of interactions every day, provides a unique opportunity for the transport sector to optimise capacity and to influence behaviour across the network as a whole.

So, if this human behavioural ‘big-data’ is the key to optimising capacity, how can the industry take advantage of the opportunity? As previously discussed, much of the technology and underlying infrastructure required to collect and process this sort of data has been available for some time now. However, the transport industry urgently needs to embrace these ‘outside’ technologies, and the innovative companies working in this space, to provide the support and investment required to deliver a truly 21st century customer experience, and an intelligent transport network optimised to the needs of the passenger!