How LaLiga championed big data to transform data analytics in sport

LaLiga Tech's data analytics product in action
(Image credit: LaLiga Tech)

We all know data is the new currency, and the applications for data analytics are seemingly endless. Industries across the economy have been keen to adopt big data and analytics, but have done so to varying degrees of success. 

Football is an area in which data is everywhere. Indeed, the use of data analytics is rife – whether it’s quantifying different phases of play throughout a match or measuring player performance, or even identifying transfer targets.

There are countless data points available to coaches, presenters, commentators, pundits, and executives throughout the industry, who comb through the numbers and translate them into insights. 

But there are metrics off the pitch the football industry has less success in translating to real insights. Seeking to capitalize on the opportunities of on-pitch and off-pitch data is LaLiga, the premier division in Spain.

Tapping into rivers of data

To make the most of the untapped rivers of data flowing through football, the league created LaLiga Tech as a subsidiary that strives to capture and process the data generated throughout each facet of the annual competition, with the organization providing detailed analytics to the groups that make up the league.

“We know now that data is power – but if data is isolated, and you have silos of data that are unconnected, then you are not able to provide value for the clubs, or the fans,” says Fermin Martinez, senior data engineer for business intelligence and analytics at La Liga Tech. 

LaLiga Tech senior data engineer Fermin Martinez
Fermin Martinez

Martinez has is senior data engineer for business intelligence and analytics at La Liga Tech. He joined the league's subsidiary organization in 2018 and has been instrumental in its efforts to maximize big data in football. 

“But it's not only about having the data, you have to be able to join all the data together. In order to get real insights, you need to put a lot of different sources together and mix them.

“The real motivation is to be able to give value to clubs and fans. In order to do something with this data, we need a platform. We need the data lake we are using from Databricks, but we also need to apply data processing techniques.”

Collaborating with data off the pitch

Before the inception of LaLiga Tech five years ago, clubs were processing their own data and creating analysis in isolation. While this was useful for each club in isolation, it couldn’t provide insights that a larger data model would be capable of, Martinez says.

With 20 clubs all providing up to 25 data points a second throughout matches, this volume of data needed a place to be stored and processed. In the case of LaLiga Tech, this came in the form of Databrick’s data lake where, as Martinez puts it, the data originating from different sources can be “mixed” and insights can be generated.

“Before we started with Databricks, most departments inside LaLiga worked with different providers; these providers were the only ones able to access the data. As a result, the data was there but there was no connection between one source and the other because there was no connection between teams.

“We talk about data and how data is important. It's obvious that we need to get all this data together somewhere. The data lake is the place where all the data goes. Using Databricks, we are able to separate the data we receive into several states. We have a staging area inside this data lake, a prepared area where we start to prepare the data, and we have a final consumer area where the data is presented.”

Powering the sporting world with data

With these analytics, LaLiga Tech is able to provide insights to players and management around the probability of a player getting injured, the style of play when a goal was scored and, during the pandemic, how likely it was for players to transmit COVID-19 to each other during a match.

For the clubs and TV broadcasters, the data that’s generated provides the best time to start promoting season tickets for the next year. The insights will also reveal which events during a match – like yellow cards and goals – might cause people to tune into a game.

According to Martinez, these insights are not just reserved for football and can be applied to any sport from rugby, to tennis, to basketball. “We’ll need to use some different data sources,” says Martinez, “but the process is going to be more or less the same. We are now able to use what we have built to apply it to different leagues around the world.

"We want to package what we have built and to be able to apply it to a tennis competition, or the Olympic Games. If we talk about tracking the position of a player 25 times a second, this is something that a lot of sports could use. The conclusions we get may be different but the treatment of the information stays the same.”

Elliot Mulley-Goodbarne

Elliot Mulley-Goodbarne is a freelance journalist and content writer with six years of experience writing for B2B technology publications, notably Mobile News and Comms Business. He specialises in mobile, business strategy, and cloud technologies, with interests in environmental impacts, innovation, and competition. You can follow Elliot on Twitter and Instagram.