The world’s largest football video database is constantly learning new tricks

Artificial Intelligence (AI) and machine learning now allow for users of the DFL Media Hub to search for content more efficiently in the 175,000-hour inventory of video recordings

6 May 2021 – Over the past 15 years DFL has created a comprehensive archive of video recordings of German professional football matches – the world’s largest digital archive of its kind. This initiative helps to provide all of DFL’s national and international media partners with fast and easy access to these materials. In close cooperation with DFB, all existing historical film and video recordings from Bundesliga, Bundesliga 2, Liga 3, Women’s Bundesliga, DFB-Pokal as well as A and U men’s and women’s international fixtures have been collected from formerly distributed storage sites, digitised at the best-possible quality levels, categorised and preserved for long-term archiving, including full-length matches, highlights and live broadcast recordings. The Bundesliga and Bundesliga 2 clubs as well as all national and international DFL media partners and other media, agencies and sponsors with corresponding enquiries have since been able to access this content.

But the DFL Media Hub does far more than simply archive historic match footage. It is the centralised source for all numerous media products of DFL, Bundesliga and Bundesliga 2 clubs and many partners, for example video-contents for the Official Bundesliga App, for club’s digital offers or the coverage of many broadcasters. The infrastructure is accessible to licensees at any time and is an integral part of the DFL’s value chain, which extends from the creation and distribution of the live image signal to the creation of highlight clips and other editorial content. In the corresponding production processes, especially on matchdays, the DFL Media Hub is fully integrated.

Additional metadata transform more than 11 petabytes of data into a treasure trove of content

The DFL Media Hub uses editorial metadata, i.e. attributes that describe the content and serve to keyword the video content. The more comprehensive and detailed these are, the easier, faster and more targeted users can search the video content and filter out the right material for media productions.

2.75 Billion smartphone photos– this corresponds to the amount of video data in the DFL Media Hub

Three different kinds of metadata are being applied in the DFL Media Hub to maximise the value of content: the official match data, live-logging data, and data generated by artificial intelligence. Taken together, this metadata turns the huge video collection of more than 11 petabytes into a well-structured, multi-use content repository – a real treasure chest. One petabyte represents 1 Million gigabytes and corresponds to the size of approximately 250 Million photos taken with a smartphone. We explain the different types of metadata in the following.

  • The official match data is added to the video archive automatically to create the underlying master data framework for each fixture. This includes items such as the proper description of the match, the team line-ups, or match statistics including the score, the goal scorers and any yellow or red cards. This official match information is collected by the DFL subsidiary Sportec Solutions AG and assigned to each match in an automated import process that executes on the DFL Media Hub. A video on match data collection can be found here on the DFL’s YouTube channel (video in German, English subtitle available).
  • Live logging adds another layer: Once the ball starts rolling, the specialists at the DFL subsidiary Sportcast, located at the Cologne Broadcasting Center (CBC), add so-called Live Logs (descriptive metadata in the form of keywords) to the basic signal. That means that for every match a ‘Live Logger’ sits down to follow the events on screen and document them by assigning information from a set of 72 editorially relevant action types to each activity, such as “Move”, “Trick”, “Close-up”, “Save”, “Free Kick”, “Defensive Action” or “Fan Images”. Even pre-match and post-match events are tagged in this way. In the process, an average of more than 100 different, editorially relevant scenes accumulate per fixture. This makes it easier to search for certain scenes, for example.
  • Machine learning and artificial intelligence automatically create an extra layer of highly detailed, content-related metadata which further refines the searchable details associated with video content on the DFL Media Hub to make specific scenes easier to find. Over the past few years, DFL has integrated various innovative technologies, collaborating with the leading provider of cloud computing services, Amazon Web Services (AWS), and Quantiphi, a developer of artificial intelligence solutions. AWS has been the Official Technology Provider of the DFL since January 2020.
Face recognition used in the DFL Media Hub.

Machine learning and AI technology add further information, enabling faster and more accurate search results

A feature called “Face and Emotion Recognition” now recognises specific players and the related emotions automatically. AI-based software algorithms developed specifically for this purpose analyse the video footage, using player images that have been uploaded to the system and processed by the custom software to identify individual players. This software can recognise, categorise and store specific players in a matter of seconds based on their facial features. The system continues to learn as more players and their facial features are added. Another software functionality called “Logo Detection” operates on similar logic: It uses logo information stored in the system, such as club logos, to identify and tag matching video sequences. Thus, it is possible for the around 1,200 regular users to search for clubs or individual players also in combination with emotions.

Another innovation which helps ensure rapid delivery of specific video content relies on ‘fingerprinting’ technology, which has been developed jointly by the DFL and ivitec GmbH, a provider of video and audio content recognition solutions.

12,000 hoursof video material are added to the DFL Media Hub every year

This approach includes the existing intellectual property rights, that are in place between DFL and the individual media partners to provide legal information about who produced the respective video footage and who is allowed to use it. It increases legal certainty for users of video materials while improving search efficiency: about 95 percent of searches are checked automatically based on a pre-defined rights matrix, and delivered to the licensees.

Matchday-specific archiving and availability of editorial broadcasts

The DFL Media Hub includes broadcast recordings from the national media partners, raw footage of all matches, as well as all related extra video coverage. This content aggregation routine is followed for every match, adding around 12,000 hours of extra video material to this archive every year. With 175,000 hours of available content in total, the DFL Media Hub is the world’s largest and smartest football digital video database. Working closely with AWS and its partner-network, the DFL Media Hub will continue to develop both, its technology and content.

Interested in receiving regular updates on the latest innovations in professional football?