We use the game's broadcast footage to extract the players' coordinates. This process involves first localizing the area of the pitch that the camera is focused on. We then employ cutting-edge AI models to detect and identify the on-screen players and the ball, mapping their coordinates from the broadcast footage onto a flat 2D projection of the pitch.
For players not visible to the broadcast camera, we utilize an advanced AI imputation system. This system, trained on millions of data points from GPS tracking, accurately predicts their locations. The figure below visualizes this process.
This process occurs five times per second, allowing us to accurately track players' movements throughout an entire match. Moments irrelevant to data collection—such as replays, crowd shots, and zoom-ins—are automatically discarded. However, using the AI Imputation system, we still provide estimated coordinates for players during these moments. As a result, the tracking data offers a granular and extensible foundation from which you can extract countless insights and define custom metrics.
The data is provided as a JSONL file, where each individual line in the file is itself a valid json object.
The first line contains information about the game and players. It serves as a look-up dictionary to map player IDs to player information. This approach avoids unnecessary repetition and reduces file size. An example is shown below:
{
"match_data": {
"date": "2024-03-01",
"match_id": 553262,
"result": {
"home": 0,
"away": 1
},
"season_data": {
"id": 816545,
"name": "POL I 2023"
}
},
"players_data": {
"team0_id": {
"player0_id": {
"name": "Name Surname",
"number": 10,
"position": "DC"
},
{...}
},
"team1_id": {
"player0_id": {
"name": "Name Surname",
"number": 10,
"position": "DC"
},
{...}
}
}
}
After the first line, all subsequent lines follow a common structure. This structure is straightforward, and an example is provided below:
{
"frame": number,
"vid_timestamp": number, // Seconds elapsed in original video
"period": number,
"ball": [x, y] // The x and y coordinates of the ball
"data": {
_id: [ // Array of players that play for team0_id
{
"id": number // Corresponds with look-up dict
"x": number // The x coordinate of the player
"y": number // The y coordinate of the player
"vis": boolean
},
{ ... } // Remaining players for team0_id
],
team1_id: [
... // The same structure for the players of team1_id
]
},
"cam": [ // Polygon describing broadcast camera view
[x0,y0], // The first coordinate
[x1,y1], // The second coordinate
[x2, y2],// The third coordinate
[x3, y3] // The fourth coordinate
]
}