Download Tracking

Download raw player and ball tracking data.

Performance Note

This endpoint streams large files and may take several seconds to respond. Results are deterministic for a given analysis — we recommend caching the downloaded file on your end to avoid repeated transfers.

Request

GET /analyses/{id}/tracking/download?format=parquet

Query Parameters

Parameter	Type	Default	Options	Description
`format`	string	`parquet`	`parquet`, `json`	Output format

Response

Returns a file download:

Parquet: Binary file, ~20-50 MB (recommended)
JSON: Text file, ~200-500 MB

Data Schema

Column	Type	Description
`timestamp`	int32	Time in milliseconds
`role`	string	`player`, `goal_keeper`, `ball`, `referee`
`player_id`	string	Player identifier (e.g., "0-10") or null
`player_id_conf`	float16	Confidence of player identification (0-1)
`x`	int16	X position in centimeters
`y`	int16	Y position in centimeters
`bbox_x`	int16	Pixel X location of player in the video. Center of the player bounding box. Null on projected rows
`bbox_y`	int16	Pixel Y location of player in the video. Bottom of the player bounding box. Null on projected rows
`status`	string	How this row was produced — see Row Status below
`overlap_partner_id`	string	On `detected_overlap` rows: the player this detection overlaps with
`position_conf`	float	On projected rows: confidence in the projected position (0-1)
`identity_conf`	float	On projected rows: confidence in the player identity (0-1)

New in analysis version 0.5.36

The status, overlap_partner_id, position_conf, and identity_conf columns are present for analyses processed with version 0.5.36 or later. Older analyses do not include them.

Row Status

Status	Meaning
`detected`	Player directly detected in the video frame
`detected_overlap`	Detected, but overlapping another player (`overlap_partner_id` says which) — identity attribution is less certain
`projected`	Player not visible in the frame; position projected by the tracking model
`projected_nocam`	Player outside all camera coverage; position projected with the highest uncertainty

Off-Camera Players (Projected Rows)

From analysis version 0.5.36, the file covers the full squad: alongside the 25 FPS detection rows it includes projected rows (at 5 Hz) for players the camera does not currently see. Projected rows have pitch coordinates (x, y) but no pixel coordinates (bbox_x, bbox_y are null), and carry position_conf / identity_conf so you can gate on trust.

If you want the previous behavior (only players actually seen on camera), filter on status:

detected = df[df['status'].str.startswith('detected', na=False)]

Sample Data

timestamp  role        player_id  x      y      status
3309680    player      0-4        6606   3256   detected
3309680    player      0-10       4520   4100   detected
3309680    player      1-3        2210   1180   projected
3309680    ball        null       5200   3400   detected
3309720    player      0-4        6610   3259   detected

Understanding the Data

Coordinate System

Origin (0,0) is bottom-left corner of pitch
X: 0 to 10500 (centimeters) = 0 to 105 meters, left to right
Y: 0 to 6800 (centimeters) = 0 to 68 meters, bottom to top

See Pitch Coordinates for details.

Frame Rate

Data is recorded at 25 FPS (frames per second):

Timestamps increment by 40ms between frames
1 second = 25 data points per tracked entity
Projected rows (off-camera players, version 0.5.36+) are sampled at 5 Hz (200ms steps)

Player Identification

player_id: Assigned when jersey is detected with confidence
player_id_conf: Higher = more certain identification
null player_id: Player detected but jersey not readable

Working with the Data

Python (Parquet)

import pandas as pd

# Load tracking data
df = pd.read_parquet('tracking.parquet')

# Keep detection-backed rows only (0.5.36+ also contains projected rows,
# which would distort distance calculations)
if 'status' in df.columns:
    df = df[df['status'].str.startswith('detected', na=False)]

# Filter to specific player
player_10 = df[df['player_id'] == '0-10']

# Convert to meters
player_10['x_m'] = player_10['x'] / 100
player_10['y_m'] = player_10['y'] / 100

# Calculate distance traveled
player_10 = player_10.sort_values('timestamp')
player_10['dx'] = player_10['x_m'].diff()
player_10['dy'] = player_10['y_m'].diff()
player_10['dist'] = (player_10['dx']**2 + player_10['dy']**2)**0.5

total_distance = player_10['dist'].sum()
print(f"Player 0-10 traveled {total_distance:.0f} meters")

JavaScript (JSON)

// Note: JSON files are large, consider streaming
const response = await fetch(
  `/api/v1/analyses/${id}/tracking/download?format=json`,
  { headers }
);
const tracking = await response.json();

// Group by timestamp for frame-by-frame analysis
const frames = {};
tracking.forEach(row => {
  if (!frames[row.timestamp]) frames[row.timestamp] = [];
  frames[row.timestamp].push(row);
});

Format Comparison

Aspect	Parquet	JSON
File size	20-50 MB	200-500 MB
Load time	Fast	Slow
Streaming	No	Yes
Language support	Python, R, Spark	Universal
Recommended for	Analysis	Web apps

Example

cURL (Parquet)cURL (JSON)Python

curl -X GET "https://aiontheball.nl/api/v1/analyses/1246/tracking/download?format=parquet" \
  -H "Authorization: Bearer YOUR_API_TOKEN" \
  -o tracking.parquet

curl -X GET "https://aiontheball.nl/api/v1/analyses/1246/tracking/download?format=json" \
  -H "Authorization: Bearer YOUR_API_TOKEN" \
  -o tracking.json

import requests
import pandas as pd
from io import BytesIO

response = requests.get(
    'https://aiontheball.nl/api/v1/analyses/1246/tracking/download',
    headers={'Authorization': f'Bearer {token}'},
    params={'format': 'parquet'}
)

df = pd.read_parquet(BytesIO(response.content))
print(f"Loaded {len(df)} tracking points")

Large Files

Tracking files can be large. For web applications, consider:

Using the Summary endpoint for aggregated stats
Streaming JSON responses
Server-side processing with Parquet