API Data Limiting

Our Job Request API supports jobs with a data limit of 10GB. If the total data for a job exceeds this limit, the job will still be processed. However, when using the Job Result API to retrieve the data, you will only be able to access up to the 10GB limit using the offset and limit parameters. Requests exceeding this threshold will result in an error.

Key Considerations

  1. Capture Session Size
    Each capture session JSON data file is typically between 5MB and 10MB. Use this as a baseline when planning your queries.

  2. Date Range Recommendations
    When using the startTime and endTime parameters:

    • Recommended Range: Limit queries to one month or less.
    • This ensures that the job stays under the 10GB limit, even for clients with high capture volumes.
  3. Job Splitting for Multi-Year Data
    Clients with multiple years of data should break their requests into monthly or smaller chunks.

    • For example, a request covering one year should be split into 12 monthly jobs.
    • This approach ensures manageable job sizes and avoids exceeding limits.

Example Scenarios

To illustrate best practices when dealing with data limits, here are a few examples showing how to split queries properly across multiple months and years.

Scenario 1: Single Job Exceeding Limits

Objective: Query data for a single month.

  • Time Period: January 2024
  • Total Captures: 1,500 sessions
  • Estimated Data Size:
    7.5MB per session × 1,500 sessions = 11.25GB

Outcome: The requested data exceeds the 10GB limit. To retrieve the entire dataset, consider splitting the month into two smaller jobs, each covering roughly half of the month.

Scenario 2: Multi-Year Data Split into Monthly Jobs

Objective: Query data spanning multiple years. To ensure data retrieval stays manageable, we split the queries into monthly jobs.

Year 1 (2022)

  • January 2022

    • Total Captures: 3,000 sessions
    • Estimated Data Size:
      7.5MB per session × 3,000 sessions = 22.5GB
    • Action Required: Split January into three smaller jobs, each covering approximately 10 days.
  • February 2022

    • Total Captures: 1,400 sessions
    • Estimated Data Size:
      7.5MB per session × 1,400 sessions = 10.5GB
    • Action Required: Split February into two jobs to ensure each job remains under the 10GB limit.

Year 2 (2023)

  • January 2023
    • Total Captures: 900 sessions
    • Estimated Data Size:
      7.5MB per session × 900 sessions = 6.75GB
    • Action Required: No further splitting needed as the job is under the 10GB limit.

Scenario 3: Full Year Data Request

Objective: Query data for the entire year of 2022.

  • Total Captures for the Year: 25,000 sessions
  • Estimated Data Size:
    7.5MB per session × 25,000 sessions = 187.5GB
  • Action Required: Split the request into 12 monthly jobs, with each month further split into smaller jobs if it exceeds 10GB.

Error Handling

If your query attempts to retrieve more than the 10GB limit, the Job Result API will return the following error:

{
  "error": "Data limit exceeded",
  "message": "The requested data exceeds the 10GB limit per job. Use offset and limit parameters to retrieve data within the allowed range."
}

Best Practices Summary

  • Limit Queries to Monthly or Smaller Chunks
    This ensures manageable data retrieval and avoids exceeding the 10GB limit.

  • Pre-Estimate Query Size
    Calculate the approximate data size of your request before submission.

  • Paginate Results
    Use the offset and limit parameters to retrieve data incrementally, ensuring each request stays under the limit.