Error Handling
The SDK raises typed exceptions for all API errors. Every exception inherits from DataBlueError, making it easy to catch all errors or handle specific types.
The HTTP client automatically retries transient errors (429 and 5xx) with exponential backoff before raising.
Exception Hierarchy
DataBlueError # Base exception for all SDK errors
AuthenticationError # 401 — bad or missing API key / JWT
NotFoundError # 404 — resource does not exist
RateLimitError # 429 — rate limit exceeded (retryable)
ServerError # 5xx — server error (retryable)
JobFailedError # Polled job completed with "failed" status
TimeoutError # Polling timeout exceeded
Basic Error Handling
from datablue import (
DataBlue,
DataBlueError,
AuthenticationError,
RateLimitError,
NotFoundError,
ServerError,
JobFailedError,
TimeoutError,
)
with DataBlue(api_key="wh_your_api_key") as client:
try:
result = client.scrape("https://example.com")
except AuthenticationError as e:
print(f"Auth failed: {e.message}")
print(f"Status: {e.status_code}") # 401
print(f"Docs: {e.docs_url}") # https://docs.datablue.dev/errors/authentication
except RateLimitError as e:
print(f"Rate limited: {e.message}")
print(f"Retry after: {e.retry_after}s") # seconds to wait
print(f"Retryable: {e.is_retryable}") # True
except NotFoundError as e:
print(f"Not found: {e.message}") # 404
except ServerError as e:
print(f"Server error ({e.status_code}): {e.message}")
print(f"Retryable: {e.is_retryable}") # True
except DataBlueError as e:
print(f"API error: {e.message}")
print(f"Status: {e.status_code}")
print(f"Body: {e.response_body}")
Job Errors (Crawl / Search)
from datablue import DataBlue, JobFailedError, TimeoutError
with DataBlue(api_key="wh_your_api_key") as client:
try:
status = client.crawl(
"https://example.com",
max_pages=100,
timeout=60.0, # fail if not done in 60s
)
except TimeoutError as e:
print(f"Timed out after {e.elapsed:.1f}s")
print(f"Job ID: {e.job_id}")
# Optionally cancel the still-running job
client.cancel_crawl(e.job_id)
except JobFailedError as e:
print(f"Job failed: {e.message}")
print(f"Job ID: {e.job_id}")
print(f"Response: {e.response_body}")
Exception Attributes
| Attribute | Type | Available On | Description |
|---|---|---|---|
message | str | All | Human-readable error description |
status_code | int | None | All | HTTP status code (if from API response) |
response_body | dict | None | All | Raw API response body |
is_retryable | bool | All | Whether the request can be safely retried |
retry_after | float | None | RateLimitError | Seconds to wait before retrying |
docs_url | str | None | All | Link to documentation for this error type |
job_id | str | None | JobFailedError, TimeoutError | Job ID that failed or timed out |
elapsed | float | None | TimeoutError | Seconds elapsed before timeout |
AI-Friendly Error Messages
v2.0.0 errors include fix suggestions directly in the message, making them useful for both humans and AI coding assistants:
# AuthenticationError message includes fix instructions:
# "Authentication failed. Set DATABLUE_API_KEY environment variable
# or pass api_key to DataBlue(api_key='wh_...')"
# RateLimitError includes wait time:
# "Rate limit exceeded. Wait 42s before retrying,
# or reduce request frequency."
# TimeoutError includes fix suggestion:
# "Job crawl-abc123 did not complete within 300s.
# Try increasing the timeout parameter."
# ServerError indicates auto-retry:
# "Server error (502). This request will be automatically retried."
Automatic Retries
The SDK automatically retries on transient errors before raising an exception:
- 429 (Rate Limit) — waits for the
Retry-Afterheader, or uses exponential backoff (max 30s) - 5xx (Server Error) — exponential backoff: 0.5s, 1s, 2s (max 10s per wait)
- Connection errors — same exponential backoff as 5xx
- Max retries: 3 by default, configurable via
max_retriesparameter