Start a new document scraping job with a website URL and natural language prompt
Field | Type | Description |
---|---|---|
website | string | Starting URL to scrape (must be valid HTTP/HTTPS URL) |
prompt | string | Description of documents to find (10-500 characters) |
Field | Type | Default | Description |
---|---|---|---|
single_page | boolean | true | Only scrape the provided URL (no navigation) |
timeout | integer | 1800 | Max time in seconds (60-3600) |
confidence_threshold | float | 0.1 | Min AI confidence score (0.0-1.0) |
file_type | string | "document" | Type of files to extract |
max_file_size_mb | integer | 100 | Max file size in MB (1-500) |
Field | Type | Description |
---|---|---|
job_id | string | Unique identifier for the created job |
status | string | Initial job status (always pending ) |
message | string | Success message |
estimated_completion | string | ISO 8601 estimated completion time |
created_at | string | ISO 8601 job creation timestamp |
Status | Error Code | Description |
---|---|---|
400 | validation_error | Invalid request parameters |
402 | insufficient_credits | Not enough credits |
429 | concurrency_limit_exceeded | Too many concurrent jobs |
503 | service_unavailable | Required services not configured |
API key in format 'sk-xxxxxxxxxxxxx' or 'sk_xxxxxxxxxxxxx'
Job created successfully
The response is of type object
.