Chocodata

Docs

ChatGPT Claude

Universal Web Scraper API

Scrape any URL on the web to JSON, clean HTML, or plain text. Residential proxies and anti-bot handling for sites without a dedicated endpoint.

GET /api/v1/universal/get

The Universal Web Scraper API fetches any URL on the web and returns it in the format you ask for. Use it for sites or page types that do not have a dedicated endpoint, or any time you just want the raw page (or auto-extracted JSON) from an arbitrary URL.

You pass a url, Chocodata handles the proxies, headers, and anti-bot, and you get back the page. It is the same engine as the dedicated endpoints, exposed as a general-purpose fetcher.

Request

GET /api/v1/universal/get?url=https://example.com/article/123&api_key=cd_live_YOUR_KEY
curl "https://api.chocodata.com/api/v1/universal/get?api_key=cd_live_YOUR_KEY&url=https://example.com/article/123&parse=auto"

Query parameters

ParamTypeRequiredDefaultDescription
urlstringyes-The full URL to fetch. URL-encode it if it contains &, ?, or spaces.
parseenum-autoOutput format: auto (best-effort structured JSON), html (raw rendered HTML), text (readable plain text), json (alias for auto).
countrystring-autoTwo-letter country code to force proxy egress (us, de, gb). Omit to let us choose.

Parse modes

The parse parameter decides what you get back:

parseReturnsUse when
autoStructured JSON: title, main text, links, metadata extracted from the pageYou want clean fields without writing selectors
htmlThe full rendered HTML of the pageYou want to run your own parser / selectors
textBoilerplate-stripped readable textYou want article text for search, RAG, or summarization
jsonSame as autoExplicit alias

Response (200) - parse=auto

{
  "url": "https://example.com/article/123",
  "resolved_url": "https://example.com/article/123",
  "title": "How tariffs reshaped the supply chain",
  "text": "The new tariffs took effect in March and within weeks ...",
  "links": [
    { "text": "Read the full report", "href": "https://example.com/report" }
  ],
  "metadata": {
    "description": "An analysis of 2026 supply-chain shifts.",
    "author": "J. Rivera",
    "published": "2026-03-14"
  }
}

Response (200) - parse=html

{
  "url": "https://example.com/article/123",
  "resolved_url": "https://example.com/article/123",
  "status": 200,
  "html": "<!doctype html><html> ... </html>"
}

Response (200) - parse=text

{
  "url": "https://example.com/article/123",
  "resolved_url": "https://example.com/article/123",
  "text": "How tariffs reshaped the supply chain\n\nThe new tariffs took effect in March ..."
}

When to use Universal vs a dedicated endpoint

Use Universal when…Use a dedicated endpoint when…
The site / page type has no dedicated endpointThe directory lists a JSON endpoint for it
You want the raw HTML to run your own parserYou want clean, validated, typed fields
You are fetching arbitrary or one-off URLsYou are scraping a known site at scale
You are feeding page text to an LLM / RAG pipelineYou need precise commerce / job / listing fields

If a card in the scraper directory is marked via Universal, it means that site is scraped through this endpoint rather than a hand-tuned parser. You can call /api/v1/universal/get directly with the target URL.

Cost

A Universal request costs 5 credits, the same as a dedicated request, and follows the same rule: only successful (2xx) responses are billed. Blocked pages, timeouts, and errors are free. See Billing.

render_js and screenshot are reserved for a future release and return 501 not_implemented today. They will be opt-in add-ons (+10 credits each) when they ship.

Errors

Universal returns the same error envelope as every other endpoint. The most common cases:

HTTPerrorMeaning
400invalid_paramsurl missing or malformed
401unauthorizedMissing / bad API key
429rate_limitedOver your concurrency or RPS ceiling
502target_unreachableThe target blocked every internal retry
502extraction_failedparse=auto could not extract structure (try parse=html and parse it yourself)