GET /v1/scrape

Fetch the raw HTML of a public web page. Returns the target site's status code and the full HTML body.

curl "https://api.chuger.com/v1/scrape?url=https://example.com" \
  -H "Authorization: Bearer YOUR_API_TOKEN"

const res = await fetch(
  `https://api.chuger.com/v1/scrape?url=${encodeURIComponent('https://example.com')}`,
  { headers: { Authorization: `Bearer ${process.env.CHUGER_TOKEN}` } },
);
const data = await res.json();

import os, requests

r = requests.get(
    "https://api.chuger.com/v1/scrape",
    params={"url": "https://example.com"},
    headers={"Authorization": f"Bearer {os.environ['CHUGER_TOKEN']}"},
)
data = r.json()

{
  "url": "https://example.com",
  "statusCode": 200,
  "html": "<!doctype html><html>...</html>"
}

Fetch the raw HTML of a public web page.

GET https://api.chuger.com/v1/scrape

Use this when you need the full HTML source — for example to detect changes, run your own parser, or feed a downstream pipeline. If you want cleaned-up content, Markdown, or metadata, use /v1/content instead.

Authentication

Bearer token in the Authorization header. See Authentication.

Cost

Plan	Credits per request
Basic	1
Pro	1
Business	1

Credits are only deducted on success.

Query parameters

query

urlstring

Required

The URL to scrape. Must be HTTP or HTTPS, max 180 characters. Raw IP hosts and non-default ports are rejected.

Example

Response fields

urlstring

Required

The URL that was fetched.

statusCodeinteger

Required

The HTTP status code returned by the target site (not Chuger). A 200 from Chuger with statusCode: 404 means we successfully fetched a page that responded with 404.

htmlstring

Required

The full HTML response body.

Errors

Status	When
`401`	Missing / invalid token
`402`	No plan, or insufficient credits
`422`	`url` missing, malformed, too long, raw IP, or non-default port
`429`	Rate limit or monthly quota exceeded
`503`	Page could not be retrieved

See Errors for the full reference.

Tips

For SPAs or pages that render content with JavaScript, the raw HTML you get back may not include dynamically-rendered content. If you need rendered content, prefer /v1/content, which handles those cases automatically.

Was this page helpful?