public Via Universal scraper Knowledge & Academic

Arxiv Scraper API

Access any Arxiv page using our Arxiv Scraper API. Point one API request at a Arxiv URL and get structured JSON back while we get past CAPTCHAs, IP rate limiting, IP blocking, and headless-browser detection through residential proxies. Get back fields like paper title, authors, abstract, subject category as structured JSON.

Get your free API key Browse all scrapers

check_circle 1,000 free requests. No credit card. Residential proxies.

Request

curl "https://api.chocodata.com/api/v1/universal/get?api_key=YOUR_API_KEY&url=https://www.arxiv.com/"

JSON Response

200 OK

{
  "success": true,
  "url": "https://www.arxiv.com/",
  "status": 200,
  "content": "parsed Arxiv page content..."
  // + more fields
}

lan 237 targets

vpn_lock Residential proxy network

data_object Structured JSON

redeem 1,000 free requests

Arxiv is served through the Universal Web Scraper API

You scrape Arxiv with Chocodata's Universal Web Scraper API: point it at any Arxiv URL and get the page back as structured JSON. Residential proxies, rotation, and retries are handled for you, with no Arxiv-specific setup. The same API key and credits work here as on every dedicated endpoint.

Universal scraper Start free

About Arxiv

arXiv is an open-access repository of preprint scholarly articles in physics, mathematics, computer science, and related fields. Its public listing, abstract, and search pages expose titles, authors, abstracts, subject categories, and submission dates that are worth scraping for research tracking. Each listing links to an individual scholarly work with its abstract and full-text PDF.

Scraping note: arXiv provides a public OAI/API and asks scrapers to use it with rate limits; listing and abstract HTML pages render server-side so a single fetch returns the full content.

Example Arxiv page

https://arxiv.org/list/cs.LG/recent

Pass a URL like this as url= to the Universal Web Scraper API and get it back as JSON.

Fields you can extract from Arxiv

check_small paper title check_small authors check_small abstract check_small subject category check_small arXiv identifier check_small submission date

Common use: A research-trends tool scrapes arXiv listings to track newly published machine-learning preprints and their authors by subject category.

Everything you need to scrape Arxiv

You never expose your own IP, you get validated JSON instead of raw HTML, and the free tier covers arxiv scraper at a one-time 1,000 requests with no card required.

data_object

Clean structured JSON

The Arxiv page comes back as JSON in the parse mode you choose: auto, html, text, or json. No selectors to maintain for arxiv scraper.

gpp_good

Bypasses CAPTCHAs and anti-bot defenses

We get past CAPTCHAs, IP rate limiting, IP blocking, and headless-browser detection on Arxiv using rotating, country-matched residential proxies, so requests look like real users and your own IP stays private.

api

One API call

Scraping Arxiv is one simple API call. The same key works across our whole web scraping API: all 237 targets, any language, no SDK required.

travel_explore

Geo-targeting and JS rendering

Pass a country for proxy geo-targeting and turn on JavaScript rendering when a Arxiv page needs it. arxiv data api handles dynamic content and region-locked results.

Scrape Arxiv in one API request

Send one API request to the Universal Web Scraper API with a Arxiv URL and read the page back as JSON. Chocodata routes it through residential proxies and retries soft blocks, so arxiv api works without a Arxiv-specific integration.

check_circle JSON in the parse mode you choose
check_circle Residential-proxy infrastructure
check_circle Only successful 2xx responses are billed

Get your free API key

Request

curl "https://api.chocodata.com/api/v1/universal/get?api_key=YOUR_API_KEY&url=https://www.arxiv.com/"

Response

{
  "success": true,
  "url": "https://www.arxiv.com/",
  "status": 200,
  "content": "parsed Arxiv page content..."
  // + more fields
}

Parameters

Arxiv routes through the Universal Web Scraper API, so the parameters are the universal ones: the URL to fetch, an optional parse mode, and an optional country for proxy geo-targeting.

Parameter	Type	Required	Example
url	string	required	https://www.arxiv.com/
parse	enum	optional	auto
country	string	optional	us

Enum parameters accept a fixed set of values. For example, parse accepts auto, html, text, json.

What you can build with Arxiv data

Once the scraping API returns Arxiv search data as clean JSON, these are the patterns teams reach for most.

check_circle

Data Aggregation

Pull Arxiv search records into your own warehouse and analyze them on your terms.

check_circle

Market Intelligence

Track Arxiv search data over time to spot market shifts before competitors do.

check_circle

Competitor Monitoring

Watch rivals on Arxiv and alert your team when prices, ranks, or listings change.

check_circle

Lead Generation

Turn Arxiv search data into targeted lead lists and feed them into your CRM.

Why developers pick Chocodata for Arxiv

bolt

Fast at the tail

Median 2.6 s per request with multi-tier retry, so latency stays predictable even when Arxiv fights back.

verified_user

Parity-checked output

The Universal scraper returns the live page faithfully in your chosen parse mode.

hub

One key, 237 targets

The same scraping API key that scrapes Arxiv works across every other target, with official Node, Python, and Go SDKs.

Simple pricing that scales with you

Start free with a one-time 1,000 requests and 5,000 credits, no credit card. Scale on monthly plans from $19, or top up pay-as-you-go at $0.90 per 1,000 successful requests. Only successful 2xx responses are billed, and every plan covers the full scraping API: all 237 targets and every endpoint.

Free

Forever free on signup

check_circle1,000 requests (5,000 credits), one-time
check_circle10 concurrent requests
check_circleAll 237 targets
check_circleFull dashboard + analytics
check_circleTop-up at $0.90 / 1k
check_circleCommunity support

Start free

Vibe

$19 / month

$0.70 / 1k effective

check_circle27,000 requests / month (135,000 credits)
check_circle30 concurrent requests
check_circleAll 237 targets + content-language
check_circleCountry-matched residential IPs
check_circlePer-API-key usage tracking
check_circleTop-up at $0.90 / 1k
check_circleEmail support (1 business day)

Get Vibe

Pro

$49 / month

$0.60 / 1k effective

check_circle82,000 requests / month (410,000 credits)
check_circle50 concurrent requests
check_circlePriority routing queue
check_circleCountry-matched residential IPs
check_circleTeam seats (up to 5)
check_circleTop-up at $0.90 / 1k
check_circleEmail + chat support

Get Pro

Custom

$100-$2k / month

Flat $0.50 / 1k effective at every level

check_circle200k - 4M+ requests / month
check_circle100-500+ concurrent requests
check_circlePriority queue (highest)
check_circlePremium proxy pool + SLA on request
check_circleUnlimited team seats
check_circleWire / invoice / annual PO
check_circleDedicated Slack channel

Pick Custom level

Pay-as-you-go top-up

$0.90 / 1,000 successful requests

Available on every plan including Free. Top up any time when included credits run out. Only 2xx responses charged. Balance never expires.

See full pricing and calculator →

Arxiv Scraper API FAQ

Is it legal to scrape Arxiv?

expand_more

Scraping publicly accessible Arxiv data is generally legal in the US under the Ninth Circuit's hiQ v. LinkedIn ruling, which held that scraping public web data does not violate the Computer Fraud and Abuse Act. It does not override Arxiv's own Terms of Service, which are a contract matter, so check those for your use case and comply with data laws such as GDPR. This is general information, not legal advice.

How do I scrape Arxiv without getting blocked?

expand_more

You scrape Arxiv without getting blocked by sending the URL to the Universal Web Scraper API, which fetches it through rotating residential proxies and retries on soft blocks. Your own IP is never used, so Arxiv's anti-bot layer sees ordinary residential traffic instead of a server.

What data does the Arxiv Scraper API return?

expand_more

Through the Universal Web Scraper API you get the Arxiv page back as JSON: rendered HTML, plain text, or auto-parsed content depending on the parse mode you pass. You choose the URL, Chocodata returns the data.

How much does the Arxiv Scraper API cost?

expand_more

The Arxiv Scraper API costs nothing to start: the Free plan includes a one-time 1,000 requests (5,000 credits) across all 237 targets. After that, monthly plans begin at $19, or you top up pay-as-you-go at $0.90 per 1,000 successful requests. One request is 5 credits, and non-2xx responses are never charged.

Do I need a Arxiv account or login?

expand_more

No Arxiv login is needed for public pages. You point the Universal Web Scraper API at a publicly accessible Arxiv URL with your Chocodata API key; you do not hand over any Arxiv credentials. Login-gated content is not supported.

View other scraping APIs arrow_forward

Start scraping Arxiv for free

1,000 free scraping API requests on signup across all 237 targets. No credit card required.

Get your free API key Browse all scrapers