Skip to main content
All job types use the /api/v1/search/live POST endpoint and follow the same structure: a top-level type to define the data source, and a set of arguments defining the parameters of the job.

URL Scraping

Extract content from specific web pages as markdown

LLM Processing

Get LLM processed text from the scraped content

scraper

Scrapes content from web pages and returns metadata, markdown, and LLM-processed text. Parameters:
  • url (string, required): the URL to scrape
  • max_depth (int, optional): depth from url to scrape
  • max_pages (int, optional): amount of pages to scrape
Example:
{
  "type": "web",
  "arguments": {
    "type": "scraper",
    "url": "https://docs.learnbittensor.org",
    "max_depth": 1,
    "max_pages": 3
  }
}
I