0% found this document useful (0 votes)
69 views314 pages

Scrapegraphai Docs

ScrapeGraphAI is an AI-powered web data extraction suite that provides tools for structured data extraction from websites and local HTML content. It offers services like SmartScraper for web scraping, LocalScraper for processing local HTML, and Markdownify for converting web content to Markdown. The platform supports integrations with popular frameworks and provides SDKs for Python and JavaScript, making it suitable for various applications including data analysis and AI training.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as TXT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
69 views314 pages

Scrapegraphai Docs

ScrapeGraphAI is an AI-powered web data extraction suite that provides tools for structured data extraction from websites and local HTML content. It offers services like SmartScraper for web scraping, LocalScraper for processing local HTML, and Markdownify for converting web content to Markdown. The platform supports integrations with popular frameworks and provides SDKs for Python and JavaScript, making it suitable for various applications including data analysis and AI training.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as TXT, PDF, TXT or read online on Scribd
You are on page 1/ 314

https://docs.scrapegraphai.

com/

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationGet
StartedIntroductionHomeCookbookAPI ReferenceOfficial WebsiteCommunityBlogGet
StartedIntroductionDashboardServicesSmartScraperLocalScraperMarkdownifyBrowser
ExtensionsOfficial SDKsPython SDKJavaScript SDKIntegrations🦜 LangChain🦙 LlamaIndex👥
CrewAI PhidataContributeOpen SourceFeedbackResources🕷️ScrapeGraphAI BadgeGet
StartedIntroductionWelcome to ScrapeGraphAI - AI-Powered Web Data Extraction
Overview
ScrapeGraphAI is a powerful suite of LLM-driven web scraping tools designed to
extract structured data from any website and HTML content. Our API is designed to
be easy to use and integrate with your existing workflows.
Perfect For
AI ApplicationsFeed your AI agents with structured web data for enhanced decision-
makingData AnalysisExtract and structure web data for research and analysisDataset
CreationBuild comprehensive datasets from web sourcesPlatform BuildingCreate
scraping-powered platforms and applications
Getting Started
1Get API KeySign up and access your API key from the dashboard2Choose Your
ServiceSelect from our specialized extraction services based on your needs3Start
ExtractingBegin extracting data using our SDKs or direct API calls
Documentation Structure
DashboardLearn how to manage your account, monitor jobs, and access your API
keysServicesExplore our core services: SmartScraper, LocalScraper, and
MarkdownifySDKs & IntegrationImplement with Python, JavaScript, or integrate with
LangChain and LlamaIndexAPI ReferenceDetailed API documentation for direct
integration
Core Services

SmartScraper: AI-powered extraction for any website


LocalScraper: AI-powered extraction for local HTML content
Markdownify: Convert web content to clean Markdown format

Implementation Options
Official SDKs

Production-ready SDKs for Python and JavaScript


Comprehensive error handling and retry logic
Type hints and full IDE support

Integrations

Seamless integration with LangChain


Native support for LlamaIndex
Perfect for AI agent workflows

Examples & Use Cases


Visit our Cookbook to explore real-world examples and implementation patterns:

E-commerce data extraction


News article scraping
Research data collection
Content aggregation

Open SourceScrapeGraphAI is built with transparency in mind. Check out our open-
source core at:
github.com/scrapegraphai/scrapegraph-ai
Ready to Start?Get your API key and start extracting data in minutes!Was this page
helpful?YesNoDashboardxgithublinkedinPowered by MintlifyOn this pageOverviewPerfect
ForGetting StartedDocumentation StructureCore ServicesImplementation
OptionsOfficial SDKsIntegrationsExamples & Use Cases

------- • -------

https://docs.scrapegraphai.com/introduction

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationGet
StartedIntroductionHomeCookbookAPI ReferenceOfficial WebsiteCommunityBlogGet
StartedIntroductionDashboardServicesSmartScraperLocalScraperMarkdownifyBrowser
ExtensionsOfficial SDKsPython SDKJavaScript SDKIntegrations🦜 LangChain🦙 LlamaIndex👥
CrewAI PhidataContributeOpen SourceFeedbackResources🕷️ScrapeGraphAI BadgeGet
StartedIntroductionWelcome to ScrapeGraphAI - AI-Powered Web Data Extraction
Overview
ScrapeGraphAI is a powerful suite of LLM-driven web scraping tools designed to
extract structured data from any website and HTML content. Our API is designed to
be easy to use and integrate with your existing workflows.
Perfect For
AI ApplicationsFeed your AI agents with structured web data for enhanced decision-
makingData AnalysisExtract and structure web data for research and analysisDataset
CreationBuild comprehensive datasets from web sourcesPlatform BuildingCreate
scraping-powered platforms and applications
Getting Started
1Get API KeySign up and access your API key from the dashboard2Choose Your
ServiceSelect from our specialized extraction services based on your needs3Start
ExtractingBegin extracting data using our SDKs or direct API calls
Documentation Structure
DashboardLearn how to manage your account, monitor jobs, and access your API
keysServicesExplore our core services: SmartScraper, LocalScraper, and
MarkdownifySDKs & IntegrationImplement with Python, JavaScript, or integrate with
LangChain and LlamaIndexAPI ReferenceDetailed API documentation for direct
integration
Core Services

SmartScraper: AI-powered extraction for any website


LocalScraper: AI-powered extraction for local HTML content
Markdownify: Convert web content to clean Markdown format

Implementation Options
Official SDKs

Production-ready SDKs for Python and JavaScript


Comprehensive error handling and retry logic
Type hints and full IDE support

Integrations

Seamless integration with LangChain


Native support for LlamaIndex
Perfect for AI agent workflows

Examples & Use Cases


Visit our Cookbook to explore real-world examples and implementation patterns:

E-commerce data extraction


News article scraping
Research data collection
Content aggregation

Open SourceScrapeGraphAI is built with transparency in mind. Check out our open-
source core at:
github.com/scrapegraphai/scrapegraph-ai
Ready to Start?Get your API key and start extracting data in minutes!Was this page
helpful?YesNoDashboardxgithublinkedinPowered by MintlifyOn this pageOverviewPerfect
ForGetting StartedDocumentation StructureCore ServicesImplementation
OptionsOfficial SDKsIntegrationsExamples & Use Cases

------- • -------

https://docs.scrapegraphai.com/cookbook/introduction

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationCookbookIntroduction
HomeCookbookAPI ReferenceOfficial WebsiteCommunityBlogCookbookIntroductionExamples🏢
Company Information🌟 GitHub Trending📰 Wired Articles🏠 Homes Listings🔬 News Research
Agent💬 Chat with WebpageCookbookIntroductionLearn from practical examples using
ScrapeGraphAPIOverview
Welcome to the ScrapeGraphAPI cookbook! Here you’ll find practical examples
implemented as interactive Google Colab notebooks. Each example demonstrates
different integration methods and use cases.
All examples are available as ready-to-use Google Colab notebooks - just click and
start experimenting!
Implementation Methods
Each example is available in multiple implementations:
SDK Direct UsageBasic implementation using our official SDKsLangChain
IntegrationIntegration with LangChain for LLM workflowsLlamaIndex IntegrationUsing
ScrapeGraph with LlamaIndex tools
Example Projects
🏢 Company InformationExtract structured company data from websites🌟 GitHub
TrendingMonitor trending repositories and developers📰 Wired ArticlesExtract news
articles and content🏠 Homes ListingsScrape real estate property data
Advanced Examples
🔬 Research Agent with TavilyBuild a sophisticated research agent combining
ScrapeGraph, LangGraph, and Tavily Search💬 Chat with WebpageCreate a RAG chatbot
using ScrapeGraph, Burr, and LanceDB
Getting Started

Choose an example that matches your use case


Open the Colab notebook for your preferred implementation method
Follow the step-by-step instructions
Experiment and adapt the code for your needs

Make sure to have your ScrapeGraphAI API key ready. Get one from the dashboard if
you haven’t already.Was this page helpful?YesNo🏢 Company
InformationxgithublinkedinPowered by MintlifyOn this pageOverviewImplementation
MethodsExample ProjectsAdvanced ExamplesGetting Started

------- • -------

https://docs.scrapegraphai.com/api-reference/introduction

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationAPI
DocumentationIntroductionHomeCookbookAPI ReferenceOfficial WebsiteCommunityBlogAPI
DocumentationIntroductionError HandlingSmartScraperPOSTStart SmartScraperGETGet
SmartScraper StatusLocalScraperPOSTStart LocalScraperGETGet LocalScraper
StatusMarkdownifyPOSTStart MarkdownifyGETGet Markdownify StatusUserGETGet
CreditsPOSTSubmit FeedbackAPI DocumentationIntroductionComplete reference for the
ScrapeGraphAI REST APIOverview
The ScrapeGraphAI API provides powerful endpoints for AI-powered web scraping and
content extraction. Our RESTful API allows you to extract structured data from any
website, process local HTML content, and convert web pages to clean markdown.
Authentication
All API requests require authentication using an API key. You can get your API key
from the dashboard.
SGAI-APIKEY: your-api-key-here

Keep your API key secure and never expose it in client-side code. Use environment
variables to manage your keys safely.
Base URL
https://api.scrapegraphai.com/v1

Available Services
SmartScraperExtract structured data from any website using AILocalScraperProcess
local HTML content with AI extractionMarkdownifyConvert web content to clean
markdownUserManage credits and submit feedback
SDKs & Integration
We provide official SDKs to help you integrate quickly:
Python SDKPerfect for data science and backend applicationsJavaScript SDKIdeal for
web applications and Node.js
AI Framework Integration
LangChainUse our services in your LLM workflowsLlamaIndexBuild powerful search and
QA systems
Error Handling
Our API uses conventional HTTP response codes:

200 - Success
400 - Bad Request
401 - Unauthorized
429 - Too Many Requests
500 - Server Error

Check our error handling guide for detailed information about error responses and
how to handle them.
Support
Need help with the API? We’re here to assist:
Discord CommunityGet help from our communityEmail SupportContact our technical
teamWas this page helpful?YesNoError HandlingxgithublinkedinPowered by MintlifyOn
this pageOverviewAuthenticationBase URLAvailable ServicesSDKs & IntegrationAI
Framework IntegrationError HandlingSupport

------- • -------

https://docs.scrapegraphai.com/services/smartscraper

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationServicesSmartScraper
HomeCookbookAPI ReferenceOfficial WebsiteCommunityBlogGet
StartedIntroductionDashboardServicesSmartScraperLocalScraperMarkdownifyBrowser
ExtensionsOfficial SDKsPython SDKJavaScript SDKIntegrations🦜 LangChain🦙 LlamaIndex👥
CrewAI PhidataContributeOpen SourceFeedbackResources🕷️ScrapeGraphAI
BadgeServicesSmartScraperAI-powered web scraping for any website
Overview
SmartScraper is our flagship LLM-powered web scraping service that intelligently
extracts structured data from any website. Using advanced LLM models, it
understands context and content like a human would, making web data extraction more
reliable and efficient than ever.
Try SmartScraper instantly in our interactive playground - no coding required!
Key Features
Universal CompatibilityWorks with any website structure, including JavaScript-
rendered contentAI UnderstandingContextual understanding of content for accurate
extractionStructured OutputReturns clean, structured data in your preferred
formatSchema SupportDefine custom output schemas using Pydantic or Zod
Use Cases
Content Aggregation

News article extraction


Blog post summarization
Product information gathering
Research data collection

Data Analysis

Market research
Competitor analysis
Price monitoring
Trend tracking

AI Training

Dataset creation
Training data collection
Content classification
Knowledge base building

Want to learn more about our AI-powered scraping technology? Visit our main website
to discover how we’re revolutionizing web data extraction.
Getting Started
Quick Start
from scrapegraph_py import Client

client = Client(api_key="your-api-key")

response = client.smartscraper(
website_url="https://scrapegraphai.com/",
user_prompt="Extract info about the company"
)

Get your API key from the dashboard


Example Response{
"request_id": "sg-req-abc123",
"status": "completed",
"website_url": "https://scrapegraphai.com/",
"user_prompt": "Extract info about the company",
"result": {
"company_name": "ScrapeGraphAI",
"description": "ScrapeGraphAI is a powerful AI scraping API designed for
efficient web data extraction to power LLM applications and AI agents...",
"features": [
"Effortless, cost-effective, and AI-powered data extraction",
"Handles proxy rotation and rate limits",
"Supports a wide variety of websites"
],
"contact_email": "[email protected]",
"social_links": {
"github": "https://github.com/ScrapeGraphAI/Scrapegraph-ai",
"linkedin": "https://www.linkedin.com/company/101881123",
"twitter": "https://x.com/scrapegraphai"
}
},
"error": ""
}
The response includes:
request_id: Unique identifier for tracking your request
status: Current status of the extraction (“completed”, “running”, “failed”)
result: The extracted data in structured JSON format
error: Error message (if any occurred during extraction)

Advanced Usage
Custom Schema Example
Define exactly what data you want to extract:

Async Support
For applications requiring asynchronous execution, SmartScraper provides async
support through the AsyncClient:
from scrapegraph_py import AsyncClient
import asyncio

async def main():


async with AsyncClient(api_key="your-api-key") as client:
response = await client.smartscraper(
website_url="https://example.com",
user_prompt="Extract the main content"
)
print(response)

# Run the async function


asyncio.run(main())

Integration Options
Official SDKs

Python SDK - Perfect for data science and backend applications


JavaScript SDK - Ideal for web applications and Node.js

AI Framework Integrations

LangChain Integration - Use SmartScraper in your LLM workflows


LlamaIndex Integration - Build powerful search and QA systems

Best Practices
Optimizing Extraction

Be specific in your prompts


Use schemas for structured data
Handle pagination for multi-page content
Implement error handling and retries

Rate Limiting

Implement reasonable delays between requests


Use async clients for better performance
Monitor your API usage
Example Projects
Check out our cookbook for real-world examples:

E-commerce product scraping


News aggregation
Research data collection
Content monitoring

API Reference
For detailed API documentation, see:

Start Scraping Job


Get Job Status

Support & Resources


DocumentationComprehensive guides and tutorialsAPI ReferenceDetailed API
documentationCommunityJoin our Discord communityGitHubCheck out our open-source
projects
Ready to Start?Sign up now and get your API key to begin extracting data with
SmartScraper!Was this page helpful?YesNoUser
SettingsLocalScraperxgithublinkedinPowered by MintlifyOn this pageOverviewKey
FeaturesUse CasesContent AggregationData AnalysisAI TrainingGetting StartedQuick
StartAdvanced UsageCustom Schema ExampleAsync SupportIntegration OptionsOfficial
SDKsAI Framework IntegrationsBest PracticesOptimizing ExtractionRate
LimitingExample ProjectsAPI ReferenceSupport & Resources

------- • -------

https://docs.scrapegraphai.com/services/localscraper

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationServicesLocalScraper
HomeCookbookAPI ReferenceOfficial WebsiteCommunityBlogGet
StartedIntroductionDashboardServicesSmartScraperLocalScraperMarkdownifyBrowser
ExtensionsOfficial SDKsPython SDKJavaScript SDKIntegrations🦜 LangChain🦙 LlamaIndex👥
CrewAI PhidataContributeOpen SourceFeedbackResources🕷️ScrapeGraphAI
BadgeServicesLocalScraperAI-powered extraction from local HTML content
Overview
LocalScraper brings the same powerful AI extraction capabilities as SmartScraper
but works with your local HTML content. This makes it perfect for scenarios where
you already have the HTML content or need to process cached pages, internal
documents, or dynamically generated content.
Try LocalScraper instantly in our interactive playground - no coding required!
Key Features
Local ProcessingProcess HTML content directly without making external requestsAI
UnderstandingSame powerful AI extraction as SmartScraperFaster ProcessingNo network
latency or website loading delaysFull ControlComplete control over your HTML input
and processing
Use Cases
Internal Systems

Process internally cached pages


Extract from intranet content
Handle dynamic JavaScript renders
Process email templates

Batch Processing
Archive data extraction
Historical content analysis
Bulk document processing
Offline content processing

Development & Testing

Test extraction logic locally


Debug content processing
Prototype without API calls
Validate schemas offline

Want to learn more about our AI-powered scraping technology? Visit our main website
to discover how we’re revolutionizing web data extraction.
Getting Started
Quick Start
from scrapegraph_py import Client

client = Client(api_key="your-api-key")

html_content = """
<html>
<body>
<h1>ScrapeGraphAI</h1>
<div class="description">
<p>AI-powered web scraping for modern applications.</p>
</div>
<div class="features">
<ul>
<li>Smart Extraction</li>
<li>Local Processing</li>
<li>Schema Support</li>
</ul>
</div>
</body>
</html>
"""

response = client.localscraper(
website_html=html_content,
user_prompt="Extract the company information and features"
)

Get your API key from the dashboard


Example Response{
"request_id": "sg-req-xyz789",
"status": "completed",
"user_prompt": "Extract the company information and features",
"result": {
"company_name": "ScrapeGraphAI",
"description": "AI-powered web scraping for modern applications.",
"features": [
"Smart Extraction",
"Local Processing",
"Schema Support"
]
},
"error": ""
}
The response includes:
request_id: Unique identifier for tracking your request
status: Current status of the extraction
result: The extracted data in structured JSON format
error: Error message (if any occurred during extraction)

Advanced Usage
Custom Schema Example
Define exactly what data you want to extract:

Async Support
For applications requiring asynchronous execution, LocalScraper provides async
support through the AsyncClient:
from scrapegraph_py import AsyncClient
import asyncio

async def main():


html_content = """
<html>
<body>
<h1>Product: Gaming Laptop</h1>
<div class="price">$999.99</div>
<div class="description">
High-performance gaming laptop with RTX 3080.
</div>
</body>
</html>
"""

async with AsyncClient(api_key="your-api-key") as client:


response = await client.localscraper(
website_html=html_content,
user_prompt="Extract the product information"
)
print(response)

# Run the async function


asyncio.run(main())

Integration Options
Official SDKs

Python SDK - Perfect for data science and backend applications


JavaScript SDK - Ideal for web applications and Node.js

AI Framework Integrations

LangChain Integration - Use LocalScraper in your LLM workflows


LlamaIndex Integration - Build powerful search and QA systems

Best Practices
HTML Preparation

Ensure HTML is well-formed


Include relevant content only
Clean up unnecessary markup
Handle character encoding properly

Optimization Tips
Remove unnecessary scripts and styles
Clean up dynamic content placeholders
Preserve important semantic structure
Include relevant metadata

Example Projects
Check out our cookbook for real-world examples:

Dynamic content extraction


Email template processing
Cached content analysis
Batch HTML processing

API Reference
For detailed API documentation, see:

Start Scraping Job


Get Job Status

Support & Resources


DocumentationComprehensive guides and tutorialsAPI ReferenceDetailed API
documentationCommunityJoin our Discord communityGitHubCheck out our open-source
projectsMain WebsiteVisit our official website
Ready to Start?Sign up now and get your API key to begin processing your HTML
content with LocalScraper!Was this page helpful?
YesNoSmartScraperMarkdownifyxgithublinkedinPowered by MintlifyOn this
pageOverviewKey FeaturesUse CasesInternal SystemsBatch ProcessingDevelopment &
TestingGetting StartedQuick StartAdvanced UsageCustom Schema ExampleAsync
SupportIntegration OptionsOfficial SDKsAI Framework IntegrationsBest PracticesHTML
PreparationOptimization TipsExample ProjectsAPI ReferenceSupport & Resources

------- • -------

https://docs.scrapegraphai.com/services/markdownify

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationServicesMarkdownifyH
omeCookbookAPI ReferenceOfficial WebsiteCommunityBlogGet
StartedIntroductionDashboardServicesSmartScraperLocalScraperMarkdownifyBrowser
ExtensionsOfficial SDKsPython SDKJavaScript SDKIntegrations🦜 LangChain🦙 LlamaIndex👥
CrewAI PhidataContributeOpen SourceFeedbackResources🕷️ScrapeGraphAI
BadgeServicesMarkdownifyConvert web content to clean, structured markdown
Overview
Markdownify is our specialized service that transforms web content into clean,
well-formatted markdown. It intelligently preserves the content’s structure while
removing unnecessary elements, making it perfect for content migration,
documentation creation, and knowledge base building.
Try Markdownify instantly in our interactive playground - no coding required!
Key Features
Smart ConversionIntelligent content structure preservationClean OutputRemoves ads,
navigation, and irrelevant contentFormat RetentionMaintains headings, lists, and
text formattingAsset HandlingPreserves images and handles external links
Use Cases
Content Migration

Convert blog posts to markdown


Transform documentation
Migrate knowledge bases
Archive web content

Documentation

Create technical documentation


Build wikis and guides
Generate README files
Maintain developer docs

Content Management

Prepare content for CMS import


Create portable content
Build learning resources
Format articles

Want to learn more about our AI-powered scraping technology? Visit our main website
to discover how we’re revolutionizing web data extraction.
Getting Started
Quick Start
from scrapegraph_py import Client

client = Client(api_key="your-api-key")

response = client.markdownify(
website_url="https://example.com/article"
)

Get your API key from the dashboard


Example Response{
"request_id": "sg-req-md456",
"status": "completed",
"website_url": "https://example.com/article",
"result": "# Understanding AI-Powered Web Scraping\n\nWeb scraping has evolved
significantly with the advent of AI technologies...\n\n## Key Benefits\n\n-
Improved accuracy\n- Intelligent extraction\n- Structured output\n\n![AI Scraping
Process](https://example.com/images/ai-scraping.png)\n\n> AI-powered scraping
represents the future of web data extraction.\n\n### Getting Started\n\n1. Choose
your target website\n2. Define extraction goals\n3. Select appropriate tools\n",
"error": ""
}
The response includes:
request_id: Unique identifier for tracking your request
status: Current status of the conversion
result: Object containing the markdown content and metadata
error: Error message (if any occurred during conversion)

Async Support
For applications requiring asynchronous execution, Markdownify provides async
support through the AsyncClient:
from scrapegraph_py import AsyncClient
import asyncio

async def main():


async with AsyncClient(api_key="your-api-key") as client:
response = await client.markdownify(
website_url="https://example.com/article"
)
print(response)
# Run the async function
asyncio.run(main())

Integration Options
Official SDKs

Python SDK - Perfect for automation and content processing


JavaScript SDK - Ideal for web applications and content tools

AI Framework Integrations

LangChain Integration - Use Markdownify in your content pipelines


LlamaIndex Integration - Create searchable knowledge bases

Best Practices
Content Optimization

Verify source content quality


Check image and link preservation
Review markdown formatting
Validate output structure

Processing Tips

Handle large content in chunks


Preserve important metadata
Maintain content hierarchy
Check for formatting consistency

Example Projects
Check out our cookbook for real-world examples:

Blog migration tools


Documentation generators
Content archival systems
Knowledge base builders

API Reference
For detailed API documentation, see:

Start Conversion Job


Get Job Status

Support & Resources


DocumentationComprehensive guides and tutorialsAPI ReferenceDetailed API
documentationCommunityJoin our Discord communityGitHubCheck out our open-source
projectsMain WebsiteVisit our official website
Ready to Start?Sign up now and get your API key to begin converting web content to
clean markdown!Was this page helpful?YesNoLocalScraperFirefoxxgithublinkedinPowered
by MintlifyOn this pageOverviewKey FeaturesUse CasesContent
MigrationDocumentationContent ManagementGetting StartedQuick StartAsync
SupportIntegration OptionsOfficial SDKsAI Framework IntegrationsBest
PracticesContent OptimizationProcessing TipsExample ProjectsAPI ReferenceSupport &
Resources

------- • -------

https://docs.scrapegraphai.com/sdks/python
ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationOfficial SDKsPython
SDKHomeCookbookAPI ReferenceOfficial WebsiteCommunityBlogGet
StartedIntroductionDashboardServicesSmartScraperLocalScraperMarkdownifyBrowser
ExtensionsOfficial SDKsPython SDKJavaScript SDKIntegrations🦜 LangChain🦙 LlamaIndex👥
CrewAI PhidataContributeOpen SourceFeedbackResources🕷️ScrapeGraphAI BadgeOfficial
SDKsPython SDKOfficial Python SDK for ScrapeGraphAI
PyPI PackagePython Support
Installation
Install the package using pip:
pip install scrapegraph-py

Features

AI-Powered Extraction: Advanced web scraping using artificial intelligence


Flexible Clients: Both synchronous and asynchronous support
Type Safety: Structured output with Pydantic schemas
Production Ready: Detailed logging and automatic retries
Developer Friendly: Comprehensive error handling

Quick Start
Initialize the client with your API key:
from scrapegraph_py import Client

client = Client(api_key="your-api-key-here")

You can also set the SGAI_API_KEY environment variable and initialize the client
without parameters: client = Client()
Services
SmartScraper
Extract specific information from any webpage using AI:
response = client.smartscraper(
website_url="https://example.com",
user_prompt="Extract the main heading and description"
)

Basic Schema ExampleDefine a simple schema for basic data extraction:from pydantic
import BaseModel, Field

class ArticleData(BaseModel):
title: str = Field(description="The article title")
author: str = Field(description="The author's name")
publish_date: str = Field(description="Article publication date")
content: str = Field(description="Main article content")
category: str = Field(description="Article category")

response = client.smartscraper(
website_url="https://example.com/blog/article",
user_prompt="Extract the article information",
output_schema=ArticleData
)

print(f"Title: {response.title}")
print(f"Author: {response.author}")
print(f"Published: {response.publish_date}")

Advanced Schema ExampleDefine a complex schema for nested data structures:from


typing import List
from pydantic import BaseModel, Field

class Employee(BaseModel):
name: str = Field(description="Employee's full name")
position: str = Field(description="Job title")
department: str = Field(description="Department name")
email: str = Field(description="Email address")

class Office(BaseModel):
location: str = Field(description="Office location/city")
address: str = Field(description="Full address")
phone: str = Field(description="Contact number")

class CompanyData(BaseModel):
name: str = Field(description="Company name")
description: str = Field(description="Company description")
industry: str = Field(description="Industry sector")
founded_year: int = Field(description="Year company was founded")
employees: List[Employee] = Field(description="List of key employees")
offices: List[Office] = Field(description="Company office locations")
website: str = Field(description="Company website URL")

# Extract comprehensive company information


response = client.smartscraper(
website_url="https://example.com/about",
user_prompt="Extract detailed company information including employees and
offices",
output_schema=CompanyData
)

# Access nested data


print(f"Company: {response.name}")
print("\nKey Employees:")
for employee in response.employees:
print(f"- {employee.name} ({employee.position})")

print("\nOffice Locations:")
for office in response.offices:
print(f"- {office.location}: {office.address}")

LocalScraper
Process local HTML content with AI extraction:
html_content = """
<html>
<body>
<h1>Company Name</h1>
<p>We are a technology company focused on AI solutions.</p>
<div class="contact">
<p>Email: [email protected]</p>
</div>
</body>
</html>
"""

response = client.localscraper(
user_prompt="Extract the company description",
website_html=html_content
)
Markdownify
Convert any webpage into clean, formatted markdown:
response = client.markdownify(
website_url="https://example.com"
)

Async Support
All endpoints support asynchronous operations:
import asyncio
from scrapegraph_py import AsyncClient

async def main():


async with AsyncClient() as client:
response = await client.smartscraper(
website_url="https://example.com",
user_prompt="Extract the main content"
)
print(response)

asyncio.run(main())

Feedback
Help us improve by submitting feedback programmatically:
client.submit_feedback(
request_id="your-request-id",
rating=5,
feedback_text="Great results!"
)

Support
GitHubReport issues and contribute to the SDKEmail SupportGet help from our
development team
LicenseThis project is licensed under the MIT License. See the LICENSE file for
details.Was this page helpful?YesNoFirefoxJavaScript SDKxgithublinkedinPowered by
MintlifyOn this pageInstallationFeaturesQuick
StartServicesSmartScraperLocalScraperMarkdownifyAsync SupportFeedbackSupport

------- • -------

https://docs.scrapegraphai.com/sdks/javascript

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationOfficial
SDKsJavaScript SDKHomeCookbookAPI ReferenceOfficial WebsiteCommunityBlogGet
StartedIntroductionDashboardServicesSmartScraperLocalScraperMarkdownifyBrowser
ExtensionsOfficial SDKsPython SDKJavaScript SDKIntegrations🦜 LangChain🦙 LlamaIndex👥
CrewAI PhidataContributeOpen SourceFeedbackResources🕷️ScrapeGraphAI BadgeOfficial
SDKsJavaScript SDKOfficial JavaScript/TypeScript SDK for ScrapeGraphAI
NPM PackageLicense
Installation
Install the package using npm or yarn:
# Using npm
npm i scrapegraph-js

# Using yarn
yarn add scrapegraph-js

Features
AI-Powered Extraction: Smart web scraping with artificial intelligence
Async by Design: Fully asynchronous architecture
Type Safety: Built-in TypeScript support with Zod schemas
Production Ready: Automatic retries and detailed logging
Developer Friendly: Comprehensive error handling

Quick Start
Initialize with your API key:
import { smartScraper } from 'scrapegraph-js';

const apiKey = process.env.SGAI_APIKEY;


const websiteUrl = 'https://example.com';
const prompt = 'Extract the main heading and description';

try {
const response = await smartScraper(apiKey, websiteUrl, prompt);
console.log(response.result);
} catch (error) {
console.error('Error:', error);
}

Store your API keys securely in environment variables. Use .env files and libraries
like dotenv to load them into your app.
Services
SmartScraper
Extract specific information from any webpage using AI:
const response = await smartScraper(
apiKey,
'https://example.com',
'Extract the main content'
);

Basic Schema ExampleDefine a simple schema using Zod:import { z } from 'zod';

const ArticleSchema = z.object({


title: z.string().describe('The article title'),
author: z.string().describe('The author\'s name'),
publishDate: z.string().describe('Article publication date'),
content: z.string().describe('Main article content'),
category: z.string().describe('Article category')
});

const response = await smartScraper(


apiKey,
'https://example.com/blog/article',
'Extract the article information',
ArticleSchema
);

console.log(`Title: ${response.result.title}`);
console.log(`Author: ${response.result.author}`);
console.log(`Published: ${response.result.publishDate}`);

Advanced Schema ExampleDefine a complex schema for nested data structures:import


{ z } from 'zod';

const EmployeeSchema = z.object({


name: z.string().describe('Employee\'s full name'),
position: z.string().describe('Job title'),
department: z.string().describe('Department name'),
email: z.string().describe('Email address')
});

const OfficeSchema = z.object({


location: z.string().describe('Office location/city'),
address: z.string().describe('Full address'),
phone: z.string().describe('Contact number')
});

const CompanySchema = z.object({


name: z.string().describe('Company name'),
description: z.string().describe('Company description'),
industry: z.string().describe('Industry sector'),
foundedYear: z.number().describe('Year company was founded'),
employees: z.array(EmployeeSchema).describe('List of key employees'),
offices: z.array(OfficeSchema).describe('Company office locations'),
website: z.string().url().describe('Company website URL')
});

// Extract comprehensive company information


const response = await smartScraper(
apiKey,
'https://example.com/about',
'Extract detailed company information including employees and offices',
CompanySchema
);

// Access nested data


console.log(`Company: ${response.result.name}`);
console.log('\nKey Employees:');
response.result.employees.forEach(employee => {
console.log(`- ${employee.name} (${employee.position})`);
});

console.log('\nOffice Locations:');
response.result.offices.forEach(office => {
console.log(`- ${office.location}: ${office.address}`);
});

LocalScraper
Process local HTML content with AI extraction:
const html = `
<html>
<body>
<h1>Company Name</h1>
<p>We are a technology company focused on AI solutions.</p>
<div class="contact">
<p>Email: [email protected]</p>
</div>
</body>
</html>
`;

const response = await localScraper(


apiKey,
html,
'Extract the company description'
);
Markdownify
Convert any webpage into clean, formatted markdown:
const response = await markdownify(
apiKey,
'https://example.com'
);

API Credits
Check your available API credits:
import { getCredits } from 'scrapegraph-js';

try {
const credits = await getCredits(apiKey);
console.log('Available credits:', credits);
} catch (error) {
console.error('Error fetching credits:', error);
}

Feedback
Help us improve by submitting feedback programmatically:
import { sendFeedback } from 'scrapegraph-js';

try {
await sendFeedback(
apiKey,
'request-id',
5,
'Great results!'
);
} catch (error) {
console.error('Error sending feedback:', error);
}

Support
GitHubReport issues and contribute to the SDKEmail SupportGet help from our
development team
LicenseThis project is licensed under the MIT License. See the LICENSE file for
details.Was this page helpful?YesNoPython SDK🦜 LangChainxgithublinkedinPowered by
MintlifyOn this pageInstallationFeaturesQuick
StartServicesSmartScraperLocalScraperMarkdownifyAPI CreditsFeedbackSupport

------- • -------

https://docs.scrapegraphai.com/integrations/langchain

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationIntegrations🦜
LangChainHomeCookbookAPI ReferenceOfficial WebsiteCommunityBlogGet
StartedIntroductionDashboardServicesSmartScraperLocalScraperMarkdownifyBrowser
ExtensionsOfficial SDKsPython SDKJavaScript SDKIntegrations🦜 LangChain🦙 LlamaIndex👥
CrewAI PhidataContributeOpen SourceFeedbackResources🕷️ScrapeGraphAI
BadgeIntegrations🦜 LangChainSupercharge your LangChain agents with AI-powered web
scrapingOverview
The LangChain integration enables your agents to extract structured data from
websites using natural language. This powerful combination allows you to build
sophisticated AI agents that can understand and process web content intelligently.
Official LangChain DocumentationView the integration in LangChain’s official
documentation
Installation
Install the package using pip:
pip install langchain-scrapegraph

Available Tools
SmartScraperTool
Extract structured data from any webpage using natural language prompts:
from langchain_scrapegraph.tools import SmartScraperTool

# Initialize the tool (uses SGAI_API_KEY from environment)


tool = SmartscraperTool()

# Extract information using natural language


result = tool.invoke({
"website_url": "https://www.example.com",
"user_prompt": "Extract the main heading and first paragraph"
})

Using Output SchemasDefine the structure of the output using Pydantic models:from
typing import List
from pydantic import BaseModel, Field
from langchain_scrapegraph.tools import SmartScraperTool

class WebsiteInfo(BaseModel):
title: str = Field(description="The main title of the webpage")
description: str = Field(description="The main description or first paragraph")
urls: List[str] = Field(description="The URLs inside the webpage")

# Initialize with schema


tool = SmartScraperTool(llm_output_schema=WebsiteInfo)

result = tool.invoke({
"website_url": "https://www.example.com",
"user_prompt": "Extract the website information"
})

LocalScraperTool
Process HTML content directly with AI extraction:
from langchain_scrapegraph.tools import LocalScraperTool

tool = LocalScraperTool()
result = tool.invoke({
"user_prompt": "Extract all contact information",
"website_html": "<html>...</html>"
})

Using Output Schemasfrom typing import Optional


from pydantic import BaseModel, Field
from langchain_scrapegraph.tools import LocalScraperTool

class CompanyInfo(BaseModel):
name: str = Field(description="The company name")
description: str = Field(description="The company description")
email: Optional[str] = Field(description="Contact email if available")
phone: Optional[str] = Field(description="Contact phone if available")

tool = LocalScraperTool(llm_output_schema=CompanyInfo)

html_content = """
<html>
<body>
<h1>TechCorp Solutions</h1>
<p>We are a leading AI technology company.</p>
<div class="contact">
<p>Email: [email protected]</p>
<p>Phone: (555) 123-4567</p>
</div>
</body>
</html>
"""

result = tool.invoke({
"website_html": html_content,
"user_prompt": "Extract the company information"
})

MarkdownifyTool
Convert any webpage into clean, formatted markdown:
from langchain_scrapegraph.tools import MarkdownifyTool

tool = MarkdownifyTool()
markdown = tool.invoke({"website_url": "https://example.com"})

Example Agent
Create a research agent that can gather and analyze web data:
from langchain.agents import initialize_agent, AgentType
from langchain_scrapegraph.tools import SmartScraperTool
from langchain_openai import ChatOpenAI

# Initialize tools
tools = [
SmartScraperTool(),
]

# Create an agent
agent = initialize_agent(
tools=tools,
llm=ChatOpenAI(temperature=0),
agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION,
verbose=True
)

# Use the agent


response = agent.run("""
Visit example.com, make a summary of the content and extract the main heading
and first paragraph
""")

Configuration
Set your ScrapeGraph API key in your environment:
export SGAI_API_KEY="your-api-key-here"

Or set it programmatically:
import os
os.environ["SGAI_API_KEY"] = "your-api-key-here"

Get your API key from the dashboard


Use Cases
Research AgentsCreate agents that gather and analyze web dataData
CollectionAutomate structured data extraction from websitesContent
ProcessingConvert web content into markdown for further processingInformation
ExtractionExtract specific data points using natural language
Support
Need help with the integration?
GitHub IssuesReport bugs and request featuresDiscord CommunityGet help from our
communityWas this page helpful?YesNoJavaScript SDK🦙
LlamaIndexxgithublinkedinPowered by MintlifyOn this
pageOverviewInstallationAvailable
ToolsSmartScraperToolLocalScraperToolMarkdownifyToolExample AgentConfigurationUse
CasesSupport

------- • -------

https://docs.scrapegraphai.com/integrations/llamaindex

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationIntegrations🦙
LlamaIndexHomeCookbookAPI ReferenceOfficial WebsiteCommunityBlogGet
StartedIntroductionDashboardServicesSmartScraperLocalScraperMarkdownifyBrowser
ExtensionsOfficial SDKsPython SDKJavaScript SDKIntegrations🦜 LangChain🦙 LlamaIndex👥
CrewAI PhidataContributeOpen SourceFeedbackResources🕷️ScrapeGraphAI
BadgeIntegrations🦙 LlamaIndexIntegrate ScrapeGraphAI with LlamaIndex for powerful
data ingestionOverview
This tool integrates ScrapeGraph with LlamaIndex, providing intelligent web
scraping capabilities with structured data extraction.
Official LlamaHub DocumentationView the integration on LlamaHub
Installation
Install the package using pip:
pip install llama-index-tools-scrapegraphai

Usage
First, import and initialize the ScrapegraphToolSpec:
from llama_index.tools.scrapegraph.base import ScrapegraphToolSpec

scrapegraph_tool = ScrapegraphToolSpec()

Available Functions
Smart Scraping (Sync)
Extract structured data using a schema:
from pydantic import BaseModel, Field

class FounderSchema(BaseModel):
name: str = Field(description="Name of the founder")
role: str = Field(description="Role of the founder")
social_media: str = Field(description="Social media URL of the founder")

class ListFoundersSchema(BaseModel):
founders: list[FounderSchema] = Field(description="List of founders")

response = scrapegraph_tool.scrapegraph_smartscraper(
prompt="Extract product information",
url="https://scrapegraphai.com/",
api_key="sgai-***",
schema=ListFoundersSchema,
)

result = response["result"]
for founder in result["founders"]:
print(founder)

Smart Scraping (Async)


Asynchronous version of the smart scraper:
result = await scrapegraph_tool.scrapegraph_smartscraper_async(
prompt="Extract product information",
url="https://example.com/product",
api_key="your-api-key",
schema=schema,
)

Submit Feedback
Provide feedback on extraction results:
response = scrapegraph_tool.scrapegraph_feedback(
request_id="request-id",
api_key="your-api-key",
rating=5,
feedback_text="Great results!",
)

Check Credits
Monitor your API credit usage:
credits = scrapegraph_tool.scrapegraph_get_credits(api_key="your-api-key")

Use Cases
RAG ApplicationsBuild powerful retrieval-augmented generation systemsKnowledge
BasesCreate and maintain up-to-date knowledge basesWeb ResearchAutomate web
research and data collectionContent IndexingIndex and structure web content for
search
Support
Need help with the integration?
GitHub IssuesReport bugs and request featuresDiscord CommunityGet help from our
communityWas this page helpful?YesNo🦜 LangChain👥 CrewAIxgithublinkedinPowered by
MintlifyOn this pageOverviewInstallationUsageAvailable FunctionsSmart Scraping
(Sync)Smart Scraping (Async)Submit FeedbackCheck CreditsUse CasesSupport

------- • -------

https://docs.scrapegraphai.com/integrations/crewai

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationIntegrations👥
CrewAIHomeCookbookAPI ReferenceOfficial WebsiteCommunityBlogGet
StartedIntroductionDashboardServicesSmartScraperLocalScraperMarkdownifyBrowser
ExtensionsOfficial SDKsPython SDKJavaScript SDKIntegrations🦜 LangChain🦙 LlamaIndex👥
CrewAI PhidataContributeOpen SourceFeedbackResources🕷️ScrapeGraphAI
BadgeIntegrations👥 CrewAIUse CrewAI with ScrapegraphOverview
CrewAI is a framework for orchestrating role-playing AI agents. With the
Scrapegraph CrewAI integration, you can easily incorporate web scraping
capabilities into your agent workflows.
Try it in Google ColabInteractive example notebook to get started with CrewAI and
Scrapegraph
Installation
Install the required packages:
pip install crewai scrapegraph-tools python-dotenv

Available Tools
ScrapegraphScrapeTool
The ScrapegraphScrapeTool provides web scraping capabilities to your CrewAI agents:
from crewai import Agent, Crew, Task
from crewai_tools import ScrapegraphScrapeTool
from dotenv import load_dotenv

# Initialize the tool


tool = ScrapegraphScrapeTool()

# Create an agent with the tool


agent = Agent(
role="Web Researcher",
goal="Research and extract accurate information from websites",
backstory="You are an expert web researcher with experience in extracting and
analyzing information from various websites.",
tools=[tool],
)

Complete Examplefrom crewai import Agent, Crew, Task


from crewai_tools import ScrapegraphScrapeTool
from dotenv import load_dotenv

# Load environment variables


load_dotenv()

# Initialize the Scrapegraph tool


tool = ScrapegraphScrapeTool()

# Create an agent with the Scrapegraph tool


agent = Agent(
role="Web Researcher",
goal="Research and extract accurate information from websites",
backstory="You are an expert web researcher with experience in extracting and
analyzing information from various websites.",
tools=[tool],
)

# Define a task for the agent


task = Task(
name="scraping task",
description="Visit the website https://scrapegraphai.com and extract detailed
information about the founders, including their names, roles, and any relevant
background information.",
expected_output="A file with the information extracted from the website.",
agent=agent,
)

# Create a crew with the agent and task


crew = Crew(
agents=[agent],
tasks=[task],
)

# Execute the task


result = crew.kickoff()

Configuration
Set your Scrapegraph API key in your environment:
export SCRAPEGRAPH_API_KEY="your-api-key-here"
Or using a .env file:
SCRAPEGRAPH_API_KEY=your_api_key_here

Get your API key from the dashboard


Use Cases
Content ResearchGather information from multiple websites for market research or
competitive analysisData CollectionExtract structured data from websites for
analysis or database populationAutomated MonitoringKeep track of changes on
specific web pagesInformation ExtractionExtract specific data points using natural
language
Best Practices
Rate LimitingBe mindful of website rate limits and implement appropriate
delaysError HandlingImplement proper error handling for failed requestsData
ValidationVerify extracted data meets requirementsEthical ScrapingRespect
robots.txt and website terms of service
Support
Need help with the integration?
GitHub RepositoryReport bugs and request featuresDiscord CommunityGet help from our
communityWas this page helpful?YesNo🦙 LlamaIndex🦐 PhidataxgithublinkedinPowered by
MintlifyOn this pageOverviewInstallationAvailable
ToolsScrapegraphScrapeToolConfigurationUse CasesBest PracticesSupport

------- • -------

https://docs.scrapegraphai.com/integrations/phidata

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationIntegrations🦐
PhidataHomeCookbookAPI ReferenceOfficial WebsiteCommunityBlogGet
StartedIntroductionDashboardServicesSmartScraperLocalScraperMarkdownifyBrowser
ExtensionsOfficial SDKsPython SDKJavaScript SDKIntegrations🦜 LangChain🦙 LlamaIndex👥
CrewAI PhidataContributeOpen SourceFeedbackResources🕷️ScrapeGraphAI
BadgeIntegrations🦐 PhidataBuild AI Assistants with ScrapeGraph using Phidata
Overview
Phidata is a development framework for building production-ready AI Assistants.
This integration allows you to easily add ScrapeGraph’s web scraping capabilities
to your Phidata-powered AI agents.
Official Phidata DocumentationLearn more about building AI Assistants with Phidata
Installation
Install the required packages:
pip install -U phidata

pip install scrapegraph-py

Usage
Basic Example
Create an AI Assistant with ScrapeGraph tools:
from phi.agent import Agent
from phi.tools.scrapegraph_tools import ScrapeGraphTools

# Initialize with smartscraper enabled


scrapegraph = ScrapeGraphTools(smartscraper=True)

# Create an agent with the tools


agent = Agent(
tools=[scrapegraph],
show_tool_calls=True,
markdown=True,
stream=True
)

# Use smartscraper to extract structured data


agent.print_response("""
Use smartscraper to extract the following from
https://www.wired.com/category/science/:
- News articles
- Headlines
- Images
- Links
- Author
""")

Markdown Conversion
You can also use ScrapeGraph to convert web pages to markdown:
from phi.agent import Agent
from phi.tools.scrapegraph_tools import ScrapeGraphTools

# Initialize with only markdownify enabled


scrapegraph_md = ScrapeGraphTools(smartscraper=False)

# Create an agent for markdown conversion


agent_md = Agent(
tools=[scrapegraph_md],
show_tool_calls=True,
markdown=True
)

# Convert webpage to markdown


agent_md.print_response(
"Fetch and convert https://www.wired.com/category/science/ to markdown format"
)

Features
Smart ScrapingExtract structured data using natural languageMarkdown
ConversionConvert web pages to clean markdownStreaming SupportReal-time responses
with streamingTool VisibilityDebug with visible tool calls
Support
Need help with the integration?
Phidata DiscordJoin the Phidata communityGitHub CookbookCheck out the source
codeWas this page helpful?YesNo👥 CrewAIOpen SourcexgithublinkedinPowered by
MintlifyOn this pageOverviewInstallationUsageBasic ExampleMarkdown
ConversionFeaturesSupport

------- • -------

https://docs.scrapegraphai.com/contribute/opensource

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationContributeOpen
SourceHomeCookbookAPI ReferenceOfficial WebsiteCommunityBlogGet
StartedIntroductionDashboardServicesSmartScraperLocalScraperMarkdownifyBrowser
ExtensionsOfficial SDKsPython SDKJavaScript SDKIntegrations🦜 LangChain🦙 LlamaIndex👥
CrewAI PhidataContributeOpen SourceFeedbackResources🕷️ScrapeGraphAI
BadgeContributeOpen SourceScrapeGraphAI open-source ecosystemOur Open Source
Projects
ScrapeGraphAI is committed to the open-source community. We maintain several
projects to help developers integrate and extend our services.
Core Project
scrapegraph-aiOur main open-source repository containing the core AI-powered web
scraping engine. This is the foundation of our API service.
Features

Advanced AI extraction engine


Smart content processing
Intelligent schema handling
Modular architecture

Integration Tools
scrapegraph-sdkOfficial Python and JavaScript SDKs for easy API
integration.langchain-scrapegraphOfficial LangChain integration for LLM workflows.
Installation

The scrapegraphai package is our core library that powers the API service. For most
use cases, we recommend using our SDKs (scrapegraph-py or scrapegraph-js) which
provide a convenient interface to the API.
Resources
DocumentationComprehensive guides and API referenceExamplesReal-world usage
examples and tutorialsDiscord CommunityJoin our developer communityGitHub
OrganizationBrowse all our open-source projects
Support
Need help with our open-source projects? We’re here to assist:
GitHub IssuesReport bugs and request featuresDiscord SupportGet help from our
communityWas this page helpful?YesNo🦐 PhidataFeedbackxgithublinkedinPowered by
MintlifyOn this pageOur Open Source ProjectsCore ProjectFeaturesIntegration
ToolsInstallationResourcesSupport

------- • -------

https://docs.scrapegraphai.com/contribute/feedback

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationContributeFeedbackHo
meCookbookAPI ReferenceOfficial WebsiteCommunityBlogGet
StartedIntroductionDashboardServicesSmartScraperLocalScraperMarkdownifyBrowser
ExtensionsOfficial SDKsPython SDKJavaScript SDKIntegrations🦜 LangChain🦙 LlamaIndex👥
CrewAI PhidataContributeOpen SourceFeedbackResources🕷️ScrapeGraphAI
BadgeContributeFeedbackHelp us improve ScrapeGraphAIShare Your Experience
Your feedback helps us improve ScrapeGraphAI. There are several ways to share your
thoughts and experiences with us.
API Feedback

Feature Requests
Have an idea for a new feature? We’d love to hear it!
Submit Feature RequestShare your ideas on our GitHub repository
Bug Reports
Found a bug? Help us improve by reporting it:
Core LibraryReport issues with the core functionalitySDK IssuesReport SDK-specific
problems
Community Channels
Join our community to discuss ideas, share experiences, and get help:
Discord CommunityChat with other developers and our teamGitHub
DiscussionsParticipate in technical discussions
Contact Us
Need to reach us directly?
Email SupportContact our support team for direct assistance
For urgent issues or private inquiries, please email us directly. For general
questions and discussions, we encourage using our Discord community or GitHub
discussions.Was this page helpful?YesNoOpen Source ScrapeGraphAI
BadgexgithublinkedinPowered by MintlifyOn this pageShare Your ExperienceAPI
FeedbackFeature RequestsBug ReportsCommunity ChannelsContact Us

------- • -------

https://docs.scrapegraphai.com/resources/badge

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationResources
ScrapeGraphAI BadgeHomeCookbookAPI ReferenceOfficial WebsiteCommunityBlogGet
StartedIntroductionDashboardServicesSmartScraperLocalScraperMarkdownifyBrowser
ExtensionsOfficial SDKsPython SDKJavaScript SDKIntegrations🦜 LangChain🦙 LlamaIndex👥
CrewAI PhidataContributeOpen SourceFeedbackResources ScrapeGraphAI BadgeResources🕷️
ScrapeGraphAI BadgeUse our badges for ScrapeGraphAI integrationWe love seeing what
you build with ScrapeGraphAI API each day. For projects and demos built with
ScrapeGraphAI, please use our Powered by ScrapeGraphAI badge on your application
user interface.

Installation
You can use the following HTML code snippet to integrate our badge into your user
interface:
<a href="https://scrapegraphai.com" target="_blank" rel="noopener noreferrer">
<img
src="https://i.ibb.co/BryKk4x/Screenshot-2025-01-06-at-18-48-45-removebg-
preview.png"
alt="Powered by ScrapeGraphAI for simple and fast scraping."
/>
</a>
Was this page helpful?YesNoFeedbackxgithublinkedinPowered by MintlifyOn this
pageInstallation

------- • -------

https://docs.scrapegraphai.com/#overview

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationGet
StartedIntroductionHomeCookbookAPI ReferenceOfficial WebsiteCommunityBlogGet
StartedIntroductionDashboardServicesSmartScraperLocalScraperMarkdownifyBrowser
ExtensionsOfficial SDKsPython SDKJavaScript SDKIntegrations🦜 LangChain🦙 LlamaIndex👥
CrewAI PhidataContributeOpen SourceFeedbackResources🕷️ScrapeGraphAI BadgeGet
StartedIntroductionWelcome to ScrapeGraphAI - AI-Powered Web Data Extraction
Overview
ScrapeGraphAI is a powerful suite of LLM-driven web scraping tools designed to
extract structured data from any website and HTML content. Our API is designed to
be easy to use and integrate with your existing workflows.
Perfect For
AI ApplicationsFeed your AI agents with structured web data for enhanced decision-
makingData AnalysisExtract and structure web data for research and analysisDataset
CreationBuild comprehensive datasets from web sourcesPlatform BuildingCreate
scraping-powered platforms and applications
Getting Started
1Get API KeySign up and access your API key from the dashboard2Choose Your
ServiceSelect from our specialized extraction services based on your needs3Start
ExtractingBegin extracting data using our SDKs or direct API calls
Documentation Structure
DashboardLearn how to manage your account, monitor jobs, and access your API
keysServicesExplore our core services: SmartScraper, LocalScraper, and
MarkdownifySDKs & IntegrationImplement with Python, JavaScript, or integrate with
LangChain and LlamaIndexAPI ReferenceDetailed API documentation for direct
integration
Core Services

SmartScraper: AI-powered extraction for any website


LocalScraper: AI-powered extraction for local HTML content
Markdownify: Convert web content to clean Markdown format

Implementation Options
Official SDKs

Production-ready SDKs for Python and JavaScript


Comprehensive error handling and retry logic
Type hints and full IDE support

Integrations

Seamless integration with LangChain


Native support for LlamaIndex
Perfect for AI agent workflows

Examples & Use Cases


Visit our Cookbook to explore real-world examples and implementation patterns:

E-commerce data extraction


News article scraping
Research data collection
Content aggregation

Open SourceScrapeGraphAI is built with transparency in mind. Check out our open-
source core at:
github.com/scrapegraphai/scrapegraph-ai
Ready to Start?Get your API key and start extracting data in minutes!Was this page
helpful?YesNoDashboardxgithublinkedinPowered by MintlifyOn this pageOverviewPerfect
ForGetting StartedDocumentation StructureCore ServicesImplementation
OptionsOfficial SDKsIntegrationsExamples & Use Cases

------- • -------

https://docs.scrapegraphai.com/#perfect-for

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationGet
StartedIntroductionHomeCookbookAPI ReferenceOfficial WebsiteCommunityBlogGet
StartedIntroductionDashboardServicesSmartScraperLocalScraperMarkdownifyBrowser
ExtensionsOfficial SDKsPython SDKJavaScript SDKIntegrations🦜 LangChain🦙 LlamaIndex👥
CrewAI PhidataContributeOpen SourceFeedbackResources🕷️ScrapeGraphAI BadgeGet
StartedIntroductionWelcome to ScrapeGraphAI - AI-Powered Web Data Extraction
Overview
ScrapeGraphAI is a powerful suite of LLM-driven web scraping tools designed to
extract structured data from any website and HTML content. Our API is designed to
be easy to use and integrate with your existing workflows.
Perfect For
AI ApplicationsFeed your AI agents with structured web data for enhanced decision-
makingData AnalysisExtract and structure web data for research and analysisDataset
CreationBuild comprehensive datasets from web sourcesPlatform BuildingCreate
scraping-powered platforms and applications
Getting Started
1Get API KeySign up and access your API key from the dashboard2Choose Your
ServiceSelect from our specialized extraction services based on your needs3Start
ExtractingBegin extracting data using our SDKs or direct API calls
Documentation Structure
DashboardLearn how to manage your account, monitor jobs, and access your API
keysServicesExplore our core services: SmartScraper, LocalScraper, and
MarkdownifySDKs & IntegrationImplement with Python, JavaScript, or integrate with
LangChain and LlamaIndexAPI ReferenceDetailed API documentation for direct
integration
Core Services

SmartScraper: AI-powered extraction for any website


LocalScraper: AI-powered extraction for local HTML content
Markdownify: Convert web content to clean Markdown format

Implementation Options
Official SDKs

Production-ready SDKs for Python and JavaScript


Comprehensive error handling and retry logic
Type hints and full IDE support

Integrations

Seamless integration with LangChain


Native support for LlamaIndex
Perfect for AI agent workflows

Examples & Use Cases


Visit our Cookbook to explore real-world examples and implementation patterns:

E-commerce data extraction


News article scraping
Research data collection
Content aggregation

Open SourceScrapeGraphAI is built with transparency in mind. Check out our open-
source core at:
github.com/scrapegraphai/scrapegraph-ai
Ready to Start?Get your API key and start extracting data in minutes!Was this page
helpful?YesNoDashboardxgithublinkedinPowered by MintlifyOn this pageOverviewPerfect
ForGetting StartedDocumentation StructureCore ServicesImplementation
OptionsOfficial SDKsIntegrationsExamples & Use Cases

------- • -------

https://docs.scrapegraphai.com/#getting-started

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationGet
StartedIntroductionHomeCookbookAPI ReferenceOfficial WebsiteCommunityBlogGet
StartedIntroductionDashboardServicesSmartScraperLocalScraperMarkdownifyBrowser
ExtensionsOfficial SDKsPython SDKJavaScript SDKIntegrations🦜 LangChain🦙 LlamaIndex👥
CrewAI PhidataContributeOpen SourceFeedbackResources🕷️ScrapeGraphAI BadgeGet
StartedIntroductionWelcome to ScrapeGraphAI - AI-Powered Web Data Extraction
Overview
ScrapeGraphAI is a powerful suite of LLM-driven web scraping tools designed to
extract structured data from any website and HTML content. Our API is designed to
be easy to use and integrate with your existing workflows.
Perfect For
AI ApplicationsFeed your AI agents with structured web data for enhanced decision-
makingData AnalysisExtract and structure web data for research and analysisDataset
CreationBuild comprehensive datasets from web sourcesPlatform BuildingCreate
scraping-powered platforms and applications
Getting Started
1Get API KeySign up and access your API key from the dashboard2Choose Your
ServiceSelect from our specialized extraction services based on your needs3Start
ExtractingBegin extracting data using our SDKs or direct API calls
Documentation Structure
DashboardLearn how to manage your account, monitor jobs, and access your API
keysServicesExplore our core services: SmartScraper, LocalScraper, and
MarkdownifySDKs & IntegrationImplement with Python, JavaScript, or integrate with
LangChain and LlamaIndexAPI ReferenceDetailed API documentation for direct
integration
Core Services

SmartScraper: AI-powered extraction for any website


LocalScraper: AI-powered extraction for local HTML content
Markdownify: Convert web content to clean Markdown format

Implementation Options
Official SDKs

Production-ready SDKs for Python and JavaScript


Comprehensive error handling and retry logic
Type hints and full IDE support

Integrations

Seamless integration with LangChain


Native support for LlamaIndex
Perfect for AI agent workflows

Examples & Use Cases


Visit our Cookbook to explore real-world examples and implementation patterns:

E-commerce data extraction


News article scraping
Research data collection
Content aggregation

Open SourceScrapeGraphAI is built with transparency in mind. Check out our open-
source core at:
github.com/scrapegraphai/scrapegraph-ai
Ready to Start?Get your API key and start extracting data in minutes!Was this page
helpful?YesNoDashboardxgithublinkedinPowered by MintlifyOn this pageOverviewPerfect
ForGetting StartedDocumentation StructureCore ServicesImplementation
OptionsOfficial SDKsIntegrationsExamples & Use Cases

------- • -------

https://docs.scrapegraphai.com/#documentation-structure

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationGet
StartedIntroductionHomeCookbookAPI ReferenceOfficial WebsiteCommunityBlogGet
StartedIntroductionDashboardServicesSmartScraperLocalScraperMarkdownifyBrowser
ExtensionsOfficial SDKsPython SDKJavaScript SDKIntegrations🦜 LangChain🦙 LlamaIndex👥
CrewAI PhidataContributeOpen SourceFeedbackResources🕷️ScrapeGraphAI BadgeGet
StartedIntroductionWelcome to ScrapeGraphAI - AI-Powered Web Data Extraction
Overview
ScrapeGraphAI is a powerful suite of LLM-driven web scraping tools designed to
extract structured data from any website and HTML content. Our API is designed to
be easy to use and integrate with your existing workflows.
Perfect For
AI ApplicationsFeed your AI agents with structured web data for enhanced decision-
makingData AnalysisExtract and structure web data for research and analysisDataset
CreationBuild comprehensive datasets from web sourcesPlatform BuildingCreate
scraping-powered platforms and applications
Getting Started
1Get API KeySign up and access your API key from the dashboard2Choose Your
ServiceSelect from our specialized extraction services based on your needs3Start
ExtractingBegin extracting data using our SDKs or direct API calls
Documentation Structure
DashboardLearn how to manage your account, monitor jobs, and access your API
keysServicesExplore our core services: SmartScraper, LocalScraper, and
MarkdownifySDKs & IntegrationImplement with Python, JavaScript, or integrate with
LangChain and LlamaIndexAPI ReferenceDetailed API documentation for direct
integration
Core Services

SmartScraper: AI-powered extraction for any website


LocalScraper: AI-powered extraction for local HTML content
Markdownify: Convert web content to clean Markdown format

Implementation Options
Official SDKs

Production-ready SDKs for Python and JavaScript


Comprehensive error handling and retry logic
Type hints and full IDE support

Integrations

Seamless integration with LangChain


Native support for LlamaIndex
Perfect for AI agent workflows

Examples & Use Cases


Visit our Cookbook to explore real-world examples and implementation patterns:

E-commerce data extraction


News article scraping
Research data collection
Content aggregation

Open SourceScrapeGraphAI is built with transparency in mind. Check out our open-
source core at:
github.com/scrapegraphai/scrapegraph-ai
Ready to Start?Get your API key and start extracting data in minutes!Was this page
helpful?YesNoDashboardxgithublinkedinPowered by MintlifyOn this pageOverviewPerfect
ForGetting StartedDocumentation StructureCore ServicesImplementation
OptionsOfficial SDKsIntegrationsExamples & Use Cases

------- • -------
https://docs.scrapegraphai.com/dashboard/overview

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationDashboardDashboardHo
meCookbookAPI ReferenceOfficial WebsiteCommunityBlogGet
StartedIntroductionDashboardDashboardPlaygroundJobs StatusUser
SettingsServicesSmartScraperLocalScraperMarkdownifyBrowser ExtensionsOfficial
SDKsPython SDKJavaScript SDKIntegrations🦜 LangChain🦙 LlamaIndex👥 CrewAI🦐
PhidataContributeOpen SourceFeedbackResources ScrapeGraphAI
BadgeDashboardDashboardOverview of your ScrapeGraphAI dashboardDashboard Overview
The ScrapeGraphAI dashboard is your central hub for managing all your web scraping
operations. Here you can monitor your usage, start new jobs, and manage your
account settings.

Main Dashboard Elements

API Key: Your personal authentication key required for accessing all services
Total Requests: Counter showing your total API calls across all services
Last Used: Timestamp of your most recent API request
Quick Actions: Buttons to start new scraping jobs or access common features

Usage Analytics
Track your API usage patterns with our detailed analytics view:

The usage graph provides:

Service-specific metrics: Track usage for SmartScraper, LocalScraper, and


Markdownify separately
Time-based analysis: View usage patterns over different time periods
Interactive tooltips: Hover over data points to see detailed information
Trend analysis: Identify usage patterns and optimize your API consumption

Key Features

Usage Statistics: Monitor your API usage and remaining credits


Recent Jobs: View and manage your recent scraping jobs
Quick Actions: Start new scraping jobs with just a few clicks
System Status: Check the current system status and any ongoing maintenance

Getting Started

Log in to your dashboard


View your API key in the settings section
Check your available credits
Start your first scraping job

Need Help?Check out our quickstart guide or contact supportWas this page helpful?
YesNoIntroductionPlaygroundxgithublinkedinPowered by MintlifyOn this pageDashboard
OverviewMain Dashboard ElementsUsage AnalyticsKey FeaturesGetting Started

------- • -------

https://docs.scrapegraphai.com/#core-services

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationGet
StartedIntroductionHomeCookbookAPI ReferenceOfficial WebsiteCommunityBlogGet
StartedIntroductionDashboardServicesSmartScraperLocalScraperMarkdownifyBrowser
ExtensionsOfficial SDKsPython SDKJavaScript SDKIntegrations🦜 LangChain🦙 LlamaIndex👥
CrewAI PhidataContributeOpen SourceFeedbackResources🕷️ScrapeGraphAI BadgeGet
StartedIntroductionWelcome to ScrapeGraphAI - AI-Powered Web Data Extraction
Overview
ScrapeGraphAI is a powerful suite of LLM-driven web scraping tools designed to
extract structured data from any website and HTML content. Our API is designed to
be easy to use and integrate with your existing workflows.
Perfect For
AI ApplicationsFeed your AI agents with structured web data for enhanced decision-
makingData AnalysisExtract and structure web data for research and analysisDataset
CreationBuild comprehensive datasets from web sourcesPlatform BuildingCreate
scraping-powered platforms and applications
Getting Started
1Get API KeySign up and access your API key from the dashboard2Choose Your
ServiceSelect from our specialized extraction services based on your needs3Start
ExtractingBegin extracting data using our SDKs or direct API calls
Documentation Structure
DashboardLearn how to manage your account, monitor jobs, and access your API
keysServicesExplore our core services: SmartScraper, LocalScraper, and
MarkdownifySDKs & IntegrationImplement with Python, JavaScript, or integrate with
LangChain and LlamaIndexAPI ReferenceDetailed API documentation for direct
integration
Core Services

SmartScraper: AI-powered extraction for any website


LocalScraper: AI-powered extraction for local HTML content
Markdownify: Convert web content to clean Markdown format

Implementation Options
Official SDKs

Production-ready SDKs for Python and JavaScript


Comprehensive error handling and retry logic
Type hints and full IDE support

Integrations

Seamless integration with LangChain


Native support for LlamaIndex
Perfect for AI agent workflows

Examples & Use Cases


Visit our Cookbook to explore real-world examples and implementation patterns:

E-commerce data extraction


News article scraping
Research data collection
Content aggregation

Open SourceScrapeGraphAI is built with transparency in mind. Check out our open-
source core at:
github.com/scrapegraphai/scrapegraph-ai
Ready to Start?Get your API key and start extracting data in minutes!Was this page
helpful?YesNoDashboardxgithublinkedinPowered by MintlifyOn this pageOverviewPerfect
ForGetting StartedDocumentation StructureCore ServicesImplementation
OptionsOfficial SDKsIntegrationsExamples & Use Cases

------- • -------

https://docs.scrapegraphai.com/#implementation-options
ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationGet
StartedIntroductionHomeCookbookAPI ReferenceOfficial WebsiteCommunityBlogGet
StartedIntroductionDashboardServicesSmartScraperLocalScraperMarkdownifyBrowser
ExtensionsOfficial SDKsPython SDKJavaScript SDKIntegrations🦜 LangChain🦙 LlamaIndex👥
CrewAI PhidataContributeOpen SourceFeedbackResources🕷️ScrapeGraphAI BadgeGet
StartedIntroductionWelcome to ScrapeGraphAI - AI-Powered Web Data Extraction
Overview
ScrapeGraphAI is a powerful suite of LLM-driven web scraping tools designed to
extract structured data from any website and HTML content. Our API is designed to
be easy to use and integrate with your existing workflows.
Perfect For
AI ApplicationsFeed your AI agents with structured web data for enhanced decision-
makingData AnalysisExtract and structure web data for research and analysisDataset
CreationBuild comprehensive datasets from web sourcesPlatform BuildingCreate
scraping-powered platforms and applications
Getting Started
1Get API KeySign up and access your API key from the dashboard2Choose Your
ServiceSelect from our specialized extraction services based on your needs3Start
ExtractingBegin extracting data using our SDKs or direct API calls
Documentation Structure
DashboardLearn how to manage your account, monitor jobs, and access your API
keysServicesExplore our core services: SmartScraper, LocalScraper, and
MarkdownifySDKs & IntegrationImplement with Python, JavaScript, or integrate with
LangChain and LlamaIndexAPI ReferenceDetailed API documentation for direct
integration
Core Services

SmartScraper: AI-powered extraction for any website


LocalScraper: AI-powered extraction for local HTML content
Markdownify: Convert web content to clean Markdown format

Implementation Options
Official SDKs

Production-ready SDKs for Python and JavaScript


Comprehensive error handling and retry logic
Type hints and full IDE support

Integrations

Seamless integration with LangChain


Native support for LlamaIndex
Perfect for AI agent workflows

Examples & Use Cases


Visit our Cookbook to explore real-world examples and implementation patterns:

E-commerce data extraction


News article scraping
Research data collection
Content aggregation

Open SourceScrapeGraphAI is built with transparency in mind. Check out our open-
source core at:
github.com/scrapegraphai/scrapegraph-ai
Ready to Start?Get your API key and start extracting data in minutes!Was this page
helpful?YesNoDashboardxgithublinkedinPowered by MintlifyOn this pageOverviewPerfect
ForGetting StartedDocumentation StructureCore ServicesImplementation
OptionsOfficial SDKsIntegrationsExamples & Use Cases

------- • -------

https://docs.scrapegraphai.com/#official-sdks

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationGet
StartedIntroductionHomeCookbookAPI ReferenceOfficial WebsiteCommunityBlogGet
StartedIntroductionDashboardServicesSmartScraperLocalScraperMarkdownifyBrowser
ExtensionsOfficial SDKsPython SDKJavaScript SDKIntegrations🦜 LangChain🦙 LlamaIndex👥
CrewAI PhidataContributeOpen SourceFeedbackResources🕷️ScrapeGraphAI BadgeGet
StartedIntroductionWelcome to ScrapeGraphAI - AI-Powered Web Data Extraction
Overview
ScrapeGraphAI is a powerful suite of LLM-driven web scraping tools designed to
extract structured data from any website and HTML content. Our API is designed to
be easy to use and integrate with your existing workflows.
Perfect For
AI ApplicationsFeed your AI agents with structured web data for enhanced decision-
makingData AnalysisExtract and structure web data for research and analysisDataset
CreationBuild comprehensive datasets from web sourcesPlatform BuildingCreate
scraping-powered platforms and applications
Getting Started
1Get API KeySign up and access your API key from the dashboard2Choose Your
ServiceSelect from our specialized extraction services based on your needs3Start
ExtractingBegin extracting data using our SDKs or direct API calls
Documentation Structure
DashboardLearn how to manage your account, monitor jobs, and access your API
keysServicesExplore our core services: SmartScraper, LocalScraper, and
MarkdownifySDKs & IntegrationImplement with Python, JavaScript, or integrate with
LangChain and LlamaIndexAPI ReferenceDetailed API documentation for direct
integration
Core Services

SmartScraper: AI-powered extraction for any website


LocalScraper: AI-powered extraction for local HTML content
Markdownify: Convert web content to clean Markdown format

Implementation Options
Official SDKs

Production-ready SDKs for Python and JavaScript


Comprehensive error handling and retry logic
Type hints and full IDE support

Integrations

Seamless integration with LangChain


Native support for LlamaIndex
Perfect for AI agent workflows

Examples & Use Cases


Visit our Cookbook to explore real-world examples and implementation patterns:

E-commerce data extraction


News article scraping
Research data collection
Content aggregation
Open SourceScrapeGraphAI is built with transparency in mind. Check out our open-
source core at:
github.com/scrapegraphai/scrapegraph-ai
Ready to Start?Get your API key and start extracting data in minutes!Was this page
helpful?YesNoDashboardxgithublinkedinPowered by MintlifyOn this pageOverviewPerfect
ForGetting StartedDocumentation StructureCore ServicesImplementation
OptionsOfficial SDKsIntegrationsExamples & Use Cases

------- • -------

https://docs.scrapegraphai.com/#integrations

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationGet
StartedIntroductionHomeCookbookAPI ReferenceOfficial WebsiteCommunityBlogGet
StartedIntroductionDashboardServicesSmartScraperLocalScraperMarkdownifyBrowser
ExtensionsOfficial SDKsPython SDKJavaScript SDKIntegrations🦜 LangChain🦙 LlamaIndex👥
CrewAI PhidataContributeOpen SourceFeedbackResources🕷️ScrapeGraphAI BadgeGet
StartedIntroductionWelcome to ScrapeGraphAI - AI-Powered Web Data Extraction
Overview
ScrapeGraphAI is a powerful suite of LLM-driven web scraping tools designed to
extract structured data from any website and HTML content. Our API is designed to
be easy to use and integrate with your existing workflows.
Perfect For
AI ApplicationsFeed your AI agents with structured web data for enhanced decision-
makingData AnalysisExtract and structure web data for research and analysisDataset
CreationBuild comprehensive datasets from web sourcesPlatform BuildingCreate
scraping-powered platforms and applications
Getting Started
1Get API KeySign up and access your API key from the dashboard2Choose Your
ServiceSelect from our specialized extraction services based on your needs3Start
ExtractingBegin extracting data using our SDKs or direct API calls
Documentation Structure
DashboardLearn how to manage your account, monitor jobs, and access your API
keysServicesExplore our core services: SmartScraper, LocalScraper, and
MarkdownifySDKs & IntegrationImplement with Python, JavaScript, or integrate with
LangChain and LlamaIndexAPI ReferenceDetailed API documentation for direct
integration
Core Services

SmartScraper: AI-powered extraction for any website


LocalScraper: AI-powered extraction for local HTML content
Markdownify: Convert web content to clean Markdown format

Implementation Options
Official SDKs

Production-ready SDKs for Python and JavaScript


Comprehensive error handling and retry logic
Type hints and full IDE support

Integrations

Seamless integration with LangChain


Native support for LlamaIndex
Perfect for AI agent workflows

Examples & Use Cases


Visit our Cookbook to explore real-world examples and implementation patterns:

E-commerce data extraction


News article scraping
Research data collection
Content aggregation

Open SourceScrapeGraphAI is built with transparency in mind. Check out our open-
source core at:
github.com/scrapegraphai/scrapegraph-ai
Ready to Start?Get your API key and start extracting data in minutes!Was this page
helpful?YesNoDashboardxgithublinkedinPowered by MintlifyOn this pageOverviewPerfect
ForGetting StartedDocumentation StructureCore ServicesImplementation
OptionsOfficial SDKsIntegrationsExamples & Use Cases

------- • -------

https://docs.scrapegraphai.com/#examples-and-use-cases

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationGet
StartedIntroductionHomeCookbookAPI ReferenceOfficial WebsiteCommunityBlogGet
StartedIntroductionDashboardServicesSmartScraperLocalScraperMarkdownifyBrowser
ExtensionsOfficial SDKsPython SDKJavaScript SDKIntegrations🦜 LangChain🦙 LlamaIndex👥
CrewAI PhidataContributeOpen SourceFeedbackResources🕷️ScrapeGraphAI BadgeGet
StartedIntroductionWelcome to ScrapeGraphAI - AI-Powered Web Data Extraction
Overview
ScrapeGraphAI is a powerful suite of LLM-driven web scraping tools designed to
extract structured data from any website and HTML content. Our API is designed to
be easy to use and integrate with your existing workflows.
Perfect For
AI ApplicationsFeed your AI agents with structured web data for enhanced decision-
makingData AnalysisExtract and structure web data for research and analysisDataset
CreationBuild comprehensive datasets from web sourcesPlatform BuildingCreate
scraping-powered platforms and applications
Getting Started
1Get API KeySign up and access your API key from the dashboard2Choose Your
ServiceSelect from our specialized extraction services based on your needs3Start
ExtractingBegin extracting data using our SDKs or direct API calls
Documentation Structure
DashboardLearn how to manage your account, monitor jobs, and access your API
keysServicesExplore our core services: SmartScraper, LocalScraper, and
MarkdownifySDKs & IntegrationImplement with Python, JavaScript, or integrate with
LangChain and LlamaIndexAPI ReferenceDetailed API documentation for direct
integration
Core Services

SmartScraper: AI-powered extraction for any website


LocalScraper: AI-powered extraction for local HTML content
Markdownify: Convert web content to clean Markdown format

Implementation Options
Official SDKs

Production-ready SDKs for Python and JavaScript


Comprehensive error handling and retry logic
Type hints and full IDE support

Integrations
Seamless integration with LangChain
Native support for LlamaIndex
Perfect for AI agent workflows

Examples & Use Cases


Visit our Cookbook to explore real-world examples and implementation patterns:

E-commerce data extraction


News article scraping
Research data collection
Content aggregation

Open SourceScrapeGraphAI is built with transparency in mind. Check out our open-
source core at:
github.com/scrapegraphai/scrapegraph-ai
Ready to Start?Get your API key and start extracting data in minutes!Was this page
helpful?YesNoDashboardxgithublinkedinPowered by MintlifyOn this pageOverviewPerfect
ForGetting StartedDocumentation StructureCore ServicesImplementation
OptionsOfficial SDKsIntegrationsExamples & Use Cases

------- • -------

https://docs.scrapegraphai.com/introduction#overview

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationGet
StartedIntroductionHomeCookbookAPI ReferenceOfficial WebsiteCommunityBlogGet
StartedIntroductionDashboardServicesSmartScraperLocalScraperMarkdownifyBrowser
ExtensionsOfficial SDKsPython SDKJavaScript SDKIntegrations🦜 LangChain🦙 LlamaIndex👥
CrewAI PhidataContributeOpen SourceFeedbackResources🕷️ScrapeGraphAI BadgeGet
StartedIntroductionWelcome to ScrapeGraphAI - AI-Powered Web Data Extraction
Overview
ScrapeGraphAI is a powerful suite of LLM-driven web scraping tools designed to
extract structured data from any website and HTML content. Our API is designed to
be easy to use and integrate with your existing workflows.
Perfect For
AI ApplicationsFeed your AI agents with structured web data for enhanced decision-
makingData AnalysisExtract and structure web data for research and analysisDataset
CreationBuild comprehensive datasets from web sourcesPlatform BuildingCreate
scraping-powered platforms and applications
Getting Started
1Get API KeySign up and access your API key from the dashboard2Choose Your
ServiceSelect from our specialized extraction services based on your needs3Start
ExtractingBegin extracting data using our SDKs or direct API calls
Documentation Structure
DashboardLearn how to manage your account, monitor jobs, and access your API
keysServicesExplore our core services: SmartScraper, LocalScraper, and
MarkdownifySDKs & IntegrationImplement with Python, JavaScript, or integrate with
LangChain and LlamaIndexAPI ReferenceDetailed API documentation for direct
integration
Core Services

SmartScraper: AI-powered extraction for any website


LocalScraper: AI-powered extraction for local HTML content
Markdownify: Convert web content to clean Markdown format

Implementation Options
Official SDKs
Production-ready SDKs for Python and JavaScript
Comprehensive error handling and retry logic
Type hints and full IDE support

Integrations

Seamless integration with LangChain


Native support for LlamaIndex
Perfect for AI agent workflows

Examples & Use Cases


Visit our Cookbook to explore real-world examples and implementation patterns:

E-commerce data extraction


News article scraping
Research data collection
Content aggregation

Open SourceScrapeGraphAI is built with transparency in mind. Check out our open-
source core at:
github.com/scrapegraphai/scrapegraph-ai
Ready to Start?Get your API key and start extracting data in minutes!Was this page
helpful?YesNoDashboardxgithublinkedinPowered by MintlifyOn this pageOverviewPerfect
ForGetting StartedDocumentation StructureCore ServicesImplementation
OptionsOfficial SDKsIntegrationsExamples & Use Cases

------- • -------

https://docs.scrapegraphai.com/introduction#perfect-for

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationGet
StartedIntroductionHomeCookbookAPI ReferenceOfficial WebsiteCommunityBlogGet
StartedIntroductionDashboardServicesSmartScraperLocalScraperMarkdownifyBrowser
ExtensionsOfficial SDKsPython SDKJavaScript SDKIntegrations🦜 LangChain🦙 LlamaIndex👥
CrewAI PhidataContributeOpen SourceFeedbackResources🕷️ScrapeGraphAI BadgeGet
StartedIntroductionWelcome to ScrapeGraphAI - AI-Powered Web Data Extraction
Overview
ScrapeGraphAI is a powerful suite of LLM-driven web scraping tools designed to
extract structured data from any website and HTML content. Our API is designed to
be easy to use and integrate with your existing workflows.
Perfect For
AI ApplicationsFeed your AI agents with structured web data for enhanced decision-
makingData AnalysisExtract and structure web data for research and analysisDataset
CreationBuild comprehensive datasets from web sourcesPlatform BuildingCreate
scraping-powered platforms and applications
Getting Started
1Get API KeySign up and access your API key from the dashboard2Choose Your
ServiceSelect from our specialized extraction services based on your needs3Start
ExtractingBegin extracting data using our SDKs or direct API calls
Documentation Structure
DashboardLearn how to manage your account, monitor jobs, and access your API
keysServicesExplore our core services: SmartScraper, LocalScraper, and
MarkdownifySDKs & IntegrationImplement with Python, JavaScript, or integrate with
LangChain and LlamaIndexAPI ReferenceDetailed API documentation for direct
integration
Core Services
SmartScraper: AI-powered extraction for any website
LocalScraper: AI-powered extraction for local HTML content
Markdownify: Convert web content to clean Markdown format

Implementation Options
Official SDKs

Production-ready SDKs for Python and JavaScript


Comprehensive error handling and retry logic
Type hints and full IDE support

Integrations

Seamless integration with LangChain


Native support for LlamaIndex
Perfect for AI agent workflows

Examples & Use Cases


Visit our Cookbook to explore real-world examples and implementation patterns:

E-commerce data extraction


News article scraping
Research data collection
Content aggregation

Open SourceScrapeGraphAI is built with transparency in mind. Check out our open-
source core at:
github.com/scrapegraphai/scrapegraph-ai
Ready to Start?Get your API key and start extracting data in minutes!Was this page
helpful?YesNoDashboardxgithublinkedinPowered by MintlifyOn this pageOverviewPerfect
ForGetting StartedDocumentation StructureCore ServicesImplementation
OptionsOfficial SDKsIntegrationsExamples & Use Cases

------- • -------

https://docs.scrapegraphai.com/introduction#getting-started

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationGet
StartedIntroductionHomeCookbookAPI ReferenceOfficial WebsiteCommunityBlogGet
StartedIntroductionDashboardServicesSmartScraperLocalScraperMarkdownifyBrowser
ExtensionsOfficial SDKsPython SDKJavaScript SDKIntegrations🦜 LangChain🦙 LlamaIndex👥
CrewAI PhidataContributeOpen SourceFeedbackResources🕷️ScrapeGraphAI BadgeGet
StartedIntroductionWelcome to ScrapeGraphAI - AI-Powered Web Data Extraction
Overview
ScrapeGraphAI is a powerful suite of LLM-driven web scraping tools designed to
extract structured data from any website and HTML content. Our API is designed to
be easy to use and integrate with your existing workflows.
Perfect For
AI ApplicationsFeed your AI agents with structured web data for enhanced decision-
makingData AnalysisExtract and structure web data for research and analysisDataset
CreationBuild comprehensive datasets from web sourcesPlatform BuildingCreate
scraping-powered platforms and applications
Getting Started
1Get API KeySign up and access your API key from the dashboard2Choose Your
ServiceSelect from our specialized extraction services based on your needs3Start
ExtractingBegin extracting data using our SDKs or direct API calls
Documentation Structure
DashboardLearn how to manage your account, monitor jobs, and access your API
keysServicesExplore our core services: SmartScraper, LocalScraper, and
MarkdownifySDKs & IntegrationImplement with Python, JavaScript, or integrate with
LangChain and LlamaIndexAPI ReferenceDetailed API documentation for direct
integration
Core Services

SmartScraper: AI-powered extraction for any website


LocalScraper: AI-powered extraction for local HTML content
Markdownify: Convert web content to clean Markdown format

Implementation Options
Official SDKs

Production-ready SDKs for Python and JavaScript


Comprehensive error handling and retry logic
Type hints and full IDE support

Integrations

Seamless integration with LangChain


Native support for LlamaIndex
Perfect for AI agent workflows

Examples & Use Cases


Visit our Cookbook to explore real-world examples and implementation patterns:

E-commerce data extraction


News article scraping
Research data collection
Content aggregation

Open SourceScrapeGraphAI is built with transparency in mind. Check out our open-
source core at:
github.com/scrapegraphai/scrapegraph-ai
Ready to Start?Get your API key and start extracting data in minutes!Was this page
helpful?YesNoDashboardxgithublinkedinPowered by MintlifyOn this pageOverviewPerfect
ForGetting StartedDocumentation StructureCore ServicesImplementation
OptionsOfficial SDKsIntegrationsExamples & Use Cases

------- • -------

https://docs.scrapegraphai.com/introduction#documentation-structure

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationGet
StartedIntroductionHomeCookbookAPI ReferenceOfficial WebsiteCommunityBlogGet
StartedIntroductionDashboardServicesSmartScraperLocalScraperMarkdownifyBrowser
ExtensionsOfficial SDKsPython SDKJavaScript SDKIntegrations🦜 LangChain🦙 LlamaIndex👥
CrewAI PhidataContributeOpen SourceFeedbackResources🕷️ScrapeGraphAI BadgeGet
StartedIntroductionWelcome to ScrapeGraphAI - AI-Powered Web Data Extraction
Overview
ScrapeGraphAI is a powerful suite of LLM-driven web scraping tools designed to
extract structured data from any website and HTML content. Our API is designed to
be easy to use and integrate with your existing workflows.
Perfect For
AI ApplicationsFeed your AI agents with structured web data for enhanced decision-
makingData AnalysisExtract and structure web data for research and analysisDataset
CreationBuild comprehensive datasets from web sourcesPlatform BuildingCreate
scraping-powered platforms and applications
Getting Started
1Get API KeySign up and access your API key from the dashboard2Choose Your
ServiceSelect from our specialized extraction services based on your needs3Start
ExtractingBegin extracting data using our SDKs or direct API calls
Documentation Structure
DashboardLearn how to manage your account, monitor jobs, and access your API
keysServicesExplore our core services: SmartScraper, LocalScraper, and
MarkdownifySDKs & IntegrationImplement with Python, JavaScript, or integrate with
LangChain and LlamaIndexAPI ReferenceDetailed API documentation for direct
integration
Core Services

SmartScraper: AI-powered extraction for any website


LocalScraper: AI-powered extraction for local HTML content
Markdownify: Convert web content to clean Markdown format

Implementation Options
Official SDKs

Production-ready SDKs for Python and JavaScript


Comprehensive error handling and retry logic
Type hints and full IDE support

Integrations

Seamless integration with LangChain


Native support for LlamaIndex
Perfect for AI agent workflows

Examples & Use Cases


Visit our Cookbook to explore real-world examples and implementation patterns:

E-commerce data extraction


News article scraping
Research data collection
Content aggregation

Open SourceScrapeGraphAI is built with transparency in mind. Check out our open-
source core at:
github.com/scrapegraphai/scrapegraph-ai
Ready to Start?Get your API key and start extracting data in minutes!Was this page
helpful?YesNoDashboardxgithublinkedinPowered by MintlifyOn this pageOverviewPerfect
ForGetting StartedDocumentation StructureCore ServicesImplementation
OptionsOfficial SDKsIntegrationsExamples & Use Cases

------- • -------

https://docs.scrapegraphai.com/introduction#core-services

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationGet
StartedIntroductionHomeCookbookAPI ReferenceOfficial WebsiteCommunityBlogGet
StartedIntroductionDashboardServicesSmartScraperLocalScraperMarkdownifyBrowser
ExtensionsOfficial SDKsPython SDKJavaScript SDKIntegrations🦜 LangChain🦙 LlamaIndex👥
CrewAI PhidataContributeOpen SourceFeedbackResources🕷️ScrapeGraphAI BadgeGet
StartedIntroductionWelcome to ScrapeGraphAI - AI-Powered Web Data Extraction
Overview
ScrapeGraphAI is a powerful suite of LLM-driven web scraping tools designed to
extract structured data from any website and HTML content. Our API is designed to
be easy to use and integrate with your existing workflows.
Perfect For
AI ApplicationsFeed your AI agents with structured web data for enhanced decision-
makingData AnalysisExtract and structure web data for research and analysisDataset
CreationBuild comprehensive datasets from web sourcesPlatform BuildingCreate
scraping-powered platforms and applications
Getting Started
1Get API KeySign up and access your API key from the dashboard2Choose Your
ServiceSelect from our specialized extraction services based on your needs3Start
ExtractingBegin extracting data using our SDKs or direct API calls
Documentation Structure
DashboardLearn how to manage your account, monitor jobs, and access your API
keysServicesExplore our core services: SmartScraper, LocalScraper, and
MarkdownifySDKs & IntegrationImplement with Python, JavaScript, or integrate with
LangChain and LlamaIndexAPI ReferenceDetailed API documentation for direct
integration
Core Services

SmartScraper: AI-powered extraction for any website


LocalScraper: AI-powered extraction for local HTML content
Markdownify: Convert web content to clean Markdown format

Implementation Options
Official SDKs

Production-ready SDKs for Python and JavaScript


Comprehensive error handling and retry logic
Type hints and full IDE support

Integrations

Seamless integration with LangChain


Native support for LlamaIndex
Perfect for AI agent workflows

Examples & Use Cases


Visit our Cookbook to explore real-world examples and implementation patterns:

E-commerce data extraction


News article scraping
Research data collection
Content aggregation

Open SourceScrapeGraphAI is built with transparency in mind. Check out our open-
source core at:
github.com/scrapegraphai/scrapegraph-ai
Ready to Start?Get your API key and start extracting data in minutes!Was this page
helpful?YesNoDashboardxgithublinkedinPowered by MintlifyOn this pageOverviewPerfect
ForGetting StartedDocumentation StructureCore ServicesImplementation
OptionsOfficial SDKsIntegrationsExamples & Use Cases

------- • -------

https://docs.scrapegraphai.com/introduction#implementation-options

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationGet
StartedIntroductionHomeCookbookAPI ReferenceOfficial WebsiteCommunityBlogGet
StartedIntroductionDashboardServicesSmartScraperLocalScraperMarkdownifyBrowser
ExtensionsOfficial SDKsPython SDKJavaScript SDKIntegrations🦜 LangChain🦙 LlamaIndex👥
CrewAI PhidataContributeOpen SourceFeedbackResources🕷️ScrapeGraphAI BadgeGet
StartedIntroductionWelcome to ScrapeGraphAI - AI-Powered Web Data Extraction
Overview
ScrapeGraphAI is a powerful suite of LLM-driven web scraping tools designed to
extract structured data from any website and HTML content. Our API is designed to
be easy to use and integrate with your existing workflows.
Perfect For
AI ApplicationsFeed your AI agents with structured web data for enhanced decision-
makingData AnalysisExtract and structure web data for research and analysisDataset
CreationBuild comprehensive datasets from web sourcesPlatform BuildingCreate
scraping-powered platforms and applications
Getting Started
1Get API KeySign up and access your API key from the dashboard2Choose Your
ServiceSelect from our specialized extraction services based on your needs3Start
ExtractingBegin extracting data using our SDKs or direct API calls
Documentation Structure
DashboardLearn how to manage your account, monitor jobs, and access your API
keysServicesExplore our core services: SmartScraper, LocalScraper, and
MarkdownifySDKs & IntegrationImplement with Python, JavaScript, or integrate with
LangChain and LlamaIndexAPI ReferenceDetailed API documentation for direct
integration
Core Services

SmartScraper: AI-powered extraction for any website


LocalScraper: AI-powered extraction for local HTML content
Markdownify: Convert web content to clean Markdown format

Implementation Options
Official SDKs

Production-ready SDKs for Python and JavaScript


Comprehensive error handling and retry logic
Type hints and full IDE support

Integrations

Seamless integration with LangChain


Native support for LlamaIndex
Perfect for AI agent workflows

Examples & Use Cases


Visit our Cookbook to explore real-world examples and implementation patterns:

E-commerce data extraction


News article scraping
Research data collection
Content aggregation

Open SourceScrapeGraphAI is built with transparency in mind. Check out our open-
source core at:
github.com/scrapegraphai/scrapegraph-ai
Ready to Start?Get your API key and start extracting data in minutes!Was this page
helpful?YesNoDashboardxgithublinkedinPowered by MintlifyOn this pageOverviewPerfect
ForGetting StartedDocumentation StructureCore ServicesImplementation
OptionsOfficial SDKsIntegrationsExamples & Use Cases

------- • -------
https://docs.scrapegraphai.com/introduction#official-sdks

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationGet
StartedIntroductionHomeCookbookAPI ReferenceOfficial WebsiteCommunityBlogGet
StartedIntroductionDashboardServicesSmartScraperLocalScraperMarkdownifyBrowser
ExtensionsOfficial SDKsPython SDKJavaScript SDKIntegrations🦜 LangChain🦙 LlamaIndex👥
CrewAI PhidataContributeOpen SourceFeedbackResources🕷️ScrapeGraphAI BadgeGet
StartedIntroductionWelcome to ScrapeGraphAI - AI-Powered Web Data Extraction
Overview
ScrapeGraphAI is a powerful suite of LLM-driven web scraping tools designed to
extract structured data from any website and HTML content. Our API is designed to
be easy to use and integrate with your existing workflows.
Perfect For
AI ApplicationsFeed your AI agents with structured web data for enhanced decision-
makingData AnalysisExtract and structure web data for research and analysisDataset
CreationBuild comprehensive datasets from web sourcesPlatform BuildingCreate
scraping-powered platforms and applications
Getting Started
1Get API KeySign up and access your API key from the dashboard2Choose Your
ServiceSelect from our specialized extraction services based on your needs3Start
ExtractingBegin extracting data using our SDKs or direct API calls
Documentation Structure
DashboardLearn how to manage your account, monitor jobs, and access your API
keysServicesExplore our core services: SmartScraper, LocalScraper, and
MarkdownifySDKs & IntegrationImplement with Python, JavaScript, or integrate with
LangChain and LlamaIndexAPI ReferenceDetailed API documentation for direct
integration
Core Services

SmartScraper: AI-powered extraction for any website


LocalScraper: AI-powered extraction for local HTML content
Markdownify: Convert web content to clean Markdown format

Implementation Options
Official SDKs

Production-ready SDKs for Python and JavaScript


Comprehensive error handling and retry logic
Type hints and full IDE support

Integrations

Seamless integration with LangChain


Native support for LlamaIndex
Perfect for AI agent workflows

Examples & Use Cases


Visit our Cookbook to explore real-world examples and implementation patterns:

E-commerce data extraction


News article scraping
Research data collection
Content aggregation

Open SourceScrapeGraphAI is built with transparency in mind. Check out our open-
source core at:
github.com/scrapegraphai/scrapegraph-ai
Ready to Start?Get your API key and start extracting data in minutes!Was this page
helpful?YesNoDashboardxgithublinkedinPowered by MintlifyOn this pageOverviewPerfect
ForGetting StartedDocumentation StructureCore ServicesImplementation
OptionsOfficial SDKsIntegrationsExamples & Use Cases

------- • -------

https://docs.scrapegraphai.com/introduction#integrations

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationGet
StartedIntroductionHomeCookbookAPI ReferenceOfficial WebsiteCommunityBlogGet
StartedIntroductionDashboardServicesSmartScraperLocalScraperMarkdownifyBrowser
ExtensionsOfficial SDKsPython SDKJavaScript SDKIntegrations🦜 LangChain🦙 LlamaIndex👥
CrewAI PhidataContributeOpen SourceFeedbackResources🕷️ScrapeGraphAI BadgeGet
StartedIntroductionWelcome to ScrapeGraphAI - AI-Powered Web Data Extraction
Overview
ScrapeGraphAI is a powerful suite of LLM-driven web scraping tools designed to
extract structured data from any website and HTML content. Our API is designed to
be easy to use and integrate with your existing workflows.
Perfect For
AI ApplicationsFeed your AI agents with structured web data for enhanced decision-
makingData AnalysisExtract and structure web data for research and analysisDataset
CreationBuild comprehensive datasets from web sourcesPlatform BuildingCreate
scraping-powered platforms and applications
Getting Started
1Get API KeySign up and access your API key from the dashboard2Choose Your
ServiceSelect from our specialized extraction services based on your needs3Start
ExtractingBegin extracting data using our SDKs or direct API calls
Documentation Structure
DashboardLearn how to manage your account, monitor jobs, and access your API
keysServicesExplore our core services: SmartScraper, LocalScraper, and
MarkdownifySDKs & IntegrationImplement with Python, JavaScript, or integrate with
LangChain and LlamaIndexAPI ReferenceDetailed API documentation for direct
integration
Core Services

SmartScraper: AI-powered extraction for any website


LocalScraper: AI-powered extraction for local HTML content
Markdownify: Convert web content to clean Markdown format

Implementation Options
Official SDKs

Production-ready SDKs for Python and JavaScript


Comprehensive error handling and retry logic
Type hints and full IDE support

Integrations

Seamless integration with LangChain


Native support for LlamaIndex
Perfect for AI agent workflows

Examples & Use Cases


Visit our Cookbook to explore real-world examples and implementation patterns:

E-commerce data extraction


News article scraping
Research data collection
Content aggregation

Open SourceScrapeGraphAI is built with transparency in mind. Check out our open-
source core at:
github.com/scrapegraphai/scrapegraph-ai
Ready to Start?Get your API key and start extracting data in minutes!Was this page
helpful?YesNoDashboardxgithublinkedinPowered by MintlifyOn this pageOverviewPerfect
ForGetting StartedDocumentation StructureCore ServicesImplementation
OptionsOfficial SDKsIntegrationsExamples & Use Cases

------- • -------

https://docs.scrapegraphai.com/introduction#examples-and-use-cases

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationGet
StartedIntroductionHomeCookbookAPI ReferenceOfficial WebsiteCommunityBlogGet
StartedIntroductionDashboardServicesSmartScraperLocalScraperMarkdownifyBrowser
ExtensionsOfficial SDKsPython SDKJavaScript SDKIntegrations🦜 LangChain🦙 LlamaIndex👥
CrewAI PhidataContributeOpen SourceFeedbackResources🕷️ScrapeGraphAI BadgeGet
StartedIntroductionWelcome to ScrapeGraphAI - AI-Powered Web Data Extraction
Overview
ScrapeGraphAI is a powerful suite of LLM-driven web scraping tools designed to
extract structured data from any website and HTML content. Our API is designed to
be easy to use and integrate with your existing workflows.
Perfect For
AI ApplicationsFeed your AI agents with structured web data for enhanced decision-
makingData AnalysisExtract and structure web data for research and analysisDataset
CreationBuild comprehensive datasets from web sourcesPlatform BuildingCreate
scraping-powered platforms and applications
Getting Started
1Get API KeySign up and access your API key from the dashboard2Choose Your
ServiceSelect from our specialized extraction services based on your needs3Start
ExtractingBegin extracting data using our SDKs or direct API calls
Documentation Structure
DashboardLearn how to manage your account, monitor jobs, and access your API
keysServicesExplore our core services: SmartScraper, LocalScraper, and
MarkdownifySDKs & IntegrationImplement with Python, JavaScript, or integrate with
LangChain and LlamaIndexAPI ReferenceDetailed API documentation for direct
integration
Core Services

SmartScraper: AI-powered extraction for any website


LocalScraper: AI-powered extraction for local HTML content
Markdownify: Convert web content to clean Markdown format

Implementation Options
Official SDKs

Production-ready SDKs for Python and JavaScript


Comprehensive error handling and retry logic
Type hints and full IDE support

Integrations

Seamless integration with LangChain


Native support for LlamaIndex
Perfect for AI agent workflows
Examples & Use Cases
Visit our Cookbook to explore real-world examples and implementation patterns:

E-commerce data extraction


News article scraping
Research data collection
Content aggregation

Open SourceScrapeGraphAI is built with transparency in mind. Check out our open-
source core at:
github.com/scrapegraphai/scrapegraph-ai
Ready to Start?Get your API key and start extracting data in minutes!Was this page
helpful?YesNoDashboardxgithublinkedinPowered by MintlifyOn this pageOverviewPerfect
ForGetting StartedDocumentation StructureCore ServicesImplementation
OptionsOfficial SDKsIntegrationsExamples & Use Cases

------- • -------

https://docs.scrapegraphai.com/cookbook/introduction#overview

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationCookbookIntroduction
HomeCookbookAPI ReferenceOfficial WebsiteCommunityBlogCookbookIntroductionExamples🏢
Company Information🌟 GitHub Trending📰 Wired Articles🏠 Homes Listings🔬 News Research
Agent💬 Chat with WebpageCookbookIntroductionLearn from practical examples using
ScrapeGraphAPIOverview
Welcome to the ScrapeGraphAPI cookbook! Here you’ll find practical examples
implemented as interactive Google Colab notebooks. Each example demonstrates
different integration methods and use cases.
All examples are available as ready-to-use Google Colab notebooks - just click and
start experimenting!
Implementation Methods
Each example is available in multiple implementations:
SDK Direct UsageBasic implementation using our official SDKsLangChain
IntegrationIntegration with LangChain for LLM workflowsLlamaIndex IntegrationUsing
ScrapeGraph with LlamaIndex tools
Example Projects
🏢 Company InformationExtract structured company data from websites🌟 GitHub
TrendingMonitor trending repositories and developers📰 Wired ArticlesExtract news
articles and content🏠 Homes ListingsScrape real estate property data
Advanced Examples
🔬 Research Agent with TavilyBuild a sophisticated research agent combining
ScrapeGraph, LangGraph, and Tavily Search💬 Chat with WebpageCreate a RAG chatbot
using ScrapeGraph, Burr, and LanceDB
Getting Started

Choose an example that matches your use case


Open the Colab notebook for your preferred implementation method
Follow the step-by-step instructions
Experiment and adapt the code for your needs

Make sure to have your ScrapeGraphAI API key ready. Get one from the dashboard if
you haven’t already.Was this page helpful?YesNo🏢 Company
InformationxgithublinkedinPowered by MintlifyOn this pageOverviewImplementation
MethodsExample ProjectsAdvanced ExamplesGetting Started

------- • -------

https://docs.scrapegraphai.com/cookbook/introduction#implementation-methods
ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationCookbookIntroduction
HomeCookbookAPI ReferenceOfficial WebsiteCommunityBlogCookbookIntroductionExamples🏢
Company Information🌟 GitHub Trending📰 Wired Articles🏠 Homes Listings🔬 News Research
Agent💬 Chat with WebpageCookbookIntroductionLearn from practical examples using
ScrapeGraphAPIOverview
Welcome to the ScrapeGraphAPI cookbook! Here you’ll find practical examples
implemented as interactive Google Colab notebooks. Each example demonstrates
different integration methods and use cases.
All examples are available as ready-to-use Google Colab notebooks - just click and
start experimenting!
Implementation Methods
Each example is available in multiple implementations:
SDK Direct UsageBasic implementation using our official SDKsLangChain
IntegrationIntegration with LangChain for LLM workflowsLlamaIndex IntegrationUsing
ScrapeGraph with LlamaIndex tools
Example Projects
🏢 Company InformationExtract structured company data from websites🌟 GitHub
TrendingMonitor trending repositories and developers📰 Wired ArticlesExtract news
articles and content🏠 Homes ListingsScrape real estate property data
Advanced Examples
🔬 Research Agent with TavilyBuild a sophisticated research agent combining
ScrapeGraph, LangGraph, and Tavily Search💬 Chat with WebpageCreate a RAG chatbot
using ScrapeGraph, Burr, and LanceDB
Getting Started

Choose an example that matches your use case


Open the Colab notebook for your preferred implementation method
Follow the step-by-step instructions
Experiment and adapt the code for your needs

Make sure to have your ScrapeGraphAI API key ready. Get one from the dashboard if
you haven’t already.Was this page helpful?YesNo🏢 Company
InformationxgithublinkedinPowered by MintlifyOn this pageOverviewImplementation
MethodsExample ProjectsAdvanced ExamplesGetting Started

------- • -------

https://docs.scrapegraphai.com/cookbook/introduction#example-projects

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationCookbookIntroduction
HomeCookbookAPI ReferenceOfficial WebsiteCommunityBlogCookbookIntroductionExamples🏢
Company Information🌟 GitHub Trending📰 Wired Articles🏠 Homes Listings🔬 News Research
Agent💬 Chat with WebpageCookbookIntroductionLearn from practical examples using
ScrapeGraphAPIOverview
Welcome to the ScrapeGraphAPI cookbook! Here you’ll find practical examples
implemented as interactive Google Colab notebooks. Each example demonstrates
different integration methods and use cases.
All examples are available as ready-to-use Google Colab notebooks - just click and
start experimenting!
Implementation Methods
Each example is available in multiple implementations:
SDK Direct UsageBasic implementation using our official SDKsLangChain
IntegrationIntegration with LangChain for LLM workflowsLlamaIndex IntegrationUsing
ScrapeGraph with LlamaIndex tools
Example Projects
🏢 Company InformationExtract structured company data from websites🌟 GitHub
TrendingMonitor trending repositories and developers📰 Wired ArticlesExtract news
articles and content🏠 Homes ListingsScrape real estate property data
Advanced Examples
🔬 Research Agent with TavilyBuild a sophisticated research agent combining
ScrapeGraph, LangGraph, and Tavily Search💬 Chat with WebpageCreate a RAG chatbot
using ScrapeGraph, Burr, and LanceDB
Getting Started

Choose an example that matches your use case


Open the Colab notebook for your preferred implementation method
Follow the step-by-step instructions
Experiment and adapt the code for your needs

Make sure to have your ScrapeGraphAI API key ready. Get one from the dashboard if
you haven’t already.Was this page helpful?YesNo🏢 Company
InformationxgithublinkedinPowered by MintlifyOn this pageOverviewImplementation
MethodsExample ProjectsAdvanced ExamplesGetting Started

------- • -------

https://docs.scrapegraphai.com/cookbook/introduction#advanced-examples

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationCookbookIntroduction
HomeCookbookAPI ReferenceOfficial WebsiteCommunityBlogCookbookIntroductionExamples🏢
Company Information🌟 GitHub Trending📰 Wired Articles🏠 Homes Listings🔬 News Research
Agent💬 Chat with WebpageCookbookIntroductionLearn from practical examples using
ScrapeGraphAPIOverview
Welcome to the ScrapeGraphAPI cookbook! Here you’ll find practical examples
implemented as interactive Google Colab notebooks. Each example demonstrates
different integration methods and use cases.
All examples are available as ready-to-use Google Colab notebooks - just click and
start experimenting!
Implementation Methods
Each example is available in multiple implementations:
SDK Direct UsageBasic implementation using our official SDKsLangChain
IntegrationIntegration with LangChain for LLM workflowsLlamaIndex IntegrationUsing
ScrapeGraph with LlamaIndex tools
Example Projects
🏢 Company InformationExtract structured company data from websites🌟 GitHub
TrendingMonitor trending repositories and developers📰 Wired ArticlesExtract news
articles and content🏠 Homes ListingsScrape real estate property data
Advanced Examples
🔬 Research Agent with TavilyBuild a sophisticated research agent combining
ScrapeGraph, LangGraph, and Tavily Search💬 Chat with WebpageCreate a RAG chatbot
using ScrapeGraph, Burr, and LanceDB
Getting Started

Choose an example that matches your use case


Open the Colab notebook for your preferred implementation method
Follow the step-by-step instructions
Experiment and adapt the code for your needs

Make sure to have your ScrapeGraphAI API key ready. Get one from the dashboard if
you haven’t already.Was this page helpful?YesNo🏢 Company
InformationxgithublinkedinPowered by MintlifyOn this pageOverviewImplementation
MethodsExample ProjectsAdvanced ExamplesGetting Started

------- • -------
https://docs.scrapegraphai.com/cookbook/introduction#getting-started

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationCookbookIntroduction
HomeCookbookAPI ReferenceOfficial WebsiteCommunityBlogCookbookIntroductionExamples🏢
Company Information🌟 GitHub Trending📰 Wired Articles🏠 Homes Listings🔬 News Research
Agent💬 Chat with WebpageCookbookIntroductionLearn from practical examples using
ScrapeGraphAPIOverview
Welcome to the ScrapeGraphAPI cookbook! Here you’ll find practical examples
implemented as interactive Google Colab notebooks. Each example demonstrates
different integration methods and use cases.
All examples are available as ready-to-use Google Colab notebooks - just click and
start experimenting!
Implementation Methods
Each example is available in multiple implementations:
SDK Direct UsageBasic implementation using our official SDKsLangChain
IntegrationIntegration with LangChain for LLM workflowsLlamaIndex IntegrationUsing
ScrapeGraph with LlamaIndex tools
Example Projects
🏢 Company InformationExtract structured company data from websites🌟 GitHub
TrendingMonitor trending repositories and developers📰 Wired ArticlesExtract news
articles and content🏠 Homes ListingsScrape real estate property data
Advanced Examples
🔬 Research Agent with TavilyBuild a sophisticated research agent combining
ScrapeGraph, LangGraph, and Tavily Search💬 Chat with WebpageCreate a RAG chatbot
using ScrapeGraph, Burr, and LanceDB
Getting Started

Choose an example that matches your use case


Open the Colab notebook for your preferred implementation method
Follow the step-by-step instructions
Experiment and adapt the code for your needs

Make sure to have your ScrapeGraphAI API key ready. Get one from the dashboard if
you haven’t already.Was this page helpful?YesNo🏢 Company
InformationxgithublinkedinPowered by MintlifyOn this pageOverviewImplementation
MethodsExample ProjectsAdvanced ExamplesGetting Started

------- • -------

https://docs.scrapegraphai.com/api-reference/introduction#overview

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationAPI
DocumentationIntroductionHomeCookbookAPI ReferenceOfficial WebsiteCommunityBlogAPI
DocumentationIntroductionError HandlingSmartScraperPOSTStart SmartScraperGETGet
SmartScraper StatusLocalScraperPOSTStart LocalScraperGETGet LocalScraper
StatusMarkdownifyPOSTStart MarkdownifyGETGet Markdownify StatusUserGETGet
CreditsPOSTSubmit FeedbackAPI DocumentationIntroductionComplete reference for the
ScrapeGraphAI REST APIOverview
The ScrapeGraphAI API provides powerful endpoints for AI-powered web scraping and
content extraction. Our RESTful API allows you to extract structured data from any
website, process local HTML content, and convert web pages to clean markdown.
Authentication
All API requests require authentication using an API key. You can get your API key
from the dashboard.
SGAI-APIKEY: your-api-key-here
Keep your API key secure and never expose it in client-side code. Use environment
variables to manage your keys safely.
Base URL
https://api.scrapegraphai.com/v1

Available Services
SmartScraperExtract structured data from any website using AILocalScraperProcess
local HTML content with AI extractionMarkdownifyConvert web content to clean
markdownUserManage credits and submit feedback
SDKs & Integration
We provide official SDKs to help you integrate quickly:
Python SDKPerfect for data science and backend applicationsJavaScript SDKIdeal for
web applications and Node.js
AI Framework Integration
LangChainUse our services in your LLM workflowsLlamaIndexBuild powerful search and
QA systems
Error Handling
Our API uses conventional HTTP response codes:

200 - Success
400 - Bad Request
401 - Unauthorized
429 - Too Many Requests
500 - Server Error

Check our error handling guide for detailed information about error responses and
how to handle them.
Support
Need help with the API? We’re here to assist:
Discord CommunityGet help from our communityEmail SupportContact our technical
teamWas this page helpful?YesNoError HandlingxgithublinkedinPowered by MintlifyOn
this pageOverviewAuthenticationBase URLAvailable ServicesSDKs & IntegrationAI
Framework IntegrationError HandlingSupport

------- • -------

https://docs.scrapegraphai.com/api-reference/introduction#authentication

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationAPI
DocumentationIntroductionHomeCookbookAPI ReferenceOfficial WebsiteCommunityBlogAPI
DocumentationIntroductionError HandlingSmartScraperPOSTStart SmartScraperGETGet
SmartScraper StatusLocalScraperPOSTStart LocalScraperGETGet LocalScraper
StatusMarkdownifyPOSTStart MarkdownifyGETGet Markdownify StatusUserGETGet
CreditsPOSTSubmit FeedbackAPI DocumentationIntroductionComplete reference for the
ScrapeGraphAI REST APIOverview
The ScrapeGraphAI API provides powerful endpoints for AI-powered web scraping and
content extraction. Our RESTful API allows you to extract structured data from any
website, process local HTML content, and convert web pages to clean markdown.
Authentication
All API requests require authentication using an API key. You can get your API key
from the dashboard.
SGAI-APIKEY: your-api-key-here

Keep your API key secure and never expose it in client-side code. Use environment
variables to manage your keys safely.
Base URL
https://api.scrapegraphai.com/v1
Available Services
SmartScraperExtract structured data from any website using AILocalScraperProcess
local HTML content with AI extractionMarkdownifyConvert web content to clean
markdownUserManage credits and submit feedback
SDKs & Integration
We provide official SDKs to help you integrate quickly:
Python SDKPerfect for data science and backend applicationsJavaScript SDKIdeal for
web applications and Node.js
AI Framework Integration
LangChainUse our services in your LLM workflowsLlamaIndexBuild powerful search and
QA systems
Error Handling
Our API uses conventional HTTP response codes:

200 - Success
400 - Bad Request
401 - Unauthorized
429 - Too Many Requests
500 - Server Error

Check our error handling guide for detailed information about error responses and
how to handle them.
Support
Need help with the API? We’re here to assist:
Discord CommunityGet help from our communityEmail SupportContact our technical
teamWas this page helpful?YesNoError HandlingxgithublinkedinPowered by MintlifyOn
this pageOverviewAuthenticationBase URLAvailable ServicesSDKs & IntegrationAI
Framework IntegrationError HandlingSupport

------- • -------

https://docs.scrapegraphai.com/api-reference/introduction#base-url

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationAPI
DocumentationIntroductionHomeCookbookAPI ReferenceOfficial WebsiteCommunityBlogAPI
DocumentationIntroductionError HandlingSmartScraperPOSTStart SmartScraperGETGet
SmartScraper StatusLocalScraperPOSTStart LocalScraperGETGet LocalScraper
StatusMarkdownifyPOSTStart MarkdownifyGETGet Markdownify StatusUserGETGet
CreditsPOSTSubmit FeedbackAPI DocumentationIntroductionComplete reference for the
ScrapeGraphAI REST APIOverview
The ScrapeGraphAI API provides powerful endpoints for AI-powered web scraping and
content extraction. Our RESTful API allows you to extract structured data from any
website, process local HTML content, and convert web pages to clean markdown.
Authentication
All API requests require authentication using an API key. You can get your API key
from the dashboard.
SGAI-APIKEY: your-api-key-here

Keep your API key secure and never expose it in client-side code. Use environment
variables to manage your keys safely.
Base URL
https://api.scrapegraphai.com/v1

Available Services
SmartScraperExtract structured data from any website using AILocalScraperProcess
local HTML content with AI extractionMarkdownifyConvert web content to clean
markdownUserManage credits and submit feedback
SDKs & Integration
We provide official SDKs to help you integrate quickly:
Python SDKPerfect for data science and backend applicationsJavaScript SDKIdeal for
web applications and Node.js
AI Framework Integration
LangChainUse our services in your LLM workflowsLlamaIndexBuild powerful search and
QA systems
Error Handling
Our API uses conventional HTTP response codes:

200 - Success
400 - Bad Request
401 - Unauthorized
429 - Too Many Requests
500 - Server Error

Check our error handling guide for detailed information about error responses and
how to handle them.
Support
Need help with the API? We’re here to assist:
Discord CommunityGet help from our communityEmail SupportContact our technical
teamWas this page helpful?YesNoError HandlingxgithublinkedinPowered by MintlifyOn
this pageOverviewAuthenticationBase URLAvailable ServicesSDKs & IntegrationAI
Framework IntegrationError HandlingSupport

------- • -------

https://docs.scrapegraphai.com/api-reference/introduction#available-services

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationAPI
DocumentationIntroductionHomeCookbookAPI ReferenceOfficial WebsiteCommunityBlogAPI
DocumentationIntroductionError HandlingSmartScraperPOSTStart SmartScraperGETGet
SmartScraper StatusLocalScraperPOSTStart LocalScraperGETGet LocalScraper
StatusMarkdownifyPOSTStart MarkdownifyGETGet Markdownify StatusUserGETGet
CreditsPOSTSubmit FeedbackAPI DocumentationIntroductionComplete reference for the
ScrapeGraphAI REST APIOverview
The ScrapeGraphAI API provides powerful endpoints for AI-powered web scraping and
content extraction. Our RESTful API allows you to extract structured data from any
website, process local HTML content, and convert web pages to clean markdown.
Authentication
All API requests require authentication using an API key. You can get your API key
from the dashboard.
SGAI-APIKEY: your-api-key-here

Keep your API key secure and never expose it in client-side code. Use environment
variables to manage your keys safely.
Base URL
https://api.scrapegraphai.com/v1

Available Services
SmartScraperExtract structured data from any website using AILocalScraperProcess
local HTML content with AI extractionMarkdownifyConvert web content to clean
markdownUserManage credits and submit feedback
SDKs & Integration
We provide official SDKs to help you integrate quickly:
Python SDKPerfect for data science and backend applicationsJavaScript SDKIdeal for
web applications and Node.js
AI Framework Integration
LangChainUse our services in your LLM workflowsLlamaIndexBuild powerful search and
QA systems
Error Handling
Our API uses conventional HTTP response codes:

200 - Success
400 - Bad Request
401 - Unauthorized
429 - Too Many Requests
500 - Server Error

Check our error handling guide for detailed information about error responses and
how to handle them.
Support
Need help with the API? We’re here to assist:
Discord CommunityGet help from our communityEmail SupportContact our technical
teamWas this page helpful?YesNoError HandlingxgithublinkedinPowered by MintlifyOn
this pageOverviewAuthenticationBase URLAvailable ServicesSDKs & IntegrationAI
Framework IntegrationError HandlingSupport

------- • -------

https://docs.scrapegraphai.com/api-reference/introduction#sdks-and-integration

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationAPI
DocumentationIntroductionHomeCookbookAPI ReferenceOfficial WebsiteCommunityBlogAPI
DocumentationIntroductionError HandlingSmartScraperPOSTStart SmartScraperGETGet
SmartScraper StatusLocalScraperPOSTStart LocalScraperGETGet LocalScraper
StatusMarkdownifyPOSTStart MarkdownifyGETGet Markdownify StatusUserGETGet
CreditsPOSTSubmit FeedbackAPI DocumentationIntroductionComplete reference for the
ScrapeGraphAI REST APIOverview
The ScrapeGraphAI API provides powerful endpoints for AI-powered web scraping and
content extraction. Our RESTful API allows you to extract structured data from any
website, process local HTML content, and convert web pages to clean markdown.
Authentication
All API requests require authentication using an API key. You can get your API key
from the dashboard.
SGAI-APIKEY: your-api-key-here

Keep your API key secure and never expose it in client-side code. Use environment
variables to manage your keys safely.
Base URL
https://api.scrapegraphai.com/v1

Available Services
SmartScraperExtract structured data from any website using AILocalScraperProcess
local HTML content with AI extractionMarkdownifyConvert web content to clean
markdownUserManage credits and submit feedback
SDKs & Integration
We provide official SDKs to help you integrate quickly:
Python SDKPerfect for data science and backend applicationsJavaScript SDKIdeal for
web applications and Node.js
AI Framework Integration
LangChainUse our services in your LLM workflowsLlamaIndexBuild powerful search and
QA systems
Error Handling
Our API uses conventional HTTP response codes:

200 - Success
400 - Bad Request
401 - Unauthorized
429 - Too Many Requests
500 - Server Error

Check our error handling guide for detailed information about error responses and
how to handle them.
Support
Need help with the API? We’re here to assist:
Discord CommunityGet help from our communityEmail SupportContact our technical
teamWas this page helpful?YesNoError HandlingxgithublinkedinPowered by MintlifyOn
this pageOverviewAuthenticationBase URLAvailable ServicesSDKs & IntegrationAI
Framework IntegrationError HandlingSupport

------- • -------

https://docs.scrapegraphai.com/api-reference/introduction#ai-framework-integration

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationAPI
DocumentationIntroductionHomeCookbookAPI ReferenceOfficial WebsiteCommunityBlogAPI
DocumentationIntroductionError HandlingSmartScraperPOSTStart SmartScraperGETGet
SmartScraper StatusLocalScraperPOSTStart LocalScraperGETGet LocalScraper
StatusMarkdownifyPOSTStart MarkdownifyGETGet Markdownify StatusUserGETGet
CreditsPOSTSubmit FeedbackAPI DocumentationIntroductionComplete reference for the
ScrapeGraphAI REST APIOverview
The ScrapeGraphAI API provides powerful endpoints for AI-powered web scraping and
content extraction. Our RESTful API allows you to extract structured data from any
website, process local HTML content, and convert web pages to clean markdown.
Authentication
All API requests require authentication using an API key. You can get your API key
from the dashboard.
SGAI-APIKEY: your-api-key-here

Keep your API key secure and never expose it in client-side code. Use environment
variables to manage your keys safely.
Base URL
https://api.scrapegraphai.com/v1

Available Services
SmartScraperExtract structured data from any website using AILocalScraperProcess
local HTML content with AI extractionMarkdownifyConvert web content to clean
markdownUserManage credits and submit feedback
SDKs & Integration
We provide official SDKs to help you integrate quickly:
Python SDKPerfect for data science and backend applicationsJavaScript SDKIdeal for
web applications and Node.js
AI Framework Integration
LangChainUse our services in your LLM workflowsLlamaIndexBuild powerful search and
QA systems
Error Handling
Our API uses conventional HTTP response codes:

200 - Success
400 - Bad Request
401 - Unauthorized
429 - Too Many Requests
500 - Server Error
Check our error handling guide for detailed information about error responses and
how to handle them.
Support
Need help with the API? We’re here to assist:
Discord CommunityGet help from our communityEmail SupportContact our technical
teamWas this page helpful?YesNoError HandlingxgithublinkedinPowered by MintlifyOn
this pageOverviewAuthenticationBase URLAvailable ServicesSDKs & IntegrationAI
Framework IntegrationError HandlingSupport

------- • -------

https://docs.scrapegraphai.com/api-reference/introduction#error-handling

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationAPI
DocumentationIntroductionHomeCookbookAPI ReferenceOfficial WebsiteCommunityBlogAPI
DocumentationIntroductionError HandlingSmartScraperPOSTStart SmartScraperGETGet
SmartScraper StatusLocalScraperPOSTStart LocalScraperGETGet LocalScraper
StatusMarkdownifyPOSTStart MarkdownifyGETGet Markdownify StatusUserGETGet
CreditsPOSTSubmit FeedbackAPI DocumentationIntroductionComplete reference for the
ScrapeGraphAI REST APIOverview
The ScrapeGraphAI API provides powerful endpoints for AI-powered web scraping and
content extraction. Our RESTful API allows you to extract structured data from any
website, process local HTML content, and convert web pages to clean markdown.
Authentication
All API requests require authentication using an API key. You can get your API key
from the dashboard.
SGAI-APIKEY: your-api-key-here

Keep your API key secure and never expose it in client-side code. Use environment
variables to manage your keys safely.
Base URL
https://api.scrapegraphai.com/v1

Available Services
SmartScraperExtract structured data from any website using AILocalScraperProcess
local HTML content with AI extractionMarkdownifyConvert web content to clean
markdownUserManage credits and submit feedback
SDKs & Integration
We provide official SDKs to help you integrate quickly:
Python SDKPerfect for data science and backend applicationsJavaScript SDKIdeal for
web applications and Node.js
AI Framework Integration
LangChainUse our services in your LLM workflowsLlamaIndexBuild powerful search and
QA systems
Error Handling
Our API uses conventional HTTP response codes:

200 - Success
400 - Bad Request
401 - Unauthorized
429 - Too Many Requests
500 - Server Error

Check our error handling guide for detailed information about error responses and
how to handle them.
Support
Need help with the API? We’re here to assist:
Discord CommunityGet help from our communityEmail SupportContact our technical
teamWas this page helpful?YesNoError HandlingxgithublinkedinPowered by MintlifyOn
this pageOverviewAuthenticationBase URLAvailable ServicesSDKs & IntegrationAI
Framework IntegrationError HandlingSupport

------- • -------

https://docs.scrapegraphai.com/api-reference/introduction#support

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationAPI
DocumentationIntroductionHomeCookbookAPI ReferenceOfficial WebsiteCommunityBlogAPI
DocumentationIntroductionError HandlingSmartScraperPOSTStart SmartScraperGETGet
SmartScraper StatusLocalScraperPOSTStart LocalScraperGETGet LocalScraper
StatusMarkdownifyPOSTStart MarkdownifyGETGet Markdownify StatusUserGETGet
CreditsPOSTSubmit FeedbackAPI DocumentationIntroductionComplete reference for the
ScrapeGraphAI REST APIOverview
The ScrapeGraphAI API provides powerful endpoints for AI-powered web scraping and
content extraction. Our RESTful API allows you to extract structured data from any
website, process local HTML content, and convert web pages to clean markdown.
Authentication
All API requests require authentication using an API key. You can get your API key
from the dashboard.
SGAI-APIKEY: your-api-key-here

Keep your API key secure and never expose it in client-side code. Use environment
variables to manage your keys safely.
Base URL
https://api.scrapegraphai.com/v1

Available Services
SmartScraperExtract structured data from any website using AILocalScraperProcess
local HTML content with AI extractionMarkdownifyConvert web content to clean
markdownUserManage credits and submit feedback
SDKs & Integration
We provide official SDKs to help you integrate quickly:
Python SDKPerfect for data science and backend applicationsJavaScript SDKIdeal for
web applications and Node.js
AI Framework Integration
LangChainUse our services in your LLM workflowsLlamaIndexBuild powerful search and
QA systems
Error Handling
Our API uses conventional HTTP response codes:

200 - Success
400 - Bad Request
401 - Unauthorized
429 - Too Many Requests
500 - Server Error

Check our error handling guide for detailed information about error responses and
how to handle them.
Support
Need help with the API? We’re here to assist:
Discord CommunityGet help from our communityEmail SupportContact our technical
teamWas this page helpful?YesNoError HandlingxgithublinkedinPowered by MintlifyOn
this pageOverviewAuthenticationBase URLAvailable ServicesSDKs & IntegrationAI
Framework IntegrationError HandlingSupport

------- • -------
https://docs.scrapegraphai.com/services/smartscraper#overview

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationServicesSmartScraper
HomeCookbookAPI ReferenceOfficial WebsiteCommunityBlogGet
StartedIntroductionDashboardServicesSmartScraperLocalScraperMarkdownifyBrowser
ExtensionsOfficial SDKsPython SDKJavaScript SDKIntegrations🦜 LangChain🦙 LlamaIndex👥
CrewAI PhidataContributeOpen SourceFeedbackResources🕷️ScrapeGraphAI
BadgeServicesSmartScraperAI-powered web scraping for any website
Overview
SmartScraper is our flagship LLM-powered web scraping service that intelligently
extracts structured data from any website. Using advanced LLM models, it
understands context and content like a human would, making web data extraction more
reliable and efficient than ever.
Try SmartScraper instantly in our interactive playground - no coding required!
Key Features
Universal CompatibilityWorks with any website structure, including JavaScript-
rendered contentAI UnderstandingContextual understanding of content for accurate
extractionStructured OutputReturns clean, structured data in your preferred
formatSchema SupportDefine custom output schemas using Pydantic or Zod
Use Cases
Content Aggregation

News article extraction


Blog post summarization
Product information gathering
Research data collection

Data Analysis

Market research
Competitor analysis
Price monitoring
Trend tracking

AI Training

Dataset creation
Training data collection
Content classification
Knowledge base building

Want to learn more about our AI-powered scraping technology? Visit our main website
to discover how we’re revolutionizing web data extraction.
Getting Started
Quick Start
from scrapegraph_py import Client

client = Client(api_key="your-api-key")

response = client.smartscraper(
website_url="https://scrapegraphai.com/",
user_prompt="Extract info about the company"
)

Get your API key from the dashboard


Example Response{
"request_id": "sg-req-abc123",
"status": "completed",
"website_url": "https://scrapegraphai.com/",
"user_prompt": "Extract info about the company",
"result": {
"company_name": "ScrapeGraphAI",
"description": "ScrapeGraphAI is a powerful AI scraping API designed for
efficient web data extraction to power LLM applications and AI agents...",
"features": [
"Effortless, cost-effective, and AI-powered data extraction",
"Handles proxy rotation and rate limits",
"Supports a wide variety of websites"
],
"contact_email": "[email protected]",
"social_links": {
"github": "https://github.com/ScrapeGraphAI/Scrapegraph-ai",
"linkedin": "https://www.linkedin.com/company/101881123",
"twitter": "https://x.com/scrapegraphai"
}
},
"error": ""
}
The response includes:
request_id: Unique identifier for tracking your request
status: Current status of the extraction (“completed”, “running”, “failed”)
result: The extracted data in structured JSON format
error: Error message (if any occurred during extraction)

Advanced Usage
Custom Schema Example
Define exactly what data you want to extract:

Async Support
For applications requiring asynchronous execution, SmartScraper provides async
support through the AsyncClient:
from scrapegraph_py import AsyncClient
import asyncio

async def main():


async with AsyncClient(api_key="your-api-key") as client:
response = await client.smartscraper(
website_url="https://example.com",
user_prompt="Extract the main content"
)
print(response)

# Run the async function


asyncio.run(main())

Integration Options
Official SDKs

Python SDK - Perfect for data science and backend applications


JavaScript SDK - Ideal for web applications and Node.js

AI Framework Integrations

LangChain Integration - Use SmartScraper in your LLM workflows


LlamaIndex Integration - Build powerful search and QA systems
Best Practices
Optimizing Extraction

Be specific in your prompts


Use schemas for structured data
Handle pagination for multi-page content
Implement error handling and retries

Rate Limiting

Implement reasonable delays between requests


Use async clients for better performance
Monitor your API usage

Example Projects
Check out our cookbook for real-world examples:

E-commerce product scraping


News aggregation
Research data collection
Content monitoring

API Reference
For detailed API documentation, see:

Start Scraping Job


Get Job Status

Support & Resources


DocumentationComprehensive guides and tutorialsAPI ReferenceDetailed API
documentationCommunityJoin our Discord communityGitHubCheck out our open-source
projects
Ready to Start?Sign up now and get your API key to begin extracting data with
SmartScraper!Was this page helpful?YesNoUser
SettingsLocalScraperxgithublinkedinPowered by MintlifyOn this pageOverviewKey
FeaturesUse CasesContent AggregationData AnalysisAI TrainingGetting StartedQuick
StartAdvanced UsageCustom Schema ExampleAsync SupportIntegration OptionsOfficial
SDKsAI Framework IntegrationsBest PracticesOptimizing ExtractionRate
LimitingExample ProjectsAPI ReferenceSupport & Resources

------- • -------

https://docs.scrapegraphai.com/services/smartscraper#key-features

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationServicesSmartScraper
HomeCookbookAPI ReferenceOfficial WebsiteCommunityBlogGet
StartedIntroductionDashboardServicesSmartScraperLocalScraperMarkdownifyBrowser
ExtensionsOfficial SDKsPython SDKJavaScript SDKIntegrations🦜 LangChain🦙 LlamaIndex👥
CrewAI PhidataContributeOpen SourceFeedbackResources🕷️ScrapeGraphAI
BadgeServicesSmartScraperAI-powered web scraping for any website
Overview
SmartScraper is our flagship LLM-powered web scraping service that intelligently
extracts structured data from any website. Using advanced LLM models, it
understands context and content like a human would, making web data extraction more
reliable and efficient than ever.
Try SmartScraper instantly in our interactive playground - no coding required!
Key Features
Universal CompatibilityWorks with any website structure, including JavaScript-
rendered contentAI UnderstandingContextual understanding of content for accurate
extractionStructured OutputReturns clean, structured data in your preferred
formatSchema SupportDefine custom output schemas using Pydantic or Zod
Use Cases
Content Aggregation

News article extraction


Blog post summarization
Product information gathering
Research data collection

Data Analysis

Market research
Competitor analysis
Price monitoring
Trend tracking

AI Training

Dataset creation
Training data collection
Content classification
Knowledge base building

Want to learn more about our AI-powered scraping technology? Visit our main website
to discover how we’re revolutionizing web data extraction.
Getting Started
Quick Start
from scrapegraph_py import Client

client = Client(api_key="your-api-key")

response = client.smartscraper(
website_url="https://scrapegraphai.com/",
user_prompt="Extract info about the company"
)

Get your API key from the dashboard


Example Response{
"request_id": "sg-req-abc123",
"status": "completed",
"website_url": "https://scrapegraphai.com/",
"user_prompt": "Extract info about the company",
"result": {
"company_name": "ScrapeGraphAI",
"description": "ScrapeGraphAI is a powerful AI scraping API designed for
efficient web data extraction to power LLM applications and AI agents...",
"features": [
"Effortless, cost-effective, and AI-powered data extraction",
"Handles proxy rotation and rate limits",
"Supports a wide variety of websites"
],
"contact_email": "[email protected]",
"social_links": {
"github": "https://github.com/ScrapeGraphAI/Scrapegraph-ai",
"linkedin": "https://www.linkedin.com/company/101881123",
"twitter": "https://x.com/scrapegraphai"
}
},
"error": ""
}
The response includes:
request_id: Unique identifier for tracking your request
status: Current status of the extraction (“completed”, “running”, “failed”)
result: The extracted data in structured JSON format
error: Error message (if any occurred during extraction)

Advanced Usage
Custom Schema Example
Define exactly what data you want to extract:

Async Support
For applications requiring asynchronous execution, SmartScraper provides async
support through the AsyncClient:
from scrapegraph_py import AsyncClient
import asyncio

async def main():


async with AsyncClient(api_key="your-api-key") as client:
response = await client.smartscraper(
website_url="https://example.com",
user_prompt="Extract the main content"
)
print(response)

# Run the async function


asyncio.run(main())

Integration Options
Official SDKs

Python SDK - Perfect for data science and backend applications


JavaScript SDK - Ideal for web applications and Node.js

AI Framework Integrations

LangChain Integration - Use SmartScraper in your LLM workflows


LlamaIndex Integration - Build powerful search and QA systems

Best Practices
Optimizing Extraction

Be specific in your prompts


Use schemas for structured data
Handle pagination for multi-page content
Implement error handling and retries

Rate Limiting

Implement reasonable delays between requests


Use async clients for better performance
Monitor your API usage

Example Projects
Check out our cookbook for real-world examples:

E-commerce product scraping


News aggregation
Research data collection
Content monitoring

API Reference
For detailed API documentation, see:

Start Scraping Job


Get Job Status

Support & Resources


DocumentationComprehensive guides and tutorialsAPI ReferenceDetailed API
documentationCommunityJoin our Discord communityGitHubCheck out our open-source
projects
Ready to Start?Sign up now and get your API key to begin extracting data with
SmartScraper!Was this page helpful?YesNoUser
SettingsLocalScraperxgithublinkedinPowered by MintlifyOn this pageOverviewKey
FeaturesUse CasesContent AggregationData AnalysisAI TrainingGetting StartedQuick
StartAdvanced UsageCustom Schema ExampleAsync SupportIntegration OptionsOfficial
SDKsAI Framework IntegrationsBest PracticesOptimizing ExtractionRate
LimitingExample ProjectsAPI ReferenceSupport & Resources

------- • -------

https://docs.scrapegraphai.com/services/smartscraper#use-cases

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationServicesSmartScraper
HomeCookbookAPI ReferenceOfficial WebsiteCommunityBlogGet
StartedIntroductionDashboardServicesSmartScraperLocalScraperMarkdownifyBrowser
ExtensionsOfficial SDKsPython SDKJavaScript SDKIntegrations🦜 LangChain🦙 LlamaIndex👥
CrewAI PhidataContributeOpen SourceFeedbackResources🕷️ScrapeGraphAI
BadgeServicesSmartScraperAI-powered web scraping for any website
Overview
SmartScraper is our flagship LLM-powered web scraping service that intelligently
extracts structured data from any website. Using advanced LLM models, it
understands context and content like a human would, making web data extraction more
reliable and efficient than ever.
Try SmartScraper instantly in our interactive playground - no coding required!
Key Features
Universal CompatibilityWorks with any website structure, including JavaScript-
rendered contentAI UnderstandingContextual understanding of content for accurate
extractionStructured OutputReturns clean, structured data in your preferred
formatSchema SupportDefine custom output schemas using Pydantic or Zod
Use Cases
Content Aggregation

News article extraction


Blog post summarization
Product information gathering
Research data collection

Data Analysis

Market research
Competitor analysis
Price monitoring
Trend tracking
AI Training

Dataset creation
Training data collection
Content classification
Knowledge base building

Want to learn more about our AI-powered scraping technology? Visit our main website
to discover how we’re revolutionizing web data extraction.
Getting Started
Quick Start
from scrapegraph_py import Client

client = Client(api_key="your-api-key")

response = client.smartscraper(
website_url="https://scrapegraphai.com/",
user_prompt="Extract info about the company"
)

Get your API key from the dashboard


Example Response{
"request_id": "sg-req-abc123",
"status": "completed",
"website_url": "https://scrapegraphai.com/",
"user_prompt": "Extract info about the company",
"result": {
"company_name": "ScrapeGraphAI",
"description": "ScrapeGraphAI is a powerful AI scraping API designed for
efficient web data extraction to power LLM applications and AI agents...",
"features": [
"Effortless, cost-effective, and AI-powered data extraction",
"Handles proxy rotation and rate limits",
"Supports a wide variety of websites"
],
"contact_email": "[email protected]",
"social_links": {
"github": "https://github.com/ScrapeGraphAI/Scrapegraph-ai",
"linkedin": "https://www.linkedin.com/company/101881123",
"twitter": "https://x.com/scrapegraphai"
}
},
"error": ""
}
The response includes:
request_id: Unique identifier for tracking your request
status: Current status of the extraction (“completed”, “running”, “failed”)
result: The extracted data in structured JSON format
error: Error message (if any occurred during extraction)

Advanced Usage
Custom Schema Example
Define exactly what data you want to extract:

Async Support
For applications requiring asynchronous execution, SmartScraper provides async
support through the AsyncClient:
from scrapegraph_py import AsyncClient
import asyncio
async def main():
async with AsyncClient(api_key="your-api-key") as client:
response = await client.smartscraper(
website_url="https://example.com",
user_prompt="Extract the main content"
)
print(response)

# Run the async function


asyncio.run(main())

Integration Options
Official SDKs

Python SDK - Perfect for data science and backend applications


JavaScript SDK - Ideal for web applications and Node.js

AI Framework Integrations

LangChain Integration - Use SmartScraper in your LLM workflows


LlamaIndex Integration - Build powerful search and QA systems

Best Practices
Optimizing Extraction

Be specific in your prompts


Use schemas for structured data
Handle pagination for multi-page content
Implement error handling and retries

Rate Limiting

Implement reasonable delays between requests


Use async clients for better performance
Monitor your API usage

Example Projects
Check out our cookbook for real-world examples:

E-commerce product scraping


News aggregation
Research data collection
Content monitoring

API Reference
For detailed API documentation, see:

Start Scraping Job


Get Job Status

Support & Resources


DocumentationComprehensive guides and tutorialsAPI ReferenceDetailed API
documentationCommunityJoin our Discord communityGitHubCheck out our open-source
projects
Ready to Start?Sign up now and get your API key to begin extracting data with
SmartScraper!Was this page helpful?YesNoUser
SettingsLocalScraperxgithublinkedinPowered by MintlifyOn this pageOverviewKey
FeaturesUse CasesContent AggregationData AnalysisAI TrainingGetting StartedQuick
StartAdvanced UsageCustom Schema ExampleAsync SupportIntegration OptionsOfficial
SDKsAI Framework IntegrationsBest PracticesOptimizing ExtractionRate
LimitingExample ProjectsAPI ReferenceSupport & Resources

------- • -------

https://docs.scrapegraphai.com/services/smartscraper#content-aggregation

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationServicesSmartScraper
HomeCookbookAPI ReferenceOfficial WebsiteCommunityBlogGet
StartedIntroductionDashboardServicesSmartScraperLocalScraperMarkdownifyBrowser
ExtensionsOfficial SDKsPython SDKJavaScript SDKIntegrations🦜 LangChain🦙 LlamaIndex👥
CrewAI PhidataContributeOpen SourceFeedbackResources🕷️ScrapeGraphAI
BadgeServicesSmartScraperAI-powered web scraping for any website
Overview
SmartScraper is our flagship LLM-powered web scraping service that intelligently
extracts structured data from any website. Using advanced LLM models, it
understands context and content like a human would, making web data extraction more
reliable and efficient than ever.
Try SmartScraper instantly in our interactive playground - no coding required!
Key Features
Universal CompatibilityWorks with any website structure, including JavaScript-
rendered contentAI UnderstandingContextual understanding of content for accurate
extractionStructured OutputReturns clean, structured data in your preferred
formatSchema SupportDefine custom output schemas using Pydantic or Zod
Use Cases
Content Aggregation

News article extraction


Blog post summarization
Product information gathering
Research data collection

Data Analysis

Market research
Competitor analysis
Price monitoring
Trend tracking

AI Training

Dataset creation
Training data collection
Content classification
Knowledge base building

Want to learn more about our AI-powered scraping technology? Visit our main website
to discover how we’re revolutionizing web data extraction.
Getting Started
Quick Start
from scrapegraph_py import Client

client = Client(api_key="your-api-key")

response = client.smartscraper(
website_url="https://scrapegraphai.com/",
user_prompt="Extract info about the company"
)

Get your API key from the dashboard


Example Response{
"request_id": "sg-req-abc123",
"status": "completed",
"website_url": "https://scrapegraphai.com/",
"user_prompt": "Extract info about the company",
"result": {
"company_name": "ScrapeGraphAI",
"description": "ScrapeGraphAI is a powerful AI scraping API designed for
efficient web data extraction to power LLM applications and AI agents...",
"features": [
"Effortless, cost-effective, and AI-powered data extraction",
"Handles proxy rotation and rate limits",
"Supports a wide variety of websites"
],
"contact_email": "[email protected]",
"social_links": {
"github": "https://github.com/ScrapeGraphAI/Scrapegraph-ai",
"linkedin": "https://www.linkedin.com/company/101881123",
"twitter": "https://x.com/scrapegraphai"
}
},
"error": ""
}
The response includes:
request_id: Unique identifier for tracking your request
status: Current status of the extraction (“completed”, “running”, “failed”)
result: The extracted data in structured JSON format
error: Error message (if any occurred during extraction)

Advanced Usage
Custom Schema Example
Define exactly what data you want to extract:

Async Support
For applications requiring asynchronous execution, SmartScraper provides async
support through the AsyncClient:
from scrapegraph_py import AsyncClient
import asyncio

async def main():


async with AsyncClient(api_key="your-api-key") as client:
response = await client.smartscraper(
website_url="https://example.com",
user_prompt="Extract the main content"
)
print(response)

# Run the async function


asyncio.run(main())

Integration Options
Official SDKs

Python SDK - Perfect for data science and backend applications


JavaScript SDK - Ideal for web applications and Node.js
AI Framework Integrations

LangChain Integration - Use SmartScraper in your LLM workflows


LlamaIndex Integration - Build powerful search and QA systems

Best Practices
Optimizing Extraction

Be specific in your prompts


Use schemas for structured data
Handle pagination for multi-page content
Implement error handling and retries

Rate Limiting

Implement reasonable delays between requests


Use async clients for better performance
Monitor your API usage

Example Projects
Check out our cookbook for real-world examples:

E-commerce product scraping


News aggregation
Research data collection
Content monitoring

API Reference
For detailed API documentation, see:

Start Scraping Job


Get Job Status

Support & Resources


DocumentationComprehensive guides and tutorialsAPI ReferenceDetailed API
documentationCommunityJoin our Discord communityGitHubCheck out our open-source
projects
Ready to Start?Sign up now and get your API key to begin extracting data with
SmartScraper!Was this page helpful?YesNoUser
SettingsLocalScraperxgithublinkedinPowered by MintlifyOn this pageOverviewKey
FeaturesUse CasesContent AggregationData AnalysisAI TrainingGetting StartedQuick
StartAdvanced UsageCustom Schema ExampleAsync SupportIntegration OptionsOfficial
SDKsAI Framework IntegrationsBest PracticesOptimizing ExtractionRate
LimitingExample ProjectsAPI ReferenceSupport & Resources

------- • -------

https://docs.scrapegraphai.com/services/smartscraper#data-analysis

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationServicesSmartScraper
HomeCookbookAPI ReferenceOfficial WebsiteCommunityBlogGet
StartedIntroductionDashboardServicesSmartScraperLocalScraperMarkdownifyBrowser
ExtensionsOfficial SDKsPython SDKJavaScript SDKIntegrations🦜 LangChain🦙 LlamaIndex👥
CrewAI PhidataContributeOpen SourceFeedbackResources🕷️ScrapeGraphAI
BadgeServicesSmartScraperAI-powered web scraping for any website
Overview
SmartScraper is our flagship LLM-powered web scraping service that intelligently
extracts structured data from any website. Using advanced LLM models, it
understands context and content like a human would, making web data extraction more
reliable and efficient than ever.
Try SmartScraper instantly in our interactive playground - no coding required!
Key Features
Universal CompatibilityWorks with any website structure, including JavaScript-
rendered contentAI UnderstandingContextual understanding of content for accurate
extractionStructured OutputReturns clean, structured data in your preferred
formatSchema SupportDefine custom output schemas using Pydantic or Zod
Use Cases
Content Aggregation

News article extraction


Blog post summarization
Product information gathering
Research data collection

Data Analysis

Market research
Competitor analysis
Price monitoring
Trend tracking

AI Training

Dataset creation
Training data collection
Content classification
Knowledge base building

Want to learn more about our AI-powered scraping technology? Visit our main website
to discover how we’re revolutionizing web data extraction.
Getting Started
Quick Start
from scrapegraph_py import Client

client = Client(api_key="your-api-key")

response = client.smartscraper(
website_url="https://scrapegraphai.com/",
user_prompt="Extract info about the company"
)

Get your API key from the dashboard


Example Response{
"request_id": "sg-req-abc123",
"status": "completed",
"website_url": "https://scrapegraphai.com/",
"user_prompt": "Extract info about the company",
"result": {
"company_name": "ScrapeGraphAI",
"description": "ScrapeGraphAI is a powerful AI scraping API designed for
efficient web data extraction to power LLM applications and AI agents...",
"features": [
"Effortless, cost-effective, and AI-powered data extraction",
"Handles proxy rotation and rate limits",
"Supports a wide variety of websites"
],
"contact_email": "[email protected]",
"social_links": {
"github": "https://github.com/ScrapeGraphAI/Scrapegraph-ai",
"linkedin": "https://www.linkedin.com/company/101881123",
"twitter": "https://x.com/scrapegraphai"
}
},
"error": ""
}
The response includes:
request_id: Unique identifier for tracking your request
status: Current status of the extraction (“completed”, “running”, “failed”)
result: The extracted data in structured JSON format
error: Error message (if any occurred during extraction)

Advanced Usage
Custom Schema Example
Define exactly what data you want to extract:

Async Support
For applications requiring asynchronous execution, SmartScraper provides async
support through the AsyncClient:
from scrapegraph_py import AsyncClient
import asyncio

async def main():


async with AsyncClient(api_key="your-api-key") as client:
response = await client.smartscraper(
website_url="https://example.com",
user_prompt="Extract the main content"
)
print(response)

# Run the async function


asyncio.run(main())

Integration Options
Official SDKs

Python SDK - Perfect for data science and backend applications


JavaScript SDK - Ideal for web applications and Node.js

AI Framework Integrations

LangChain Integration - Use SmartScraper in your LLM workflows


LlamaIndex Integration - Build powerful search and QA systems

Best Practices
Optimizing Extraction

Be specific in your prompts


Use schemas for structured data
Handle pagination for multi-page content
Implement error handling and retries

Rate Limiting

Implement reasonable delays between requests


Use async clients for better performance
Monitor your API usage
Example Projects
Check out our cookbook for real-world examples:

E-commerce product scraping


News aggregation
Research data collection
Content monitoring

API Reference
For detailed API documentation, see:

Start Scraping Job


Get Job Status

Support & Resources


DocumentationComprehensive guides and tutorialsAPI ReferenceDetailed API
documentationCommunityJoin our Discord communityGitHubCheck out our open-source
projects
Ready to Start?Sign up now and get your API key to begin extracting data with
SmartScraper!Was this page helpful?YesNoUser
SettingsLocalScraperxgithublinkedinPowered by MintlifyOn this pageOverviewKey
FeaturesUse CasesContent AggregationData AnalysisAI TrainingGetting StartedQuick
StartAdvanced UsageCustom Schema ExampleAsync SupportIntegration OptionsOfficial
SDKsAI Framework IntegrationsBest PracticesOptimizing ExtractionRate
LimitingExample ProjectsAPI ReferenceSupport & Resources

------- • -------

https://docs.scrapegraphai.com/services/smartscraper#ai-training

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationServicesSmartScraper
HomeCookbookAPI ReferenceOfficial WebsiteCommunityBlogGet
StartedIntroductionDashboardServicesSmartScraperLocalScraperMarkdownifyBrowser
ExtensionsOfficial SDKsPython SDKJavaScript SDKIntegrations🦜 LangChain🦙 LlamaIndex👥
CrewAI PhidataContributeOpen SourceFeedbackResources🕷️ScrapeGraphAI
BadgeServicesSmartScraperAI-powered web scraping for any website
Overview
SmartScraper is our flagship LLM-powered web scraping service that intelligently
extracts structured data from any website. Using advanced LLM models, it
understands context and content like a human would, making web data extraction more
reliable and efficient than ever.
Try SmartScraper instantly in our interactive playground - no coding required!
Key Features
Universal CompatibilityWorks with any website structure, including JavaScript-
rendered contentAI UnderstandingContextual understanding of content for accurate
extractionStructured OutputReturns clean, structured data in your preferred
formatSchema SupportDefine custom output schemas using Pydantic or Zod
Use Cases
Content Aggregation

News article extraction


Blog post summarization
Product information gathering
Research data collection

Data Analysis
Market research
Competitor analysis
Price monitoring
Trend tracking

AI Training

Dataset creation
Training data collection
Content classification
Knowledge base building

Want to learn more about our AI-powered scraping technology? Visit our main website
to discover how we’re revolutionizing web data extraction.
Getting Started
Quick Start
from scrapegraph_py import Client

client = Client(api_key="your-api-key")

response = client.smartscraper(
website_url="https://scrapegraphai.com/",
user_prompt="Extract info about the company"
)

Get your API key from the dashboard


Example Response{
"request_id": "sg-req-abc123",
"status": "completed",
"website_url": "https://scrapegraphai.com/",
"user_prompt": "Extract info about the company",
"result": {
"company_name": "ScrapeGraphAI",
"description": "ScrapeGraphAI is a powerful AI scraping API designed for
efficient web data extraction to power LLM applications and AI agents...",
"features": [
"Effortless, cost-effective, and AI-powered data extraction",
"Handles proxy rotation and rate limits",
"Supports a wide variety of websites"
],
"contact_email": "[email protected]",
"social_links": {
"github": "https://github.com/ScrapeGraphAI/Scrapegraph-ai",
"linkedin": "https://www.linkedin.com/company/101881123",
"twitter": "https://x.com/scrapegraphai"
}
},
"error": ""
}
The response includes:
request_id: Unique identifier for tracking your request
status: Current status of the extraction (“completed”, “running”, “failed”)
result: The extracted data in structured JSON format
error: Error message (if any occurred during extraction)

Advanced Usage
Custom Schema Example
Define exactly what data you want to extract:
Async Support
For applications requiring asynchronous execution, SmartScraper provides async
support through the AsyncClient:
from scrapegraph_py import AsyncClient
import asyncio

async def main():


async with AsyncClient(api_key="your-api-key") as client:
response = await client.smartscraper(
website_url="https://example.com",
user_prompt="Extract the main content"
)
print(response)

# Run the async function


asyncio.run(main())

Integration Options
Official SDKs

Python SDK - Perfect for data science and backend applications


JavaScript SDK - Ideal for web applications and Node.js

AI Framework Integrations

LangChain Integration - Use SmartScraper in your LLM workflows


LlamaIndex Integration - Build powerful search and QA systems

Best Practices
Optimizing Extraction

Be specific in your prompts


Use schemas for structured data
Handle pagination for multi-page content
Implement error handling and retries

Rate Limiting

Implement reasonable delays between requests


Use async clients for better performance
Monitor your API usage

Example Projects
Check out our cookbook for real-world examples:

E-commerce product scraping


News aggregation
Research data collection
Content monitoring

API Reference
For detailed API documentation, see:

Start Scraping Job


Get Job Status

Support & Resources


DocumentationComprehensive guides and tutorialsAPI ReferenceDetailed API
documentationCommunityJoin our Discord communityGitHubCheck out our open-source
projects
Ready to Start?Sign up now and get your API key to begin extracting data with
SmartScraper!Was this page helpful?YesNoUser
SettingsLocalScraperxgithublinkedinPowered by MintlifyOn this pageOverviewKey
FeaturesUse CasesContent AggregationData AnalysisAI TrainingGetting StartedQuick
StartAdvanced UsageCustom Schema ExampleAsync SupportIntegration OptionsOfficial
SDKsAI Framework IntegrationsBest PracticesOptimizing ExtractionRate
LimitingExample ProjectsAPI ReferenceSupport & Resources

------- • -------

https://docs.scrapegraphai.com/services/smartscraper#getting-started

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationServicesSmartScraper
HomeCookbookAPI ReferenceOfficial WebsiteCommunityBlogGet
StartedIntroductionDashboardServicesSmartScraperLocalScraperMarkdownifyBrowser
ExtensionsOfficial SDKsPython SDKJavaScript SDKIntegrations🦜 LangChain🦙 LlamaIndex👥
CrewAI PhidataContributeOpen SourceFeedbackResources🕷️ScrapeGraphAI
BadgeServicesSmartScraperAI-powered web scraping for any website
Overview
SmartScraper is our flagship LLM-powered web scraping service that intelligently
extracts structured data from any website. Using advanced LLM models, it
understands context and content like a human would, making web data extraction more
reliable and efficient than ever.
Try SmartScraper instantly in our interactive playground - no coding required!
Key Features
Universal CompatibilityWorks with any website structure, including JavaScript-
rendered contentAI UnderstandingContextual understanding of content for accurate
extractionStructured OutputReturns clean, structured data in your preferred
formatSchema SupportDefine custom output schemas using Pydantic or Zod
Use Cases
Content Aggregation

News article extraction


Blog post summarization
Product information gathering
Research data collection

Data Analysis

Market research
Competitor analysis
Price monitoring
Trend tracking

AI Training

Dataset creation
Training data collection
Content classification
Knowledge base building

Want to learn more about our AI-powered scraping technology? Visit our main website
to discover how we’re revolutionizing web data extraction.
Getting Started
Quick Start
from scrapegraph_py import Client
client = Client(api_key="your-api-key")

response = client.smartscraper(
website_url="https://scrapegraphai.com/",
user_prompt="Extract info about the company"
)

Get your API key from the dashboard


Example Response{
"request_id": "sg-req-abc123",
"status": "completed",
"website_url": "https://scrapegraphai.com/",
"user_prompt": "Extract info about the company",
"result": {
"company_name": "ScrapeGraphAI",
"description": "ScrapeGraphAI is a powerful AI scraping API designed for
efficient web data extraction to power LLM applications and AI agents...",
"features": [
"Effortless, cost-effective, and AI-powered data extraction",
"Handles proxy rotation and rate limits",
"Supports a wide variety of websites"
],
"contact_email": "[email protected]",
"social_links": {
"github": "https://github.com/ScrapeGraphAI/Scrapegraph-ai",
"linkedin": "https://www.linkedin.com/company/101881123",
"twitter": "https://x.com/scrapegraphai"
}
},
"error": ""
}
The response includes:
request_id: Unique identifier for tracking your request
status: Current status of the extraction (“completed”, “running”, “failed”)
result: The extracted data in structured JSON format
error: Error message (if any occurred during extraction)

Advanced Usage
Custom Schema Example
Define exactly what data you want to extract:

Async Support
For applications requiring asynchronous execution, SmartScraper provides async
support through the AsyncClient:
from scrapegraph_py import AsyncClient
import asyncio

async def main():


async with AsyncClient(api_key="your-api-key") as client:
response = await client.smartscraper(
website_url="https://example.com",
user_prompt="Extract the main content"
)
print(response)

# Run the async function


asyncio.run(main())

Integration Options
Official SDKs

Python SDK - Perfect for data science and backend applications


JavaScript SDK - Ideal for web applications and Node.js

AI Framework Integrations

LangChain Integration - Use SmartScraper in your LLM workflows


LlamaIndex Integration - Build powerful search and QA systems

Best Practices
Optimizing Extraction

Be specific in your prompts


Use schemas for structured data
Handle pagination for multi-page content
Implement error handling and retries

Rate Limiting

Implement reasonable delays between requests


Use async clients for better performance
Monitor your API usage

Example Projects
Check out our cookbook for real-world examples:

E-commerce product scraping


News aggregation
Research data collection
Content monitoring

API Reference
For detailed API documentation, see:

Start Scraping Job


Get Job Status

Support & Resources


DocumentationComprehensive guides and tutorialsAPI ReferenceDetailed API
documentationCommunityJoin our Discord communityGitHubCheck out our open-source
projects
Ready to Start?Sign up now and get your API key to begin extracting data with
SmartScraper!Was this page helpful?YesNoUser
SettingsLocalScraperxgithublinkedinPowered by MintlifyOn this pageOverviewKey
FeaturesUse CasesContent AggregationData AnalysisAI TrainingGetting StartedQuick
StartAdvanced UsageCustom Schema ExampleAsync SupportIntegration OptionsOfficial
SDKsAI Framework IntegrationsBest PracticesOptimizing ExtractionRate
LimitingExample ProjectsAPI ReferenceSupport & Resources

------- • -------

https://docs.scrapegraphai.com/services/smartscraper#quick-start

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationServicesSmartScraper
HomeCookbookAPI ReferenceOfficial WebsiteCommunityBlogGet
StartedIntroductionDashboardServicesSmartScraperLocalScraperMarkdownifyBrowser
ExtensionsOfficial SDKsPython SDKJavaScript SDKIntegrations🦜 LangChain🦙 LlamaIndex👥
CrewAI PhidataContributeOpen SourceFeedbackResources🕷️ScrapeGraphAI
BadgeServicesSmartScraperAI-powered web scraping for any website
Overview
SmartScraper is our flagship LLM-powered web scraping service that intelligently
extracts structured data from any website. Using advanced LLM models, it
understands context and content like a human would, making web data extraction more
reliable and efficient than ever.
Try SmartScraper instantly in our interactive playground - no coding required!
Key Features
Universal CompatibilityWorks with any website structure, including JavaScript-
rendered contentAI UnderstandingContextual understanding of content for accurate
extractionStructured OutputReturns clean, structured data in your preferred
formatSchema SupportDefine custom output schemas using Pydantic or Zod
Use Cases
Content Aggregation

News article extraction


Blog post summarization
Product information gathering
Research data collection

Data Analysis

Market research
Competitor analysis
Price monitoring
Trend tracking

AI Training

Dataset creation
Training data collection
Content classification
Knowledge base building

Want to learn more about our AI-powered scraping technology? Visit our main website
to discover how we’re revolutionizing web data extraction.
Getting Started
Quick Start
from scrapegraph_py import Client

client = Client(api_key="your-api-key")

response = client.smartscraper(
website_url="https://scrapegraphai.com/",
user_prompt="Extract info about the company"
)

Get your API key from the dashboard


Example Response{
"request_id": "sg-req-abc123",
"status": "completed",
"website_url": "https://scrapegraphai.com/",
"user_prompt": "Extract info about the company",
"result": {
"company_name": "ScrapeGraphAI",
"description": "ScrapeGraphAI is a powerful AI scraping API designed for
efficient web data extraction to power LLM applications and AI agents...",
"features": [
"Effortless, cost-effective, and AI-powered data extraction",
"Handles proxy rotation and rate limits",
"Supports a wide variety of websites"
],
"contact_email": "[email protected]",
"social_links": {
"github": "https://github.com/ScrapeGraphAI/Scrapegraph-ai",
"linkedin": "https://www.linkedin.com/company/101881123",
"twitter": "https://x.com/scrapegraphai"
}
},
"error": ""
}
The response includes:
request_id: Unique identifier for tracking your request
status: Current status of the extraction (“completed”, “running”, “failed”)
result: The extracted data in structured JSON format
error: Error message (if any occurred during extraction)

Advanced Usage
Custom Schema Example
Define exactly what data you want to extract:

Async Support
For applications requiring asynchronous execution, SmartScraper provides async
support through the AsyncClient:
from scrapegraph_py import AsyncClient
import asyncio

async def main():


async with AsyncClient(api_key="your-api-key") as client:
response = await client.smartscraper(
website_url="https://example.com",
user_prompt="Extract the main content"
)
print(response)

# Run the async function


asyncio.run(main())

Integration Options
Official SDKs

Python SDK - Perfect for data science and backend applications


JavaScript SDK - Ideal for web applications and Node.js

AI Framework Integrations

LangChain Integration - Use SmartScraper in your LLM workflows


LlamaIndex Integration - Build powerful search and QA systems

Best Practices
Optimizing Extraction

Be specific in your prompts


Use schemas for structured data
Handle pagination for multi-page content
Implement error handling and retries
Rate Limiting

Implement reasonable delays between requests


Use async clients for better performance
Monitor your API usage

Example Projects
Check out our cookbook for real-world examples:

E-commerce product scraping


News aggregation
Research data collection
Content monitoring

API Reference
For detailed API documentation, see:

Start Scraping Job


Get Job Status

Support & Resources


DocumentationComprehensive guides and tutorialsAPI ReferenceDetailed API
documentationCommunityJoin our Discord communityGitHubCheck out our open-source
projects
Ready to Start?Sign up now and get your API key to begin extracting data with
SmartScraper!Was this page helpful?YesNoUser
SettingsLocalScraperxgithublinkedinPowered by MintlifyOn this pageOverviewKey
FeaturesUse CasesContent AggregationData AnalysisAI TrainingGetting StartedQuick
StartAdvanced UsageCustom Schema ExampleAsync SupportIntegration OptionsOfficial
SDKsAI Framework IntegrationsBest PracticesOptimizing ExtractionRate
LimitingExample ProjectsAPI ReferenceSupport & Resources

------- • -------

https://docs.scrapegraphai.com/services/smartscraper#advanced-usage

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationServicesSmartScraper
HomeCookbookAPI ReferenceOfficial WebsiteCommunityBlogGet
StartedIntroductionDashboardServicesSmartScraperLocalScraperMarkdownifyBrowser
ExtensionsOfficial SDKsPython SDKJavaScript SDKIntegrations🦜 LangChain🦙 LlamaIndex👥
CrewAI PhidataContributeOpen SourceFeedbackResources🕷️ScrapeGraphAI
BadgeServicesSmartScraperAI-powered web scraping for any website
Overview
SmartScraper is our flagship LLM-powered web scraping service that intelligently
extracts structured data from any website. Using advanced LLM models, it
understands context and content like a human would, making web data extraction more
reliable and efficient than ever.
Try SmartScraper instantly in our interactive playground - no coding required!
Key Features
Universal CompatibilityWorks with any website structure, including JavaScript-
rendered contentAI UnderstandingContextual understanding of content for accurate
extractionStructured OutputReturns clean, structured data in your preferred
formatSchema SupportDefine custom output schemas using Pydantic or Zod
Use Cases
Content Aggregation

News article extraction


Blog post summarization
Product information gathering
Research data collection

Data Analysis

Market research
Competitor analysis
Price monitoring
Trend tracking

AI Training

Dataset creation
Training data collection
Content classification
Knowledge base building

Want to learn more about our AI-powered scraping technology? Visit our main website
to discover how we’re revolutionizing web data extraction.
Getting Started
Quick Start
from scrapegraph_py import Client

client = Client(api_key="your-api-key")

response = client.smartscraper(
website_url="https://scrapegraphai.com/",
user_prompt="Extract info about the company"
)

Get your API key from the dashboard


Example Response{
"request_id": "sg-req-abc123",
"status": "completed",
"website_url": "https://scrapegraphai.com/",
"user_prompt": "Extract info about the company",
"result": {
"company_name": "ScrapeGraphAI",
"description": "ScrapeGraphAI is a powerful AI scraping API designed for
efficient web data extraction to power LLM applications and AI agents...",
"features": [
"Effortless, cost-effective, and AI-powered data extraction",
"Handles proxy rotation and rate limits",
"Supports a wide variety of websites"
],
"contact_email": "[email protected]",
"social_links": {
"github": "https://github.com/ScrapeGraphAI/Scrapegraph-ai",
"linkedin": "https://www.linkedin.com/company/101881123",
"twitter": "https://x.com/scrapegraphai"
}
},
"error": ""
}
The response includes:
request_id: Unique identifier for tracking your request
status: Current status of the extraction (“completed”, “running”, “failed”)
result: The extracted data in structured JSON format
error: Error message (if any occurred during extraction)
Advanced Usage
Custom Schema Example
Define exactly what data you want to extract:

Async Support
For applications requiring asynchronous execution, SmartScraper provides async
support through the AsyncClient:
from scrapegraph_py import AsyncClient
import asyncio

async def main():


async with AsyncClient(api_key="your-api-key") as client:
response = await client.smartscraper(
website_url="https://example.com",
user_prompt="Extract the main content"
)
print(response)

# Run the async function


asyncio.run(main())

Integration Options
Official SDKs

Python SDK - Perfect for data science and backend applications


JavaScript SDK - Ideal for web applications and Node.js

AI Framework Integrations

LangChain Integration - Use SmartScraper in your LLM workflows


LlamaIndex Integration - Build powerful search and QA systems

Best Practices
Optimizing Extraction

Be specific in your prompts


Use schemas for structured data
Handle pagination for multi-page content
Implement error handling and retries

Rate Limiting

Implement reasonable delays between requests


Use async clients for better performance
Monitor your API usage

Example Projects
Check out our cookbook for real-world examples:

E-commerce product scraping


News aggregation
Research data collection
Content monitoring

API Reference
For detailed API documentation, see:

Start Scraping Job


Get Job Status

Support & Resources


DocumentationComprehensive guides and tutorialsAPI ReferenceDetailed API
documentationCommunityJoin our Discord communityGitHubCheck out our open-source
projects
Ready to Start?Sign up now and get your API key to begin extracting data with
SmartScraper!Was this page helpful?YesNoUser
SettingsLocalScraperxgithublinkedinPowered by MintlifyOn this pageOverviewKey
FeaturesUse CasesContent AggregationData AnalysisAI TrainingGetting StartedQuick
StartAdvanced UsageCustom Schema ExampleAsync SupportIntegration OptionsOfficial
SDKsAI Framework IntegrationsBest PracticesOptimizing ExtractionRate
LimitingExample ProjectsAPI ReferenceSupport & Resources

------- • -------

https://docs.scrapegraphai.com/services/smartscraper#custom-schema-example

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationServicesSmartScraper
HomeCookbookAPI ReferenceOfficial WebsiteCommunityBlogGet
StartedIntroductionDashboardServicesSmartScraperLocalScraperMarkdownifyBrowser
ExtensionsOfficial SDKsPython SDKJavaScript SDKIntegrations🦜 LangChain🦙 LlamaIndex👥
CrewAI PhidataContributeOpen SourceFeedbackResources🕷️ScrapeGraphAI
BadgeServicesSmartScraperAI-powered web scraping for any website
Overview
SmartScraper is our flagship LLM-powered web scraping service that intelligently
extracts structured data from any website. Using advanced LLM models, it
understands context and content like a human would, making web data extraction more
reliable and efficient than ever.
Try SmartScraper instantly in our interactive playground - no coding required!
Key Features
Universal CompatibilityWorks with any website structure, including JavaScript-
rendered contentAI UnderstandingContextual understanding of content for accurate
extractionStructured OutputReturns clean, structured data in your preferred
formatSchema SupportDefine custom output schemas using Pydantic or Zod
Use Cases
Content Aggregation

News article extraction


Blog post summarization
Product information gathering
Research data collection

Data Analysis

Market research
Competitor analysis
Price monitoring
Trend tracking

AI Training

Dataset creation
Training data collection
Content classification
Knowledge base building

Want to learn more about our AI-powered scraping technology? Visit our main website
to discover how we’re revolutionizing web data extraction.
Getting Started
Quick Start
from scrapegraph_py import Client

client = Client(api_key="your-api-key")

response = client.smartscraper(
website_url="https://scrapegraphai.com/",
user_prompt="Extract info about the company"
)

Get your API key from the dashboard


Example Response{
"request_id": "sg-req-abc123",
"status": "completed",
"website_url": "https://scrapegraphai.com/",
"user_prompt": "Extract info about the company",
"result": {
"company_name": "ScrapeGraphAI",
"description": "ScrapeGraphAI is a powerful AI scraping API designed for
efficient web data extraction to power LLM applications and AI agents...",
"features": [
"Effortless, cost-effective, and AI-powered data extraction",
"Handles proxy rotation and rate limits",
"Supports a wide variety of websites"
],
"contact_email": "[email protected]",
"social_links": {
"github": "https://github.com/ScrapeGraphAI/Scrapegraph-ai",
"linkedin": "https://www.linkedin.com/company/101881123",
"twitter": "https://x.com/scrapegraphai"
}
},
"error": ""
}
The response includes:
request_id: Unique identifier for tracking your request
status: Current status of the extraction (“completed”, “running”, “failed”)
result: The extracted data in structured JSON format
error: Error message (if any occurred during extraction)

Advanced Usage
Custom Schema Example
Define exactly what data you want to extract:

Async Support
For applications requiring asynchronous execution, SmartScraper provides async
support through the AsyncClient:
from scrapegraph_py import AsyncClient
import asyncio

async def main():


async with AsyncClient(api_key="your-api-key") as client:
response = await client.smartscraper(
website_url="https://example.com",
user_prompt="Extract the main content"
)
print(response)
# Run the async function
asyncio.run(main())

Integration Options
Official SDKs

Python SDK - Perfect for data science and backend applications


JavaScript SDK - Ideal for web applications and Node.js

AI Framework Integrations

LangChain Integration - Use SmartScraper in your LLM workflows


LlamaIndex Integration - Build powerful search and QA systems

Best Practices
Optimizing Extraction

Be specific in your prompts


Use schemas for structured data
Handle pagination for multi-page content
Implement error handling and retries

Rate Limiting

Implement reasonable delays between requests


Use async clients for better performance
Monitor your API usage

Example Projects
Check out our cookbook for real-world examples:

E-commerce product scraping


News aggregation
Research data collection
Content monitoring

API Reference
For detailed API documentation, see:

Start Scraping Job


Get Job Status

Support & Resources


DocumentationComprehensive guides and tutorialsAPI ReferenceDetailed API
documentationCommunityJoin our Discord communityGitHubCheck out our open-source
projects
Ready to Start?Sign up now and get your API key to begin extracting data with
SmartScraper!Was this page helpful?YesNoUser
SettingsLocalScraperxgithublinkedinPowered by MintlifyOn this pageOverviewKey
FeaturesUse CasesContent AggregationData AnalysisAI TrainingGetting StartedQuick
StartAdvanced UsageCustom Schema ExampleAsync SupportIntegration OptionsOfficial
SDKsAI Framework IntegrationsBest PracticesOptimizing ExtractionRate
LimitingExample ProjectsAPI ReferenceSupport & Resources

------- • -------

https://docs.scrapegraphai.com/services/smartscraper#async-support
ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationServicesSmartScraper
HomeCookbookAPI ReferenceOfficial WebsiteCommunityBlogGet
StartedIntroductionDashboardServicesSmartScraperLocalScraperMarkdownifyBrowser
ExtensionsOfficial SDKsPython SDKJavaScript SDKIntegrations🦜 LangChain🦙 LlamaIndex👥
CrewAI PhidataContributeOpen SourceFeedbackResources🕷️ScrapeGraphAI
BadgeServicesSmartScraperAI-powered web scraping for any website
Overview
SmartScraper is our flagship LLM-powered web scraping service that intelligently
extracts structured data from any website. Using advanced LLM models, it
understands context and content like a human would, making web data extraction more
reliable and efficient than ever.
Try SmartScraper instantly in our interactive playground - no coding required!
Key Features
Universal CompatibilityWorks with any website structure, including JavaScript-
rendered contentAI UnderstandingContextual understanding of content for accurate
extractionStructured OutputReturns clean, structured data in your preferred
formatSchema SupportDefine custom output schemas using Pydantic or Zod
Use Cases
Content Aggregation

News article extraction


Blog post summarization
Product information gathering
Research data collection

Data Analysis

Market research
Competitor analysis
Price monitoring
Trend tracking

AI Training

Dataset creation
Training data collection
Content classification
Knowledge base building

Want to learn more about our AI-powered scraping technology? Visit our main website
to discover how we’re revolutionizing web data extraction.
Getting Started
Quick Start
from scrapegraph_py import Client

client = Client(api_key="your-api-key")

response = client.smartscraper(
website_url="https://scrapegraphai.com/",
user_prompt="Extract info about the company"
)

Get your API key from the dashboard


Example Response{
"request_id": "sg-req-abc123",
"status": "completed",
"website_url": "https://scrapegraphai.com/",
"user_prompt": "Extract info about the company",
"result": {
"company_name": "ScrapeGraphAI",
"description": "ScrapeGraphAI is a powerful AI scraping API designed for
efficient web data extraction to power LLM applications and AI agents...",
"features": [
"Effortless, cost-effective, and AI-powered data extraction",
"Handles proxy rotation and rate limits",
"Supports a wide variety of websites"
],
"contact_email": "[email protected]",
"social_links": {
"github": "https://github.com/ScrapeGraphAI/Scrapegraph-ai",
"linkedin": "https://www.linkedin.com/company/101881123",
"twitter": "https://x.com/scrapegraphai"
}
},
"error": ""
}
The response includes:
request_id: Unique identifier for tracking your request
status: Current status of the extraction (“completed”, “running”, “failed”)
result: The extracted data in structured JSON format
error: Error message (if any occurred during extraction)

Advanced Usage
Custom Schema Example
Define exactly what data you want to extract:

Async Support
For applications requiring asynchronous execution, SmartScraper provides async
support through the AsyncClient:
from scrapegraph_py import AsyncClient
import asyncio

async def main():


async with AsyncClient(api_key="your-api-key") as client:
response = await client.smartscraper(
website_url="https://example.com",
user_prompt="Extract the main content"
)
print(response)

# Run the async function


asyncio.run(main())

Integration Options
Official SDKs

Python SDK - Perfect for data science and backend applications


JavaScript SDK - Ideal for web applications and Node.js

AI Framework Integrations

LangChain Integration - Use SmartScraper in your LLM workflows


LlamaIndex Integration - Build powerful search and QA systems

Best Practices
Optimizing Extraction
Be specific in your prompts
Use schemas for structured data
Handle pagination for multi-page content
Implement error handling and retries

Rate Limiting

Implement reasonable delays between requests


Use async clients for better performance
Monitor your API usage

Example Projects
Check out our cookbook for real-world examples:

E-commerce product scraping


News aggregation
Research data collection
Content monitoring

API Reference
For detailed API documentation, see:

Start Scraping Job


Get Job Status

Support & Resources


DocumentationComprehensive guides and tutorialsAPI ReferenceDetailed API
documentationCommunityJoin our Discord communityGitHubCheck out our open-source
projects
Ready to Start?Sign up now and get your API key to begin extracting data with
SmartScraper!Was this page helpful?YesNoUser
SettingsLocalScraperxgithublinkedinPowered by MintlifyOn this pageOverviewKey
FeaturesUse CasesContent AggregationData AnalysisAI TrainingGetting StartedQuick
StartAdvanced UsageCustom Schema ExampleAsync SupportIntegration OptionsOfficial
SDKsAI Framework IntegrationsBest PracticesOptimizing ExtractionRate
LimitingExample ProjectsAPI ReferenceSupport & Resources

------- • -------

https://docs.scrapegraphai.com/services/smartscraper#integration-options

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationServicesSmartScraper
HomeCookbookAPI ReferenceOfficial WebsiteCommunityBlogGet
StartedIntroductionDashboardServicesSmartScraperLocalScraperMarkdownifyBrowser
ExtensionsOfficial SDKsPython SDKJavaScript SDKIntegrations🦜 LangChain🦙 LlamaIndex👥
CrewAI PhidataContributeOpen SourceFeedbackResources🕷️ScrapeGraphAI
BadgeServicesSmartScraperAI-powered web scraping for any website
Overview
SmartScraper is our flagship LLM-powered web scraping service that intelligently
extracts structured data from any website. Using advanced LLM models, it
understands context and content like a human would, making web data extraction more
reliable and efficient than ever.
Try SmartScraper instantly in our interactive playground - no coding required!
Key Features
Universal CompatibilityWorks with any website structure, including JavaScript-
rendered contentAI UnderstandingContextual understanding of content for accurate
extractionStructured OutputReturns clean, structured data in your preferred
formatSchema SupportDefine custom output schemas using Pydantic or Zod
Use Cases
Content Aggregation

News article extraction


Blog post summarization
Product information gathering
Research data collection

Data Analysis

Market research
Competitor analysis
Price monitoring
Trend tracking

AI Training

Dataset creation
Training data collection
Content classification
Knowledge base building

Want to learn more about our AI-powered scraping technology? Visit our main website
to discover how we’re revolutionizing web data extraction.
Getting Started
Quick Start
from scrapegraph_py import Client

client = Client(api_key="your-api-key")

response = client.smartscraper(
website_url="https://scrapegraphai.com/",
user_prompt="Extract info about the company"
)

Get your API key from the dashboard


Example Response{
"request_id": "sg-req-abc123",
"status": "completed",
"website_url": "https://scrapegraphai.com/",
"user_prompt": "Extract info about the company",
"result": {
"company_name": "ScrapeGraphAI",
"description": "ScrapeGraphAI is a powerful AI scraping API designed for
efficient web data extraction to power LLM applications and AI agents...",
"features": [
"Effortless, cost-effective, and AI-powered data extraction",
"Handles proxy rotation and rate limits",
"Supports a wide variety of websites"
],
"contact_email": "[email protected]",
"social_links": {
"github": "https://github.com/ScrapeGraphAI/Scrapegraph-ai",
"linkedin": "https://www.linkedin.com/company/101881123",
"twitter": "https://x.com/scrapegraphai"
}
},
"error": ""
}
The response includes:
request_id: Unique identifier for tracking your request
status: Current status of the extraction (“completed”, “running”, “failed”)
result: The extracted data in structured JSON format
error: Error message (if any occurred during extraction)

Advanced Usage
Custom Schema Example
Define exactly what data you want to extract:

Async Support
For applications requiring asynchronous execution, SmartScraper provides async
support through the AsyncClient:
from scrapegraph_py import AsyncClient
import asyncio

async def main():


async with AsyncClient(api_key="your-api-key") as client:
response = await client.smartscraper(
website_url="https://example.com",
user_prompt="Extract the main content"
)
print(response)

# Run the async function


asyncio.run(main())

Integration Options
Official SDKs

Python SDK - Perfect for data science and backend applications


JavaScript SDK - Ideal for web applications and Node.js

AI Framework Integrations

LangChain Integration - Use SmartScraper in your LLM workflows


LlamaIndex Integration - Build powerful search and QA systems

Best Practices
Optimizing Extraction

Be specific in your prompts


Use schemas for structured data
Handle pagination for multi-page content
Implement error handling and retries

Rate Limiting

Implement reasonable delays between requests


Use async clients for better performance
Monitor your API usage

Example Projects
Check out our cookbook for real-world examples:

E-commerce product scraping


News aggregation
Research data collection
Content monitoring
API Reference
For detailed API documentation, see:

Start Scraping Job


Get Job Status

Support & Resources


DocumentationComprehensive guides and tutorialsAPI ReferenceDetailed API
documentationCommunityJoin our Discord communityGitHubCheck out our open-source
projects
Ready to Start?Sign up now and get your API key to begin extracting data with
SmartScraper!Was this page helpful?YesNoUser
SettingsLocalScraperxgithublinkedinPowered by MintlifyOn this pageOverviewKey
FeaturesUse CasesContent AggregationData AnalysisAI TrainingGetting StartedQuick
StartAdvanced UsageCustom Schema ExampleAsync SupportIntegration OptionsOfficial
SDKsAI Framework IntegrationsBest PracticesOptimizing ExtractionRate
LimitingExample ProjectsAPI ReferenceSupport & Resources

------- • -------

https://docs.scrapegraphai.com/services/smartscraper#official-sdks

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationServicesSmartScraper
HomeCookbookAPI ReferenceOfficial WebsiteCommunityBlogGet
StartedIntroductionDashboardServicesSmartScraperLocalScraperMarkdownifyBrowser
ExtensionsOfficial SDKsPython SDKJavaScript SDKIntegrations🦜 LangChain🦙 LlamaIndex👥
CrewAI PhidataContributeOpen SourceFeedbackResources🕷️ScrapeGraphAI
BadgeServicesSmartScraperAI-powered web scraping for any website
Overview
SmartScraper is our flagship LLM-powered web scraping service that intelligently
extracts structured data from any website. Using advanced LLM models, it
understands context and content like a human would, making web data extraction more
reliable and efficient than ever.
Try SmartScraper instantly in our interactive playground - no coding required!
Key Features
Universal CompatibilityWorks with any website structure, including JavaScript-
rendered contentAI UnderstandingContextual understanding of content for accurate
extractionStructured OutputReturns clean, structured data in your preferred
formatSchema SupportDefine custom output schemas using Pydantic or Zod
Use Cases
Content Aggregation

News article extraction


Blog post summarization
Product information gathering
Research data collection

Data Analysis

Market research
Competitor analysis
Price monitoring
Trend tracking

AI Training

Dataset creation
Training data collection
Content classification
Knowledge base building

Want to learn more about our AI-powered scraping technology? Visit our main website
to discover how we’re revolutionizing web data extraction.
Getting Started
Quick Start
from scrapegraph_py import Client

client = Client(api_key="your-api-key")

response = client.smartscraper(
website_url="https://scrapegraphai.com/",
user_prompt="Extract info about the company"
)

Get your API key from the dashboard


Example Response{
"request_id": "sg-req-abc123",
"status": "completed",
"website_url": "https://scrapegraphai.com/",
"user_prompt": "Extract info about the company",
"result": {
"company_name": "ScrapeGraphAI",
"description": "ScrapeGraphAI is a powerful AI scraping API designed for
efficient web data extraction to power LLM applications and AI agents...",
"features": [
"Effortless, cost-effective, and AI-powered data extraction",
"Handles proxy rotation and rate limits",
"Supports a wide variety of websites"
],
"contact_email": "[email protected]",
"social_links": {
"github": "https://github.com/ScrapeGraphAI/Scrapegraph-ai",
"linkedin": "https://www.linkedin.com/company/101881123",
"twitter": "https://x.com/scrapegraphai"
}
},
"error": ""
}
The response includes:
request_id: Unique identifier for tracking your request
status: Current status of the extraction (“completed”, “running”, “failed”)
result: The extracted data in structured JSON format
error: Error message (if any occurred during extraction)

Advanced Usage
Custom Schema Example
Define exactly what data you want to extract:

Async Support
For applications requiring asynchronous execution, SmartScraper provides async
support through the AsyncClient:
from scrapegraph_py import AsyncClient
import asyncio

async def main():


async with AsyncClient(api_key="your-api-key") as client:
response = await client.smartscraper(
website_url="https://example.com",
user_prompt="Extract the main content"
)
print(response)

# Run the async function


asyncio.run(main())

Integration Options
Official SDKs

Python SDK - Perfect for data science and backend applications


JavaScript SDK - Ideal for web applications and Node.js

AI Framework Integrations

LangChain Integration - Use SmartScraper in your LLM workflows


LlamaIndex Integration - Build powerful search and QA systems

Best Practices
Optimizing Extraction

Be specific in your prompts


Use schemas for structured data
Handle pagination for multi-page content
Implement error handling and retries

Rate Limiting

Implement reasonable delays between requests


Use async clients for better performance
Monitor your API usage

Example Projects
Check out our cookbook for real-world examples:

E-commerce product scraping


News aggregation
Research data collection
Content monitoring

API Reference
For detailed API documentation, see:

Start Scraping Job


Get Job Status

Support & Resources


DocumentationComprehensive guides and tutorialsAPI ReferenceDetailed API
documentationCommunityJoin our Discord communityGitHubCheck out our open-source
projects
Ready to Start?Sign up now and get your API key to begin extracting data with
SmartScraper!Was this page helpful?YesNoUser
SettingsLocalScraperxgithublinkedinPowered by MintlifyOn this pageOverviewKey
FeaturesUse CasesContent AggregationData AnalysisAI TrainingGetting StartedQuick
StartAdvanced UsageCustom Schema ExampleAsync SupportIntegration OptionsOfficial
SDKsAI Framework IntegrationsBest PracticesOptimizing ExtractionRate
LimitingExample ProjectsAPI ReferenceSupport & Resources
------- • -------

https://docs.scrapegraphai.com/services/smartscraper#ai-framework-integrations

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationServicesSmartScraper
HomeCookbookAPI ReferenceOfficial WebsiteCommunityBlogGet
StartedIntroductionDashboardServicesSmartScraperLocalScraperMarkdownifyBrowser
ExtensionsOfficial SDKsPython SDKJavaScript SDKIntegrations🦜 LangChain🦙 LlamaIndex👥
CrewAI PhidataContributeOpen SourceFeedbackResources🕷️ScrapeGraphAI
BadgeServicesSmartScraperAI-powered web scraping for any website
Overview
SmartScraper is our flagship LLM-powered web scraping service that intelligently
extracts structured data from any website. Using advanced LLM models, it
understands context and content like a human would, making web data extraction more
reliable and efficient than ever.
Try SmartScraper instantly in our interactive playground - no coding required!
Key Features
Universal CompatibilityWorks with any website structure, including JavaScript-
rendered contentAI UnderstandingContextual understanding of content for accurate
extractionStructured OutputReturns clean, structured data in your preferred
formatSchema SupportDefine custom output schemas using Pydantic or Zod
Use Cases
Content Aggregation

News article extraction


Blog post summarization
Product information gathering
Research data collection

Data Analysis

Market research
Competitor analysis
Price monitoring
Trend tracking

AI Training

Dataset creation
Training data collection
Content classification
Knowledge base building

Want to learn more about our AI-powered scraping technology? Visit our main website
to discover how we’re revolutionizing web data extraction.
Getting Started
Quick Start
from scrapegraph_py import Client

client = Client(api_key="your-api-key")

response = client.smartscraper(
website_url="https://scrapegraphai.com/",
user_prompt="Extract info about the company"
)

Get your API key from the dashboard


Example Response{
"request_id": "sg-req-abc123",
"status": "completed",
"website_url": "https://scrapegraphai.com/",
"user_prompt": "Extract info about the company",
"result": {
"company_name": "ScrapeGraphAI",
"description": "ScrapeGraphAI is a powerful AI scraping API designed for
efficient web data extraction to power LLM applications and AI agents...",
"features": [
"Effortless, cost-effective, and AI-powered data extraction",
"Handles proxy rotation and rate limits",
"Supports a wide variety of websites"
],
"contact_email": "[email protected]",
"social_links": {
"github": "https://github.com/ScrapeGraphAI/Scrapegraph-ai",
"linkedin": "https://www.linkedin.com/company/101881123",
"twitter": "https://x.com/scrapegraphai"
}
},
"error": ""
}
The response includes:
request_id: Unique identifier for tracking your request
status: Current status of the extraction (“completed”, “running”, “failed”)
result: The extracted data in structured JSON format
error: Error message (if any occurred during extraction)

Advanced Usage
Custom Schema Example
Define exactly what data you want to extract:

Async Support
For applications requiring asynchronous execution, SmartScraper provides async
support through the AsyncClient:
from scrapegraph_py import AsyncClient
import asyncio

async def main():


async with AsyncClient(api_key="your-api-key") as client:
response = await client.smartscraper(
website_url="https://example.com",
user_prompt="Extract the main content"
)
print(response)

# Run the async function


asyncio.run(main())

Integration Options
Official SDKs

Python SDK - Perfect for data science and backend applications


JavaScript SDK - Ideal for web applications and Node.js

AI Framework Integrations

LangChain Integration - Use SmartScraper in your LLM workflows


LlamaIndex Integration - Build powerful search and QA systems

Best Practices
Optimizing Extraction

Be specific in your prompts


Use schemas for structured data
Handle pagination for multi-page content
Implement error handling and retries

Rate Limiting

Implement reasonable delays between requests


Use async clients for better performance
Monitor your API usage

Example Projects
Check out our cookbook for real-world examples:

E-commerce product scraping


News aggregation
Research data collection
Content monitoring

API Reference
For detailed API documentation, see:

Start Scraping Job


Get Job Status

Support & Resources


DocumentationComprehensive guides and tutorialsAPI ReferenceDetailed API
documentationCommunityJoin our Discord communityGitHubCheck out our open-source
projects
Ready to Start?Sign up now and get your API key to begin extracting data with
SmartScraper!Was this page helpful?YesNoUser
SettingsLocalScraperxgithublinkedinPowered by MintlifyOn this pageOverviewKey
FeaturesUse CasesContent AggregationData AnalysisAI TrainingGetting StartedQuick
StartAdvanced UsageCustom Schema ExampleAsync SupportIntegration OptionsOfficial
SDKsAI Framework IntegrationsBest PracticesOptimizing ExtractionRate
LimitingExample ProjectsAPI ReferenceSupport & Resources

------- • -------

https://docs.scrapegraphai.com/services/smartscraper#best-practices

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationServicesSmartScraper
HomeCookbookAPI ReferenceOfficial WebsiteCommunityBlogGet
StartedIntroductionDashboardServicesSmartScraperLocalScraperMarkdownifyBrowser
ExtensionsOfficial SDKsPython SDKJavaScript SDKIntegrations🦜 LangChain🦙 LlamaIndex👥
CrewAI PhidataContributeOpen SourceFeedbackResources🕷️ScrapeGraphAI
BadgeServicesSmartScraperAI-powered web scraping for any website
Overview
SmartScraper is our flagship LLM-powered web scraping service that intelligently
extracts structured data from any website. Using advanced LLM models, it
understands context and content like a human would, making web data extraction more
reliable and efficient than ever.
Try SmartScraper instantly in our interactive playground - no coding required!
Key Features
Universal CompatibilityWorks with any website structure, including JavaScript-
rendered contentAI UnderstandingContextual understanding of content for accurate
extractionStructured OutputReturns clean, structured data in your preferred
formatSchema SupportDefine custom output schemas using Pydantic or Zod
Use Cases
Content Aggregation

News article extraction


Blog post summarization
Product information gathering
Research data collection

Data Analysis

Market research
Competitor analysis
Price monitoring
Trend tracking

AI Training

Dataset creation
Training data collection
Content classification
Knowledge base building

Want to learn more about our AI-powered scraping technology? Visit our main website
to discover how we’re revolutionizing web data extraction.
Getting Started
Quick Start
from scrapegraph_py import Client

client = Client(api_key="your-api-key")

response = client.smartscraper(
website_url="https://scrapegraphai.com/",
user_prompt="Extract info about the company"
)

Get your API key from the dashboard


Example Response{
"request_id": "sg-req-abc123",
"status": "completed",
"website_url": "https://scrapegraphai.com/",
"user_prompt": "Extract info about the company",
"result": {
"company_name": "ScrapeGraphAI",
"description": "ScrapeGraphAI is a powerful AI scraping API designed for
efficient web data extraction to power LLM applications and AI agents...",
"features": [
"Effortless, cost-effective, and AI-powered data extraction",
"Handles proxy rotation and rate limits",
"Supports a wide variety of websites"
],
"contact_email": "[email protected]",
"social_links": {
"github": "https://github.com/ScrapeGraphAI/Scrapegraph-ai",
"linkedin": "https://www.linkedin.com/company/101881123",
"twitter": "https://x.com/scrapegraphai"
}
},
"error": ""
}
The response includes:
request_id: Unique identifier for tracking your request
status: Current status of the extraction (“completed”, “running”, “failed”)
result: The extracted data in structured JSON format
error: Error message (if any occurred during extraction)

Advanced Usage
Custom Schema Example
Define exactly what data you want to extract:

Async Support
For applications requiring asynchronous execution, SmartScraper provides async
support through the AsyncClient:
from scrapegraph_py import AsyncClient
import asyncio

async def main():


async with AsyncClient(api_key="your-api-key") as client:
response = await client.smartscraper(
website_url="https://example.com",
user_prompt="Extract the main content"
)
print(response)

# Run the async function


asyncio.run(main())

Integration Options
Official SDKs

Python SDK - Perfect for data science and backend applications


JavaScript SDK - Ideal for web applications and Node.js

AI Framework Integrations

LangChain Integration - Use SmartScraper in your LLM workflows


LlamaIndex Integration - Build powerful search and QA systems

Best Practices
Optimizing Extraction

Be specific in your prompts


Use schemas for structured data
Handle pagination for multi-page content
Implement error handling and retries

Rate Limiting

Implement reasonable delays between requests


Use async clients for better performance
Monitor your API usage

Example Projects
Check out our cookbook for real-world examples:
E-commerce product scraping
News aggregation
Research data collection
Content monitoring

API Reference
For detailed API documentation, see:

Start Scraping Job


Get Job Status

Support & Resources


DocumentationComprehensive guides and tutorialsAPI ReferenceDetailed API
documentationCommunityJoin our Discord communityGitHubCheck out our open-source
projects
Ready to Start?Sign up now and get your API key to begin extracting data with
SmartScraper!Was this page helpful?YesNoUser
SettingsLocalScraperxgithublinkedinPowered by MintlifyOn this pageOverviewKey
FeaturesUse CasesContent AggregationData AnalysisAI TrainingGetting StartedQuick
StartAdvanced UsageCustom Schema ExampleAsync SupportIntegration OptionsOfficial
SDKsAI Framework IntegrationsBest PracticesOptimizing ExtractionRate
LimitingExample ProjectsAPI ReferenceSupport & Resources

------- • -------

https://docs.scrapegraphai.com/services/smartscraper#optimizing-extraction

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationServicesSmartScraper
HomeCookbookAPI ReferenceOfficial WebsiteCommunityBlogGet
StartedIntroductionDashboardServicesSmartScraperLocalScraperMarkdownifyBrowser
ExtensionsOfficial SDKsPython SDKJavaScript SDKIntegrations🦜 LangChain🦙 LlamaIndex👥
CrewAI PhidataContributeOpen SourceFeedbackResources🕷️ScrapeGraphAI
BadgeServicesSmartScraperAI-powered web scraping for any website
Overview
SmartScraper is our flagship LLM-powered web scraping service that intelligently
extracts structured data from any website. Using advanced LLM models, it
understands context and content like a human would, making web data extraction more
reliable and efficient than ever.
Try SmartScraper instantly in our interactive playground - no coding required!
Key Features
Universal CompatibilityWorks with any website structure, including JavaScript-
rendered contentAI UnderstandingContextual understanding of content for accurate
extractionStructured OutputReturns clean, structured data in your preferred
formatSchema SupportDefine custom output schemas using Pydantic or Zod
Use Cases
Content Aggregation

News article extraction


Blog post summarization
Product information gathering
Research data collection

Data Analysis

Market research
Competitor analysis
Price monitoring
Trend tracking

AI Training

Dataset creation
Training data collection
Content classification
Knowledge base building

Want to learn more about our AI-powered scraping technology? Visit our main website
to discover how we’re revolutionizing web data extraction.
Getting Started
Quick Start
from scrapegraph_py import Client

client = Client(api_key="your-api-key")

response = client.smartscraper(
website_url="https://scrapegraphai.com/",
user_prompt="Extract info about the company"
)

Get your API key from the dashboard


Example Response{
"request_id": "sg-req-abc123",
"status": "completed",
"website_url": "https://scrapegraphai.com/",
"user_prompt": "Extract info about the company",
"result": {
"company_name": "ScrapeGraphAI",
"description": "ScrapeGraphAI is a powerful AI scraping API designed for
efficient web data extraction to power LLM applications and AI agents...",
"features": [
"Effortless, cost-effective, and AI-powered data extraction",
"Handles proxy rotation and rate limits",
"Supports a wide variety of websites"
],
"contact_email": "[email protected]",
"social_links": {
"github": "https://github.com/ScrapeGraphAI/Scrapegraph-ai",
"linkedin": "https://www.linkedin.com/company/101881123",
"twitter": "https://x.com/scrapegraphai"
}
},
"error": ""
}
The response includes:
request_id: Unique identifier for tracking your request
status: Current status of the extraction (“completed”, “running”, “failed”)
result: The extracted data in structured JSON format
error: Error message (if any occurred during extraction)

Advanced Usage
Custom Schema Example
Define exactly what data you want to extract:

Async Support
For applications requiring asynchronous execution, SmartScraper provides async
support through the AsyncClient:
from scrapegraph_py import AsyncClient
import asyncio

async def main():


async with AsyncClient(api_key="your-api-key") as client:
response = await client.smartscraper(
website_url="https://example.com",
user_prompt="Extract the main content"
)
print(response)

# Run the async function


asyncio.run(main())

Integration Options
Official SDKs

Python SDK - Perfect for data science and backend applications


JavaScript SDK - Ideal for web applications and Node.js

AI Framework Integrations

LangChain Integration - Use SmartScraper in your LLM workflows


LlamaIndex Integration - Build powerful search and QA systems

Best Practices
Optimizing Extraction

Be specific in your prompts


Use schemas for structured data
Handle pagination for multi-page content
Implement error handling and retries

Rate Limiting

Implement reasonable delays between requests


Use async clients for better performance
Monitor your API usage

Example Projects
Check out our cookbook for real-world examples:

E-commerce product scraping


News aggregation
Research data collection
Content monitoring

API Reference
For detailed API documentation, see:

Start Scraping Job


Get Job Status

Support & Resources


DocumentationComprehensive guides and tutorialsAPI ReferenceDetailed API
documentationCommunityJoin our Discord communityGitHubCheck out our open-source
projects
Ready to Start?Sign up now and get your API key to begin extracting data with
SmartScraper!Was this page helpful?YesNoUser
SettingsLocalScraperxgithublinkedinPowered by MintlifyOn this pageOverviewKey
FeaturesUse CasesContent AggregationData AnalysisAI TrainingGetting StartedQuick
StartAdvanced UsageCustom Schema ExampleAsync SupportIntegration OptionsOfficial
SDKsAI Framework IntegrationsBest PracticesOptimizing ExtractionRate
LimitingExample ProjectsAPI ReferenceSupport & Resources

------- • -------

https://docs.scrapegraphai.com/services/smartscraper#rate-limiting

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationServicesSmartScraper
HomeCookbookAPI ReferenceOfficial WebsiteCommunityBlogGet
StartedIntroductionDashboardServicesSmartScraperLocalScraperMarkdownifyBrowser
ExtensionsOfficial SDKsPython SDKJavaScript SDKIntegrations🦜 LangChain🦙 LlamaIndex👥
CrewAI PhidataContributeOpen SourceFeedbackResources🕷️ScrapeGraphAI
BadgeServicesSmartScraperAI-powered web scraping for any website
Overview
SmartScraper is our flagship LLM-powered web scraping service that intelligently
extracts structured data from any website. Using advanced LLM models, it
understands context and content like a human would, making web data extraction more
reliable and efficient than ever.
Try SmartScraper instantly in our interactive playground - no coding required!
Key Features
Universal CompatibilityWorks with any website structure, including JavaScript-
rendered contentAI UnderstandingContextual understanding of content for accurate
extractionStructured OutputReturns clean, structured data in your preferred
formatSchema SupportDefine custom output schemas using Pydantic or Zod
Use Cases
Content Aggregation

News article extraction


Blog post summarization
Product information gathering
Research data collection

Data Analysis

Market research
Competitor analysis
Price monitoring
Trend tracking

AI Training

Dataset creation
Training data collection
Content classification
Knowledge base building

Want to learn more about our AI-powered scraping technology? Visit our main website
to discover how we’re revolutionizing web data extraction.
Getting Started
Quick Start
from scrapegraph_py import Client

client = Client(api_key="your-api-key")

response = client.smartscraper(
website_url="https://scrapegraphai.com/",
user_prompt="Extract info about the company"
)

Get your API key from the dashboard


Example Response{
"request_id": "sg-req-abc123",
"status": "completed",
"website_url": "https://scrapegraphai.com/",
"user_prompt": "Extract info about the company",
"result": {
"company_name": "ScrapeGraphAI",
"description": "ScrapeGraphAI is a powerful AI scraping API designed for
efficient web data extraction to power LLM applications and AI agents...",
"features": [
"Effortless, cost-effective, and AI-powered data extraction",
"Handles proxy rotation and rate limits",
"Supports a wide variety of websites"
],
"contact_email": "[email protected]",
"social_links": {
"github": "https://github.com/ScrapeGraphAI/Scrapegraph-ai",
"linkedin": "https://www.linkedin.com/company/101881123",
"twitter": "https://x.com/scrapegraphai"
}
},
"error": ""
}
The response includes:
request_id: Unique identifier for tracking your request
status: Current status of the extraction (“completed”, “running”, “failed”)
result: The extracted data in structured JSON format
error: Error message (if any occurred during extraction)

Advanced Usage
Custom Schema Example
Define exactly what data you want to extract:

Async Support
For applications requiring asynchronous execution, SmartScraper provides async
support through the AsyncClient:
from scrapegraph_py import AsyncClient
import asyncio

async def main():


async with AsyncClient(api_key="your-api-key") as client:
response = await client.smartscraper(
website_url="https://example.com",
user_prompt="Extract the main content"
)
print(response)

# Run the async function


asyncio.run(main())

Integration Options
Official SDKs

Python SDK - Perfect for data science and backend applications


JavaScript SDK - Ideal for web applications and Node.js

AI Framework Integrations

LangChain Integration - Use SmartScraper in your LLM workflows


LlamaIndex Integration - Build powerful search and QA systems

Best Practices
Optimizing Extraction

Be specific in your prompts


Use schemas for structured data
Handle pagination for multi-page content
Implement error handling and retries

Rate Limiting

Implement reasonable delays between requests


Use async clients for better performance
Monitor your API usage

Example Projects
Check out our cookbook for real-world examples:

E-commerce product scraping


News aggregation
Research data collection
Content monitoring

API Reference
For detailed API documentation, see:

Start Scraping Job


Get Job Status

Support & Resources


DocumentationComprehensive guides and tutorialsAPI ReferenceDetailed API
documentationCommunityJoin our Discord communityGitHubCheck out our open-source
projects
Ready to Start?Sign up now and get your API key to begin extracting data with
SmartScraper!Was this page helpful?YesNoUser
SettingsLocalScraperxgithublinkedinPowered by MintlifyOn this pageOverviewKey
FeaturesUse CasesContent AggregationData AnalysisAI TrainingGetting StartedQuick
StartAdvanced UsageCustom Schema ExampleAsync SupportIntegration OptionsOfficial
SDKsAI Framework IntegrationsBest PracticesOptimizing ExtractionRate
LimitingExample ProjectsAPI ReferenceSupport & Resources

------- • -------

https://docs.scrapegraphai.com/services/smartscraper#example-projects

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationServicesSmartScraper
HomeCookbookAPI ReferenceOfficial WebsiteCommunityBlogGet
StartedIntroductionDashboardServicesSmartScraperLocalScraperMarkdownifyBrowser
ExtensionsOfficial SDKsPython SDKJavaScript SDKIntegrations🦜 LangChain🦙 LlamaIndex👥
CrewAI PhidataContributeOpen SourceFeedbackResources🕷️ScrapeGraphAI
BadgeServicesSmartScraperAI-powered web scraping for any website
Overview
SmartScraper is our flagship LLM-powered web scraping service that intelligently
extracts structured data from any website. Using advanced LLM models, it
understands context and content like a human would, making web data extraction more
reliable and efficient than ever.
Try SmartScraper instantly in our interactive playground - no coding required!
Key Features
Universal CompatibilityWorks with any website structure, including JavaScript-
rendered contentAI UnderstandingContextual understanding of content for accurate
extractionStructured OutputReturns clean, structured data in your preferred
formatSchema SupportDefine custom output schemas using Pydantic or Zod
Use Cases
Content Aggregation

News article extraction


Blog post summarization
Product information gathering
Research data collection

Data Analysis

Market research
Competitor analysis
Price monitoring
Trend tracking

AI Training

Dataset creation
Training data collection
Content classification
Knowledge base building

Want to learn more about our AI-powered scraping technology? Visit our main website
to discover how we’re revolutionizing web data extraction.
Getting Started
Quick Start
from scrapegraph_py import Client

client = Client(api_key="your-api-key")

response = client.smartscraper(
website_url="https://scrapegraphai.com/",
user_prompt="Extract info about the company"
)

Get your API key from the dashboard


Example Response{
"request_id": "sg-req-abc123",
"status": "completed",
"website_url": "https://scrapegraphai.com/",
"user_prompt": "Extract info about the company",
"result": {
"company_name": "ScrapeGraphAI",
"description": "ScrapeGraphAI is a powerful AI scraping API designed for
efficient web data extraction to power LLM applications and AI agents...",
"features": [
"Effortless, cost-effective, and AI-powered data extraction",
"Handles proxy rotation and rate limits",
"Supports a wide variety of websites"
],
"contact_email": "[email protected]",
"social_links": {
"github": "https://github.com/ScrapeGraphAI/Scrapegraph-ai",
"linkedin": "https://www.linkedin.com/company/101881123",
"twitter": "https://x.com/scrapegraphai"
}
},
"error": ""
}
The response includes:
request_id: Unique identifier for tracking your request
status: Current status of the extraction (“completed”, “running”, “failed”)
result: The extracted data in structured JSON format
error: Error message (if any occurred during extraction)

Advanced Usage
Custom Schema Example
Define exactly what data you want to extract:

Async Support
For applications requiring asynchronous execution, SmartScraper provides async
support through the AsyncClient:
from scrapegraph_py import AsyncClient
import asyncio

async def main():


async with AsyncClient(api_key="your-api-key") as client:
response = await client.smartscraper(
website_url="https://example.com",
user_prompt="Extract the main content"
)
print(response)

# Run the async function


asyncio.run(main())

Integration Options
Official SDKs

Python SDK - Perfect for data science and backend applications


JavaScript SDK - Ideal for web applications and Node.js

AI Framework Integrations

LangChain Integration - Use SmartScraper in your LLM workflows


LlamaIndex Integration - Build powerful search and QA systems

Best Practices
Optimizing Extraction

Be specific in your prompts


Use schemas for structured data
Handle pagination for multi-page content
Implement error handling and retries

Rate Limiting

Implement reasonable delays between requests


Use async clients for better performance
Monitor your API usage

Example Projects
Check out our cookbook for real-world examples:

E-commerce product scraping


News aggregation
Research data collection
Content monitoring

API Reference
For detailed API documentation, see:

Start Scraping Job


Get Job Status

Support & Resources


DocumentationComprehensive guides and tutorialsAPI ReferenceDetailed API
documentationCommunityJoin our Discord communityGitHubCheck out our open-source
projects
Ready to Start?Sign up now and get your API key to begin extracting data with
SmartScraper!Was this page helpful?YesNoUser
SettingsLocalScraperxgithublinkedinPowered by MintlifyOn this pageOverviewKey
FeaturesUse CasesContent AggregationData AnalysisAI TrainingGetting StartedQuick
StartAdvanced UsageCustom Schema ExampleAsync SupportIntegration OptionsOfficial
SDKsAI Framework IntegrationsBest PracticesOptimizing ExtractionRate
LimitingExample ProjectsAPI ReferenceSupport & Resources

------- • -------

https://docs.scrapegraphai.com/services/smartscraper#api-reference

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationServicesSmartScraper
HomeCookbookAPI ReferenceOfficial WebsiteCommunityBlogGet
StartedIntroductionDashboardServicesSmartScraperLocalScraperMarkdownifyBrowser
ExtensionsOfficial SDKsPython SDKJavaScript SDKIntegrations🦜 LangChain🦙 LlamaIndex👥
CrewAI PhidataContributeOpen SourceFeedbackResources🕷️ScrapeGraphAI
BadgeServicesSmartScraperAI-powered web scraping for any website
Overview
SmartScraper is our flagship LLM-powered web scraping service that intelligently
extracts structured data from any website. Using advanced LLM models, it
understands context and content like a human would, making web data extraction more
reliable and efficient than ever.
Try SmartScraper instantly in our interactive playground - no coding required!
Key Features
Universal CompatibilityWorks with any website structure, including JavaScript-
rendered contentAI UnderstandingContextual understanding of content for accurate
extractionStructured OutputReturns clean, structured data in your preferred
formatSchema SupportDefine custom output schemas using Pydantic or Zod
Use Cases
Content Aggregation

News article extraction


Blog post summarization
Product information gathering
Research data collection
Data Analysis

Market research
Competitor analysis
Price monitoring
Trend tracking

AI Training

Dataset creation
Training data collection
Content classification
Knowledge base building

Want to learn more about our AI-powered scraping technology? Visit our main website
to discover how we’re revolutionizing web data extraction.
Getting Started
Quick Start
from scrapegraph_py import Client

client = Client(api_key="your-api-key")

response = client.smartscraper(
website_url="https://scrapegraphai.com/",
user_prompt="Extract info about the company"
)

Get your API key from the dashboard


Example Response{
"request_id": "sg-req-abc123",
"status": "completed",
"website_url": "https://scrapegraphai.com/",
"user_prompt": "Extract info about the company",
"result": {
"company_name": "ScrapeGraphAI",
"description": "ScrapeGraphAI is a powerful AI scraping API designed for
efficient web data extraction to power LLM applications and AI agents...",
"features": [
"Effortless, cost-effective, and AI-powered data extraction",
"Handles proxy rotation and rate limits",
"Supports a wide variety of websites"
],
"contact_email": "[email protected]",
"social_links": {
"github": "https://github.com/ScrapeGraphAI/Scrapegraph-ai",
"linkedin": "https://www.linkedin.com/company/101881123",
"twitter": "https://x.com/scrapegraphai"
}
},
"error": ""
}
The response includes:
request_id: Unique identifier for tracking your request
status: Current status of the extraction (“completed”, “running”, “failed”)
result: The extracted data in structured JSON format
error: Error message (if any occurred during extraction)

Advanced Usage
Custom Schema Example
Define exactly what data you want to extract:

Async Support
For applications requiring asynchronous execution, SmartScraper provides async
support through the AsyncClient:
from scrapegraph_py import AsyncClient
import asyncio

async def main():


async with AsyncClient(api_key="your-api-key") as client:
response = await client.smartscraper(
website_url="https://example.com",
user_prompt="Extract the main content"
)
print(response)

# Run the async function


asyncio.run(main())

Integration Options
Official SDKs

Python SDK - Perfect for data science and backend applications


JavaScript SDK - Ideal for web applications and Node.js

AI Framework Integrations

LangChain Integration - Use SmartScraper in your LLM workflows


LlamaIndex Integration - Build powerful search and QA systems

Best Practices
Optimizing Extraction

Be specific in your prompts


Use schemas for structured data
Handle pagination for multi-page content
Implement error handling and retries

Rate Limiting

Implement reasonable delays between requests


Use async clients for better performance
Monitor your API usage

Example Projects
Check out our cookbook for real-world examples:

E-commerce product scraping


News aggregation
Research data collection
Content monitoring

API Reference
For detailed API documentation, see:

Start Scraping Job


Get Job Status

Support & Resources


DocumentationComprehensive guides and tutorialsAPI ReferenceDetailed API
documentationCommunityJoin our Discord communityGitHubCheck out our open-source
projects
Ready to Start?Sign up now and get your API key to begin extracting data with
SmartScraper!Was this page helpful?YesNoUser
SettingsLocalScraperxgithublinkedinPowered by MintlifyOn this pageOverviewKey
FeaturesUse CasesContent AggregationData AnalysisAI TrainingGetting StartedQuick
StartAdvanced UsageCustom Schema ExampleAsync SupportIntegration OptionsOfficial
SDKsAI Framework IntegrationsBest PracticesOptimizing ExtractionRate
LimitingExample ProjectsAPI ReferenceSupport & Resources

------- • -------

https://docs.scrapegraphai.com/services/smartscraper#support-and-resources

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationServicesSmartScraper
HomeCookbookAPI ReferenceOfficial WebsiteCommunityBlogGet
StartedIntroductionDashboardServicesSmartScraperLocalScraperMarkdownifyBrowser
ExtensionsOfficial SDKsPython SDKJavaScript SDKIntegrations🦜 LangChain🦙 LlamaIndex👥
CrewAI PhidataContributeOpen SourceFeedbackResources🕷️ScrapeGraphAI
BadgeServicesSmartScraperAI-powered web scraping for any website
Overview
SmartScraper is our flagship LLM-powered web scraping service that intelligently
extracts structured data from any website. Using advanced LLM models, it
understands context and content like a human would, making web data extraction more
reliable and efficient than ever.
Try SmartScraper instantly in our interactive playground - no coding required!
Key Features
Universal CompatibilityWorks with any website structure, including JavaScript-
rendered contentAI UnderstandingContextual understanding of content for accurate
extractionStructured OutputReturns clean, structured data in your preferred
formatSchema SupportDefine custom output schemas using Pydantic or Zod
Use Cases
Content Aggregation

News article extraction


Blog post summarization
Product information gathering
Research data collection

Data Analysis

Market research
Competitor analysis
Price monitoring
Trend tracking

AI Training

Dataset creation
Training data collection
Content classification
Knowledge base building

Want to learn more about our AI-powered scraping technology? Visit our main website
to discover how we’re revolutionizing web data extraction.
Getting Started
Quick Start
from scrapegraph_py import Client

client = Client(api_key="your-api-key")

response = client.smartscraper(
website_url="https://scrapegraphai.com/",
user_prompt="Extract info about the company"
)

Get your API key from the dashboard


Example Response{
"request_id": "sg-req-abc123",
"status": "completed",
"website_url": "https://scrapegraphai.com/",
"user_prompt": "Extract info about the company",
"result": {
"company_name": "ScrapeGraphAI",
"description": "ScrapeGraphAI is a powerful AI scraping API designed for
efficient web data extraction to power LLM applications and AI agents...",
"features": [
"Effortless, cost-effective, and AI-powered data extraction",
"Handles proxy rotation and rate limits",
"Supports a wide variety of websites"
],
"contact_email": "[email protected]",
"social_links": {
"github": "https://github.com/ScrapeGraphAI/Scrapegraph-ai",
"linkedin": "https://www.linkedin.com/company/101881123",
"twitter": "https://x.com/scrapegraphai"
}
},
"error": ""
}
The response includes:
request_id: Unique identifier for tracking your request
status: Current status of the extraction (“completed”, “running”, “failed”)
result: The extracted data in structured JSON format
error: Error message (if any occurred during extraction)

Advanced Usage
Custom Schema Example
Define exactly what data you want to extract:

Async Support
For applications requiring asynchronous execution, SmartScraper provides async
support through the AsyncClient:
from scrapegraph_py import AsyncClient
import asyncio

async def main():


async with AsyncClient(api_key="your-api-key") as client:
response = await client.smartscraper(
website_url="https://example.com",
user_prompt="Extract the main content"
)
print(response)

# Run the async function


asyncio.run(main())
Integration Options
Official SDKs

Python SDK - Perfect for data science and backend applications


JavaScript SDK - Ideal for web applications and Node.js

AI Framework Integrations

LangChain Integration - Use SmartScraper in your LLM workflows


LlamaIndex Integration - Build powerful search and QA systems

Best Practices
Optimizing Extraction

Be specific in your prompts


Use schemas for structured data
Handle pagination for multi-page content
Implement error handling and retries

Rate Limiting

Implement reasonable delays between requests


Use async clients for better performance
Monitor your API usage

Example Projects
Check out our cookbook for real-world examples:

E-commerce product scraping


News aggregation
Research data collection
Content monitoring

API Reference
For detailed API documentation, see:

Start Scraping Job


Get Job Status

Support & Resources


DocumentationComprehensive guides and tutorialsAPI ReferenceDetailed API
documentationCommunityJoin our Discord communityGitHubCheck out our open-source
projects
Ready to Start?Sign up now and get your API key to begin extracting data with
SmartScraper!Was this page helpful?YesNoUser
SettingsLocalScraperxgithublinkedinPowered by MintlifyOn this pageOverviewKey
FeaturesUse CasesContent AggregationData AnalysisAI TrainingGetting StartedQuick
StartAdvanced UsageCustom Schema ExampleAsync SupportIntegration OptionsOfficial
SDKsAI Framework IntegrationsBest PracticesOptimizing ExtractionRate
LimitingExample ProjectsAPI ReferenceSupport & Resources

------- • -------

https://docs.scrapegraphai.com/services/localscraper#overview

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationServicesLocalScraper
HomeCookbookAPI ReferenceOfficial WebsiteCommunityBlogGet
StartedIntroductionDashboardServicesSmartScraperLocalScraperMarkdownifyBrowser
ExtensionsOfficial SDKsPython SDKJavaScript SDKIntegrations🦜 LangChain🦙 LlamaIndex👥
CrewAI PhidataContributeOpen SourceFeedbackResources🕷️ScrapeGraphAI
BadgeServicesLocalScraperAI-powered extraction from local HTML content
Overview
LocalScraper brings the same powerful AI extraction capabilities as SmartScraper
but works with your local HTML content. This makes it perfect for scenarios where
you already have the HTML content or need to process cached pages, internal
documents, or dynamically generated content.
Try LocalScraper instantly in our interactive playground - no coding required!
Key Features
Local ProcessingProcess HTML content directly without making external requestsAI
UnderstandingSame powerful AI extraction as SmartScraperFaster ProcessingNo network
latency or website loading delaysFull ControlComplete control over your HTML input
and processing
Use Cases
Internal Systems

Process internally cached pages


Extract from intranet content
Handle dynamic JavaScript renders
Process email templates

Batch Processing

Archive data extraction


Historical content analysis
Bulk document processing
Offline content processing

Development & Testing

Test extraction logic locally


Debug content processing
Prototype without API calls
Validate schemas offline

Want to learn more about our AI-powered scraping technology? Visit our main website
to discover how we’re revolutionizing web data extraction.
Getting Started
Quick Start
from scrapegraph_py import Client

client = Client(api_key="your-api-key")

html_content = """
<html>
<body>
<h1>ScrapeGraphAI</h1>
<div class="description">
<p>AI-powered web scraping for modern applications.</p>
</div>
<div class="features">
<ul>
<li>Smart Extraction</li>
<li>Local Processing</li>
<li>Schema Support</li>
</ul>
</div>
</body>
</html>
"""

response = client.localscraper(
website_html=html_content,
user_prompt="Extract the company information and features"
)

Get your API key from the dashboard


Example Response{
"request_id": "sg-req-xyz789",
"status": "completed",
"user_prompt": "Extract the company information and features",
"result": {
"company_name": "ScrapeGraphAI",
"description": "AI-powered web scraping for modern applications.",
"features": [
"Smart Extraction",
"Local Processing",
"Schema Support"
]
},
"error": ""
}
The response includes:
request_id: Unique identifier for tracking your request
status: Current status of the extraction
result: The extracted data in structured JSON format
error: Error message (if any occurred during extraction)

Advanced Usage
Custom Schema Example
Define exactly what data you want to extract:

Async Support
For applications requiring asynchronous execution, LocalScraper provides async
support through the AsyncClient:
from scrapegraph_py import AsyncClient
import asyncio

async def main():


html_content = """
<html>
<body>
<h1>Product: Gaming Laptop</h1>
<div class="price">$999.99</div>
<div class="description">
High-performance gaming laptop with RTX 3080.
</div>
</body>
</html>
"""

async with AsyncClient(api_key="your-api-key") as client:


response = await client.localscraper(
website_html=html_content,
user_prompt="Extract the product information"
)
print(response)

# Run the async function


asyncio.run(main())

Integration Options
Official SDKs

Python SDK - Perfect for data science and backend applications


JavaScript SDK - Ideal for web applications and Node.js

AI Framework Integrations

LangChain Integration - Use LocalScraper in your LLM workflows


LlamaIndex Integration - Build powerful search and QA systems

Best Practices
HTML Preparation

Ensure HTML is well-formed


Include relevant content only
Clean up unnecessary markup
Handle character encoding properly

Optimization Tips

Remove unnecessary scripts and styles


Clean up dynamic content placeholders
Preserve important semantic structure
Include relevant metadata

Example Projects
Check out our cookbook for real-world examples:

Dynamic content extraction


Email template processing
Cached content analysis
Batch HTML processing

API Reference
For detailed API documentation, see:

Start Scraping Job


Get Job Status

Support & Resources


DocumentationComprehensive guides and tutorialsAPI ReferenceDetailed API
documentationCommunityJoin our Discord communityGitHubCheck out our open-source
projectsMain WebsiteVisit our official website
Ready to Start?Sign up now and get your API key to begin processing your HTML
content with LocalScraper!Was this page helpful?
YesNoSmartScraperMarkdownifyxgithublinkedinPowered by MintlifyOn this
pageOverviewKey FeaturesUse CasesInternal SystemsBatch ProcessingDevelopment &
TestingGetting StartedQuick StartAdvanced UsageCustom Schema ExampleAsync
SupportIntegration OptionsOfficial SDKsAI Framework IntegrationsBest PracticesHTML
PreparationOptimization TipsExample ProjectsAPI ReferenceSupport & Resources

------- • -------
https://docs.scrapegraphai.com/services/localscraper#key-features

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationServicesLocalScraper
HomeCookbookAPI ReferenceOfficial WebsiteCommunityBlogGet
StartedIntroductionDashboardServicesSmartScraperLocalScraperMarkdownifyBrowser
ExtensionsOfficial SDKsPython SDKJavaScript SDKIntegrations🦜 LangChain🦙 LlamaIndex👥
CrewAI PhidataContributeOpen SourceFeedbackResources🕷️ScrapeGraphAI
BadgeServicesLocalScraperAI-powered extraction from local HTML content
Overview
LocalScraper brings the same powerful AI extraction capabilities as SmartScraper
but works with your local HTML content. This makes it perfect for scenarios where
you already have the HTML content or need to process cached pages, internal
documents, or dynamically generated content.
Try LocalScraper instantly in our interactive playground - no coding required!
Key Features
Local ProcessingProcess HTML content directly without making external requestsAI
UnderstandingSame powerful AI extraction as SmartScraperFaster ProcessingNo network
latency or website loading delaysFull ControlComplete control over your HTML input
and processing
Use Cases
Internal Systems

Process internally cached pages


Extract from intranet content
Handle dynamic JavaScript renders
Process email templates

Batch Processing

Archive data extraction


Historical content analysis
Bulk document processing
Offline content processing

Development & Testing

Test extraction logic locally


Debug content processing
Prototype without API calls
Validate schemas offline

Want to learn more about our AI-powered scraping technology? Visit our main website
to discover how we’re revolutionizing web data extraction.
Getting Started
Quick Start
from scrapegraph_py import Client

client = Client(api_key="your-api-key")

html_content = """
<html>
<body>
<h1>ScrapeGraphAI</h1>
<div class="description">
<p>AI-powered web scraping for modern applications.</p>
</div>
<div class="features">
<ul>
<li>Smart Extraction</li>
<li>Local Processing</li>
<li>Schema Support</li>
</ul>
</div>
</body>
</html>
"""

response = client.localscraper(
website_html=html_content,
user_prompt="Extract the company information and features"
)

Get your API key from the dashboard


Example Response{
"request_id": "sg-req-xyz789",
"status": "completed",
"user_prompt": "Extract the company information and features",
"result": {
"company_name": "ScrapeGraphAI",
"description": "AI-powered web scraping for modern applications.",
"features": [
"Smart Extraction",
"Local Processing",
"Schema Support"
]
},
"error": ""
}
The response includes:
request_id: Unique identifier for tracking your request
status: Current status of the extraction
result: The extracted data in structured JSON format
error: Error message (if any occurred during extraction)

Advanced Usage
Custom Schema Example
Define exactly what data you want to extract:

Async Support
For applications requiring asynchronous execution, LocalScraper provides async
support through the AsyncClient:
from scrapegraph_py import AsyncClient
import asyncio

async def main():


html_content = """
<html>
<body>
<h1>Product: Gaming Laptop</h1>
<div class="price">$999.99</div>
<div class="description">
High-performance gaming laptop with RTX 3080.
</div>
</body>
</html>
"""
async with AsyncClient(api_key="your-api-key") as client:
response = await client.localscraper(
website_html=html_content,
user_prompt="Extract the product information"
)
print(response)

# Run the async function


asyncio.run(main())

Integration Options
Official SDKs

Python SDK - Perfect for data science and backend applications


JavaScript SDK - Ideal for web applications and Node.js

AI Framework Integrations

LangChain Integration - Use LocalScraper in your LLM workflows


LlamaIndex Integration - Build powerful search and QA systems

Best Practices
HTML Preparation

Ensure HTML is well-formed


Include relevant content only
Clean up unnecessary markup
Handle character encoding properly

Optimization Tips

Remove unnecessary scripts and styles


Clean up dynamic content placeholders
Preserve important semantic structure
Include relevant metadata

Example Projects
Check out our cookbook for real-world examples:

Dynamic content extraction


Email template processing
Cached content analysis
Batch HTML processing

API Reference
For detailed API documentation, see:

Start Scraping Job


Get Job Status

Support & Resources


DocumentationComprehensive guides and tutorialsAPI ReferenceDetailed API
documentationCommunityJoin our Discord communityGitHubCheck out our open-source
projectsMain WebsiteVisit our official website
Ready to Start?Sign up now and get your API key to begin processing your HTML
content with LocalScraper!Was this page helpful?
YesNoSmartScraperMarkdownifyxgithublinkedinPowered by MintlifyOn this
pageOverviewKey FeaturesUse CasesInternal SystemsBatch ProcessingDevelopment &
TestingGetting StartedQuick StartAdvanced UsageCustom Schema ExampleAsync
SupportIntegration OptionsOfficial SDKsAI Framework IntegrationsBest PracticesHTML
PreparationOptimization TipsExample ProjectsAPI ReferenceSupport & Resources

------- • -------

https://docs.scrapegraphai.com/services/localscraper#use-cases

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationServicesLocalScraper
HomeCookbookAPI ReferenceOfficial WebsiteCommunityBlogGet
StartedIntroductionDashboardServicesSmartScraperLocalScraperMarkdownifyBrowser
ExtensionsOfficial SDKsPython SDKJavaScript SDKIntegrations🦜 LangChain🦙 LlamaIndex👥
CrewAI PhidataContributeOpen SourceFeedbackResources🕷️ScrapeGraphAI
BadgeServicesLocalScraperAI-powered extraction from local HTML content
Overview
LocalScraper brings the same powerful AI extraction capabilities as SmartScraper
but works with your local HTML content. This makes it perfect for scenarios where
you already have the HTML content or need to process cached pages, internal
documents, or dynamically generated content.
Try LocalScraper instantly in our interactive playground - no coding required!
Key Features
Local ProcessingProcess HTML content directly without making external requestsAI
UnderstandingSame powerful AI extraction as SmartScraperFaster ProcessingNo network
latency or website loading delaysFull ControlComplete control over your HTML input
and processing
Use Cases
Internal Systems

Process internally cached pages


Extract from intranet content
Handle dynamic JavaScript renders
Process email templates

Batch Processing

Archive data extraction


Historical content analysis
Bulk document processing
Offline content processing

Development & Testing

Test extraction logic locally


Debug content processing
Prototype without API calls
Validate schemas offline

Want to learn more about our AI-powered scraping technology? Visit our main website
to discover how we’re revolutionizing web data extraction.
Getting Started
Quick Start
from scrapegraph_py import Client

client = Client(api_key="your-api-key")

html_content = """
<html>
<body>
<h1>ScrapeGraphAI</h1>
<div class="description">
<p>AI-powered web scraping for modern applications.</p>
</div>
<div class="features">
<ul>
<li>Smart Extraction</li>
<li>Local Processing</li>
<li>Schema Support</li>
</ul>
</div>
</body>
</html>
"""

response = client.localscraper(
website_html=html_content,
user_prompt="Extract the company information and features"
)

Get your API key from the dashboard


Example Response{
"request_id": "sg-req-xyz789",
"status": "completed",
"user_prompt": "Extract the company information and features",
"result": {
"company_name": "ScrapeGraphAI",
"description": "AI-powered web scraping for modern applications.",
"features": [
"Smart Extraction",
"Local Processing",
"Schema Support"
]
},
"error": ""
}
The response includes:
request_id: Unique identifier for tracking your request
status: Current status of the extraction
result: The extracted data in structured JSON format
error: Error message (if any occurred during extraction)

Advanced Usage
Custom Schema Example
Define exactly what data you want to extract:

Async Support
For applications requiring asynchronous execution, LocalScraper provides async
support through the AsyncClient:
from scrapegraph_py import AsyncClient
import asyncio

async def main():


html_content = """
<html>
<body>
<h1>Product: Gaming Laptop</h1>
<div class="price">$999.99</div>
<div class="description">
High-performance gaming laptop with RTX 3080.
</div>
</body>
</html>
"""

async with AsyncClient(api_key="your-api-key") as client:


response = await client.localscraper(
website_html=html_content,
user_prompt="Extract the product information"
)
print(response)

# Run the async function


asyncio.run(main())

Integration Options
Official SDKs

Python SDK - Perfect for data science and backend applications


JavaScript SDK - Ideal for web applications and Node.js

AI Framework Integrations

LangChain Integration - Use LocalScraper in your LLM workflows


LlamaIndex Integration - Build powerful search and QA systems

Best Practices
HTML Preparation

Ensure HTML is well-formed


Include relevant content only
Clean up unnecessary markup
Handle character encoding properly

Optimization Tips

Remove unnecessary scripts and styles


Clean up dynamic content placeholders
Preserve important semantic structure
Include relevant metadata

Example Projects
Check out our cookbook for real-world examples:

Dynamic content extraction


Email template processing
Cached content analysis
Batch HTML processing

API Reference
For detailed API documentation, see:

Start Scraping Job


Get Job Status

Support & Resources


DocumentationComprehensive guides and tutorialsAPI ReferenceDetailed API
documentationCommunityJoin our Discord communityGitHubCheck out our open-source
projectsMain WebsiteVisit our official website
Ready to Start?Sign up now and get your API key to begin processing your HTML
content with LocalScraper!Was this page helpful?
YesNoSmartScraperMarkdownifyxgithublinkedinPowered by MintlifyOn this
pageOverviewKey FeaturesUse CasesInternal SystemsBatch ProcessingDevelopment &
TestingGetting StartedQuick StartAdvanced UsageCustom Schema ExampleAsync
SupportIntegration OptionsOfficial SDKsAI Framework IntegrationsBest PracticesHTML
PreparationOptimization TipsExample ProjectsAPI ReferenceSupport & Resources

------- • -------

https://docs.scrapegraphai.com/services/localscraper#internal-systems

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationServicesLocalScraper
HomeCookbookAPI ReferenceOfficial WebsiteCommunityBlogGet
StartedIntroductionDashboardServicesSmartScraperLocalScraperMarkdownifyBrowser
ExtensionsOfficial SDKsPython SDKJavaScript SDKIntegrations🦜 LangChain🦙 LlamaIndex👥
CrewAI PhidataContributeOpen SourceFeedbackResources🕷️ScrapeGraphAI
BadgeServicesLocalScraperAI-powered extraction from local HTML content
Overview
LocalScraper brings the same powerful AI extraction capabilities as SmartScraper
but works with your local HTML content. This makes it perfect for scenarios where
you already have the HTML content or need to process cached pages, internal
documents, or dynamically generated content.
Try LocalScraper instantly in our interactive playground - no coding required!
Key Features
Local ProcessingProcess HTML content directly without making external requestsAI
UnderstandingSame powerful AI extraction as SmartScraperFaster ProcessingNo network
latency or website loading delaysFull ControlComplete control over your HTML input
and processing
Use Cases
Internal Systems

Process internally cached pages


Extract from intranet content
Handle dynamic JavaScript renders
Process email templates

Batch Processing

Archive data extraction


Historical content analysis
Bulk document processing
Offline content processing

Development & Testing

Test extraction logic locally


Debug content processing
Prototype without API calls
Validate schemas offline

Want to learn more about our AI-powered scraping technology? Visit our main website
to discover how we’re revolutionizing web data extraction.
Getting Started
Quick Start
from scrapegraph_py import Client

client = Client(api_key="your-api-key")
html_content = """
<html>
<body>
<h1>ScrapeGraphAI</h1>
<div class="description">
<p>AI-powered web scraping for modern applications.</p>
</div>
<div class="features">
<ul>
<li>Smart Extraction</li>
<li>Local Processing</li>
<li>Schema Support</li>
</ul>
</div>
</body>
</html>
"""

response = client.localscraper(
website_html=html_content,
user_prompt="Extract the company information and features"
)

Get your API key from the dashboard


Example Response{
"request_id": "sg-req-xyz789",
"status": "completed",
"user_prompt": "Extract the company information and features",
"result": {
"company_name": "ScrapeGraphAI",
"description": "AI-powered web scraping for modern applications.",
"features": [
"Smart Extraction",
"Local Processing",
"Schema Support"
]
},
"error": ""
}
The response includes:
request_id: Unique identifier for tracking your request
status: Current status of the extraction
result: The extracted data in structured JSON format
error: Error message (if any occurred during extraction)

Advanced Usage
Custom Schema Example
Define exactly what data you want to extract:

Async Support
For applications requiring asynchronous execution, LocalScraper provides async
support through the AsyncClient:
from scrapegraph_py import AsyncClient
import asyncio

async def main():


html_content = """
<html>
<body>
<h1>Product: Gaming Laptop</h1>
<div class="price">$999.99</div>
<div class="description">
High-performance gaming laptop with RTX 3080.
</div>
</body>
</html>
"""

async with AsyncClient(api_key="your-api-key") as client:


response = await client.localscraper(
website_html=html_content,
user_prompt="Extract the product information"
)
print(response)

# Run the async function


asyncio.run(main())

Integration Options
Official SDKs

Python SDK - Perfect for data science and backend applications


JavaScript SDK - Ideal for web applications and Node.js

AI Framework Integrations

LangChain Integration - Use LocalScraper in your LLM workflows


LlamaIndex Integration - Build powerful search and QA systems

Best Practices
HTML Preparation

Ensure HTML is well-formed


Include relevant content only
Clean up unnecessary markup
Handle character encoding properly

Optimization Tips

Remove unnecessary scripts and styles


Clean up dynamic content placeholders
Preserve important semantic structure
Include relevant metadata

Example Projects
Check out our cookbook for real-world examples:

Dynamic content extraction


Email template processing
Cached content analysis
Batch HTML processing

API Reference
For detailed API documentation, see:

Start Scraping Job


Get Job Status
Support & Resources
DocumentationComprehensive guides and tutorialsAPI ReferenceDetailed API
documentationCommunityJoin our Discord communityGitHubCheck out our open-source
projectsMain WebsiteVisit our official website
Ready to Start?Sign up now and get your API key to begin processing your HTML
content with LocalScraper!Was this page helpful?
YesNoSmartScraperMarkdownifyxgithublinkedinPowered by MintlifyOn this
pageOverviewKey FeaturesUse CasesInternal SystemsBatch ProcessingDevelopment &
TestingGetting StartedQuick StartAdvanced UsageCustom Schema ExampleAsync
SupportIntegration OptionsOfficial SDKsAI Framework IntegrationsBest PracticesHTML
PreparationOptimization TipsExample ProjectsAPI ReferenceSupport & Resources

------- • -------

https://docs.scrapegraphai.com/services/localscraper#batch-processing

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationServicesLocalScraper
HomeCookbookAPI ReferenceOfficial WebsiteCommunityBlogGet
StartedIntroductionDashboardServicesSmartScraperLocalScraperMarkdownifyBrowser
ExtensionsOfficial SDKsPython SDKJavaScript SDKIntegrations🦜 LangChain🦙 LlamaIndex👥
CrewAI PhidataContributeOpen SourceFeedbackResources🕷️ScrapeGraphAI
BadgeServicesLocalScraperAI-powered extraction from local HTML content
Overview
LocalScraper brings the same powerful AI extraction capabilities as SmartScraper
but works with your local HTML content. This makes it perfect for scenarios where
you already have the HTML content or need to process cached pages, internal
documents, or dynamically generated content.
Try LocalScraper instantly in our interactive playground - no coding required!
Key Features
Local ProcessingProcess HTML content directly without making external requestsAI
UnderstandingSame powerful AI extraction as SmartScraperFaster ProcessingNo network
latency or website loading delaysFull ControlComplete control over your HTML input
and processing
Use Cases
Internal Systems

Process internally cached pages


Extract from intranet content
Handle dynamic JavaScript renders
Process email templates

Batch Processing

Archive data extraction


Historical content analysis
Bulk document processing
Offline content processing

Development & Testing

Test extraction logic locally


Debug content processing
Prototype without API calls
Validate schemas offline

Want to learn more about our AI-powered scraping technology? Visit our main website
to discover how we’re revolutionizing web data extraction.
Getting Started
Quick Start
from scrapegraph_py import Client

client = Client(api_key="your-api-key")

html_content = """
<html>
<body>
<h1>ScrapeGraphAI</h1>
<div class="description">
<p>AI-powered web scraping for modern applications.</p>
</div>
<div class="features">
<ul>
<li>Smart Extraction</li>
<li>Local Processing</li>
<li>Schema Support</li>
</ul>
</div>
</body>
</html>
"""

response = client.localscraper(
website_html=html_content,
user_prompt="Extract the company information and features"
)

Get your API key from the dashboard


Example Response{
"request_id": "sg-req-xyz789",
"status": "completed",
"user_prompt": "Extract the company information and features",
"result": {
"company_name": "ScrapeGraphAI",
"description": "AI-powered web scraping for modern applications.",
"features": [
"Smart Extraction",
"Local Processing",
"Schema Support"
]
},
"error": ""
}
The response includes:
request_id: Unique identifier for tracking your request
status: Current status of the extraction
result: The extracted data in structured JSON format
error: Error message (if any occurred during extraction)

Advanced Usage
Custom Schema Example
Define exactly what data you want to extract:

Async Support
For applications requiring asynchronous execution, LocalScraper provides async
support through the AsyncClient:
from scrapegraph_py import AsyncClient
import asyncio

async def main():


html_content = """
<html>
<body>
<h1>Product: Gaming Laptop</h1>
<div class="price">$999.99</div>
<div class="description">
High-performance gaming laptop with RTX 3080.
</div>
</body>
</html>
"""

async with AsyncClient(api_key="your-api-key") as client:


response = await client.localscraper(
website_html=html_content,
user_prompt="Extract the product information"
)
print(response)

# Run the async function


asyncio.run(main())

Integration Options
Official SDKs

Python SDK - Perfect for data science and backend applications


JavaScript SDK - Ideal for web applications and Node.js

AI Framework Integrations

LangChain Integration - Use LocalScraper in your LLM workflows


LlamaIndex Integration - Build powerful search and QA systems

Best Practices
HTML Preparation

Ensure HTML is well-formed


Include relevant content only
Clean up unnecessary markup
Handle character encoding properly

Optimization Tips

Remove unnecessary scripts and styles


Clean up dynamic content placeholders
Preserve important semantic structure
Include relevant metadata

Example Projects
Check out our cookbook for real-world examples:

Dynamic content extraction


Email template processing
Cached content analysis
Batch HTML processing
API Reference
For detailed API documentation, see:

Start Scraping Job


Get Job Status

Support & Resources


DocumentationComprehensive guides and tutorialsAPI ReferenceDetailed API
documentationCommunityJoin our Discord communityGitHubCheck out our open-source
projectsMain WebsiteVisit our official website
Ready to Start?Sign up now and get your API key to begin processing your HTML
content with LocalScraper!Was this page helpful?
YesNoSmartScraperMarkdownifyxgithublinkedinPowered by MintlifyOn this
pageOverviewKey FeaturesUse CasesInternal SystemsBatch ProcessingDevelopment &
TestingGetting StartedQuick StartAdvanced UsageCustom Schema ExampleAsync
SupportIntegration OptionsOfficial SDKsAI Framework IntegrationsBest PracticesHTML
PreparationOptimization TipsExample ProjectsAPI ReferenceSupport & Resources

------- • -------

https://docs.scrapegraphai.com/services/localscraper#development-and-testing

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationServicesLocalScraper
HomeCookbookAPI ReferenceOfficial WebsiteCommunityBlogGet
StartedIntroductionDashboardServicesSmartScraperLocalScraperMarkdownifyBrowser
ExtensionsOfficial SDKsPython SDKJavaScript SDKIntegrations🦜 LangChain🦙 LlamaIndex👥
CrewAI PhidataContributeOpen SourceFeedbackResources🕷️ScrapeGraphAI
BadgeServicesLocalScraperAI-powered extraction from local HTML content
Overview
LocalScraper brings the same powerful AI extraction capabilities as SmartScraper
but works with your local HTML content. This makes it perfect for scenarios where
you already have the HTML content or need to process cached pages, internal
documents, or dynamically generated content.
Try LocalScraper instantly in our interactive playground - no coding required!
Key Features
Local ProcessingProcess HTML content directly without making external requestsAI
UnderstandingSame powerful AI extraction as SmartScraperFaster ProcessingNo network
latency or website loading delaysFull ControlComplete control over your HTML input
and processing
Use Cases
Internal Systems

Process internally cached pages


Extract from intranet content
Handle dynamic JavaScript renders
Process email templates

Batch Processing

Archive data extraction


Historical content analysis
Bulk document processing
Offline content processing

Development & Testing

Test extraction logic locally


Debug content processing
Prototype without API calls
Validate schemas offline

Want to learn more about our AI-powered scraping technology? Visit our main website
to discover how we’re revolutionizing web data extraction.
Getting Started
Quick Start
from scrapegraph_py import Client

client = Client(api_key="your-api-key")

html_content = """
<html>
<body>
<h1>ScrapeGraphAI</h1>
<div class="description">
<p>AI-powered web scraping for modern applications.</p>
</div>
<div class="features">
<ul>
<li>Smart Extraction</li>
<li>Local Processing</li>
<li>Schema Support</li>
</ul>
</div>
</body>
</html>
"""

response = client.localscraper(
website_html=html_content,
user_prompt="Extract the company information and features"
)

Get your API key from the dashboard


Example Response{
"request_id": "sg-req-xyz789",
"status": "completed",
"user_prompt": "Extract the company information and features",
"result": {
"company_name": "ScrapeGraphAI",
"description": "AI-powered web scraping for modern applications.",
"features": [
"Smart Extraction",
"Local Processing",
"Schema Support"
]
},
"error": ""
}
The response includes:
request_id: Unique identifier for tracking your request
status: Current status of the extraction
result: The extracted data in structured JSON format
error: Error message (if any occurred during extraction)

Advanced Usage
Custom Schema Example
Define exactly what data you want to extract:
Async Support
For applications requiring asynchronous execution, LocalScraper provides async
support through the AsyncClient:
from scrapegraph_py import AsyncClient
import asyncio

async def main():


html_content = """
<html>
<body>
<h1>Product: Gaming Laptop</h1>
<div class="price">$999.99</div>
<div class="description">
High-performance gaming laptop with RTX 3080.
</div>
</body>
</html>
"""

async with AsyncClient(api_key="your-api-key") as client:


response = await client.localscraper(
website_html=html_content,
user_prompt="Extract the product information"
)
print(response)

# Run the async function


asyncio.run(main())

Integration Options
Official SDKs

Python SDK - Perfect for data science and backend applications


JavaScript SDK - Ideal for web applications and Node.js

AI Framework Integrations

LangChain Integration - Use LocalScraper in your LLM workflows


LlamaIndex Integration - Build powerful search and QA systems

Best Practices
HTML Preparation

Ensure HTML is well-formed


Include relevant content only
Clean up unnecessary markup
Handle character encoding properly

Optimization Tips

Remove unnecessary scripts and styles


Clean up dynamic content placeholders
Preserve important semantic structure
Include relevant metadata

Example Projects
Check out our cookbook for real-world examples:
Dynamic content extraction
Email template processing
Cached content analysis
Batch HTML processing

API Reference
For detailed API documentation, see:

Start Scraping Job


Get Job Status

Support & Resources


DocumentationComprehensive guides and tutorialsAPI ReferenceDetailed API
documentationCommunityJoin our Discord communityGitHubCheck out our open-source
projectsMain WebsiteVisit our official website
Ready to Start?Sign up now and get your API key to begin processing your HTML
content with LocalScraper!Was this page helpful?
YesNoSmartScraperMarkdownifyxgithublinkedinPowered by MintlifyOn this
pageOverviewKey FeaturesUse CasesInternal SystemsBatch ProcessingDevelopment &
TestingGetting StartedQuick StartAdvanced UsageCustom Schema ExampleAsync
SupportIntegration OptionsOfficial SDKsAI Framework IntegrationsBest PracticesHTML
PreparationOptimization TipsExample ProjectsAPI ReferenceSupport & Resources

------- • -------

https://docs.scrapegraphai.com/services/localscraper#getting-started

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationServicesLocalScraper
HomeCookbookAPI ReferenceOfficial WebsiteCommunityBlogGet
StartedIntroductionDashboardServicesSmartScraperLocalScraperMarkdownifyBrowser
ExtensionsOfficial SDKsPython SDKJavaScript SDKIntegrations🦜 LangChain🦙 LlamaIndex👥
CrewAI PhidataContributeOpen SourceFeedbackResources🕷️ScrapeGraphAI
BadgeServicesLocalScraperAI-powered extraction from local HTML content
Overview
LocalScraper brings the same powerful AI extraction capabilities as SmartScraper
but works with your local HTML content. This makes it perfect for scenarios where
you already have the HTML content or need to process cached pages, internal
documents, or dynamically generated content.
Try LocalScraper instantly in our interactive playground - no coding required!
Key Features
Local ProcessingProcess HTML content directly without making external requestsAI
UnderstandingSame powerful AI extraction as SmartScraperFaster ProcessingNo network
latency or website loading delaysFull ControlComplete control over your HTML input
and processing
Use Cases
Internal Systems

Process internally cached pages


Extract from intranet content
Handle dynamic JavaScript renders
Process email templates

Batch Processing

Archive data extraction


Historical content analysis
Bulk document processing
Offline content processing
Development & Testing

Test extraction logic locally


Debug content processing
Prototype without API calls
Validate schemas offline

Want to learn more about our AI-powered scraping technology? Visit our main website
to discover how we’re revolutionizing web data extraction.
Getting Started
Quick Start
from scrapegraph_py import Client

client = Client(api_key="your-api-key")

html_content = """
<html>
<body>
<h1>ScrapeGraphAI</h1>
<div class="description">
<p>AI-powered web scraping for modern applications.</p>
</div>
<div class="features">
<ul>
<li>Smart Extraction</li>
<li>Local Processing</li>
<li>Schema Support</li>
</ul>
</div>
</body>
</html>
"""

response = client.localscraper(
website_html=html_content,
user_prompt="Extract the company information and features"
)

Get your API key from the dashboard


Example Response{
"request_id": "sg-req-xyz789",
"status": "completed",
"user_prompt": "Extract the company information and features",
"result": {
"company_name": "ScrapeGraphAI",
"description": "AI-powered web scraping for modern applications.",
"features": [
"Smart Extraction",
"Local Processing",
"Schema Support"
]
},
"error": ""
}
The response includes:
request_id: Unique identifier for tracking your request
status: Current status of the extraction
result: The extracted data in structured JSON format
error: Error message (if any occurred during extraction)

Advanced Usage
Custom Schema Example
Define exactly what data you want to extract:

Async Support
For applications requiring asynchronous execution, LocalScraper provides async
support through the AsyncClient:
from scrapegraph_py import AsyncClient
import asyncio

async def main():


html_content = """
<html>
<body>
<h1>Product: Gaming Laptop</h1>
<div class="price">$999.99</div>
<div class="description">
High-performance gaming laptop with RTX 3080.
</div>
</body>
</html>
"""

async with AsyncClient(api_key="your-api-key") as client:


response = await client.localscraper(
website_html=html_content,
user_prompt="Extract the product information"
)
print(response)

# Run the async function


asyncio.run(main())

Integration Options
Official SDKs

Python SDK - Perfect for data science and backend applications


JavaScript SDK - Ideal for web applications and Node.js

AI Framework Integrations

LangChain Integration - Use LocalScraper in your LLM workflows


LlamaIndex Integration - Build powerful search and QA systems

Best Practices
HTML Preparation

Ensure HTML is well-formed


Include relevant content only
Clean up unnecessary markup
Handle character encoding properly

Optimization Tips

Remove unnecessary scripts and styles


Clean up dynamic content placeholders
Preserve important semantic structure
Include relevant metadata

Example Projects
Check out our cookbook for real-world examples:

Dynamic content extraction


Email template processing
Cached content analysis
Batch HTML processing

API Reference
For detailed API documentation, see:

Start Scraping Job


Get Job Status

Support & Resources


DocumentationComprehensive guides and tutorialsAPI ReferenceDetailed API
documentationCommunityJoin our Discord communityGitHubCheck out our open-source
projectsMain WebsiteVisit our official website
Ready to Start?Sign up now and get your API key to begin processing your HTML
content with LocalScraper!Was this page helpful?
YesNoSmartScraperMarkdownifyxgithublinkedinPowered by MintlifyOn this
pageOverviewKey FeaturesUse CasesInternal SystemsBatch ProcessingDevelopment &
TestingGetting StartedQuick StartAdvanced UsageCustom Schema ExampleAsync
SupportIntegration OptionsOfficial SDKsAI Framework IntegrationsBest PracticesHTML
PreparationOptimization TipsExample ProjectsAPI ReferenceSupport & Resources

------- • -------

https://docs.scrapegraphai.com/services/localscraper#quick-start

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationServicesLocalScraper
HomeCookbookAPI ReferenceOfficial WebsiteCommunityBlogGet
StartedIntroductionDashboardServicesSmartScraperLocalScraperMarkdownifyBrowser
ExtensionsOfficial SDKsPython SDKJavaScript SDKIntegrations🦜 LangChain🦙 LlamaIndex👥
CrewAI PhidataContributeOpen SourceFeedbackResources🕷️ScrapeGraphAI
BadgeServicesLocalScraperAI-powered extraction from local HTML content
Overview
LocalScraper brings the same powerful AI extraction capabilities as SmartScraper
but works with your local HTML content. This makes it perfect for scenarios where
you already have the HTML content or need to process cached pages, internal
documents, or dynamically generated content.
Try LocalScraper instantly in our interactive playground - no coding required!
Key Features
Local ProcessingProcess HTML content directly without making external requestsAI
UnderstandingSame powerful AI extraction as SmartScraperFaster ProcessingNo network
latency or website loading delaysFull ControlComplete control over your HTML input
and processing
Use Cases
Internal Systems

Process internally cached pages


Extract from intranet content
Handle dynamic JavaScript renders
Process email templates

Batch Processing
Archive data extraction
Historical content analysis
Bulk document processing
Offline content processing

Development & Testing

Test extraction logic locally


Debug content processing
Prototype without API calls
Validate schemas offline

Want to learn more about our AI-powered scraping technology? Visit our main website
to discover how we’re revolutionizing web data extraction.
Getting Started
Quick Start
from scrapegraph_py import Client

client = Client(api_key="your-api-key")

html_content = """
<html>
<body>
<h1>ScrapeGraphAI</h1>
<div class="description">
<p>AI-powered web scraping for modern applications.</p>
</div>
<div class="features">
<ul>
<li>Smart Extraction</li>
<li>Local Processing</li>
<li>Schema Support</li>
</ul>
</div>
</body>
</html>
"""

response = client.localscraper(
website_html=html_content,
user_prompt="Extract the company information and features"
)

Get your API key from the dashboard


Example Response{
"request_id": "sg-req-xyz789",
"status": "completed",
"user_prompt": "Extract the company information and features",
"result": {
"company_name": "ScrapeGraphAI",
"description": "AI-powered web scraping for modern applications.",
"features": [
"Smart Extraction",
"Local Processing",
"Schema Support"
]
},
"error": ""
}
The response includes:
request_id: Unique identifier for tracking your request
status: Current status of the extraction
result: The extracted data in structured JSON format
error: Error message (if any occurred during extraction)

Advanced Usage
Custom Schema Example
Define exactly what data you want to extract:

Async Support
For applications requiring asynchronous execution, LocalScraper provides async
support through the AsyncClient:
from scrapegraph_py import AsyncClient
import asyncio

async def main():


html_content = """
<html>
<body>
<h1>Product: Gaming Laptop</h1>
<div class="price">$999.99</div>
<div class="description">
High-performance gaming laptop with RTX 3080.
</div>
</body>
</html>
"""

async with AsyncClient(api_key="your-api-key") as client:


response = await client.localscraper(
website_html=html_content,
user_prompt="Extract the product information"
)
print(response)

# Run the async function


asyncio.run(main())

Integration Options
Official SDKs

Python SDK - Perfect for data science and backend applications


JavaScript SDK - Ideal for web applications and Node.js

AI Framework Integrations

LangChain Integration - Use LocalScraper in your LLM workflows


LlamaIndex Integration - Build powerful search and QA systems

Best Practices
HTML Preparation

Ensure HTML is well-formed


Include relevant content only
Clean up unnecessary markup
Handle character encoding properly
Optimization Tips

Remove unnecessary scripts and styles


Clean up dynamic content placeholders
Preserve important semantic structure
Include relevant metadata

Example Projects
Check out our cookbook for real-world examples:

Dynamic content extraction


Email template processing
Cached content analysis
Batch HTML processing

API Reference
For detailed API documentation, see:

Start Scraping Job


Get Job Status

Support & Resources


DocumentationComprehensive guides and tutorialsAPI ReferenceDetailed API
documentationCommunityJoin our Discord communityGitHubCheck out our open-source
projectsMain WebsiteVisit our official website
Ready to Start?Sign up now and get your API key to begin processing your HTML
content with LocalScraper!Was this page helpful?
YesNoSmartScraperMarkdownifyxgithublinkedinPowered by MintlifyOn this
pageOverviewKey FeaturesUse CasesInternal SystemsBatch ProcessingDevelopment &
TestingGetting StartedQuick StartAdvanced UsageCustom Schema ExampleAsync
SupportIntegration OptionsOfficial SDKsAI Framework IntegrationsBest PracticesHTML
PreparationOptimization TipsExample ProjectsAPI ReferenceSupport & Resources

------- • -------

https://docs.scrapegraphai.com/services/localscraper#advanced-usage

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationServicesLocalScraper
HomeCookbookAPI ReferenceOfficial WebsiteCommunityBlogGet
StartedIntroductionDashboardServicesSmartScraperLocalScraperMarkdownifyBrowser
ExtensionsOfficial SDKsPython SDKJavaScript SDKIntegrations🦜 LangChain🦙 LlamaIndex👥
CrewAI PhidataContributeOpen SourceFeedbackResources🕷️ScrapeGraphAI
BadgeServicesLocalScraperAI-powered extraction from local HTML content
Overview
LocalScraper brings the same powerful AI extraction capabilities as SmartScraper
but works with your local HTML content. This makes it perfect for scenarios where
you already have the HTML content or need to process cached pages, internal
documents, or dynamically generated content.
Try LocalScraper instantly in our interactive playground - no coding required!
Key Features
Local ProcessingProcess HTML content directly without making external requestsAI
UnderstandingSame powerful AI extraction as SmartScraperFaster ProcessingNo network
latency or website loading delaysFull ControlComplete control over your HTML input
and processing
Use Cases
Internal Systems

Process internally cached pages


Extract from intranet content
Handle dynamic JavaScript renders
Process email templates

Batch Processing

Archive data extraction


Historical content analysis
Bulk document processing
Offline content processing

Development & Testing

Test extraction logic locally


Debug content processing
Prototype without API calls
Validate schemas offline

Want to learn more about our AI-powered scraping technology? Visit our main website
to discover how we’re revolutionizing web data extraction.
Getting Started
Quick Start
from scrapegraph_py import Client

client = Client(api_key="your-api-key")

html_content = """
<html>
<body>
<h1>ScrapeGraphAI</h1>
<div class="description">
<p>AI-powered web scraping for modern applications.</p>
</div>
<div class="features">
<ul>
<li>Smart Extraction</li>
<li>Local Processing</li>
<li>Schema Support</li>
</ul>
</div>
</body>
</html>
"""

response = client.localscraper(
website_html=html_content,
user_prompt="Extract the company information and features"
)

Get your API key from the dashboard


Example Response{
"request_id": "sg-req-xyz789",
"status": "completed",
"user_prompt": "Extract the company information and features",
"result": {
"company_name": "ScrapeGraphAI",
"description": "AI-powered web scraping for modern applications.",
"features": [
"Smart Extraction",
"Local Processing",
"Schema Support"
]
},
"error": ""
}
The response includes:
request_id: Unique identifier for tracking your request
status: Current status of the extraction
result: The extracted data in structured JSON format
error: Error message (if any occurred during extraction)

Advanced Usage
Custom Schema Example
Define exactly what data you want to extract:

Async Support
For applications requiring asynchronous execution, LocalScraper provides async
support through the AsyncClient:
from scrapegraph_py import AsyncClient
import asyncio

async def main():


html_content = """
<html>
<body>
<h1>Product: Gaming Laptop</h1>
<div class="price">$999.99</div>
<div class="description">
High-performance gaming laptop with RTX 3080.
</div>
</body>
</html>
"""

async with AsyncClient(api_key="your-api-key") as client:


response = await client.localscraper(
website_html=html_content,
user_prompt="Extract the product information"
)
print(response)

# Run the async function


asyncio.run(main())

Integration Options
Official SDKs

Python SDK - Perfect for data science and backend applications


JavaScript SDK - Ideal for web applications and Node.js

AI Framework Integrations

LangChain Integration - Use LocalScraper in your LLM workflows


LlamaIndex Integration - Build powerful search and QA systems

Best Practices
HTML Preparation
Ensure HTML is well-formed
Include relevant content only
Clean up unnecessary markup
Handle character encoding properly

Optimization Tips

Remove unnecessary scripts and styles


Clean up dynamic content placeholders
Preserve important semantic structure
Include relevant metadata

Example Projects
Check out our cookbook for real-world examples:

Dynamic content extraction


Email template processing
Cached content analysis
Batch HTML processing

API Reference
For detailed API documentation, see:

Start Scraping Job


Get Job Status

Support & Resources


DocumentationComprehensive guides and tutorialsAPI ReferenceDetailed API
documentationCommunityJoin our Discord communityGitHubCheck out our open-source
projectsMain WebsiteVisit our official website
Ready to Start?Sign up now and get your API key to begin processing your HTML
content with LocalScraper!Was this page helpful?
YesNoSmartScraperMarkdownifyxgithublinkedinPowered by MintlifyOn this
pageOverviewKey FeaturesUse CasesInternal SystemsBatch ProcessingDevelopment &
TestingGetting StartedQuick StartAdvanced UsageCustom Schema ExampleAsync
SupportIntegration OptionsOfficial SDKsAI Framework IntegrationsBest PracticesHTML
PreparationOptimization TipsExample ProjectsAPI ReferenceSupport & Resources

------- • -------

https://docs.scrapegraphai.com/services/localscraper#custom-schema-example

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationServicesLocalScraper
HomeCookbookAPI ReferenceOfficial WebsiteCommunityBlogGet
StartedIntroductionDashboardServicesSmartScraperLocalScraperMarkdownifyBrowser
ExtensionsOfficial SDKsPython SDKJavaScript SDKIntegrations🦜 LangChain🦙 LlamaIndex👥
CrewAI PhidataContributeOpen SourceFeedbackResources🕷️ScrapeGraphAI
BadgeServicesLocalScraperAI-powered extraction from local HTML content
Overview
LocalScraper brings the same powerful AI extraction capabilities as SmartScraper
but works with your local HTML content. This makes it perfect for scenarios where
you already have the HTML content or need to process cached pages, internal
documents, or dynamically generated content.
Try LocalScraper instantly in our interactive playground - no coding required!
Key Features
Local ProcessingProcess HTML content directly without making external requestsAI
UnderstandingSame powerful AI extraction as SmartScraperFaster ProcessingNo network
latency or website loading delaysFull ControlComplete control over your HTML input
and processing
Use Cases
Internal Systems

Process internally cached pages


Extract from intranet content
Handle dynamic JavaScript renders
Process email templates

Batch Processing

Archive data extraction


Historical content analysis
Bulk document processing
Offline content processing

Development & Testing

Test extraction logic locally


Debug content processing
Prototype without API calls
Validate schemas offline

Want to learn more about our AI-powered scraping technology? Visit our main website
to discover how we’re revolutionizing web data extraction.
Getting Started
Quick Start
from scrapegraph_py import Client

client = Client(api_key="your-api-key")

html_content = """
<html>
<body>
<h1>ScrapeGraphAI</h1>
<div class="description">
<p>AI-powered web scraping for modern applications.</p>
</div>
<div class="features">
<ul>
<li>Smart Extraction</li>
<li>Local Processing</li>
<li>Schema Support</li>
</ul>
</div>
</body>
</html>
"""

response = client.localscraper(
website_html=html_content,
user_prompt="Extract the company information and features"
)

Get your API key from the dashboard


Example Response{
"request_id": "sg-req-xyz789",
"status": "completed",
"user_prompt": "Extract the company information and features",
"result": {
"company_name": "ScrapeGraphAI",
"description": "AI-powered web scraping for modern applications.",
"features": [
"Smart Extraction",
"Local Processing",
"Schema Support"
]
},
"error": ""
}
The response includes:
request_id: Unique identifier for tracking your request
status: Current status of the extraction
result: The extracted data in structured JSON format
error: Error message (if any occurred during extraction)

Advanced Usage
Custom Schema Example
Define exactly what data you want to extract:

Async Support
For applications requiring asynchronous execution, LocalScraper provides async
support through the AsyncClient:
from scrapegraph_py import AsyncClient
import asyncio

async def main():


html_content = """
<html>
<body>
<h1>Product: Gaming Laptop</h1>
<div class="price">$999.99</div>
<div class="description">
High-performance gaming laptop with RTX 3080.
</div>
</body>
</html>
"""

async with AsyncClient(api_key="your-api-key") as client:


response = await client.localscraper(
website_html=html_content,
user_prompt="Extract the product information"
)
print(response)

# Run the async function


asyncio.run(main())

Integration Options
Official SDKs

Python SDK - Perfect for data science and backend applications


JavaScript SDK - Ideal for web applications and Node.js

AI Framework Integrations

LangChain Integration - Use LocalScraper in your LLM workflows


LlamaIndex Integration - Build powerful search and QA systems

Best Practices
HTML Preparation

Ensure HTML is well-formed


Include relevant content only
Clean up unnecessary markup
Handle character encoding properly

Optimization Tips

Remove unnecessary scripts and styles


Clean up dynamic content placeholders
Preserve important semantic structure
Include relevant metadata

Example Projects
Check out our cookbook for real-world examples:

Dynamic content extraction


Email template processing
Cached content analysis
Batch HTML processing

API Reference
For detailed API documentation, see:

Start Scraping Job


Get Job Status

Support & Resources


DocumentationComprehensive guides and tutorialsAPI ReferenceDetailed API
documentationCommunityJoin our Discord communityGitHubCheck out our open-source
projectsMain WebsiteVisit our official website
Ready to Start?Sign up now and get your API key to begin processing your HTML
content with LocalScraper!Was this page helpful?
YesNoSmartScraperMarkdownifyxgithublinkedinPowered by MintlifyOn this
pageOverviewKey FeaturesUse CasesInternal SystemsBatch ProcessingDevelopment &
TestingGetting StartedQuick StartAdvanced UsageCustom Schema ExampleAsync
SupportIntegration OptionsOfficial SDKsAI Framework IntegrationsBest PracticesHTML
PreparationOptimization TipsExample ProjectsAPI ReferenceSupport & Resources

------- • -------

https://docs.scrapegraphai.com/services/localscraper#async-support

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationServicesLocalScraper
HomeCookbookAPI ReferenceOfficial WebsiteCommunityBlogGet
StartedIntroductionDashboardServicesSmartScraperLocalScraperMarkdownifyBrowser
ExtensionsOfficial SDKsPython SDKJavaScript SDKIntegrations🦜 LangChain🦙 LlamaIndex👥
CrewAI PhidataContributeOpen SourceFeedbackResources🕷️ScrapeGraphAI
BadgeServicesLocalScraperAI-powered extraction from local HTML content
Overview
LocalScraper brings the same powerful AI extraction capabilities as SmartScraper
but works with your local HTML content. This makes it perfect for scenarios where
you already have the HTML content or need to process cached pages, internal
documents, or dynamically generated content.
Try LocalScraper instantly in our interactive playground - no coding required!
Key Features
Local ProcessingProcess HTML content directly without making external requestsAI
UnderstandingSame powerful AI extraction as SmartScraperFaster ProcessingNo network
latency or website loading delaysFull ControlComplete control over your HTML input
and processing
Use Cases
Internal Systems

Process internally cached pages


Extract from intranet content
Handle dynamic JavaScript renders
Process email templates

Batch Processing

Archive data extraction


Historical content analysis
Bulk document processing
Offline content processing

Development & Testing

Test extraction logic locally


Debug content processing
Prototype without API calls
Validate schemas offline

Want to learn more about our AI-powered scraping technology? Visit our main website
to discover how we’re revolutionizing web data extraction.
Getting Started
Quick Start
from scrapegraph_py import Client

client = Client(api_key="your-api-key")

html_content = """
<html>
<body>
<h1>ScrapeGraphAI</h1>
<div class="description">
<p>AI-powered web scraping for modern applications.</p>
</div>
<div class="features">
<ul>
<li>Smart Extraction</li>
<li>Local Processing</li>
<li>Schema Support</li>
</ul>
</div>
</body>
</html>
"""

response = client.localscraper(
website_html=html_content,
user_prompt="Extract the company information and features"
)
Get your API key from the dashboard
Example Response{
"request_id": "sg-req-xyz789",
"status": "completed",
"user_prompt": "Extract the company information and features",
"result": {
"company_name": "ScrapeGraphAI",
"description": "AI-powered web scraping for modern applications.",
"features": [
"Smart Extraction",
"Local Processing",
"Schema Support"
]
},
"error": ""
}
The response includes:
request_id: Unique identifier for tracking your request
status: Current status of the extraction
result: The extracted data in structured JSON format
error: Error message (if any occurred during extraction)

Advanced Usage
Custom Schema Example
Define exactly what data you want to extract:

Async Support
For applications requiring asynchronous execution, LocalScraper provides async
support through the AsyncClient:
from scrapegraph_py import AsyncClient
import asyncio

async def main():


html_content = """
<html>
<body>
<h1>Product: Gaming Laptop</h1>
<div class="price">$999.99</div>
<div class="description">
High-performance gaming laptop with RTX 3080.
</div>
</body>
</html>
"""

async with AsyncClient(api_key="your-api-key") as client:


response = await client.localscraper(
website_html=html_content,
user_prompt="Extract the product information"
)
print(response)

# Run the async function


asyncio.run(main())

Integration Options
Official SDKs

Python SDK - Perfect for data science and backend applications


JavaScript SDK - Ideal for web applications and Node.js

AI Framework Integrations

LangChain Integration - Use LocalScraper in your LLM workflows


LlamaIndex Integration - Build powerful search and QA systems

Best Practices
HTML Preparation

Ensure HTML is well-formed


Include relevant content only
Clean up unnecessary markup
Handle character encoding properly

Optimization Tips

Remove unnecessary scripts and styles


Clean up dynamic content placeholders
Preserve important semantic structure
Include relevant metadata

Example Projects
Check out our cookbook for real-world examples:

Dynamic content extraction


Email template processing
Cached content analysis
Batch HTML processing

API Reference
For detailed API documentation, see:

Start Scraping Job


Get Job Status

Support & Resources


DocumentationComprehensive guides and tutorialsAPI ReferenceDetailed API
documentationCommunityJoin our Discord communityGitHubCheck out our open-source
projectsMain WebsiteVisit our official website
Ready to Start?Sign up now and get your API key to begin processing your HTML
content with LocalScraper!Was this page helpful?
YesNoSmartScraperMarkdownifyxgithublinkedinPowered by MintlifyOn this
pageOverviewKey FeaturesUse CasesInternal SystemsBatch ProcessingDevelopment &
TestingGetting StartedQuick StartAdvanced UsageCustom Schema ExampleAsync
SupportIntegration OptionsOfficial SDKsAI Framework IntegrationsBest PracticesHTML
PreparationOptimization TipsExample ProjectsAPI ReferenceSupport & Resources

------- • -------

https://docs.scrapegraphai.com/services/localscraper#integration-options

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationServicesLocalScraper
HomeCookbookAPI ReferenceOfficial WebsiteCommunityBlogGet
StartedIntroductionDashboardServicesSmartScraperLocalScraperMarkdownifyBrowser
ExtensionsOfficial SDKsPython SDKJavaScript SDKIntegrations🦜 LangChain🦙 LlamaIndex👥
CrewAI PhidataContributeOpen SourceFeedbackResources🕷️ScrapeGraphAI
BadgeServicesLocalScraperAI-powered extraction from local HTML content
Overview
LocalScraper brings the same powerful AI extraction capabilities as SmartScraper
but works with your local HTML content. This makes it perfect for scenarios where
you already have the HTML content or need to process cached pages, internal
documents, or dynamically generated content.
Try LocalScraper instantly in our interactive playground - no coding required!
Key Features
Local ProcessingProcess HTML content directly without making external requestsAI
UnderstandingSame powerful AI extraction as SmartScraperFaster ProcessingNo network
latency or website loading delaysFull ControlComplete control over your HTML input
and processing
Use Cases
Internal Systems

Process internally cached pages


Extract from intranet content
Handle dynamic JavaScript renders
Process email templates

Batch Processing

Archive data extraction


Historical content analysis
Bulk document processing
Offline content processing

Development & Testing

Test extraction logic locally


Debug content processing
Prototype without API calls
Validate schemas offline

Want to learn more about our AI-powered scraping technology? Visit our main website
to discover how we’re revolutionizing web data extraction.
Getting Started
Quick Start
from scrapegraph_py import Client

client = Client(api_key="your-api-key")

html_content = """
<html>
<body>
<h1>ScrapeGraphAI</h1>
<div class="description">
<p>AI-powered web scraping for modern applications.</p>
</div>
<div class="features">
<ul>
<li>Smart Extraction</li>
<li>Local Processing</li>
<li>Schema Support</li>
</ul>
</div>
</body>
</html>
"""
response = client.localscraper(
website_html=html_content,
user_prompt="Extract the company information and features"
)

Get your API key from the dashboard


Example Response{
"request_id": "sg-req-xyz789",
"status": "completed",
"user_prompt": "Extract the company information and features",
"result": {
"company_name": "ScrapeGraphAI",
"description": "AI-powered web scraping for modern applications.",
"features": [
"Smart Extraction",
"Local Processing",
"Schema Support"
]
},
"error": ""
}
The response includes:
request_id: Unique identifier for tracking your request
status: Current status of the extraction
result: The extracted data in structured JSON format
error: Error message (if any occurred during extraction)

Advanced Usage
Custom Schema Example
Define exactly what data you want to extract:

Async Support
For applications requiring asynchronous execution, LocalScraper provides async
support through the AsyncClient:
from scrapegraph_py import AsyncClient
import asyncio

async def main():


html_content = """
<html>
<body>
<h1>Product: Gaming Laptop</h1>
<div class="price">$999.99</div>
<div class="description">
High-performance gaming laptop with RTX 3080.
</div>
</body>
</html>
"""

async with AsyncClient(api_key="your-api-key") as client:


response = await client.localscraper(
website_html=html_content,
user_prompt="Extract the product information"
)
print(response)

# Run the async function


asyncio.run(main())
Integration Options
Official SDKs

Python SDK - Perfect for data science and backend applications


JavaScript SDK - Ideal for web applications and Node.js

AI Framework Integrations

LangChain Integration - Use LocalScraper in your LLM workflows


LlamaIndex Integration - Build powerful search and QA systems

Best Practices
HTML Preparation

Ensure HTML is well-formed


Include relevant content only
Clean up unnecessary markup
Handle character encoding properly

Optimization Tips

Remove unnecessary scripts and styles


Clean up dynamic content placeholders
Preserve important semantic structure
Include relevant metadata

Example Projects
Check out our cookbook for real-world examples:

Dynamic content extraction


Email template processing
Cached content analysis
Batch HTML processing

API Reference
For detailed API documentation, see:

Start Scraping Job


Get Job Status

Support & Resources


DocumentationComprehensive guides and tutorialsAPI ReferenceDetailed API
documentationCommunityJoin our Discord communityGitHubCheck out our open-source
projectsMain WebsiteVisit our official website
Ready to Start?Sign up now and get your API key to begin processing your HTML
content with LocalScraper!Was this page helpful?
YesNoSmartScraperMarkdownifyxgithublinkedinPowered by MintlifyOn this
pageOverviewKey FeaturesUse CasesInternal SystemsBatch ProcessingDevelopment &
TestingGetting StartedQuick StartAdvanced UsageCustom Schema ExampleAsync
SupportIntegration OptionsOfficial SDKsAI Framework IntegrationsBest PracticesHTML
PreparationOptimization TipsExample ProjectsAPI ReferenceSupport & Resources

------- • -------

https://docs.scrapegraphai.com/services/localscraper#official-sdks

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationServicesLocalScraper
HomeCookbookAPI ReferenceOfficial WebsiteCommunityBlogGet
StartedIntroductionDashboardServicesSmartScraperLocalScraperMarkdownifyBrowser
ExtensionsOfficial SDKsPython SDKJavaScript SDKIntegrations🦜 LangChain🦙 LlamaIndex👥
CrewAI PhidataContributeOpen SourceFeedbackResources🕷️ScrapeGraphAI
BadgeServicesLocalScraperAI-powered extraction from local HTML content
Overview
LocalScraper brings the same powerful AI extraction capabilities as SmartScraper
but works with your local HTML content. This makes it perfect for scenarios where
you already have the HTML content or need to process cached pages, internal
documents, or dynamically generated content.
Try LocalScraper instantly in our interactive playground - no coding required!
Key Features
Local ProcessingProcess HTML content directly without making external requestsAI
UnderstandingSame powerful AI extraction as SmartScraperFaster ProcessingNo network
latency or website loading delaysFull ControlComplete control over your HTML input
and processing
Use Cases
Internal Systems

Process internally cached pages


Extract from intranet content
Handle dynamic JavaScript renders
Process email templates

Batch Processing

Archive data extraction


Historical content analysis
Bulk document processing
Offline content processing

Development & Testing

Test extraction logic locally


Debug content processing
Prototype without API calls
Validate schemas offline

Want to learn more about our AI-powered scraping technology? Visit our main website
to discover how we’re revolutionizing web data extraction.
Getting Started
Quick Start
from scrapegraph_py import Client

client = Client(api_key="your-api-key")

html_content = """
<html>
<body>
<h1>ScrapeGraphAI</h1>
<div class="description">
<p>AI-powered web scraping for modern applications.</p>
</div>
<div class="features">
<ul>
<li>Smart Extraction</li>
<li>Local Processing</li>
<li>Schema Support</li>
</ul>
</div>
</body>
</html>
"""

response = client.localscraper(
website_html=html_content,
user_prompt="Extract the company information and features"
)

Get your API key from the dashboard


Example Response{
"request_id": "sg-req-xyz789",
"status": "completed",
"user_prompt": "Extract the company information and features",
"result": {
"company_name": "ScrapeGraphAI",
"description": "AI-powered web scraping for modern applications.",
"features": [
"Smart Extraction",
"Local Processing",
"Schema Support"
]
},
"error": ""
}
The response includes:
request_id: Unique identifier for tracking your request
status: Current status of the extraction
result: The extracted data in structured JSON format
error: Error message (if any occurred during extraction)

Advanced Usage
Custom Schema Example
Define exactly what data you want to extract:

Async Support
For applications requiring asynchronous execution, LocalScraper provides async
support through the AsyncClient:
from scrapegraph_py import AsyncClient
import asyncio

async def main():


html_content = """
<html>
<body>
<h1>Product: Gaming Laptop</h1>
<div class="price">$999.99</div>
<div class="description">
High-performance gaming laptop with RTX 3080.
</div>
</body>
</html>
"""

async with AsyncClient(api_key="your-api-key") as client:


response = await client.localscraper(
website_html=html_content,
user_prompt="Extract the product information"
)
print(response)

# Run the async function


asyncio.run(main())

Integration Options
Official SDKs

Python SDK - Perfect for data science and backend applications


JavaScript SDK - Ideal for web applications and Node.js

AI Framework Integrations

LangChain Integration - Use LocalScraper in your LLM workflows


LlamaIndex Integration - Build powerful search and QA systems

Best Practices
HTML Preparation

Ensure HTML is well-formed


Include relevant content only
Clean up unnecessary markup
Handle character encoding properly

Optimization Tips

Remove unnecessary scripts and styles


Clean up dynamic content placeholders
Preserve important semantic structure
Include relevant metadata

Example Projects
Check out our cookbook for real-world examples:

Dynamic content extraction


Email template processing
Cached content analysis
Batch HTML processing

API Reference
For detailed API documentation, see:

Start Scraping Job


Get Job Status

Support & Resources


DocumentationComprehensive guides and tutorialsAPI ReferenceDetailed API
documentationCommunityJoin our Discord communityGitHubCheck out our open-source
projectsMain WebsiteVisit our official website
Ready to Start?Sign up now and get your API key to begin processing your HTML
content with LocalScraper!Was this page helpful?
YesNoSmartScraperMarkdownifyxgithublinkedinPowered by MintlifyOn this
pageOverviewKey FeaturesUse CasesInternal SystemsBatch ProcessingDevelopment &
TestingGetting StartedQuick StartAdvanced UsageCustom Schema ExampleAsync
SupportIntegration OptionsOfficial SDKsAI Framework IntegrationsBest PracticesHTML
PreparationOptimization TipsExample ProjectsAPI ReferenceSupport & Resources

------- • -------
https://docs.scrapegraphai.com/services/localscraper#ai-framework-integrations

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationServicesLocalScraper
HomeCookbookAPI ReferenceOfficial WebsiteCommunityBlogGet
StartedIntroductionDashboardServicesSmartScraperLocalScraperMarkdownifyBrowser
ExtensionsOfficial SDKsPython SDKJavaScript SDKIntegrations🦜 LangChain🦙 LlamaIndex👥
CrewAI PhidataContributeOpen SourceFeedbackResources🕷️ScrapeGraphAI
BadgeServicesLocalScraperAI-powered extraction from local HTML content
Overview
LocalScraper brings the same powerful AI extraction capabilities as SmartScraper
but works with your local HTML content. This makes it perfect for scenarios where
you already have the HTML content or need to process cached pages, internal
documents, or dynamically generated content.
Try LocalScraper instantly in our interactive playground - no coding required!
Key Features
Local ProcessingProcess HTML content directly without making external requestsAI
UnderstandingSame powerful AI extraction as SmartScraperFaster ProcessingNo network
latency or website loading delaysFull ControlComplete control over your HTML input
and processing
Use Cases
Internal Systems

Process internally cached pages


Extract from intranet content
Handle dynamic JavaScript renders
Process email templates

Batch Processing

Archive data extraction


Historical content analysis
Bulk document processing
Offline content processing

Development & Testing

Test extraction logic locally


Debug content processing
Prototype without API calls
Validate schemas offline

Want to learn more about our AI-powered scraping technology? Visit our main website
to discover how we’re revolutionizing web data extraction.
Getting Started
Quick Start
from scrapegraph_py import Client

client = Client(api_key="your-api-key")

html_content = """
<html>
<body>
<h1>ScrapeGraphAI</h1>
<div class="description">
<p>AI-powered web scraping for modern applications.</p>
</div>
<div class="features">
<ul>
<li>Smart Extraction</li>
<li>Local Processing</li>
<li>Schema Support</li>
</ul>
</div>
</body>
</html>
"""

response = client.localscraper(
website_html=html_content,
user_prompt="Extract the company information and features"
)

Get your API key from the dashboard


Example Response{
"request_id": "sg-req-xyz789",
"status": "completed",
"user_prompt": "Extract the company information and features",
"result": {
"company_name": "ScrapeGraphAI",
"description": "AI-powered web scraping for modern applications.",
"features": [
"Smart Extraction",
"Local Processing",
"Schema Support"
]
},
"error": ""
}
The response includes:
request_id: Unique identifier for tracking your request
status: Current status of the extraction
result: The extracted data in structured JSON format
error: Error message (if any occurred during extraction)

Advanced Usage
Custom Schema Example
Define exactly what data you want to extract:

Async Support
For applications requiring asynchronous execution, LocalScraper provides async
support through the AsyncClient:
from scrapegraph_py import AsyncClient
import asyncio

async def main():


html_content = """
<html>
<body>
<h1>Product: Gaming Laptop</h1>
<div class="price">$999.99</div>
<div class="description">
High-performance gaming laptop with RTX 3080.
</div>
</body>
</html>
"""
async with AsyncClient(api_key="your-api-key") as client:
response = await client.localscraper(
website_html=html_content,
user_prompt="Extract the product information"
)
print(response)

# Run the async function


asyncio.run(main())

Integration Options
Official SDKs

Python SDK - Perfect for data science and backend applications


JavaScript SDK - Ideal for web applications and Node.js

AI Framework Integrations

LangChain Integration - Use LocalScraper in your LLM workflows


LlamaIndex Integration - Build powerful search and QA systems

Best Practices
HTML Preparation

Ensure HTML is well-formed


Include relevant content only
Clean up unnecessary markup
Handle character encoding properly

Optimization Tips

Remove unnecessary scripts and styles


Clean up dynamic content placeholders
Preserve important semantic structure
Include relevant metadata

Example Projects
Check out our cookbook for real-world examples:

Dynamic content extraction


Email template processing
Cached content analysis
Batch HTML processing

API Reference
For detailed API documentation, see:

Start Scraping Job


Get Job Status

Support & Resources


DocumentationComprehensive guides and tutorialsAPI ReferenceDetailed API
documentationCommunityJoin our Discord communityGitHubCheck out our open-source
projectsMain WebsiteVisit our official website
Ready to Start?Sign up now and get your API key to begin processing your HTML
content with LocalScraper!Was this page helpful?
YesNoSmartScraperMarkdownifyxgithublinkedinPowered by MintlifyOn this
pageOverviewKey FeaturesUse CasesInternal SystemsBatch ProcessingDevelopment &
TestingGetting StartedQuick StartAdvanced UsageCustom Schema ExampleAsync
SupportIntegration OptionsOfficial SDKsAI Framework IntegrationsBest PracticesHTML
PreparationOptimization TipsExample ProjectsAPI ReferenceSupport & Resources

------- • -------

https://docs.scrapegraphai.com/services/localscraper#best-practices

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationServicesLocalScraper
HomeCookbookAPI ReferenceOfficial WebsiteCommunityBlogGet
StartedIntroductionDashboardServicesSmartScraperLocalScraperMarkdownifyBrowser
ExtensionsOfficial SDKsPython SDKJavaScript SDKIntegrations🦜 LangChain🦙 LlamaIndex👥
CrewAI PhidataContributeOpen SourceFeedbackResources🕷️ScrapeGraphAI
BadgeServicesLocalScraperAI-powered extraction from local HTML content
Overview
LocalScraper brings the same powerful AI extraction capabilities as SmartScraper
but works with your local HTML content. This makes it perfect for scenarios where
you already have the HTML content or need to process cached pages, internal
documents, or dynamically generated content.
Try LocalScraper instantly in our interactive playground - no coding required!
Key Features
Local ProcessingProcess HTML content directly without making external requestsAI
UnderstandingSame powerful AI extraction as SmartScraperFaster ProcessingNo network
latency or website loading delaysFull ControlComplete control over your HTML input
and processing
Use Cases
Internal Systems

Process internally cached pages


Extract from intranet content
Handle dynamic JavaScript renders
Process email templates

Batch Processing

Archive data extraction


Historical content analysis
Bulk document processing
Offline content processing

Development & Testing

Test extraction logic locally


Debug content processing
Prototype without API calls
Validate schemas offline

Want to learn more about our AI-powered scraping technology? Visit our main website
to discover how we’re revolutionizing web data extraction.
Getting Started
Quick Start
from scrapegraph_py import Client

client = Client(api_key="your-api-key")

html_content = """
<html>
<body>
<h1>ScrapeGraphAI</h1>
<div class="description">
<p>AI-powered web scraping for modern applications.</p>
</div>
<div class="features">
<ul>
<li>Smart Extraction</li>
<li>Local Processing</li>
<li>Schema Support</li>
</ul>
</div>
</body>
</html>
"""

response = client.localscraper(
website_html=html_content,
user_prompt="Extract the company information and features"
)

Get your API key from the dashboard


Example Response{
"request_id": "sg-req-xyz789",
"status": "completed",
"user_prompt": "Extract the company information and features",
"result": {
"company_name": "ScrapeGraphAI",
"description": "AI-powered web scraping for modern applications.",
"features": [
"Smart Extraction",
"Local Processing",
"Schema Support"
]
},
"error": ""
}
The response includes:
request_id: Unique identifier for tracking your request
status: Current status of the extraction
result: The extracted data in structured JSON format
error: Error message (if any occurred during extraction)

Advanced Usage
Custom Schema Example
Define exactly what data you want to extract:

Async Support
For applications requiring asynchronous execution, LocalScraper provides async
support through the AsyncClient:
from scrapegraph_py import AsyncClient
import asyncio

async def main():


html_content = """
<html>
<body>
<h1>Product: Gaming Laptop</h1>
<div class="price">$999.99</div>
<div class="description">
High-performance gaming laptop with RTX 3080.
</div>
</body>
</html>
"""

async with AsyncClient(api_key="your-api-key") as client:


response = await client.localscraper(
website_html=html_content,
user_prompt="Extract the product information"
)
print(response)

# Run the async function


asyncio.run(main())

Integration Options
Official SDKs

Python SDK - Perfect for data science and backend applications


JavaScript SDK - Ideal for web applications and Node.js

AI Framework Integrations

LangChain Integration - Use LocalScraper in your LLM workflows


LlamaIndex Integration - Build powerful search and QA systems

Best Practices
HTML Preparation

Ensure HTML is well-formed


Include relevant content only
Clean up unnecessary markup
Handle character encoding properly

Optimization Tips

Remove unnecessary scripts and styles


Clean up dynamic content placeholders
Preserve important semantic structure
Include relevant metadata

Example Projects
Check out our cookbook for real-world examples:

Dynamic content extraction


Email template processing
Cached content analysis
Batch HTML processing

API Reference
For detailed API documentation, see:

Start Scraping Job


Get Job Status

Support & Resources


DocumentationComprehensive guides and tutorialsAPI ReferenceDetailed API
documentationCommunityJoin our Discord communityGitHubCheck out our open-source
projectsMain WebsiteVisit our official website
Ready to Start?Sign up now and get your API key to begin processing your HTML
content with LocalScraper!Was this page helpful?
YesNoSmartScraperMarkdownifyxgithublinkedinPowered by MintlifyOn this
pageOverviewKey FeaturesUse CasesInternal SystemsBatch ProcessingDevelopment &
TestingGetting StartedQuick StartAdvanced UsageCustom Schema ExampleAsync
SupportIntegration OptionsOfficial SDKsAI Framework IntegrationsBest PracticesHTML
PreparationOptimization TipsExample ProjectsAPI ReferenceSupport & Resources

------- • -------

https://docs.scrapegraphai.com/services/localscraper#html-preparation

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationServicesLocalScraper
HomeCookbookAPI ReferenceOfficial WebsiteCommunityBlogGet
StartedIntroductionDashboardServicesSmartScraperLocalScraperMarkdownifyBrowser
ExtensionsOfficial SDKsPython SDKJavaScript SDKIntegrations🦜 LangChain🦙 LlamaIndex👥
CrewAI PhidataContributeOpen SourceFeedbackResources🕷️ScrapeGraphAI
BadgeServicesLocalScraperAI-powered extraction from local HTML content
Overview
LocalScraper brings the same powerful AI extraction capabilities as SmartScraper
but works with your local HTML content. This makes it perfect for scenarios where
you already have the HTML content or need to process cached pages, internal
documents, or dynamically generated content.
Try LocalScraper instantly in our interactive playground - no coding required!
Key Features
Local ProcessingProcess HTML content directly without making external requestsAI
UnderstandingSame powerful AI extraction as SmartScraperFaster ProcessingNo network
latency or website loading delaysFull ControlComplete control over your HTML input
and processing
Use Cases
Internal Systems

Process internally cached pages


Extract from intranet content
Handle dynamic JavaScript renders
Process email templates

Batch Processing

Archive data extraction


Historical content analysis
Bulk document processing
Offline content processing

Development & Testing

Test extraction logic locally


Debug content processing
Prototype without API calls
Validate schemas offline

Want to learn more about our AI-powered scraping technology? Visit our main website
to discover how we’re revolutionizing web data extraction.
Getting Started
Quick Start
from scrapegraph_py import Client
client = Client(api_key="your-api-key")

html_content = """
<html>
<body>
<h1>ScrapeGraphAI</h1>
<div class="description">
<p>AI-powered web scraping for modern applications.</p>
</div>
<div class="features">
<ul>
<li>Smart Extraction</li>
<li>Local Processing</li>
<li>Schema Support</li>
</ul>
</div>
</body>
</html>
"""

response = client.localscraper(
website_html=html_content,
user_prompt="Extract the company information and features"
)

Get your API key from the dashboard


Example Response{
"request_id": "sg-req-xyz789",
"status": "completed",
"user_prompt": "Extract the company information and features",
"result": {
"company_name": "ScrapeGraphAI",
"description": "AI-powered web scraping for modern applications.",
"features": [
"Smart Extraction",
"Local Processing",
"Schema Support"
]
},
"error": ""
}
The response includes:
request_id: Unique identifier for tracking your request
status: Current status of the extraction
result: The extracted data in structured JSON format
error: Error message (if any occurred during extraction)

Advanced Usage
Custom Schema Example
Define exactly what data you want to extract:

Async Support
For applications requiring asynchronous execution, LocalScraper provides async
support through the AsyncClient:
from scrapegraph_py import AsyncClient
import asyncio

async def main():


html_content = """
<html>
<body>
<h1>Product: Gaming Laptop</h1>
<div class="price">$999.99</div>
<div class="description">
High-performance gaming laptop with RTX 3080.
</div>
</body>
</html>
"""

async with AsyncClient(api_key="your-api-key") as client:


response = await client.localscraper(
website_html=html_content,
user_prompt="Extract the product information"
)
print(response)

# Run the async function


asyncio.run(main())

Integration Options
Official SDKs

Python SDK - Perfect for data science and backend applications


JavaScript SDK - Ideal for web applications and Node.js

AI Framework Integrations

LangChain Integration - Use LocalScraper in your LLM workflows


LlamaIndex Integration - Build powerful search and QA systems

Best Practices
HTML Preparation

Ensure HTML is well-formed


Include relevant content only
Clean up unnecessary markup
Handle character encoding properly

Optimization Tips

Remove unnecessary scripts and styles


Clean up dynamic content placeholders
Preserve important semantic structure
Include relevant metadata

Example Projects
Check out our cookbook for real-world examples:

Dynamic content extraction


Email template processing
Cached content analysis
Batch HTML processing

API Reference
For detailed API documentation, see:

Start Scraping Job


Get Job Status

Support & Resources


DocumentationComprehensive guides and tutorialsAPI ReferenceDetailed API
documentationCommunityJoin our Discord communityGitHubCheck out our open-source
projectsMain WebsiteVisit our official website
Ready to Start?Sign up now and get your API key to begin processing your HTML
content with LocalScraper!Was this page helpful?
YesNoSmartScraperMarkdownifyxgithublinkedinPowered by MintlifyOn this
pageOverviewKey FeaturesUse CasesInternal SystemsBatch ProcessingDevelopment &
TestingGetting StartedQuick StartAdvanced UsageCustom Schema ExampleAsync
SupportIntegration OptionsOfficial SDKsAI Framework IntegrationsBest PracticesHTML
PreparationOptimization TipsExample ProjectsAPI ReferenceSupport & Resources

------- • -------

https://docs.scrapegraphai.com/services/localscraper#optimization-tips

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationServicesLocalScraper
HomeCookbookAPI ReferenceOfficial WebsiteCommunityBlogGet
StartedIntroductionDashboardServicesSmartScraperLocalScraperMarkdownifyBrowser
ExtensionsOfficial SDKsPython SDKJavaScript SDKIntegrations🦜 LangChain🦙 LlamaIndex👥
CrewAI PhidataContributeOpen SourceFeedbackResources🕷️ScrapeGraphAI
BadgeServicesLocalScraperAI-powered extraction from local HTML content
Overview
LocalScraper brings the same powerful AI extraction capabilities as SmartScraper
but works with your local HTML content. This makes it perfect for scenarios where
you already have the HTML content or need to process cached pages, internal
documents, or dynamically generated content.
Try LocalScraper instantly in our interactive playground - no coding required!
Key Features
Local ProcessingProcess HTML content directly without making external requestsAI
UnderstandingSame powerful AI extraction as SmartScraperFaster ProcessingNo network
latency or website loading delaysFull ControlComplete control over your HTML input
and processing
Use Cases
Internal Systems

Process internally cached pages


Extract from intranet content
Handle dynamic JavaScript renders
Process email templates

Batch Processing

Archive data extraction


Historical content analysis
Bulk document processing
Offline content processing

Development & Testing

Test extraction logic locally


Debug content processing
Prototype without API calls
Validate schemas offline

Want to learn more about our AI-powered scraping technology? Visit our main website
to discover how we’re revolutionizing web data extraction.
Getting Started
Quick Start
from scrapegraph_py import Client

client = Client(api_key="your-api-key")

html_content = """
<html>
<body>
<h1>ScrapeGraphAI</h1>
<div class="description">
<p>AI-powered web scraping for modern applications.</p>
</div>
<div class="features">
<ul>
<li>Smart Extraction</li>
<li>Local Processing</li>
<li>Schema Support</li>
</ul>
</div>
</body>
</html>
"""

response = client.localscraper(
website_html=html_content,
user_prompt="Extract the company information and features"
)

Get your API key from the dashboard


Example Response{
"request_id": "sg-req-xyz789",
"status": "completed",
"user_prompt": "Extract the company information and features",
"result": {
"company_name": "ScrapeGraphAI",
"description": "AI-powered web scraping for modern applications.",
"features": [
"Smart Extraction",
"Local Processing",
"Schema Support"
]
},
"error": ""
}
The response includes:
request_id: Unique identifier for tracking your request
status: Current status of the extraction
result: The extracted data in structured JSON format
error: Error message (if any occurred during extraction)

Advanced Usage
Custom Schema Example
Define exactly what data you want to extract:

Async Support
For applications requiring asynchronous execution, LocalScraper provides async
support through the AsyncClient:
from scrapegraph_py import AsyncClient
import asyncio

async def main():


html_content = """
<html>
<body>
<h1>Product: Gaming Laptop</h1>
<div class="price">$999.99</div>
<div class="description">
High-performance gaming laptop with RTX 3080.
</div>
</body>
</html>
"""

async with AsyncClient(api_key="your-api-key") as client:


response = await client.localscraper(
website_html=html_content,
user_prompt="Extract the product information"
)
print(response)

# Run the async function


asyncio.run(main())

Integration Options
Official SDKs

Python SDK - Perfect for data science and backend applications


JavaScript SDK - Ideal for web applications and Node.js

AI Framework Integrations

LangChain Integration - Use LocalScraper in your LLM workflows


LlamaIndex Integration - Build powerful search and QA systems

Best Practices
HTML Preparation

Ensure HTML is well-formed


Include relevant content only
Clean up unnecessary markup
Handle character encoding properly

Optimization Tips

Remove unnecessary scripts and styles


Clean up dynamic content placeholders
Preserve important semantic structure
Include relevant metadata

Example Projects
Check out our cookbook for real-world examples:

Dynamic content extraction


Email template processing
Cached content analysis
Batch HTML processing
API Reference
For detailed API documentation, see:

Start Scraping Job


Get Job Status

Support & Resources


DocumentationComprehensive guides and tutorialsAPI ReferenceDetailed API
documentationCommunityJoin our Discord communityGitHubCheck out our open-source
projectsMain WebsiteVisit our official website
Ready to Start?Sign up now and get your API key to begin processing your HTML
content with LocalScraper!Was this page helpful?
YesNoSmartScraperMarkdownifyxgithublinkedinPowered by MintlifyOn this
pageOverviewKey FeaturesUse CasesInternal SystemsBatch ProcessingDevelopment &
TestingGetting StartedQuick StartAdvanced UsageCustom Schema ExampleAsync
SupportIntegration OptionsOfficial SDKsAI Framework IntegrationsBest PracticesHTML
PreparationOptimization TipsExample ProjectsAPI ReferenceSupport & Resources

------- • -------

https://docs.scrapegraphai.com/services/localscraper#example-projects

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationServicesLocalScraper
HomeCookbookAPI ReferenceOfficial WebsiteCommunityBlogGet
StartedIntroductionDashboardServicesSmartScraperLocalScraperMarkdownifyBrowser
ExtensionsOfficial SDKsPython SDKJavaScript SDKIntegrations🦜 LangChain🦙 LlamaIndex👥
CrewAI PhidataContributeOpen SourceFeedbackResources🕷️ScrapeGraphAI
BadgeServicesLocalScraperAI-powered extraction from local HTML content
Overview
LocalScraper brings the same powerful AI extraction capabilities as SmartScraper
but works with your local HTML content. This makes it perfect for scenarios where
you already have the HTML content or need to process cached pages, internal
documents, or dynamically generated content.
Try LocalScraper instantly in our interactive playground - no coding required!
Key Features
Local ProcessingProcess HTML content directly without making external requestsAI
UnderstandingSame powerful AI extraction as SmartScraperFaster ProcessingNo network
latency or website loading delaysFull ControlComplete control over your HTML input
and processing
Use Cases
Internal Systems

Process internally cached pages


Extract from intranet content
Handle dynamic JavaScript renders
Process email templates

Batch Processing

Archive data extraction


Historical content analysis
Bulk document processing
Offline content processing

Development & Testing

Test extraction logic locally


Debug content processing
Prototype without API calls
Validate schemas offline

Want to learn more about our AI-powered scraping technology? Visit our main website
to discover how we’re revolutionizing web data extraction.
Getting Started
Quick Start
from scrapegraph_py import Client

client = Client(api_key="your-api-key")

html_content = """
<html>
<body>
<h1>ScrapeGraphAI</h1>
<div class="description">
<p>AI-powered web scraping for modern applications.</p>
</div>
<div class="features">
<ul>
<li>Smart Extraction</li>
<li>Local Processing</li>
<li>Schema Support</li>
</ul>
</div>
</body>
</html>
"""

response = client.localscraper(
website_html=html_content,
user_prompt="Extract the company information and features"
)

Get your API key from the dashboard


Example Response{
"request_id": "sg-req-xyz789",
"status": "completed",
"user_prompt": "Extract the company information and features",
"result": {
"company_name": "ScrapeGraphAI",
"description": "AI-powered web scraping for modern applications.",
"features": [
"Smart Extraction",
"Local Processing",
"Schema Support"
]
},
"error": ""
}
The response includes:
request_id: Unique identifier for tracking your request
status: Current status of the extraction
result: The extracted data in structured JSON format
error: Error message (if any occurred during extraction)

Advanced Usage
Custom Schema Example
Define exactly what data you want to extract:

Async Support
For applications requiring asynchronous execution, LocalScraper provides async
support through the AsyncClient:
from scrapegraph_py import AsyncClient
import asyncio

async def main():


html_content = """
<html>
<body>
<h1>Product: Gaming Laptop</h1>
<div class="price">$999.99</div>
<div class="description">
High-performance gaming laptop with RTX 3080.
</div>
</body>
</html>
"""

async with AsyncClient(api_key="your-api-key") as client:


response = await client.localscraper(
website_html=html_content,
user_prompt="Extract the product information"
)
print(response)

# Run the async function


asyncio.run(main())

Integration Options
Official SDKs

Python SDK - Perfect for data science and backend applications


JavaScript SDK - Ideal for web applications and Node.js

AI Framework Integrations

LangChain Integration - Use LocalScraper in your LLM workflows


LlamaIndex Integration - Build powerful search and QA systems

Best Practices
HTML Preparation

Ensure HTML is well-formed


Include relevant content only
Clean up unnecessary markup
Handle character encoding properly

Optimization Tips

Remove unnecessary scripts and styles


Clean up dynamic content placeholders
Preserve important semantic structure
Include relevant metadata

Example Projects
Check out our cookbook for real-world examples:
Dynamic content extraction
Email template processing
Cached content analysis
Batch HTML processing

API Reference
For detailed API documentation, see:

Start Scraping Job


Get Job Status

Support & Resources


DocumentationComprehensive guides and tutorialsAPI ReferenceDetailed API
documentationCommunityJoin our Discord communityGitHubCheck out our open-source
projectsMain WebsiteVisit our official website
Ready to Start?Sign up now and get your API key to begin processing your HTML
content with LocalScraper!Was this page helpful?
YesNoSmartScraperMarkdownifyxgithublinkedinPowered by MintlifyOn this
pageOverviewKey FeaturesUse CasesInternal SystemsBatch ProcessingDevelopment &
TestingGetting StartedQuick StartAdvanced UsageCustom Schema ExampleAsync
SupportIntegration OptionsOfficial SDKsAI Framework IntegrationsBest PracticesHTML
PreparationOptimization TipsExample ProjectsAPI ReferenceSupport & Resources

------- • -------

https://docs.scrapegraphai.com/services/localscraper#api-reference

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationServicesLocalScraper
HomeCookbookAPI ReferenceOfficial WebsiteCommunityBlogGet
StartedIntroductionDashboardServicesSmartScraperLocalScraperMarkdownifyBrowser
ExtensionsOfficial SDKsPython SDKJavaScript SDKIntegrations🦜 LangChain🦙 LlamaIndex👥
CrewAI PhidataContributeOpen SourceFeedbackResources🕷️ScrapeGraphAI
BadgeServicesLocalScraperAI-powered extraction from local HTML content
Overview
LocalScraper brings the same powerful AI extraction capabilities as SmartScraper
but works with your local HTML content. This makes it perfect for scenarios where
you already have the HTML content or need to process cached pages, internal
documents, or dynamically generated content.
Try LocalScraper instantly in our interactive playground - no coding required!
Key Features
Local ProcessingProcess HTML content directly without making external requestsAI
UnderstandingSame powerful AI extraction as SmartScraperFaster ProcessingNo network
latency or website loading delaysFull ControlComplete control over your HTML input
and processing
Use Cases
Internal Systems

Process internally cached pages


Extract from intranet content
Handle dynamic JavaScript renders
Process email templates

Batch Processing

Archive data extraction


Historical content analysis
Bulk document processing
Offline content processing

Development & Testing

Test extraction logic locally


Debug content processing
Prototype without API calls
Validate schemas offline

Want to learn more about our AI-powered scraping technology? Visit our main website
to discover how we’re revolutionizing web data extraction.
Getting Started
Quick Start
from scrapegraph_py import Client

client = Client(api_key="your-api-key")

html_content = """
<html>
<body>
<h1>ScrapeGraphAI</h1>
<div class="description">
<p>AI-powered web scraping for modern applications.</p>
</div>
<div class="features">
<ul>
<li>Smart Extraction</li>
<li>Local Processing</li>
<li>Schema Support</li>
</ul>
</div>
</body>
</html>
"""

response = client.localscraper(
website_html=html_content,
user_prompt="Extract the company information and features"
)

Get your API key from the dashboard


Example Response{
"request_id": "sg-req-xyz789",
"status": "completed",
"user_prompt": "Extract the company information and features",
"result": {
"company_name": "ScrapeGraphAI",
"description": "AI-powered web scraping for modern applications.",
"features": [
"Smart Extraction",
"Local Processing",
"Schema Support"
]
},
"error": ""
}
The response includes:
request_id: Unique identifier for tracking your request
status: Current status of the extraction
result: The extracted data in structured JSON format
error: Error message (if any occurred during extraction)

Advanced Usage
Custom Schema Example
Define exactly what data you want to extract:

Async Support
For applications requiring asynchronous execution, LocalScraper provides async
support through the AsyncClient:
from scrapegraph_py import AsyncClient
import asyncio

async def main():


html_content = """
<html>
<body>
<h1>Product: Gaming Laptop</h1>
<div class="price">$999.99</div>
<div class="description">
High-performance gaming laptop with RTX 3080.
</div>
</body>
</html>
"""

async with AsyncClient(api_key="your-api-key") as client:


response = await client.localscraper(
website_html=html_content,
user_prompt="Extract the product information"
)
print(response)

# Run the async function


asyncio.run(main())

Integration Options
Official SDKs

Python SDK - Perfect for data science and backend applications


JavaScript SDK - Ideal for web applications and Node.js

AI Framework Integrations

LangChain Integration - Use LocalScraper in your LLM workflows


LlamaIndex Integration - Build powerful search and QA systems

Best Practices
HTML Preparation

Ensure HTML is well-formed


Include relevant content only
Clean up unnecessary markup
Handle character encoding properly

Optimization Tips

Remove unnecessary scripts and styles


Clean up dynamic content placeholders
Preserve important semantic structure
Include relevant metadata

Example Projects
Check out our cookbook for real-world examples:

Dynamic content extraction


Email template processing
Cached content analysis
Batch HTML processing

API Reference
For detailed API documentation, see:

Start Scraping Job


Get Job Status

Support & Resources


DocumentationComprehensive guides and tutorialsAPI ReferenceDetailed API
documentationCommunityJoin our Discord communityGitHubCheck out our open-source
projectsMain WebsiteVisit our official website
Ready to Start?Sign up now and get your API key to begin processing your HTML
content with LocalScraper!Was this page helpful?
YesNoSmartScraperMarkdownifyxgithublinkedinPowered by MintlifyOn this
pageOverviewKey FeaturesUse CasesInternal SystemsBatch ProcessingDevelopment &
TestingGetting StartedQuick StartAdvanced UsageCustom Schema ExampleAsync
SupportIntegration OptionsOfficial SDKsAI Framework IntegrationsBest PracticesHTML
PreparationOptimization TipsExample ProjectsAPI ReferenceSupport & Resources

------- • -------

https://docs.scrapegraphai.com/services/localscraper#support-and-resources

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationServicesLocalScraper
HomeCookbookAPI ReferenceOfficial WebsiteCommunityBlogGet
StartedIntroductionDashboardServicesSmartScraperLocalScraperMarkdownifyBrowser
ExtensionsOfficial SDKsPython SDKJavaScript SDKIntegrations🦜 LangChain🦙 LlamaIndex👥
CrewAI PhidataContributeOpen SourceFeedbackResources🕷️ScrapeGraphAI
BadgeServicesLocalScraperAI-powered extraction from local HTML content
Overview
LocalScraper brings the same powerful AI extraction capabilities as SmartScraper
but works with your local HTML content. This makes it perfect for scenarios where
you already have the HTML content or need to process cached pages, internal
documents, or dynamically generated content.
Try LocalScraper instantly in our interactive playground - no coding required!
Key Features
Local ProcessingProcess HTML content directly without making external requestsAI
UnderstandingSame powerful AI extraction as SmartScraperFaster ProcessingNo network
latency or website loading delaysFull ControlComplete control over your HTML input
and processing
Use Cases
Internal Systems

Process internally cached pages


Extract from intranet content
Handle dynamic JavaScript renders
Process email templates
Batch Processing

Archive data extraction


Historical content analysis
Bulk document processing
Offline content processing

Development & Testing

Test extraction logic locally


Debug content processing
Prototype without API calls
Validate schemas offline

Want to learn more about our AI-powered scraping technology? Visit our main website
to discover how we’re revolutionizing web data extraction.
Getting Started
Quick Start
from scrapegraph_py import Client

client = Client(api_key="your-api-key")

html_content = """
<html>
<body>
<h1>ScrapeGraphAI</h1>
<div class="description">
<p>AI-powered web scraping for modern applications.</p>
</div>
<div class="features">
<ul>
<li>Smart Extraction</li>
<li>Local Processing</li>
<li>Schema Support</li>
</ul>
</div>
</body>
</html>
"""

response = client.localscraper(
website_html=html_content,
user_prompt="Extract the company information and features"
)

Get your API key from the dashboard


Example Response{
"request_id": "sg-req-xyz789",
"status": "completed",
"user_prompt": "Extract the company information and features",
"result": {
"company_name": "ScrapeGraphAI",
"description": "AI-powered web scraping for modern applications.",
"features": [
"Smart Extraction",
"Local Processing",
"Schema Support"
]
},
"error": ""
}
The response includes:
request_id: Unique identifier for tracking your request
status: Current status of the extraction
result: The extracted data in structured JSON format
error: Error message (if any occurred during extraction)

Advanced Usage
Custom Schema Example
Define exactly what data you want to extract:

Async Support
For applications requiring asynchronous execution, LocalScraper provides async
support through the AsyncClient:
from scrapegraph_py import AsyncClient
import asyncio

async def main():


html_content = """
<html>
<body>
<h1>Product: Gaming Laptop</h1>
<div class="price">$999.99</div>
<div class="description">
High-performance gaming laptop with RTX 3080.
</div>
</body>
</html>
"""

async with AsyncClient(api_key="your-api-key") as client:


response = await client.localscraper(
website_html=html_content,
user_prompt="Extract the product information"
)
print(response)

# Run the async function


asyncio.run(main())

Integration Options
Official SDKs

Python SDK - Perfect for data science and backend applications


JavaScript SDK - Ideal for web applications and Node.js

AI Framework Integrations

LangChain Integration - Use LocalScraper in your LLM workflows


LlamaIndex Integration - Build powerful search and QA systems

Best Practices
HTML Preparation

Ensure HTML is well-formed


Include relevant content only
Clean up unnecessary markup
Handle character encoding properly
Optimization Tips

Remove unnecessary scripts and styles


Clean up dynamic content placeholders
Preserve important semantic structure
Include relevant metadata

Example Projects
Check out our cookbook for real-world examples:

Dynamic content extraction


Email template processing
Cached content analysis
Batch HTML processing

API Reference
For detailed API documentation, see:

Start Scraping Job


Get Job Status

Support & Resources


DocumentationComprehensive guides and tutorialsAPI ReferenceDetailed API
documentationCommunityJoin our Discord communityGitHubCheck out our open-source
projectsMain WebsiteVisit our official website
Ready to Start?Sign up now and get your API key to begin processing your HTML
content with LocalScraper!Was this page helpful?
YesNoSmartScraperMarkdownifyxgithublinkedinPowered by MintlifyOn this
pageOverviewKey FeaturesUse CasesInternal SystemsBatch ProcessingDevelopment &
TestingGetting StartedQuick StartAdvanced UsageCustom Schema ExampleAsync
SupportIntegration OptionsOfficial SDKsAI Framework IntegrationsBest PracticesHTML
PreparationOptimization TipsExample ProjectsAPI ReferenceSupport & Resources

------- • -------

https://docs.scrapegraphai.com/services/markdownify#overview

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationServicesMarkdownifyH
omeCookbookAPI ReferenceOfficial WebsiteCommunityBlogGet
StartedIntroductionDashboardServicesSmartScraperLocalScraperMarkdownifyBrowser
ExtensionsOfficial SDKsPython SDKJavaScript SDKIntegrations🦜 LangChain🦙 LlamaIndex👥
CrewAI PhidataContributeOpen SourceFeedbackResources🕷️ScrapeGraphAI
BadgeServicesMarkdownifyConvert web content to clean, structured markdown
Overview
Markdownify is our specialized service that transforms web content into clean,
well-formatted markdown. It intelligently preserves the content’s structure while
removing unnecessary elements, making it perfect for content migration,
documentation creation, and knowledge base building.
Try Markdownify instantly in our interactive playground - no coding required!
Key Features
Smart ConversionIntelligent content structure preservationClean OutputRemoves ads,
navigation, and irrelevant contentFormat RetentionMaintains headings, lists, and
text formattingAsset HandlingPreserves images and handles external links
Use Cases
Content Migration

Convert blog posts to markdown


Transform documentation
Migrate knowledge bases
Archive web content

Documentation

Create technical documentation


Build wikis and guides
Generate README files
Maintain developer docs

Content Management

Prepare content for CMS import


Create portable content
Build learning resources
Format articles

Want to learn more about our AI-powered scraping technology? Visit our main website
to discover how we’re revolutionizing web data extraction.
Getting Started
Quick Start
from scrapegraph_py import Client

client = Client(api_key="your-api-key")

response = client.markdownify(
website_url="https://example.com/article"
)

Get your API key from the dashboard


Example Response{
"request_id": "sg-req-md456",
"status": "completed",
"website_url": "https://example.com/article",
"result": "# Understanding AI-Powered Web Scraping\n\nWeb scraping has evolved
significantly with the advent of AI technologies...\n\n## Key Benefits\n\n-
Improved accuracy\n- Intelligent extraction\n- Structured output\n\n![AI Scraping
Process](https://example.com/images/ai-scraping.png)\n\n> AI-powered scraping
represents the future of web data extraction.\n\n### Getting Started\n\n1. Choose
your target website\n2. Define extraction goals\n3. Select appropriate tools\n",
"error": ""
}
The response includes:
request_id: Unique identifier for tracking your request
status: Current status of the conversion
result: Object containing the markdown content and metadata
error: Error message (if any occurred during conversion)

Async Support
For applications requiring asynchronous execution, Markdownify provides async
support through the AsyncClient:
from scrapegraph_py import AsyncClient
import asyncio

async def main():


async with AsyncClient(api_key="your-api-key") as client:
response = await client.markdownify(
website_url="https://example.com/article"
)
print(response)

# Run the async function


asyncio.run(main())

Integration Options
Official SDKs

Python SDK - Perfect for automation and content processing


JavaScript SDK - Ideal for web applications and content tools

AI Framework Integrations

LangChain Integration - Use Markdownify in your content pipelines


LlamaIndex Integration - Create searchable knowledge bases

Best Practices
Content Optimization

Verify source content quality


Check image and link preservation
Review markdown formatting
Validate output structure

Processing Tips

Handle large content in chunks


Preserve important metadata
Maintain content hierarchy
Check for formatting consistency

Example Projects
Check out our cookbook for real-world examples:

Blog migration tools


Documentation generators
Content archival systems
Knowledge base builders

API Reference
For detailed API documentation, see:

Start Conversion Job


Get Job Status

Support & Resources


DocumentationComprehensive guides and tutorialsAPI ReferenceDetailed API
documentationCommunityJoin our Discord communityGitHubCheck out our open-source
projectsMain WebsiteVisit our official website
Ready to Start?Sign up now and get your API key to begin converting web content to
clean markdown!Was this page helpful?YesNoLocalScraperFirefoxxgithublinkedinPowered
by MintlifyOn this pageOverviewKey FeaturesUse CasesContent
MigrationDocumentationContent ManagementGetting StartedQuick StartAsync
SupportIntegration OptionsOfficial SDKsAI Framework IntegrationsBest
PracticesContent OptimizationProcessing TipsExample ProjectsAPI ReferenceSupport &
Resources

------- • -------
https://docs.scrapegraphai.com/services/markdownify#key-features

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationServicesMarkdownifyH
omeCookbookAPI ReferenceOfficial WebsiteCommunityBlogGet
StartedIntroductionDashboardServicesSmartScraperLocalScraperMarkdownifyBrowser
ExtensionsOfficial SDKsPython SDKJavaScript SDKIntegrations🦜 LangChain🦙 LlamaIndex👥
CrewAI PhidataContributeOpen SourceFeedbackResources🕷️ScrapeGraphAI
BadgeServicesMarkdownifyConvert web content to clean, structured markdown
Overview
Markdownify is our specialized service that transforms web content into clean,
well-formatted markdown. It intelligently preserves the content’s structure while
removing unnecessary elements, making it perfect for content migration,
documentation creation, and knowledge base building.
Try Markdownify instantly in our interactive playground - no coding required!
Key Features
Smart ConversionIntelligent content structure preservationClean OutputRemoves ads,
navigation, and irrelevant contentFormat RetentionMaintains headings, lists, and
text formattingAsset HandlingPreserves images and handles external links
Use Cases
Content Migration

Convert blog posts to markdown


Transform documentation
Migrate knowledge bases
Archive web content

Documentation

Create technical documentation


Build wikis and guides
Generate README files
Maintain developer docs

Content Management

Prepare content for CMS import


Create portable content
Build learning resources
Format articles

Want to learn more about our AI-powered scraping technology? Visit our main website
to discover how we’re revolutionizing web data extraction.
Getting Started
Quick Start
from scrapegraph_py import Client

client = Client(api_key="your-api-key")

response = client.markdownify(
website_url="https://example.com/article"
)

Get your API key from the dashboard


Example Response{
"request_id": "sg-req-md456",
"status": "completed",
"website_url": "https://example.com/article",
"result": "# Understanding AI-Powered Web Scraping\n\nWeb scraping has evolved
significantly with the advent of AI technologies...\n\n## Key Benefits\n\n-
Improved accuracy\n- Intelligent extraction\n- Structured output\n\n![AI Scraping
Process](https://example.com/images/ai-scraping.png)\n\n> AI-powered scraping
represents the future of web data extraction.\n\n### Getting Started\n\n1. Choose
your target website\n2. Define extraction goals\n3. Select appropriate tools\n",
"error": ""
}
The response includes:
request_id: Unique identifier for tracking your request
status: Current status of the conversion
result: Object containing the markdown content and metadata
error: Error message (if any occurred during conversion)

Async Support
For applications requiring asynchronous execution, Markdownify provides async
support through the AsyncClient:
from scrapegraph_py import AsyncClient
import asyncio

async def main():


async with AsyncClient(api_key="your-api-key") as client:
response = await client.markdownify(
website_url="https://example.com/article"
)
print(response)

# Run the async function


asyncio.run(main())

Integration Options
Official SDKs

Python SDK - Perfect for automation and content processing


JavaScript SDK - Ideal for web applications and content tools

AI Framework Integrations

LangChain Integration - Use Markdownify in your content pipelines


LlamaIndex Integration - Create searchable knowledge bases

Best Practices
Content Optimization

Verify source content quality


Check image and link preservation
Review markdown formatting
Validate output structure

Processing Tips

Handle large content in chunks


Preserve important metadata
Maintain content hierarchy
Check for formatting consistency

Example Projects
Check out our cookbook for real-world examples:
Blog migration tools
Documentation generators
Content archival systems
Knowledge base builders

API Reference
For detailed API documentation, see:

Start Conversion Job


Get Job Status

Support & Resources


DocumentationComprehensive guides and tutorialsAPI ReferenceDetailed API
documentationCommunityJoin our Discord communityGitHubCheck out our open-source
projectsMain WebsiteVisit our official website
Ready to Start?Sign up now and get your API key to begin converting web content to
clean markdown!Was this page helpful?YesNoLocalScraperFirefoxxgithublinkedinPowered
by MintlifyOn this pageOverviewKey FeaturesUse CasesContent
MigrationDocumentationContent ManagementGetting StartedQuick StartAsync
SupportIntegration OptionsOfficial SDKsAI Framework IntegrationsBest
PracticesContent OptimizationProcessing TipsExample ProjectsAPI ReferenceSupport &
Resources

------- • -------

https://docs.scrapegraphai.com/services/markdownify#use-cases

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationServicesMarkdownifyH
omeCookbookAPI ReferenceOfficial WebsiteCommunityBlogGet
StartedIntroductionDashboardServicesSmartScraperLocalScraperMarkdownifyBrowser
ExtensionsOfficial SDKsPython SDKJavaScript SDKIntegrations🦜 LangChain🦙 LlamaIndex👥
CrewAI PhidataContributeOpen SourceFeedbackResources🕷️ScrapeGraphAI
BadgeServicesMarkdownifyConvert web content to clean, structured markdown
Overview
Markdownify is our specialized service that transforms web content into clean,
well-formatted markdown. It intelligently preserves the content’s structure while
removing unnecessary elements, making it perfect for content migration,
documentation creation, and knowledge base building.
Try Markdownify instantly in our interactive playground - no coding required!
Key Features
Smart ConversionIntelligent content structure preservationClean OutputRemoves ads,
navigation, and irrelevant contentFormat RetentionMaintains headings, lists, and
text formattingAsset HandlingPreserves images and handles external links
Use Cases
Content Migration

Convert blog posts to markdown


Transform documentation
Migrate knowledge bases
Archive web content

Documentation

Create technical documentation


Build wikis and guides
Generate README files
Maintain developer docs
Content Management

Prepare content for CMS import


Create portable content
Build learning resources
Format articles

Want to learn more about our AI-powered scraping technology? Visit our main website
to discover how we’re revolutionizing web data extraction.
Getting Started
Quick Start
from scrapegraph_py import Client

client = Client(api_key="your-api-key")

response = client.markdownify(
website_url="https://example.com/article"
)

Get your API key from the dashboard


Example Response{
"request_id": "sg-req-md456",
"status": "completed",
"website_url": "https://example.com/article",
"result": "# Understanding AI-Powered Web Scraping\n\nWeb scraping has evolved
significantly with the advent of AI technologies...\n\n## Key Benefits\n\n-
Improved accuracy\n- Intelligent extraction\n- Structured output\n\n![AI Scraping
Process](https://example.com/images/ai-scraping.png)\n\n> AI-powered scraping
represents the future of web data extraction.\n\n### Getting Started\n\n1. Choose
your target website\n2. Define extraction goals\n3. Select appropriate tools\n",
"error": ""
}
The response includes:
request_id: Unique identifier for tracking your request
status: Current status of the conversion
result: Object containing the markdown content and metadata
error: Error message (if any occurred during conversion)

Async Support
For applications requiring asynchronous execution, Markdownify provides async
support through the AsyncClient:
from scrapegraph_py import AsyncClient
import asyncio

async def main():


async with AsyncClient(api_key="your-api-key") as client:
response = await client.markdownify(
website_url="https://example.com/article"
)
print(response)

# Run the async function


asyncio.run(main())

Integration Options
Official SDKs

Python SDK - Perfect for automation and content processing


JavaScript SDK - Ideal for web applications and content tools
AI Framework Integrations

LangChain Integration - Use Markdownify in your content pipelines


LlamaIndex Integration - Create searchable knowledge bases

Best Practices
Content Optimization

Verify source content quality


Check image and link preservation
Review markdown formatting
Validate output structure

Processing Tips

Handle large content in chunks


Preserve important metadata
Maintain content hierarchy
Check for formatting consistency

Example Projects
Check out our cookbook for real-world examples:

Blog migration tools


Documentation generators
Content archival systems
Knowledge base builders

API Reference
For detailed API documentation, see:

Start Conversion Job


Get Job Status

Support & Resources


DocumentationComprehensive guides and tutorialsAPI ReferenceDetailed API
documentationCommunityJoin our Discord communityGitHubCheck out our open-source
projectsMain WebsiteVisit our official website
Ready to Start?Sign up now and get your API key to begin converting web content to
clean markdown!Was this page helpful?YesNoLocalScraperFirefoxxgithublinkedinPowered
by MintlifyOn this pageOverviewKey FeaturesUse CasesContent
MigrationDocumentationContent ManagementGetting StartedQuick StartAsync
SupportIntegration OptionsOfficial SDKsAI Framework IntegrationsBest
PracticesContent OptimizationProcessing TipsExample ProjectsAPI ReferenceSupport &
Resources

------- • -------

https://docs.scrapegraphai.com/services/markdownify#content-migration

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationServicesMarkdownifyH
omeCookbookAPI ReferenceOfficial WebsiteCommunityBlogGet
StartedIntroductionDashboardServicesSmartScraperLocalScraperMarkdownifyBrowser
ExtensionsOfficial SDKsPython SDKJavaScript SDKIntegrations🦜 LangChain🦙 LlamaIndex👥
CrewAI PhidataContributeOpen SourceFeedbackResources🕷️ScrapeGraphAI
BadgeServicesMarkdownifyConvert web content to clean, structured markdown
Overview
Markdownify is our specialized service that transforms web content into clean,
well-formatted markdown. It intelligently preserves the content’s structure while
removing unnecessary elements, making it perfect for content migration,
documentation creation, and knowledge base building.
Try Markdownify instantly in our interactive playground - no coding required!
Key Features
Smart ConversionIntelligent content structure preservationClean OutputRemoves ads,
navigation, and irrelevant contentFormat RetentionMaintains headings, lists, and
text formattingAsset HandlingPreserves images and handles external links
Use Cases
Content Migration

Convert blog posts to markdown


Transform documentation
Migrate knowledge bases
Archive web content

Documentation

Create technical documentation


Build wikis and guides
Generate README files
Maintain developer docs

Content Management

Prepare content for CMS import


Create portable content
Build learning resources
Format articles

Want to learn more about our AI-powered scraping technology? Visit our main website
to discover how we’re revolutionizing web data extraction.
Getting Started
Quick Start
from scrapegraph_py import Client

client = Client(api_key="your-api-key")

response = client.markdownify(
website_url="https://example.com/article"
)

Get your API key from the dashboard


Example Response{
"request_id": "sg-req-md456",
"status": "completed",
"website_url": "https://example.com/article",
"result": "# Understanding AI-Powered Web Scraping\n\nWeb scraping has evolved
significantly with the advent of AI technologies...\n\n## Key Benefits\n\n-
Improved accuracy\n- Intelligent extraction\n- Structured output\n\n![AI Scraping
Process](https://example.com/images/ai-scraping.png)\n\n> AI-powered scraping
represents the future of web data extraction.\n\n### Getting Started\n\n1. Choose
your target website\n2. Define extraction goals\n3. Select appropriate tools\n",
"error": ""
}
The response includes:
request_id: Unique identifier for tracking your request
status: Current status of the conversion
result: Object containing the markdown content and metadata
error: Error message (if any occurred during conversion)

Async Support
For applications requiring asynchronous execution, Markdownify provides async
support through the AsyncClient:
from scrapegraph_py import AsyncClient
import asyncio

async def main():


async with AsyncClient(api_key="your-api-key") as client:
response = await client.markdownify(
website_url="https://example.com/article"
)
print(response)

# Run the async function


asyncio.run(main())

Integration Options
Official SDKs

Python SDK - Perfect for automation and content processing


JavaScript SDK - Ideal for web applications and content tools

AI Framework Integrations

LangChain Integration - Use Markdownify in your content pipelines


LlamaIndex Integration - Create searchable knowledge bases

Best Practices
Content Optimization

Verify source content quality


Check image and link preservation
Review markdown formatting
Validate output structure

Processing Tips

Handle large content in chunks


Preserve important metadata
Maintain content hierarchy
Check for formatting consistency

Example Projects
Check out our cookbook for real-world examples:

Blog migration tools


Documentation generators
Content archival systems
Knowledge base builders

API Reference
For detailed API documentation, see:

Start Conversion Job


Get Job Status
Support & Resources
DocumentationComprehensive guides and tutorialsAPI ReferenceDetailed API
documentationCommunityJoin our Discord communityGitHubCheck out our open-source
projectsMain WebsiteVisit our official website
Ready to Start?Sign up now and get your API key to begin converting web content to
clean markdown!Was this page helpful?YesNoLocalScraperFirefoxxgithublinkedinPowered
by MintlifyOn this pageOverviewKey FeaturesUse CasesContent
MigrationDocumentationContent ManagementGetting StartedQuick StartAsync
SupportIntegration OptionsOfficial SDKsAI Framework IntegrationsBest
PracticesContent OptimizationProcessing TipsExample ProjectsAPI ReferenceSupport &
Resources

------- • -------

https://docs.scrapegraphai.com/services/markdownify#documentation

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationServicesMarkdownifyH
omeCookbookAPI ReferenceOfficial WebsiteCommunityBlogGet
StartedIntroductionDashboardServicesSmartScraperLocalScraperMarkdownifyBrowser
ExtensionsOfficial SDKsPython SDKJavaScript SDKIntegrations🦜 LangChain🦙 LlamaIndex👥
CrewAI PhidataContributeOpen SourceFeedbackResources🕷️ScrapeGraphAI
BadgeServicesMarkdownifyConvert web content to clean, structured markdown
Overview
Markdownify is our specialized service that transforms web content into clean,
well-formatted markdown. It intelligently preserves the content’s structure while
removing unnecessary elements, making it perfect for content migration,
documentation creation, and knowledge base building.
Try Markdownify instantly in our interactive playground - no coding required!
Key Features
Smart ConversionIntelligent content structure preservationClean OutputRemoves ads,
navigation, and irrelevant contentFormat RetentionMaintains headings, lists, and
text formattingAsset HandlingPreserves images and handles external links
Use Cases
Content Migration

Convert blog posts to markdown


Transform documentation
Migrate knowledge bases
Archive web content

Documentation

Create technical documentation


Build wikis and guides
Generate README files
Maintain developer docs

Content Management

Prepare content for CMS import


Create portable content
Build learning resources
Format articles

Want to learn more about our AI-powered scraping technology? Visit our main website
to discover how we’re revolutionizing web data extraction.
Getting Started
Quick Start
from scrapegraph_py import Client

client = Client(api_key="your-api-key")

response = client.markdownify(
website_url="https://example.com/article"
)

Get your API key from the dashboard


Example Response{
"request_id": "sg-req-md456",
"status": "completed",
"website_url": "https://example.com/article",
"result": "# Understanding AI-Powered Web Scraping\n\nWeb scraping has evolved
significantly with the advent of AI technologies...\n\n## Key Benefits\n\n-
Improved accuracy\n- Intelligent extraction\n- Structured output\n\n![AI Scraping
Process](https://example.com/images/ai-scraping.png)\n\n> AI-powered scraping
represents the future of web data extraction.\n\n### Getting Started\n\n1. Choose
your target website\n2. Define extraction goals\n3. Select appropriate tools\n",
"error": ""
}
The response includes:
request_id: Unique identifier for tracking your request
status: Current status of the conversion
result: Object containing the markdown content and metadata
error: Error message (if any occurred during conversion)

Async Support
For applications requiring asynchronous execution, Markdownify provides async
support through the AsyncClient:
from scrapegraph_py import AsyncClient
import asyncio

async def main():


async with AsyncClient(api_key="your-api-key") as client:
response = await client.markdownify(
website_url="https://example.com/article"
)
print(response)

# Run the async function


asyncio.run(main())

Integration Options
Official SDKs

Python SDK - Perfect for automation and content processing


JavaScript SDK - Ideal for web applications and content tools

AI Framework Integrations

LangChain Integration - Use Markdownify in your content pipelines


LlamaIndex Integration - Create searchable knowledge bases

Best Practices
Content Optimization

Verify source content quality


Check image and link preservation
Review markdown formatting
Validate output structure

Processing Tips

Handle large content in chunks


Preserve important metadata
Maintain content hierarchy
Check for formatting consistency

Example Projects
Check out our cookbook for real-world examples:

Blog migration tools


Documentation generators
Content archival systems
Knowledge base builders

API Reference
For detailed API documentation, see:

Start Conversion Job


Get Job Status

Support & Resources


DocumentationComprehensive guides and tutorialsAPI ReferenceDetailed API
documentationCommunityJoin our Discord communityGitHubCheck out our open-source
projectsMain WebsiteVisit our official website
Ready to Start?Sign up now and get your API key to begin converting web content to
clean markdown!Was this page helpful?YesNoLocalScraperFirefoxxgithublinkedinPowered
by MintlifyOn this pageOverviewKey FeaturesUse CasesContent
MigrationDocumentationContent ManagementGetting StartedQuick StartAsync
SupportIntegration OptionsOfficial SDKsAI Framework IntegrationsBest
PracticesContent OptimizationProcessing TipsExample ProjectsAPI ReferenceSupport &
Resources

------- • -------

https://docs.scrapegraphai.com/services/markdownify#content-management

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationServicesMarkdownifyH
omeCookbookAPI ReferenceOfficial WebsiteCommunityBlogGet
StartedIntroductionDashboardServicesSmartScraperLocalScraperMarkdownifyBrowser
ExtensionsOfficial SDKsPython SDKJavaScript SDKIntegrations🦜 LangChain🦙 LlamaIndex👥
CrewAI PhidataContributeOpen SourceFeedbackResources🕷️ScrapeGraphAI
BadgeServicesMarkdownifyConvert web content to clean, structured markdown
Overview
Markdownify is our specialized service that transforms web content into clean,
well-formatted markdown. It intelligently preserves the content’s structure while
removing unnecessary elements, making it perfect for content migration,
documentation creation, and knowledge base building.
Try Markdownify instantly in our interactive playground - no coding required!
Key Features
Smart ConversionIntelligent content structure preservationClean OutputRemoves ads,
navigation, and irrelevant contentFormat RetentionMaintains headings, lists, and
text formattingAsset HandlingPreserves images and handles external links
Use Cases
Content Migration
Convert blog posts to markdown
Transform documentation
Migrate knowledge bases
Archive web content

Documentation

Create technical documentation


Build wikis and guides
Generate README files
Maintain developer docs

Content Management

Prepare content for CMS import


Create portable content
Build learning resources
Format articles

Want to learn more about our AI-powered scraping technology? Visit our main website
to discover how we’re revolutionizing web data extraction.
Getting Started
Quick Start
from scrapegraph_py import Client

client = Client(api_key="your-api-key")

response = client.markdownify(
website_url="https://example.com/article"
)

Get your API key from the dashboard


Example Response{
"request_id": "sg-req-md456",
"status": "completed",
"website_url": "https://example.com/article",
"result": "# Understanding AI-Powered Web Scraping\n\nWeb scraping has evolved
significantly with the advent of AI technologies...\n\n## Key Benefits\n\n-
Improved accuracy\n- Intelligent extraction\n- Structured output\n\n![AI Scraping
Process](https://example.com/images/ai-scraping.png)\n\n> AI-powered scraping
represents the future of web data extraction.\n\n### Getting Started\n\n1. Choose
your target website\n2. Define extraction goals\n3. Select appropriate tools\n",
"error": ""
}
The response includes:
request_id: Unique identifier for tracking your request
status: Current status of the conversion
result: Object containing the markdown content and metadata
error: Error message (if any occurred during conversion)

Async Support
For applications requiring asynchronous execution, Markdownify provides async
support through the AsyncClient:
from scrapegraph_py import AsyncClient
import asyncio

async def main():


async with AsyncClient(api_key="your-api-key") as client:
response = await client.markdownify(
website_url="https://example.com/article"
)
print(response)

# Run the async function


asyncio.run(main())

Integration Options
Official SDKs

Python SDK - Perfect for automation and content processing


JavaScript SDK - Ideal for web applications and content tools

AI Framework Integrations

LangChain Integration - Use Markdownify in your content pipelines


LlamaIndex Integration - Create searchable knowledge bases

Best Practices
Content Optimization

Verify source content quality


Check image and link preservation
Review markdown formatting
Validate output structure

Processing Tips

Handle large content in chunks


Preserve important metadata
Maintain content hierarchy
Check for formatting consistency

Example Projects
Check out our cookbook for real-world examples:

Blog migration tools


Documentation generators
Content archival systems
Knowledge base builders

API Reference
For detailed API documentation, see:

Start Conversion Job


Get Job Status

Support & Resources


DocumentationComprehensive guides and tutorialsAPI ReferenceDetailed API
documentationCommunityJoin our Discord communityGitHubCheck out our open-source
projectsMain WebsiteVisit our official website
Ready to Start?Sign up now and get your API key to begin converting web content to
clean markdown!Was this page helpful?YesNoLocalScraperFirefoxxgithublinkedinPowered
by MintlifyOn this pageOverviewKey FeaturesUse CasesContent
MigrationDocumentationContent ManagementGetting StartedQuick StartAsync
SupportIntegration OptionsOfficial SDKsAI Framework IntegrationsBest
PracticesContent OptimizationProcessing TipsExample ProjectsAPI ReferenceSupport &
Resources
------- • -------

https://docs.scrapegraphai.com/services/markdownify#getting-started

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationServicesMarkdownifyH
omeCookbookAPI ReferenceOfficial WebsiteCommunityBlogGet
StartedIntroductionDashboardServicesSmartScraperLocalScraperMarkdownifyBrowser
ExtensionsOfficial SDKsPython SDKJavaScript SDKIntegrations🦜 LangChain🦙 LlamaIndex👥
CrewAI PhidataContributeOpen SourceFeedbackResources🕷️ScrapeGraphAI
BadgeServicesMarkdownifyConvert web content to clean, structured markdown
Overview
Markdownify is our specialized service that transforms web content into clean,
well-formatted markdown. It intelligently preserves the content’s structure while
removing unnecessary elements, making it perfect for content migration,
documentation creation, and knowledge base building.
Try Markdownify instantly in our interactive playground - no coding required!
Key Features
Smart ConversionIntelligent content structure preservationClean OutputRemoves ads,
navigation, and irrelevant contentFormat RetentionMaintains headings, lists, and
text formattingAsset HandlingPreserves images and handles external links
Use Cases
Content Migration

Convert blog posts to markdown


Transform documentation
Migrate knowledge bases
Archive web content

Documentation

Create technical documentation


Build wikis and guides
Generate README files
Maintain developer docs

Content Management

Prepare content for CMS import


Create portable content
Build learning resources
Format articles

Want to learn more about our AI-powered scraping technology? Visit our main website
to discover how we’re revolutionizing web data extraction.
Getting Started
Quick Start
from scrapegraph_py import Client

client = Client(api_key="your-api-key")

response = client.markdownify(
website_url="https://example.com/article"
)

Get your API key from the dashboard


Example Response{
"request_id": "sg-req-md456",
"status": "completed",
"website_url": "https://example.com/article",
"result": "# Understanding AI-Powered Web Scraping\n\nWeb scraping has evolved
significantly with the advent of AI technologies...\n\n## Key Benefits\n\n-
Improved accuracy\n- Intelligent extraction\n- Structured output\n\n![AI Scraping
Process](https://example.com/images/ai-scraping.png)\n\n> AI-powered scraping
represents the future of web data extraction.\n\n### Getting Started\n\n1. Choose
your target website\n2. Define extraction goals\n3. Select appropriate tools\n",
"error": ""
}
The response includes:
request_id: Unique identifier for tracking your request
status: Current status of the conversion
result: Object containing the markdown content and metadata
error: Error message (if any occurred during conversion)

Async Support
For applications requiring asynchronous execution, Markdownify provides async
support through the AsyncClient:
from scrapegraph_py import AsyncClient
import asyncio

async def main():


async with AsyncClient(api_key="your-api-key") as client:
response = await client.markdownify(
website_url="https://example.com/article"
)
print(response)

# Run the async function


asyncio.run(main())

Integration Options
Official SDKs

Python SDK - Perfect for automation and content processing


JavaScript SDK - Ideal for web applications and content tools

AI Framework Integrations

LangChain Integration - Use Markdownify in your content pipelines


LlamaIndex Integration - Create searchable knowledge bases

Best Practices
Content Optimization

Verify source content quality


Check image and link preservation
Review markdown formatting
Validate output structure

Processing Tips

Handle large content in chunks


Preserve important metadata
Maintain content hierarchy
Check for formatting consistency

Example Projects
Check out our cookbook for real-world examples:

Blog migration tools


Documentation generators
Content archival systems
Knowledge base builders

API Reference
For detailed API documentation, see:

Start Conversion Job


Get Job Status

Support & Resources


DocumentationComprehensive guides and tutorialsAPI ReferenceDetailed API
documentationCommunityJoin our Discord communityGitHubCheck out our open-source
projectsMain WebsiteVisit our official website
Ready to Start?Sign up now and get your API key to begin converting web content to
clean markdown!Was this page helpful?YesNoLocalScraperFirefoxxgithublinkedinPowered
by MintlifyOn this pageOverviewKey FeaturesUse CasesContent
MigrationDocumentationContent ManagementGetting StartedQuick StartAsync
SupportIntegration OptionsOfficial SDKsAI Framework IntegrationsBest
PracticesContent OptimizationProcessing TipsExample ProjectsAPI ReferenceSupport &
Resources

------- • -------

https://docs.scrapegraphai.com/services/markdownify#quick-start

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationServicesMarkdownifyH
omeCookbookAPI ReferenceOfficial WebsiteCommunityBlogGet
StartedIntroductionDashboardServicesSmartScraperLocalScraperMarkdownifyBrowser
ExtensionsOfficial SDKsPython SDKJavaScript SDKIntegrations🦜 LangChain🦙 LlamaIndex👥
CrewAI PhidataContributeOpen SourceFeedbackResources🕷️ScrapeGraphAI
BadgeServicesMarkdownifyConvert web content to clean, structured markdown
Overview
Markdownify is our specialized service that transforms web content into clean,
well-formatted markdown. It intelligently preserves the content’s structure while
removing unnecessary elements, making it perfect for content migration,
documentation creation, and knowledge base building.
Try Markdownify instantly in our interactive playground - no coding required!
Key Features
Smart ConversionIntelligent content structure preservationClean OutputRemoves ads,
navigation, and irrelevant contentFormat RetentionMaintains headings, lists, and
text formattingAsset HandlingPreserves images and handles external links
Use Cases
Content Migration

Convert blog posts to markdown


Transform documentation
Migrate knowledge bases
Archive web content

Documentation

Create technical documentation


Build wikis and guides
Generate README files
Maintain developer docs

Content Management

Prepare content for CMS import


Create portable content
Build learning resources
Format articles

Want to learn more about our AI-powered scraping technology? Visit our main website
to discover how we’re revolutionizing web data extraction.
Getting Started
Quick Start
from scrapegraph_py import Client

client = Client(api_key="your-api-key")

response = client.markdownify(
website_url="https://example.com/article"
)

Get your API key from the dashboard


Example Response{
"request_id": "sg-req-md456",
"status": "completed",
"website_url": "https://example.com/article",
"result": "# Understanding AI-Powered Web Scraping\n\nWeb scraping has evolved
significantly with the advent of AI technologies...\n\n## Key Benefits\n\n-
Improved accuracy\n- Intelligent extraction\n- Structured output\n\n![AI Scraping
Process](https://example.com/images/ai-scraping.png)\n\n> AI-powered scraping
represents the future of web data extraction.\n\n### Getting Started\n\n1. Choose
your target website\n2. Define extraction goals\n3. Select appropriate tools\n",
"error": ""
}
The response includes:
request_id: Unique identifier for tracking your request
status: Current status of the conversion
result: Object containing the markdown content and metadata
error: Error message (if any occurred during conversion)

Async Support
For applications requiring asynchronous execution, Markdownify provides async
support through the AsyncClient:
from scrapegraph_py import AsyncClient
import asyncio

async def main():


async with AsyncClient(api_key="your-api-key") as client:
response = await client.markdownify(
website_url="https://example.com/article"
)
print(response)

# Run the async function


asyncio.run(main())

Integration Options
Official SDKs
Python SDK - Perfect for automation and content processing
JavaScript SDK - Ideal for web applications and content tools

AI Framework Integrations

LangChain Integration - Use Markdownify in your content pipelines


LlamaIndex Integration - Create searchable knowledge bases

Best Practices
Content Optimization

Verify source content quality


Check image and link preservation
Review markdown formatting
Validate output structure

Processing Tips

Handle large content in chunks


Preserve important metadata
Maintain content hierarchy
Check for formatting consistency

Example Projects
Check out our cookbook for real-world examples:

Blog migration tools


Documentation generators
Content archival systems
Knowledge base builders

API Reference
For detailed API documentation, see:

Start Conversion Job


Get Job Status

Support & Resources


DocumentationComprehensive guides and tutorialsAPI ReferenceDetailed API
documentationCommunityJoin our Discord communityGitHubCheck out our open-source
projectsMain WebsiteVisit our official website
Ready to Start?Sign up now and get your API key to begin converting web content to
clean markdown!Was this page helpful?YesNoLocalScraperFirefoxxgithublinkedinPowered
by MintlifyOn this pageOverviewKey FeaturesUse CasesContent
MigrationDocumentationContent ManagementGetting StartedQuick StartAsync
SupportIntegration OptionsOfficial SDKsAI Framework IntegrationsBest
PracticesContent OptimizationProcessing TipsExample ProjectsAPI ReferenceSupport &
Resources

------- • -------

https://docs.scrapegraphai.com/services/markdownify#async-support

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationServicesMarkdownifyH
omeCookbookAPI ReferenceOfficial WebsiteCommunityBlogGet
StartedIntroductionDashboardServicesSmartScraperLocalScraperMarkdownifyBrowser
ExtensionsOfficial SDKsPython SDKJavaScript SDKIntegrations🦜 LangChain🦙 LlamaIndex👥
CrewAI PhidataContributeOpen SourceFeedbackResources🕷️ScrapeGraphAI
BadgeServicesMarkdownifyConvert web content to clean, structured markdown
Overview
Markdownify is our specialized service that transforms web content into clean,
well-formatted markdown. It intelligently preserves the content’s structure while
removing unnecessary elements, making it perfect for content migration,
documentation creation, and knowledge base building.
Try Markdownify instantly in our interactive playground - no coding required!
Key Features
Smart ConversionIntelligent content structure preservationClean OutputRemoves ads,
navigation, and irrelevant contentFormat RetentionMaintains headings, lists, and
text formattingAsset HandlingPreserves images and handles external links
Use Cases
Content Migration

Convert blog posts to markdown


Transform documentation
Migrate knowledge bases
Archive web content

Documentation

Create technical documentation


Build wikis and guides
Generate README files
Maintain developer docs

Content Management

Prepare content for CMS import


Create portable content
Build learning resources
Format articles

Want to learn more about our AI-powered scraping technology? Visit our main website
to discover how we’re revolutionizing web data extraction.
Getting Started
Quick Start
from scrapegraph_py import Client

client = Client(api_key="your-api-key")

response = client.markdownify(
website_url="https://example.com/article"
)

Get your API key from the dashboard


Example Response{
"request_id": "sg-req-md456",
"status": "completed",
"website_url": "https://example.com/article",
"result": "# Understanding AI-Powered Web Scraping\n\nWeb scraping has evolved
significantly with the advent of AI technologies...\n\n## Key Benefits\n\n-
Improved accuracy\n- Intelligent extraction\n- Structured output\n\n![AI Scraping
Process](https://example.com/images/ai-scraping.png)\n\n> AI-powered scraping
represents the future of web data extraction.\n\n### Getting Started\n\n1. Choose
your target website\n2. Define extraction goals\n3. Select appropriate tools\n",
"error": ""
}
The response includes:
request_id: Unique identifier for tracking your request
status: Current status of the conversion
result: Object containing the markdown content and metadata
error: Error message (if any occurred during conversion)

Async Support
For applications requiring asynchronous execution, Markdownify provides async
support through the AsyncClient:
from scrapegraph_py import AsyncClient
import asyncio

async def main():


async with AsyncClient(api_key="your-api-key") as client:
response = await client.markdownify(
website_url="https://example.com/article"
)
print(response)

# Run the async function


asyncio.run(main())

Integration Options
Official SDKs

Python SDK - Perfect for automation and content processing


JavaScript SDK - Ideal for web applications and content tools

AI Framework Integrations

LangChain Integration - Use Markdownify in your content pipelines


LlamaIndex Integration - Create searchable knowledge bases

Best Practices
Content Optimization

Verify source content quality


Check image and link preservation
Review markdown formatting
Validate output structure

Processing Tips

Handle large content in chunks


Preserve important metadata
Maintain content hierarchy
Check for formatting consistency

Example Projects
Check out our cookbook for real-world examples:

Blog migration tools


Documentation generators
Content archival systems
Knowledge base builders

API Reference
For detailed API documentation, see:

Start Conversion Job


Get Job Status

Support & Resources


DocumentationComprehensive guides and tutorialsAPI ReferenceDetailed API
documentationCommunityJoin our Discord communityGitHubCheck out our open-source
projectsMain WebsiteVisit our official website
Ready to Start?Sign up now and get your API key to begin converting web content to
clean markdown!Was this page helpful?YesNoLocalScraperFirefoxxgithublinkedinPowered
by MintlifyOn this pageOverviewKey FeaturesUse CasesContent
MigrationDocumentationContent ManagementGetting StartedQuick StartAsync
SupportIntegration OptionsOfficial SDKsAI Framework IntegrationsBest
PracticesContent OptimizationProcessing TipsExample ProjectsAPI ReferenceSupport &
Resources

------- • -------

https://docs.scrapegraphai.com/services/markdownify#integration-options

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationServicesMarkdownifyH
omeCookbookAPI ReferenceOfficial WebsiteCommunityBlogGet
StartedIntroductionDashboardServicesSmartScraperLocalScraperMarkdownifyBrowser
ExtensionsOfficial SDKsPython SDKJavaScript SDKIntegrations🦜 LangChain🦙 LlamaIndex👥
CrewAI PhidataContributeOpen SourceFeedbackResources🕷️ScrapeGraphAI
BadgeServicesMarkdownifyConvert web content to clean, structured markdown
Overview
Markdownify is our specialized service that transforms web content into clean,
well-formatted markdown. It intelligently preserves the content’s structure while
removing unnecessary elements, making it perfect for content migration,
documentation creation, and knowledge base building.
Try Markdownify instantly in our interactive playground - no coding required!
Key Features
Smart ConversionIntelligent content structure preservationClean OutputRemoves ads,
navigation, and irrelevant contentFormat RetentionMaintains headings, lists, and
text formattingAsset HandlingPreserves images and handles external links
Use Cases
Content Migration

Convert blog posts to markdown


Transform documentation
Migrate knowledge bases
Archive web content

Documentation

Create technical documentation


Build wikis and guides
Generate README files
Maintain developer docs

Content Management

Prepare content for CMS import


Create portable content
Build learning resources
Format articles

Want to learn more about our AI-powered scraping technology? Visit our main website
to discover how we’re revolutionizing web data extraction.
Getting Started
Quick Start
from scrapegraph_py import Client

client = Client(api_key="your-api-key")

response = client.markdownify(
website_url="https://example.com/article"
)

Get your API key from the dashboard


Example Response{
"request_id": "sg-req-md456",
"status": "completed",
"website_url": "https://example.com/article",
"result": "# Understanding AI-Powered Web Scraping\n\nWeb scraping has evolved
significantly with the advent of AI technologies...\n\n## Key Benefits\n\n-
Improved accuracy\n- Intelligent extraction\n- Structured output\n\n![AI Scraping
Process](https://example.com/images/ai-scraping.png)\n\n> AI-powered scraping
represents the future of web data extraction.\n\n### Getting Started\n\n1. Choose
your target website\n2. Define extraction goals\n3. Select appropriate tools\n",
"error": ""
}
The response includes:
request_id: Unique identifier for tracking your request
status: Current status of the conversion
result: Object containing the markdown content and metadata
error: Error message (if any occurred during conversion)

Async Support
For applications requiring asynchronous execution, Markdownify provides async
support through the AsyncClient:
from scrapegraph_py import AsyncClient
import asyncio

async def main():


async with AsyncClient(api_key="your-api-key") as client:
response = await client.markdownify(
website_url="https://example.com/article"
)
print(response)

# Run the async function


asyncio.run(main())

Integration Options
Official SDKs

Python SDK - Perfect for automation and content processing


JavaScript SDK - Ideal for web applications and content tools

AI Framework Integrations

LangChain Integration - Use Markdownify in your content pipelines


LlamaIndex Integration - Create searchable knowledge bases

Best Practices
Content Optimization
Verify source content quality
Check image and link preservation
Review markdown formatting
Validate output structure

Processing Tips

Handle large content in chunks


Preserve important metadata
Maintain content hierarchy
Check for formatting consistency

Example Projects
Check out our cookbook for real-world examples:

Blog migration tools


Documentation generators
Content archival systems
Knowledge base builders

API Reference
For detailed API documentation, see:

Start Conversion Job


Get Job Status

Support & Resources


DocumentationComprehensive guides and tutorialsAPI ReferenceDetailed API
documentationCommunityJoin our Discord communityGitHubCheck out our open-source
projectsMain WebsiteVisit our official website
Ready to Start?Sign up now and get your API key to begin converting web content to
clean markdown!Was this page helpful?YesNoLocalScraperFirefoxxgithublinkedinPowered
by MintlifyOn this pageOverviewKey FeaturesUse CasesContent
MigrationDocumentationContent ManagementGetting StartedQuick StartAsync
SupportIntegration OptionsOfficial SDKsAI Framework IntegrationsBest
PracticesContent OptimizationProcessing TipsExample ProjectsAPI ReferenceSupport &
Resources

------- • -------

https://docs.scrapegraphai.com/services/markdownify#official-sdks

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationServicesMarkdownifyH
omeCookbookAPI ReferenceOfficial WebsiteCommunityBlogGet
StartedIntroductionDashboardServicesSmartScraperLocalScraperMarkdownifyBrowser
ExtensionsOfficial SDKsPython SDKJavaScript SDKIntegrations🦜 LangChain🦙 LlamaIndex👥
CrewAI PhidataContributeOpen SourceFeedbackResources🕷️ScrapeGraphAI
BadgeServicesMarkdownifyConvert web content to clean, structured markdown
Overview
Markdownify is our specialized service that transforms web content into clean,
well-formatted markdown. It intelligently preserves the content’s structure while
removing unnecessary elements, making it perfect for content migration,
documentation creation, and knowledge base building.
Try Markdownify instantly in our interactive playground - no coding required!
Key Features
Smart ConversionIntelligent content structure preservationClean OutputRemoves ads,
navigation, and irrelevant contentFormat RetentionMaintains headings, lists, and
text formattingAsset HandlingPreserves images and handles external links
Use Cases
Content Migration

Convert blog posts to markdown


Transform documentation
Migrate knowledge bases
Archive web content

Documentation

Create technical documentation


Build wikis and guides
Generate README files
Maintain developer docs

Content Management

Prepare content for CMS import


Create portable content
Build learning resources
Format articles

Want to learn more about our AI-powered scraping technology? Visit our main website
to discover how we’re revolutionizing web data extraction.
Getting Started
Quick Start
from scrapegraph_py import Client

client = Client(api_key="your-api-key")

response = client.markdownify(
website_url="https://example.com/article"
)

Get your API key from the dashboard


Example Response{
"request_id": "sg-req-md456",
"status": "completed",
"website_url": "https://example.com/article",
"result": "# Understanding AI-Powered Web Scraping\n\nWeb scraping has evolved
significantly with the advent of AI technologies...\n\n## Key Benefits\n\n-
Improved accuracy\n- Intelligent extraction\n- Structured output\n\n![AI Scraping
Process](https://example.com/images/ai-scraping.png)\n\n> AI-powered scraping
represents the future of web data extraction.\n\n### Getting Started\n\n1. Choose
your target website\n2. Define extraction goals\n3. Select appropriate tools\n",
"error": ""
}
The response includes:
request_id: Unique identifier for tracking your request
status: Current status of the conversion
result: Object containing the markdown content and metadata
error: Error message (if any occurred during conversion)

Async Support
For applications requiring asynchronous execution, Markdownify provides async
support through the AsyncClient:
from scrapegraph_py import AsyncClient
import asyncio
async def main():
async with AsyncClient(api_key="your-api-key") as client:
response = await client.markdownify(
website_url="https://example.com/article"
)
print(response)

# Run the async function


asyncio.run(main())

Integration Options
Official SDKs

Python SDK - Perfect for automation and content processing


JavaScript SDK - Ideal for web applications and content tools

AI Framework Integrations

LangChain Integration - Use Markdownify in your content pipelines


LlamaIndex Integration - Create searchable knowledge bases

Best Practices
Content Optimization

Verify source content quality


Check image and link preservation
Review markdown formatting
Validate output structure

Processing Tips

Handle large content in chunks


Preserve important metadata
Maintain content hierarchy
Check for formatting consistency

Example Projects
Check out our cookbook for real-world examples:

Blog migration tools


Documentation generators
Content archival systems
Knowledge base builders

API Reference
For detailed API documentation, see:

Start Conversion Job


Get Job Status

Support & Resources


DocumentationComprehensive guides and tutorialsAPI ReferenceDetailed API
documentationCommunityJoin our Discord communityGitHubCheck out our open-source
projectsMain WebsiteVisit our official website
Ready to Start?Sign up now and get your API key to begin converting web content to
clean markdown!Was this page helpful?YesNoLocalScraperFirefoxxgithublinkedinPowered
by MintlifyOn this pageOverviewKey FeaturesUse CasesContent
MigrationDocumentationContent ManagementGetting StartedQuick StartAsync
SupportIntegration OptionsOfficial SDKsAI Framework IntegrationsBest
PracticesContent OptimizationProcessing TipsExample ProjectsAPI ReferenceSupport &
Resources

------- • -------

https://docs.scrapegraphai.com/services/markdownify#ai-framework-integrations

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationServicesMarkdownifyH
omeCookbookAPI ReferenceOfficial WebsiteCommunityBlogGet
StartedIntroductionDashboardServicesSmartScraperLocalScraperMarkdownifyBrowser
ExtensionsOfficial SDKsPython SDKJavaScript SDKIntegrations🦜 LangChain🦙 LlamaIndex👥
CrewAI PhidataContributeOpen SourceFeedbackResources🕷️ScrapeGraphAI
BadgeServicesMarkdownifyConvert web content to clean, structured markdown
Overview
Markdownify is our specialized service that transforms web content into clean,
well-formatted markdown. It intelligently preserves the content’s structure while
removing unnecessary elements, making it perfect for content migration,
documentation creation, and knowledge base building.
Try Markdownify instantly in our interactive playground - no coding required!
Key Features
Smart ConversionIntelligent content structure preservationClean OutputRemoves ads,
navigation, and irrelevant contentFormat RetentionMaintains headings, lists, and
text formattingAsset HandlingPreserves images and handles external links
Use Cases
Content Migration

Convert blog posts to markdown


Transform documentation
Migrate knowledge bases
Archive web content

Documentation

Create technical documentation


Build wikis and guides
Generate README files
Maintain developer docs

Content Management

Prepare content for CMS import


Create portable content
Build learning resources
Format articles

Want to learn more about our AI-powered scraping technology? Visit our main website
to discover how we’re revolutionizing web data extraction.
Getting Started
Quick Start
from scrapegraph_py import Client

client = Client(api_key="your-api-key")

response = client.markdownify(
website_url="https://example.com/article"
)

Get your API key from the dashboard


Example Response{
"request_id": "sg-req-md456",
"status": "completed",
"website_url": "https://example.com/article",
"result": "# Understanding AI-Powered Web Scraping\n\nWeb scraping has evolved
significantly with the advent of AI technologies...\n\n## Key Benefits\n\n-
Improved accuracy\n- Intelligent extraction\n- Structured output\n\n![AI Scraping
Process](https://example.com/images/ai-scraping.png)\n\n> AI-powered scraping
represents the future of web data extraction.\n\n### Getting Started\n\n1. Choose
your target website\n2. Define extraction goals\n3. Select appropriate tools\n",
"error": ""
}
The response includes:
request_id: Unique identifier for tracking your request
status: Current status of the conversion
result: Object containing the markdown content and metadata
error: Error message (if any occurred during conversion)

Async Support
For applications requiring asynchronous execution, Markdownify provides async
support through the AsyncClient:
from scrapegraph_py import AsyncClient
import asyncio

async def main():


async with AsyncClient(api_key="your-api-key") as client:
response = await client.markdownify(
website_url="https://example.com/article"
)
print(response)

# Run the async function


asyncio.run(main())

Integration Options
Official SDKs

Python SDK - Perfect for automation and content processing


JavaScript SDK - Ideal for web applications and content tools

AI Framework Integrations

LangChain Integration - Use Markdownify in your content pipelines


LlamaIndex Integration - Create searchable knowledge bases

Best Practices
Content Optimization

Verify source content quality


Check image and link preservation
Review markdown formatting
Validate output structure

Processing Tips

Handle large content in chunks


Preserve important metadata
Maintain content hierarchy
Check for formatting consistency
Example Projects
Check out our cookbook for real-world examples:

Blog migration tools


Documentation generators
Content archival systems
Knowledge base builders

API Reference
For detailed API documentation, see:

Start Conversion Job


Get Job Status

Support & Resources


DocumentationComprehensive guides and tutorialsAPI ReferenceDetailed API
documentationCommunityJoin our Discord communityGitHubCheck out our open-source
projectsMain WebsiteVisit our official website
Ready to Start?Sign up now and get your API key to begin converting web content to
clean markdown!Was this page helpful?YesNoLocalScraperFirefoxxgithublinkedinPowered
by MintlifyOn this pageOverviewKey FeaturesUse CasesContent
MigrationDocumentationContent ManagementGetting StartedQuick StartAsync
SupportIntegration OptionsOfficial SDKsAI Framework IntegrationsBest
PracticesContent OptimizationProcessing TipsExample ProjectsAPI ReferenceSupport &
Resources

------- • -------

https://docs.scrapegraphai.com/services/markdownify#best-practices

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationServicesMarkdownifyH
omeCookbookAPI ReferenceOfficial WebsiteCommunityBlogGet
StartedIntroductionDashboardServicesSmartScraperLocalScraperMarkdownifyBrowser
ExtensionsOfficial SDKsPython SDKJavaScript SDKIntegrations🦜 LangChain🦙 LlamaIndex👥
CrewAI PhidataContributeOpen SourceFeedbackResources🕷️ScrapeGraphAI
BadgeServicesMarkdownifyConvert web content to clean, structured markdown
Overview
Markdownify is our specialized service that transforms web content into clean,
well-formatted markdown. It intelligently preserves the content’s structure while
removing unnecessary elements, making it perfect for content migration,
documentation creation, and knowledge base building.
Try Markdownify instantly in our interactive playground - no coding required!
Key Features
Smart ConversionIntelligent content structure preservationClean OutputRemoves ads,
navigation, and irrelevant contentFormat RetentionMaintains headings, lists, and
text formattingAsset HandlingPreserves images and handles external links
Use Cases
Content Migration

Convert blog posts to markdown


Transform documentation
Migrate knowledge bases
Archive web content

Documentation

Create technical documentation


Build wikis and guides
Generate README files
Maintain developer docs

Content Management

Prepare content for CMS import


Create portable content
Build learning resources
Format articles

Want to learn more about our AI-powered scraping technology? Visit our main website
to discover how we’re revolutionizing web data extraction.
Getting Started
Quick Start
from scrapegraph_py import Client

client = Client(api_key="your-api-key")

response = client.markdownify(
website_url="https://example.com/article"
)

Get your API key from the dashboard


Example Response{
"request_id": "sg-req-md456",
"status": "completed",
"website_url": "https://example.com/article",
"result": "# Understanding AI-Powered Web Scraping\n\nWeb scraping has evolved
significantly with the advent of AI technologies...\n\n## Key Benefits\n\n-
Improved accuracy\n- Intelligent extraction\n- Structured output\n\n![AI Scraping
Process](https://example.com/images/ai-scraping.png)\n\n> AI-powered scraping
represents the future of web data extraction.\n\n### Getting Started\n\n1. Choose
your target website\n2. Define extraction goals\n3. Select appropriate tools\n",
"error": ""
}
The response includes:
request_id: Unique identifier for tracking your request
status: Current status of the conversion
result: Object containing the markdown content and metadata
error: Error message (if any occurred during conversion)

Async Support
For applications requiring asynchronous execution, Markdownify provides async
support through the AsyncClient:
from scrapegraph_py import AsyncClient
import asyncio

async def main():


async with AsyncClient(api_key="your-api-key") as client:
response = await client.markdownify(
website_url="https://example.com/article"
)
print(response)

# Run the async function


asyncio.run(main())

Integration Options
Official SDKs

Python SDK - Perfect for automation and content processing


JavaScript SDK - Ideal for web applications and content tools

AI Framework Integrations

LangChain Integration - Use Markdownify in your content pipelines


LlamaIndex Integration - Create searchable knowledge bases

Best Practices
Content Optimization

Verify source content quality


Check image and link preservation
Review markdown formatting
Validate output structure

Processing Tips

Handle large content in chunks


Preserve important metadata
Maintain content hierarchy
Check for formatting consistency

Example Projects
Check out our cookbook for real-world examples:

Blog migration tools


Documentation generators
Content archival systems
Knowledge base builders

API Reference
For detailed API documentation, see:

Start Conversion Job


Get Job Status

Support & Resources


DocumentationComprehensive guides and tutorialsAPI ReferenceDetailed API
documentationCommunityJoin our Discord communityGitHubCheck out our open-source
projectsMain WebsiteVisit our official website
Ready to Start?Sign up now and get your API key to begin converting web content to
clean markdown!Was this page helpful?YesNoLocalScraperFirefoxxgithublinkedinPowered
by MintlifyOn this pageOverviewKey FeaturesUse CasesContent
MigrationDocumentationContent ManagementGetting StartedQuick StartAsync
SupportIntegration OptionsOfficial SDKsAI Framework IntegrationsBest
PracticesContent OptimizationProcessing TipsExample ProjectsAPI ReferenceSupport &
Resources

------- • -------

https://docs.scrapegraphai.com/services/markdownify#content-optimization

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationServicesMarkdownifyH
omeCookbookAPI ReferenceOfficial WebsiteCommunityBlogGet
StartedIntroductionDashboardServicesSmartScraperLocalScraperMarkdownifyBrowser
ExtensionsOfficial SDKsPython SDKJavaScript SDKIntegrations🦜 LangChain🦙 LlamaIndex👥
CrewAI PhidataContributeOpen SourceFeedbackResources🕷️ScrapeGraphAI
BadgeServicesMarkdownifyConvert web content to clean, structured markdown
Overview
Markdownify is our specialized service that transforms web content into clean,
well-formatted markdown. It intelligently preserves the content’s structure while
removing unnecessary elements, making it perfect for content migration,
documentation creation, and knowledge base building.
Try Markdownify instantly in our interactive playground - no coding required!
Key Features
Smart ConversionIntelligent content structure preservationClean OutputRemoves ads,
navigation, and irrelevant contentFormat RetentionMaintains headings, lists, and
text formattingAsset HandlingPreserves images and handles external links
Use Cases
Content Migration

Convert blog posts to markdown


Transform documentation
Migrate knowledge bases
Archive web content

Documentation

Create technical documentation


Build wikis and guides
Generate README files
Maintain developer docs

Content Management

Prepare content for CMS import


Create portable content
Build learning resources
Format articles

Want to learn more about our AI-powered scraping technology? Visit our main website
to discover how we’re revolutionizing web data extraction.
Getting Started
Quick Start
from scrapegraph_py import Client

client = Client(api_key="your-api-key")

response = client.markdownify(
website_url="https://example.com/article"
)

Get your API key from the dashboard


Example Response{
"request_id": "sg-req-md456",
"status": "completed",
"website_url": "https://example.com/article",
"result": "# Understanding AI-Powered Web Scraping\n\nWeb scraping has evolved
significantly with the advent of AI technologies...\n\n## Key Benefits\n\n-
Improved accuracy\n- Intelligent extraction\n- Structured output\n\n![AI Scraping
Process](https://example.com/images/ai-scraping.png)\n\n> AI-powered scraping
represents the future of web data extraction.\n\n### Getting Started\n\n1. Choose
your target website\n2. Define extraction goals\n3. Select appropriate tools\n",
"error": ""
}
The response includes:
request_id: Unique identifier for tracking your request
status: Current status of the conversion
result: Object containing the markdown content and metadata
error: Error message (if any occurred during conversion)

Async Support
For applications requiring asynchronous execution, Markdownify provides async
support through the AsyncClient:
from scrapegraph_py import AsyncClient
import asyncio

async def main():


async with AsyncClient(api_key="your-api-key") as client:
response = await client.markdownify(
website_url="https://example.com/article"
)
print(response)

# Run the async function


asyncio.run(main())

Integration Options
Official SDKs

Python SDK - Perfect for automation and content processing


JavaScript SDK - Ideal for web applications and content tools

AI Framework Integrations

LangChain Integration - Use Markdownify in your content pipelines


LlamaIndex Integration - Create searchable knowledge bases

Best Practices
Content Optimization

Verify source content quality


Check image and link preservation
Review markdown formatting
Validate output structure

Processing Tips

Handle large content in chunks


Preserve important metadata
Maintain content hierarchy
Check for formatting consistency

Example Projects
Check out our cookbook for real-world examples:

Blog migration tools


Documentation generators
Content archival systems
Knowledge base builders

API Reference
For detailed API documentation, see:
Start Conversion Job
Get Job Status

Support & Resources


DocumentationComprehensive guides and tutorialsAPI ReferenceDetailed API
documentationCommunityJoin our Discord communityGitHubCheck out our open-source
projectsMain WebsiteVisit our official website
Ready to Start?Sign up now and get your API key to begin converting web content to
clean markdown!Was this page helpful?YesNoLocalScraperFirefoxxgithublinkedinPowered
by MintlifyOn this pageOverviewKey FeaturesUse CasesContent
MigrationDocumentationContent ManagementGetting StartedQuick StartAsync
SupportIntegration OptionsOfficial SDKsAI Framework IntegrationsBest
PracticesContent OptimizationProcessing TipsExample ProjectsAPI ReferenceSupport &
Resources

------- • -------

https://docs.scrapegraphai.com/services/markdownify#processing-tips

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationServicesMarkdownifyH
omeCookbookAPI ReferenceOfficial WebsiteCommunityBlogGet
StartedIntroductionDashboardServicesSmartScraperLocalScraperMarkdownifyBrowser
ExtensionsOfficial SDKsPython SDKJavaScript SDKIntegrations🦜 LangChain🦙 LlamaIndex👥
CrewAI PhidataContributeOpen SourceFeedbackResources🕷️ScrapeGraphAI
BadgeServicesMarkdownifyConvert web content to clean, structured markdown
Overview
Markdownify is our specialized service that transforms web content into clean,
well-formatted markdown. It intelligently preserves the content’s structure while
removing unnecessary elements, making it perfect for content migration,
documentation creation, and knowledge base building.
Try Markdownify instantly in our interactive playground - no coding required!
Key Features
Smart ConversionIntelligent content structure preservationClean OutputRemoves ads,
navigation, and irrelevant contentFormat RetentionMaintains headings, lists, and
text formattingAsset HandlingPreserves images and handles external links
Use Cases
Content Migration

Convert blog posts to markdown


Transform documentation
Migrate knowledge bases
Archive web content

Documentation

Create technical documentation


Build wikis and guides
Generate README files
Maintain developer docs

Content Management

Prepare content for CMS import


Create portable content
Build learning resources
Format articles
Want to learn more about our AI-powered scraping technology? Visit our main website
to discover how we’re revolutionizing web data extraction.
Getting Started
Quick Start
from scrapegraph_py import Client

client = Client(api_key="your-api-key")

response = client.markdownify(
website_url="https://example.com/article"
)

Get your API key from the dashboard


Example Response{
"request_id": "sg-req-md456",
"status": "completed",
"website_url": "https://example.com/article",
"result": "# Understanding AI-Powered Web Scraping\n\nWeb scraping has evolved
significantly with the advent of AI technologies...\n\n## Key Benefits\n\n-
Improved accuracy\n- Intelligent extraction\n- Structured output\n\n![AI Scraping
Process](https://example.com/images/ai-scraping.png)\n\n> AI-powered scraping
represents the future of web data extraction.\n\n### Getting Started\n\n1. Choose
your target website\n2. Define extraction goals\n3. Select appropriate tools\n",
"error": ""
}
The response includes:
request_id: Unique identifier for tracking your request
status: Current status of the conversion
result: Object containing the markdown content and metadata
error: Error message (if any occurred during conversion)

Async Support
For applications requiring asynchronous execution, Markdownify provides async
support through the AsyncClient:
from scrapegraph_py import AsyncClient
import asyncio

async def main():


async with AsyncClient(api_key="your-api-key") as client:
response = await client.markdownify(
website_url="https://example.com/article"
)
print(response)

# Run the async function


asyncio.run(main())

Integration Options
Official SDKs

Python SDK - Perfect for automation and content processing


JavaScript SDK - Ideal for web applications and content tools

AI Framework Integrations

LangChain Integration - Use Markdownify in your content pipelines


LlamaIndex Integration - Create searchable knowledge bases

Best Practices
Content Optimization

Verify source content quality


Check image and link preservation
Review markdown formatting
Validate output structure

Processing Tips

Handle large content in chunks


Preserve important metadata
Maintain content hierarchy
Check for formatting consistency

Example Projects
Check out our cookbook for real-world examples:

Blog migration tools


Documentation generators
Content archival systems
Knowledge base builders

API Reference
For detailed API documentation, see:

Start Conversion Job


Get Job Status

Support & Resources


DocumentationComprehensive guides and tutorialsAPI ReferenceDetailed API
documentationCommunityJoin our Discord communityGitHubCheck out our open-source
projectsMain WebsiteVisit our official website
Ready to Start?Sign up now and get your API key to begin converting web content to
clean markdown!Was this page helpful?YesNoLocalScraperFirefoxxgithublinkedinPowered
by MintlifyOn this pageOverviewKey FeaturesUse CasesContent
MigrationDocumentationContent ManagementGetting StartedQuick StartAsync
SupportIntegration OptionsOfficial SDKsAI Framework IntegrationsBest
PracticesContent OptimizationProcessing TipsExample ProjectsAPI ReferenceSupport &
Resources

------- • -------

https://docs.scrapegraphai.com/services/markdownify#example-projects

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationServicesMarkdownifyH
omeCookbookAPI ReferenceOfficial WebsiteCommunityBlogGet
StartedIntroductionDashboardServicesSmartScraperLocalScraperMarkdownifyBrowser
ExtensionsOfficial SDKsPython SDKJavaScript SDKIntegrations🦜 LangChain🦙 LlamaIndex👥
CrewAI PhidataContributeOpen SourceFeedbackResources🕷️ScrapeGraphAI
BadgeServicesMarkdownifyConvert web content to clean, structured markdown
Overview
Markdownify is our specialized service that transforms web content into clean,
well-formatted markdown. It intelligently preserves the content’s structure while
removing unnecessary elements, making it perfect for content migration,
documentation creation, and knowledge base building.
Try Markdownify instantly in our interactive playground - no coding required!
Key Features
Smart ConversionIntelligent content structure preservationClean OutputRemoves ads,
navigation, and irrelevant contentFormat RetentionMaintains headings, lists, and
text formattingAsset HandlingPreserves images and handles external links
Use Cases
Content Migration

Convert blog posts to markdown


Transform documentation
Migrate knowledge bases
Archive web content

Documentation

Create technical documentation


Build wikis and guides
Generate README files
Maintain developer docs

Content Management

Prepare content for CMS import


Create portable content
Build learning resources
Format articles

Want to learn more about our AI-powered scraping technology? Visit our main website
to discover how we’re revolutionizing web data extraction.
Getting Started
Quick Start
from scrapegraph_py import Client

client = Client(api_key="your-api-key")

response = client.markdownify(
website_url="https://example.com/article"
)

Get your API key from the dashboard


Example Response{
"request_id": "sg-req-md456",
"status": "completed",
"website_url": "https://example.com/article",
"result": "# Understanding AI-Powered Web Scraping\n\nWeb scraping has evolved
significantly with the advent of AI technologies...\n\n## Key Benefits\n\n-
Improved accuracy\n- Intelligent extraction\n- Structured output\n\n![AI Scraping
Process](https://example.com/images/ai-scraping.png)\n\n> AI-powered scraping
represents the future of web data extraction.\n\n### Getting Started\n\n1. Choose
your target website\n2. Define extraction goals\n3. Select appropriate tools\n",
"error": ""
}
The response includes:
request_id: Unique identifier for tracking your request
status: Current status of the conversion
result: Object containing the markdown content and metadata
error: Error message (if any occurred during conversion)

Async Support
For applications requiring asynchronous execution, Markdownify provides async
support through the AsyncClient:
from scrapegraph_py import AsyncClient
import asyncio

async def main():


async with AsyncClient(api_key="your-api-key") as client:
response = await client.markdownify(
website_url="https://example.com/article"
)
print(response)

# Run the async function


asyncio.run(main())

Integration Options
Official SDKs

Python SDK - Perfect for automation and content processing


JavaScript SDK - Ideal for web applications and content tools

AI Framework Integrations

LangChain Integration - Use Markdownify in your content pipelines


LlamaIndex Integration - Create searchable knowledge bases

Best Practices
Content Optimization

Verify source content quality


Check image and link preservation
Review markdown formatting
Validate output structure

Processing Tips

Handle large content in chunks


Preserve important metadata
Maintain content hierarchy
Check for formatting consistency

Example Projects
Check out our cookbook for real-world examples:

Blog migration tools


Documentation generators
Content archival systems
Knowledge base builders

API Reference
For detailed API documentation, see:

Start Conversion Job


Get Job Status

Support & Resources


DocumentationComprehensive guides and tutorialsAPI ReferenceDetailed API
documentationCommunityJoin our Discord communityGitHubCheck out our open-source
projectsMain WebsiteVisit our official website
Ready to Start?Sign up now and get your API key to begin converting web content to
clean markdown!Was this page helpful?YesNoLocalScraperFirefoxxgithublinkedinPowered
by MintlifyOn this pageOverviewKey FeaturesUse CasesContent
MigrationDocumentationContent ManagementGetting StartedQuick StartAsync
SupportIntegration OptionsOfficial SDKsAI Framework IntegrationsBest
PracticesContent OptimizationProcessing TipsExample ProjectsAPI ReferenceSupport &
Resources

------- • -------

https://docs.scrapegraphai.com/services/markdownify#api-reference

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationServicesMarkdownifyH
omeCookbookAPI ReferenceOfficial WebsiteCommunityBlogGet
StartedIntroductionDashboardServicesSmartScraperLocalScraperMarkdownifyBrowser
ExtensionsOfficial SDKsPython SDKJavaScript SDKIntegrations🦜 LangChain🦙 LlamaIndex👥
CrewAI PhidataContributeOpen SourceFeedbackResources🕷️ScrapeGraphAI
BadgeServicesMarkdownifyConvert web content to clean, structured markdown
Overview
Markdownify is our specialized service that transforms web content into clean,
well-formatted markdown. It intelligently preserves the content’s structure while
removing unnecessary elements, making it perfect for content migration,
documentation creation, and knowledge base building.
Try Markdownify instantly in our interactive playground - no coding required!
Key Features
Smart ConversionIntelligent content structure preservationClean OutputRemoves ads,
navigation, and irrelevant contentFormat RetentionMaintains headings, lists, and
text formattingAsset HandlingPreserves images and handles external links
Use Cases
Content Migration

Convert blog posts to markdown


Transform documentation
Migrate knowledge bases
Archive web content

Documentation

Create technical documentation


Build wikis and guides
Generate README files
Maintain developer docs

Content Management

Prepare content for CMS import


Create portable content
Build learning resources
Format articles

Want to learn more about our AI-powered scraping technology? Visit our main website
to discover how we’re revolutionizing web data extraction.
Getting Started
Quick Start
from scrapegraph_py import Client

client = Client(api_key="your-api-key")

response = client.markdownify(
website_url="https://example.com/article"
)
Get your API key from the dashboard
Example Response{
"request_id": "sg-req-md456",
"status": "completed",
"website_url": "https://example.com/article",
"result": "# Understanding AI-Powered Web Scraping\n\nWeb scraping has evolved
significantly with the advent of AI technologies...\n\n## Key Benefits\n\n-
Improved accuracy\n- Intelligent extraction\n- Structured output\n\n![AI Scraping
Process](https://example.com/images/ai-scraping.png)\n\n> AI-powered scraping
represents the future of web data extraction.\n\n### Getting Started\n\n1. Choose
your target website\n2. Define extraction goals\n3. Select appropriate tools\n",
"error": ""
}
The response includes:
request_id: Unique identifier for tracking your request
status: Current status of the conversion
result: Object containing the markdown content and metadata
error: Error message (if any occurred during conversion)

Async Support
For applications requiring asynchronous execution, Markdownify provides async
support through the AsyncClient:
from scrapegraph_py import AsyncClient
import asyncio

async def main():


async with AsyncClient(api_key="your-api-key") as client:
response = await client.markdownify(
website_url="https://example.com/article"
)
print(response)

# Run the async function


asyncio.run(main())

Integration Options
Official SDKs

Python SDK - Perfect for automation and content processing


JavaScript SDK - Ideal for web applications and content tools

AI Framework Integrations

LangChain Integration - Use Markdownify in your content pipelines


LlamaIndex Integration - Create searchable knowledge bases

Best Practices
Content Optimization

Verify source content quality


Check image and link preservation
Review markdown formatting
Validate output structure

Processing Tips

Handle large content in chunks


Preserve important metadata
Maintain content hierarchy
Check for formatting consistency

Example Projects
Check out our cookbook for real-world examples:

Blog migration tools


Documentation generators
Content archival systems
Knowledge base builders

API Reference
For detailed API documentation, see:

Start Conversion Job


Get Job Status

Support & Resources


DocumentationComprehensive guides and tutorialsAPI ReferenceDetailed API
documentationCommunityJoin our Discord communityGitHubCheck out our open-source
projectsMain WebsiteVisit our official website
Ready to Start?Sign up now and get your API key to begin converting web content to
clean markdown!Was this page helpful?YesNoLocalScraperFirefoxxgithublinkedinPowered
by MintlifyOn this pageOverviewKey FeaturesUse CasesContent
MigrationDocumentationContent ManagementGetting StartedQuick StartAsync
SupportIntegration OptionsOfficial SDKsAI Framework IntegrationsBest
PracticesContent OptimizationProcessing TipsExample ProjectsAPI ReferenceSupport &
Resources

------- • -------

https://docs.scrapegraphai.com/services/markdownify#support-and-resources

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationServicesMarkdownifyH
omeCookbookAPI ReferenceOfficial WebsiteCommunityBlogGet
StartedIntroductionDashboardServicesSmartScraperLocalScraperMarkdownifyBrowser
ExtensionsOfficial SDKsPython SDKJavaScript SDKIntegrations🦜 LangChain🦙 LlamaIndex👥
CrewAI PhidataContributeOpen SourceFeedbackResources🕷️ScrapeGraphAI
BadgeServicesMarkdownifyConvert web content to clean, structured markdown
Overview
Markdownify is our specialized service that transforms web content into clean,
well-formatted markdown. It intelligently preserves the content’s structure while
removing unnecessary elements, making it perfect for content migration,
documentation creation, and knowledge base building.
Try Markdownify instantly in our interactive playground - no coding required!
Key Features
Smart ConversionIntelligent content structure preservationClean OutputRemoves ads,
navigation, and irrelevant contentFormat RetentionMaintains headings, lists, and
text formattingAsset HandlingPreserves images and handles external links
Use Cases
Content Migration

Convert blog posts to markdown


Transform documentation
Migrate knowledge bases
Archive web content

Documentation
Create technical documentation
Build wikis and guides
Generate README files
Maintain developer docs

Content Management

Prepare content for CMS import


Create portable content
Build learning resources
Format articles

Want to learn more about our AI-powered scraping technology? Visit our main website
to discover how we’re revolutionizing web data extraction.
Getting Started
Quick Start
from scrapegraph_py import Client

client = Client(api_key="your-api-key")

response = client.markdownify(
website_url="https://example.com/article"
)

Get your API key from the dashboard


Example Response{
"request_id": "sg-req-md456",
"status": "completed",
"website_url": "https://example.com/article",
"result": "# Understanding AI-Powered Web Scraping\n\nWeb scraping has evolved
significantly with the advent of AI technologies...\n\n## Key Benefits\n\n-
Improved accuracy\n- Intelligent extraction\n- Structured output\n\n![AI Scraping
Process](https://example.com/images/ai-scraping.png)\n\n> AI-powered scraping
represents the future of web data extraction.\n\n### Getting Started\n\n1. Choose
your target website\n2. Define extraction goals\n3. Select appropriate tools\n",
"error": ""
}
The response includes:
request_id: Unique identifier for tracking your request
status: Current status of the conversion
result: Object containing the markdown content and metadata
error: Error message (if any occurred during conversion)

Async Support
For applications requiring asynchronous execution, Markdownify provides async
support through the AsyncClient:
from scrapegraph_py import AsyncClient
import asyncio

async def main():


async with AsyncClient(api_key="your-api-key") as client:
response = await client.markdownify(
website_url="https://example.com/article"
)
print(response)

# Run the async function


asyncio.run(main())
Integration Options
Official SDKs

Python SDK - Perfect for automation and content processing


JavaScript SDK - Ideal for web applications and content tools

AI Framework Integrations

LangChain Integration - Use Markdownify in your content pipelines


LlamaIndex Integration - Create searchable knowledge bases

Best Practices
Content Optimization

Verify source content quality


Check image and link preservation
Review markdown formatting
Validate output structure

Processing Tips

Handle large content in chunks


Preserve important metadata
Maintain content hierarchy
Check for formatting consistency

Example Projects
Check out our cookbook for real-world examples:

Blog migration tools


Documentation generators
Content archival systems
Knowledge base builders

API Reference
For detailed API documentation, see:

Start Conversion Job


Get Job Status

Support & Resources


DocumentationComprehensive guides and tutorialsAPI ReferenceDetailed API
documentationCommunityJoin our Discord communityGitHubCheck out our open-source
projectsMain WebsiteVisit our official website
Ready to Start?Sign up now and get your API key to begin converting web content to
clean markdown!Was this page helpful?YesNoLocalScraperFirefoxxgithublinkedinPowered
by MintlifyOn this pageOverviewKey FeaturesUse CasesContent
MigrationDocumentationContent ManagementGetting StartedQuick StartAsync
SupportIntegration OptionsOfficial SDKsAI Framework IntegrationsBest
PracticesContent OptimizationProcessing TipsExample ProjectsAPI ReferenceSupport &
Resources

------- • -------

https://docs.scrapegraphai.com/sdks/python#installation

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationOfficial SDKsPython
SDKHomeCookbookAPI ReferenceOfficial WebsiteCommunityBlogGet
StartedIntroductionDashboardServicesSmartScraperLocalScraperMarkdownifyBrowser
ExtensionsOfficial SDKsPython SDKJavaScript SDKIntegrations🦜 LangChain🦙 LlamaIndex👥
CrewAI PhidataContributeOpen SourceFeedbackResources🕷️ScrapeGraphAI BadgeOfficial
SDKsPython SDKOfficial Python SDK for ScrapeGraphAI
PyPI PackagePython Support
Installation
Install the package using pip:
pip install scrapegraph-py

Features

AI-Powered Extraction: Advanced web scraping using artificial intelligence


Flexible Clients: Both synchronous and asynchronous support
Type Safety: Structured output with Pydantic schemas
Production Ready: Detailed logging and automatic retries
Developer Friendly: Comprehensive error handling

Quick Start
Initialize the client with your API key:
from scrapegraph_py import Client

client = Client(api_key="your-api-key-here")

You can also set the SGAI_API_KEY environment variable and initialize the client
without parameters: client = Client()
Services
SmartScraper
Extract specific information from any webpage using AI:
response = client.smartscraper(
website_url="https://example.com",
user_prompt="Extract the main heading and description"
)

Basic Schema ExampleDefine a simple schema for basic data extraction:from pydantic
import BaseModel, Field

class ArticleData(BaseModel):
title: str = Field(description="The article title")
author: str = Field(description="The author's name")
publish_date: str = Field(description="Article publication date")
content: str = Field(description="Main article content")
category: str = Field(description="Article category")

response = client.smartscraper(
website_url="https://example.com/blog/article",
user_prompt="Extract the article information",
output_schema=ArticleData
)

print(f"Title: {response.title}")
print(f"Author: {response.author}")
print(f"Published: {response.publish_date}")

Advanced Schema ExampleDefine a complex schema for nested data structures:from


typing import List
from pydantic import BaseModel, Field

class Employee(BaseModel):
name: str = Field(description="Employee's full name")
position: str = Field(description="Job title")
department: str = Field(description="Department name")
email: str = Field(description="Email address")

class Office(BaseModel):
location: str = Field(description="Office location/city")
address: str = Field(description="Full address")
phone: str = Field(description="Contact number")

class CompanyData(BaseModel):
name: str = Field(description="Company name")
description: str = Field(description="Company description")
industry: str = Field(description="Industry sector")
founded_year: int = Field(description="Year company was founded")
employees: List[Employee] = Field(description="List of key employees")
offices: List[Office] = Field(description="Company office locations")
website: str = Field(description="Company website URL")

# Extract comprehensive company information


response = client.smartscraper(
website_url="https://example.com/about",
user_prompt="Extract detailed company information including employees and
offices",
output_schema=CompanyData
)

# Access nested data


print(f"Company: {response.name}")
print("\nKey Employees:")
for employee in response.employees:
print(f"- {employee.name} ({employee.position})")

print("\nOffice Locations:")
for office in response.offices:
print(f"- {office.location}: {office.address}")

LocalScraper
Process local HTML content with AI extraction:
html_content = """
<html>
<body>
<h1>Company Name</h1>
<p>We are a technology company focused on AI solutions.</p>
<div class="contact">
<p>Email: [email protected]</p>
</div>
</body>
</html>
"""

response = client.localscraper(
user_prompt="Extract the company description",
website_html=html_content
)

Markdownify
Convert any webpage into clean, formatted markdown:
response = client.markdownify(
website_url="https://example.com"
)

Async Support
All endpoints support asynchronous operations:
import asyncio
from scrapegraph_py import AsyncClient

async def main():


async with AsyncClient() as client:
response = await client.smartscraper(
website_url="https://example.com",
user_prompt="Extract the main content"
)
print(response)

asyncio.run(main())

Feedback
Help us improve by submitting feedback programmatically:
client.submit_feedback(
request_id="your-request-id",
rating=5,
feedback_text="Great results!"
)

Support
GitHubReport issues and contribute to the SDKEmail SupportGet help from our
development team
LicenseThis project is licensed under the MIT License. See the LICENSE file for
details.Was this page helpful?YesNoFirefoxJavaScript SDKxgithublinkedinPowered by
MintlifyOn this pageInstallationFeaturesQuick
StartServicesSmartScraperLocalScraperMarkdownifyAsync SupportFeedbackSupport

------- • -------

https://docs.scrapegraphai.com/sdks/python#features

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationOfficial SDKsPython
SDKHomeCookbookAPI ReferenceOfficial WebsiteCommunityBlogGet
StartedIntroductionDashboardServicesSmartScraperLocalScraperMarkdownifyBrowser
ExtensionsOfficial SDKsPython SDKJavaScript SDKIntegrations🦜 LangChain🦙 LlamaIndex👥
CrewAI PhidataContributeOpen SourceFeedbackResources🕷️ScrapeGraphAI BadgeOfficial
SDKsPython SDKOfficial Python SDK for ScrapeGraphAI
PyPI PackagePython Support
Installation
Install the package using pip:
pip install scrapegraph-py

Features

AI-Powered Extraction: Advanced web scraping using artificial intelligence


Flexible Clients: Both synchronous and asynchronous support
Type Safety: Structured output with Pydantic schemas
Production Ready: Detailed logging and automatic retries
Developer Friendly: Comprehensive error handling

Quick Start
Initialize the client with your API key:
from scrapegraph_py import Client

client = Client(api_key="your-api-key-here")

You can also set the SGAI_API_KEY environment variable and initialize the client
without parameters: client = Client()
Services
SmartScraper
Extract specific information from any webpage using AI:
response = client.smartscraper(
website_url="https://example.com",
user_prompt="Extract the main heading and description"
)

Basic Schema ExampleDefine a simple schema for basic data extraction:from pydantic
import BaseModel, Field

class ArticleData(BaseModel):
title: str = Field(description="The article title")
author: str = Field(description="The author's name")
publish_date: str = Field(description="Article publication date")
content: str = Field(description="Main article content")
category: str = Field(description="Article category")

response = client.smartscraper(
website_url="https://example.com/blog/article",
user_prompt="Extract the article information",
output_schema=ArticleData
)

print(f"Title: {response.title}")
print(f"Author: {response.author}")
print(f"Published: {response.publish_date}")

Advanced Schema ExampleDefine a complex schema for nested data structures:from


typing import List
from pydantic import BaseModel, Field

class Employee(BaseModel):
name: str = Field(description="Employee's full name")
position: str = Field(description="Job title")
department: str = Field(description="Department name")
email: str = Field(description="Email address")

class Office(BaseModel):
location: str = Field(description="Office location/city")
address: str = Field(description="Full address")
phone: str = Field(description="Contact number")

class CompanyData(BaseModel):
name: str = Field(description="Company name")
description: str = Field(description="Company description")
industry: str = Field(description="Industry sector")
founded_year: int = Field(description="Year company was founded")
employees: List[Employee] = Field(description="List of key employees")
offices: List[Office] = Field(description="Company office locations")
website: str = Field(description="Company website URL")
# Extract comprehensive company information
response = client.smartscraper(
website_url="https://example.com/about",
user_prompt="Extract detailed company information including employees and
offices",
output_schema=CompanyData
)

# Access nested data


print(f"Company: {response.name}")
print("\nKey Employees:")
for employee in response.employees:
print(f"- {employee.name} ({employee.position})")

print("\nOffice Locations:")
for office in response.offices:
print(f"- {office.location}: {office.address}")

LocalScraper
Process local HTML content with AI extraction:
html_content = """
<html>
<body>
<h1>Company Name</h1>
<p>We are a technology company focused on AI solutions.</p>
<div class="contact">
<p>Email: [email protected]</p>
</div>
</body>
</html>
"""

response = client.localscraper(
user_prompt="Extract the company description",
website_html=html_content
)

Markdownify
Convert any webpage into clean, formatted markdown:
response = client.markdownify(
website_url="https://example.com"
)

Async Support
All endpoints support asynchronous operations:
import asyncio
from scrapegraph_py import AsyncClient

async def main():


async with AsyncClient() as client:
response = await client.smartscraper(
website_url="https://example.com",
user_prompt="Extract the main content"
)
print(response)

asyncio.run(main())

Feedback
Help us improve by submitting feedback programmatically:
client.submit_feedback(
request_id="your-request-id",
rating=5,
feedback_text="Great results!"
)

Support
GitHubReport issues and contribute to the SDKEmail SupportGet help from our
development team
LicenseThis project is licensed under the MIT License. See the LICENSE file for
details.Was this page helpful?YesNoFirefoxJavaScript SDKxgithublinkedinPowered by
MintlifyOn this pageInstallationFeaturesQuick
StartServicesSmartScraperLocalScraperMarkdownifyAsync SupportFeedbackSupport

------- • -------

https://docs.scrapegraphai.com/sdks/python#quick-start

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationOfficial SDKsPython
SDKHomeCookbookAPI ReferenceOfficial WebsiteCommunityBlogGet
StartedIntroductionDashboardServicesSmartScraperLocalScraperMarkdownifyBrowser
ExtensionsOfficial SDKsPython SDKJavaScript SDKIntegrations🦜 LangChain🦙 LlamaIndex👥
CrewAI PhidataContributeOpen SourceFeedbackResources🕷️ScrapeGraphAI BadgeOfficial
SDKsPython SDKOfficial Python SDK for ScrapeGraphAI
PyPI PackagePython Support
Installation
Install the package using pip:
pip install scrapegraph-py

Features

AI-Powered Extraction: Advanced web scraping using artificial intelligence


Flexible Clients: Both synchronous and asynchronous support
Type Safety: Structured output with Pydantic schemas
Production Ready: Detailed logging and automatic retries
Developer Friendly: Comprehensive error handling

Quick Start
Initialize the client with your API key:
from scrapegraph_py import Client

client = Client(api_key="your-api-key-here")

You can also set the SGAI_API_KEY environment variable and initialize the client
without parameters: client = Client()
Services
SmartScraper
Extract specific information from any webpage using AI:
response = client.smartscraper(
website_url="https://example.com",
user_prompt="Extract the main heading and description"
)

Basic Schema ExampleDefine a simple schema for basic data extraction:from pydantic
import BaseModel, Field

class ArticleData(BaseModel):
title: str = Field(description="The article title")
author: str = Field(description="The author's name")
publish_date: str = Field(description="Article publication date")
content: str = Field(description="Main article content")
category: str = Field(description="Article category")

response = client.smartscraper(
website_url="https://example.com/blog/article",
user_prompt="Extract the article information",
output_schema=ArticleData
)

print(f"Title: {response.title}")
print(f"Author: {response.author}")
print(f"Published: {response.publish_date}")

Advanced Schema ExampleDefine a complex schema for nested data structures:from


typing import List
from pydantic import BaseModel, Field

class Employee(BaseModel):
name: str = Field(description="Employee's full name")
position: str = Field(description="Job title")
department: str = Field(description="Department name")
email: str = Field(description="Email address")

class Office(BaseModel):
location: str = Field(description="Office location/city")
address: str = Field(description="Full address")
phone: str = Field(description="Contact number")

class CompanyData(BaseModel):
name: str = Field(description="Company name")
description: str = Field(description="Company description")
industry: str = Field(description="Industry sector")
founded_year: int = Field(description="Year company was founded")
employees: List[Employee] = Field(description="List of key employees")
offices: List[Office] = Field(description="Company office locations")
website: str = Field(description="Company website URL")

# Extract comprehensive company information


response = client.smartscraper(
website_url="https://example.com/about",
user_prompt="Extract detailed company information including employees and
offices",
output_schema=CompanyData
)

# Access nested data


print(f"Company: {response.name}")
print("\nKey Employees:")
for employee in response.employees:
print(f"- {employee.name} ({employee.position})")

print("\nOffice Locations:")
for office in response.offices:
print(f"- {office.location}: {office.address}")

LocalScraper
Process local HTML content with AI extraction:
html_content = """
<html>
<body>
<h1>Company Name</h1>
<p>We are a technology company focused on AI solutions.</p>
<div class="contact">
<p>Email: [email protected]</p>
</div>
</body>
</html>
"""

response = client.localscraper(
user_prompt="Extract the company description",
website_html=html_content
)

Markdownify
Convert any webpage into clean, formatted markdown:
response = client.markdownify(
website_url="https://example.com"
)

Async Support
All endpoints support asynchronous operations:
import asyncio
from scrapegraph_py import AsyncClient

async def main():


async with AsyncClient() as client:
response = await client.smartscraper(
website_url="https://example.com",
user_prompt="Extract the main content"
)
print(response)

asyncio.run(main())

Feedback
Help us improve by submitting feedback programmatically:
client.submit_feedback(
request_id="your-request-id",
rating=5,
feedback_text="Great results!"
)

Support
GitHubReport issues and contribute to the SDKEmail SupportGet help from our
development team
LicenseThis project is licensed under the MIT License. See the LICENSE file for
details.Was this page helpful?YesNoFirefoxJavaScript SDKxgithublinkedinPowered by
MintlifyOn this pageInstallationFeaturesQuick
StartServicesSmartScraperLocalScraperMarkdownifyAsync SupportFeedbackSupport

------- • -------

https://docs.scrapegraphai.com/sdks/python#services
ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationOfficial SDKsPython
SDKHomeCookbookAPI ReferenceOfficial WebsiteCommunityBlogGet
StartedIntroductionDashboardServicesSmartScraperLocalScraperMarkdownifyBrowser
ExtensionsOfficial SDKsPython SDKJavaScript SDKIntegrations🦜 LangChain🦙 LlamaIndex👥
CrewAI PhidataContributeOpen SourceFeedbackResources🕷️ScrapeGraphAI BadgeOfficial
SDKsPython SDKOfficial Python SDK for ScrapeGraphAI
PyPI PackagePython Support
Installation
Install the package using pip:
pip install scrapegraph-py

Features

AI-Powered Extraction: Advanced web scraping using artificial intelligence


Flexible Clients: Both synchronous and asynchronous support
Type Safety: Structured output with Pydantic schemas
Production Ready: Detailed logging and automatic retries
Developer Friendly: Comprehensive error handling

Quick Start
Initialize the client with your API key:
from scrapegraph_py import Client

client = Client(api_key="your-api-key-here")

You can also set the SGAI_API_KEY environment variable and initialize the client
without parameters: client = Client()
Services
SmartScraper
Extract specific information from any webpage using AI:
response = client.smartscraper(
website_url="https://example.com",
user_prompt="Extract the main heading and description"
)

Basic Schema ExampleDefine a simple schema for basic data extraction:from pydantic
import BaseModel, Field

class ArticleData(BaseModel):
title: str = Field(description="The article title")
author: str = Field(description="The author's name")
publish_date: str = Field(description="Article publication date")
content: str = Field(description="Main article content")
category: str = Field(description="Article category")

response = client.smartscraper(
website_url="https://example.com/blog/article",
user_prompt="Extract the article information",
output_schema=ArticleData
)

print(f"Title: {response.title}")
print(f"Author: {response.author}")
print(f"Published: {response.publish_date}")

Advanced Schema ExampleDefine a complex schema for nested data structures:from


typing import List
from pydantic import BaseModel, Field
class Employee(BaseModel):
name: str = Field(description="Employee's full name")
position: str = Field(description="Job title")
department: str = Field(description="Department name")
email: str = Field(description="Email address")

class Office(BaseModel):
location: str = Field(description="Office location/city")
address: str = Field(description="Full address")
phone: str = Field(description="Contact number")

class CompanyData(BaseModel):
name: str = Field(description="Company name")
description: str = Field(description="Company description")
industry: str = Field(description="Industry sector")
founded_year: int = Field(description="Year company was founded")
employees: List[Employee] = Field(description="List of key employees")
offices: List[Office] = Field(description="Company office locations")
website: str = Field(description="Company website URL")

# Extract comprehensive company information


response = client.smartscraper(
website_url="https://example.com/about",
user_prompt="Extract detailed company information including employees and
offices",
output_schema=CompanyData
)

# Access nested data


print(f"Company: {response.name}")
print("\nKey Employees:")
for employee in response.employees:
print(f"- {employee.name} ({employee.position})")

print("\nOffice Locations:")
for office in response.offices:
print(f"- {office.location}: {office.address}")

LocalScraper
Process local HTML content with AI extraction:
html_content = """
<html>
<body>
<h1>Company Name</h1>
<p>We are a technology company focused on AI solutions.</p>
<div class="contact">
<p>Email: [email protected]</p>
</div>
</body>
</html>
"""

response = client.localscraper(
user_prompt="Extract the company description",
website_html=html_content
)

Markdownify
Convert any webpage into clean, formatted markdown:
response = client.markdownify(
website_url="https://example.com"
)

Async Support
All endpoints support asynchronous operations:
import asyncio
from scrapegraph_py import AsyncClient

async def main():


async with AsyncClient() as client:
response = await client.smartscraper(
website_url="https://example.com",
user_prompt="Extract the main content"
)
print(response)

asyncio.run(main())

Feedback
Help us improve by submitting feedback programmatically:
client.submit_feedback(
request_id="your-request-id",
rating=5,
feedback_text="Great results!"
)

Support
GitHubReport issues and contribute to the SDKEmail SupportGet help from our
development team
LicenseThis project is licensed under the MIT License. See the LICENSE file for
details.Was this page helpful?YesNoFirefoxJavaScript SDKxgithublinkedinPowered by
MintlifyOn this pageInstallationFeaturesQuick
StartServicesSmartScraperLocalScraperMarkdownifyAsync SupportFeedbackSupport

------- • -------

https://docs.scrapegraphai.com/sdks/python#smartscraper

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationOfficial SDKsPython
SDKHomeCookbookAPI ReferenceOfficial WebsiteCommunityBlogGet
StartedIntroductionDashboardServicesSmartScraperLocalScraperMarkdownifyBrowser
ExtensionsOfficial SDKsPython SDKJavaScript SDKIntegrations🦜 LangChain🦙 LlamaIndex👥
CrewAI PhidataContributeOpen SourceFeedbackResources🕷️ScrapeGraphAI BadgeOfficial
SDKsPython SDKOfficial Python SDK for ScrapeGraphAI
PyPI PackagePython Support
Installation
Install the package using pip:
pip install scrapegraph-py

Features

AI-Powered Extraction: Advanced web scraping using artificial intelligence


Flexible Clients: Both synchronous and asynchronous support
Type Safety: Structured output with Pydantic schemas
Production Ready: Detailed logging and automatic retries
Developer Friendly: Comprehensive error handling
Quick Start
Initialize the client with your API key:
from scrapegraph_py import Client

client = Client(api_key="your-api-key-here")

You can also set the SGAI_API_KEY environment variable and initialize the client
without parameters: client = Client()
Services
SmartScraper
Extract specific information from any webpage using AI:
response = client.smartscraper(
website_url="https://example.com",
user_prompt="Extract the main heading and description"
)

Basic Schema ExampleDefine a simple schema for basic data extraction:from pydantic
import BaseModel, Field

class ArticleData(BaseModel):
title: str = Field(description="The article title")
author: str = Field(description="The author's name")
publish_date: str = Field(description="Article publication date")
content: str = Field(description="Main article content")
category: str = Field(description="Article category")

response = client.smartscraper(
website_url="https://example.com/blog/article",
user_prompt="Extract the article information",
output_schema=ArticleData
)

print(f"Title: {response.title}")
print(f"Author: {response.author}")
print(f"Published: {response.publish_date}")

Advanced Schema ExampleDefine a complex schema for nested data structures:from


typing import List
from pydantic import BaseModel, Field

class Employee(BaseModel):
name: str = Field(description="Employee's full name")
position: str = Field(description="Job title")
department: str = Field(description="Department name")
email: str = Field(description="Email address")

class Office(BaseModel):
location: str = Field(description="Office location/city")
address: str = Field(description="Full address")
phone: str = Field(description="Contact number")

class CompanyData(BaseModel):
name: str = Field(description="Company name")
description: str = Field(description="Company description")
industry: str = Field(description="Industry sector")
founded_year: int = Field(description="Year company was founded")
employees: List[Employee] = Field(description="List of key employees")
offices: List[Office] = Field(description="Company office locations")
website: str = Field(description="Company website URL")

# Extract comprehensive company information


response = client.smartscraper(
website_url="https://example.com/about",
user_prompt="Extract detailed company information including employees and
offices",
output_schema=CompanyData
)

# Access nested data


print(f"Company: {response.name}")
print("\nKey Employees:")
for employee in response.employees:
print(f"- {employee.name} ({employee.position})")

print("\nOffice Locations:")
for office in response.offices:
print(f"- {office.location}: {office.address}")

LocalScraper
Process local HTML content with AI extraction:
html_content = """
<html>
<body>
<h1>Company Name</h1>
<p>We are a technology company focused on AI solutions.</p>
<div class="contact">
<p>Email: [email protected]</p>
</div>
</body>
</html>
"""

response = client.localscraper(
user_prompt="Extract the company description",
website_html=html_content
)

Markdownify
Convert any webpage into clean, formatted markdown:
response = client.markdownify(
website_url="https://example.com"
)

Async Support
All endpoints support asynchronous operations:
import asyncio
from scrapegraph_py import AsyncClient

async def main():


async with AsyncClient() as client:
response = await client.smartscraper(
website_url="https://example.com",
user_prompt="Extract the main content"
)
print(response)

asyncio.run(main())
Feedback
Help us improve by submitting feedback programmatically:
client.submit_feedback(
request_id="your-request-id",
rating=5,
feedback_text="Great results!"
)

Support
GitHubReport issues and contribute to the SDKEmail SupportGet help from our
development team
LicenseThis project is licensed under the MIT License. See the LICENSE file for
details.Was this page helpful?YesNoFirefoxJavaScript SDKxgithublinkedinPowered by
MintlifyOn this pageInstallationFeaturesQuick
StartServicesSmartScraperLocalScraperMarkdownifyAsync SupportFeedbackSupport

------- • -------

https://docs.scrapegraphai.com/sdks/python#localscraper

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationOfficial SDKsPython
SDKHomeCookbookAPI ReferenceOfficial WebsiteCommunityBlogGet
StartedIntroductionDashboardServicesSmartScraperLocalScraperMarkdownifyBrowser
ExtensionsOfficial SDKsPython SDKJavaScript SDKIntegrations🦜 LangChain🦙 LlamaIndex👥
CrewAI PhidataContributeOpen SourceFeedbackResources🕷️ScrapeGraphAI BadgeOfficial
SDKsPython SDKOfficial Python SDK for ScrapeGraphAI
PyPI PackagePython Support
Installation
Install the package using pip:
pip install scrapegraph-py

Features

AI-Powered Extraction: Advanced web scraping using artificial intelligence


Flexible Clients: Both synchronous and asynchronous support
Type Safety: Structured output with Pydantic schemas
Production Ready: Detailed logging and automatic retries
Developer Friendly: Comprehensive error handling

Quick Start
Initialize the client with your API key:
from scrapegraph_py import Client

client = Client(api_key="your-api-key-here")

You can also set the SGAI_API_KEY environment variable and initialize the client
without parameters: client = Client()
Services
SmartScraper
Extract specific information from any webpage using AI:
response = client.smartscraper(
website_url="https://example.com",
user_prompt="Extract the main heading and description"
)

Basic Schema ExampleDefine a simple schema for basic data extraction:from pydantic
import BaseModel, Field
class ArticleData(BaseModel):
title: str = Field(description="The article title")
author: str = Field(description="The author's name")
publish_date: str = Field(description="Article publication date")
content: str = Field(description="Main article content")
category: str = Field(description="Article category")

response = client.smartscraper(
website_url="https://example.com/blog/article",
user_prompt="Extract the article information",
output_schema=ArticleData
)

print(f"Title: {response.title}")
print(f"Author: {response.author}")
print(f"Published: {response.publish_date}")

Advanced Schema ExampleDefine a complex schema for nested data structures:from


typing import List
from pydantic import BaseModel, Field

class Employee(BaseModel):
name: str = Field(description="Employee's full name")
position: str = Field(description="Job title")
department: str = Field(description="Department name")
email: str = Field(description="Email address")

class Office(BaseModel):
location: str = Field(description="Office location/city")
address: str = Field(description="Full address")
phone: str = Field(description="Contact number")

class CompanyData(BaseModel):
name: str = Field(description="Company name")
description: str = Field(description="Company description")
industry: str = Field(description="Industry sector")
founded_year: int = Field(description="Year company was founded")
employees: List[Employee] = Field(description="List of key employees")
offices: List[Office] = Field(description="Company office locations")
website: str = Field(description="Company website URL")

# Extract comprehensive company information


response = client.smartscraper(
website_url="https://example.com/about",
user_prompt="Extract detailed company information including employees and
offices",
output_schema=CompanyData
)

# Access nested data


print(f"Company: {response.name}")
print("\nKey Employees:")
for employee in response.employees:
print(f"- {employee.name} ({employee.position})")

print("\nOffice Locations:")
for office in response.offices:
print(f"- {office.location}: {office.address}")
LocalScraper
Process local HTML content with AI extraction:
html_content = """
<html>
<body>
<h1>Company Name</h1>
<p>We are a technology company focused on AI solutions.</p>
<div class="contact">
<p>Email: [email protected]</p>
</div>
</body>
</html>
"""

response = client.localscraper(
user_prompt="Extract the company description",
website_html=html_content
)

Markdownify
Convert any webpage into clean, formatted markdown:
response = client.markdownify(
website_url="https://example.com"
)

Async Support
All endpoints support asynchronous operations:
import asyncio
from scrapegraph_py import AsyncClient

async def main():


async with AsyncClient() as client:
response = await client.smartscraper(
website_url="https://example.com",
user_prompt="Extract the main content"
)
print(response)

asyncio.run(main())

Feedback
Help us improve by submitting feedback programmatically:
client.submit_feedback(
request_id="your-request-id",
rating=5,
feedback_text="Great results!"
)

Support
GitHubReport issues and contribute to the SDKEmail SupportGet help from our
development team
LicenseThis project is licensed under the MIT License. See the LICENSE file for
details.Was this page helpful?YesNoFirefoxJavaScript SDKxgithublinkedinPowered by
MintlifyOn this pageInstallationFeaturesQuick
StartServicesSmartScraperLocalScraperMarkdownifyAsync SupportFeedbackSupport

------- • -------
https://docs.scrapegraphai.com/sdks/python#markdownify

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationOfficial SDKsPython
SDKHomeCookbookAPI ReferenceOfficial WebsiteCommunityBlogGet
StartedIntroductionDashboardServicesSmartScraperLocalScraperMarkdownifyBrowser
ExtensionsOfficial SDKsPython SDKJavaScript SDKIntegrations🦜 LangChain🦙 LlamaIndex👥
CrewAI PhidataContributeOpen SourceFeedbackResources🕷️ScrapeGraphAI BadgeOfficial
SDKsPython SDKOfficial Python SDK for ScrapeGraphAI
PyPI PackagePython Support
Installation
Install the package using pip:
pip install scrapegraph-py

Features

AI-Powered Extraction: Advanced web scraping using artificial intelligence


Flexible Clients: Both synchronous and asynchronous support
Type Safety: Structured output with Pydantic schemas
Production Ready: Detailed logging and automatic retries
Developer Friendly: Comprehensive error handling

Quick Start
Initialize the client with your API key:
from scrapegraph_py import Client

client = Client(api_key="your-api-key-here")

You can also set the SGAI_API_KEY environment variable and initialize the client
without parameters: client = Client()
Services
SmartScraper
Extract specific information from any webpage using AI:
response = client.smartscraper(
website_url="https://example.com",
user_prompt="Extract the main heading and description"
)

Basic Schema ExampleDefine a simple schema for basic data extraction:from pydantic
import BaseModel, Field

class ArticleData(BaseModel):
title: str = Field(description="The article title")
author: str = Field(description="The author's name")
publish_date: str = Field(description="Article publication date")
content: str = Field(description="Main article content")
category: str = Field(description="Article category")

response = client.smartscraper(
website_url="https://example.com/blog/article",
user_prompt="Extract the article information",
output_schema=ArticleData
)

print(f"Title: {response.title}")
print(f"Author: {response.author}")
print(f"Published: {response.publish_date}")

Advanced Schema ExampleDefine a complex schema for nested data structures:from


typing import List
from pydantic import BaseModel, Field

class Employee(BaseModel):
name: str = Field(description="Employee's full name")
position: str = Field(description="Job title")
department: str = Field(description="Department name")
email: str = Field(description="Email address")

class Office(BaseModel):
location: str = Field(description="Office location/city")
address: str = Field(description="Full address")
phone: str = Field(description="Contact number")

class CompanyData(BaseModel):
name: str = Field(description="Company name")
description: str = Field(description="Company description")
industry: str = Field(description="Industry sector")
founded_year: int = Field(description="Year company was founded")
employees: List[Employee] = Field(description="List of key employees")
offices: List[Office] = Field(description="Company office locations")
website: str = Field(description="Company website URL")

# Extract comprehensive company information


response = client.smartscraper(
website_url="https://example.com/about",
user_prompt="Extract detailed company information including employees and
offices",
output_schema=CompanyData
)

# Access nested data


print(f"Company: {response.name}")
print("\nKey Employees:")
for employee in response.employees:
print(f"- {employee.name} ({employee.position})")

print("\nOffice Locations:")
for office in response.offices:
print(f"- {office.location}: {office.address}")

LocalScraper
Process local HTML content with AI extraction:
html_content = """
<html>
<body>
<h1>Company Name</h1>
<p>We are a technology company focused on AI solutions.</p>
<div class="contact">
<p>Email: [email protected]</p>
</div>
</body>
</html>
"""

response = client.localscraper(
user_prompt="Extract the company description",
website_html=html_content
)
Markdownify
Convert any webpage into clean, formatted markdown:
response = client.markdownify(
website_url="https://example.com"
)

Async Support
All endpoints support asynchronous operations:
import asyncio
from scrapegraph_py import AsyncClient

async def main():


async with AsyncClient() as client:
response = await client.smartscraper(
website_url="https://example.com",
user_prompt="Extract the main content"
)
print(response)

asyncio.run(main())

Feedback
Help us improve by submitting feedback programmatically:
client.submit_feedback(
request_id="your-request-id",
rating=5,
feedback_text="Great results!"
)

Support
GitHubReport issues and contribute to the SDKEmail SupportGet help from our
development team
LicenseThis project is licensed under the MIT License. See the LICENSE file for
details.Was this page helpful?YesNoFirefoxJavaScript SDKxgithublinkedinPowered by
MintlifyOn this pageInstallationFeaturesQuick
StartServicesSmartScraperLocalScraperMarkdownifyAsync SupportFeedbackSupport

------- • -------

https://docs.scrapegraphai.com/sdks/python#async-support

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationOfficial SDKsPython
SDKHomeCookbookAPI ReferenceOfficial WebsiteCommunityBlogGet
StartedIntroductionDashboardServicesSmartScraperLocalScraperMarkdownifyBrowser
ExtensionsOfficial SDKsPython SDKJavaScript SDKIntegrations🦜 LangChain🦙 LlamaIndex👥
CrewAI PhidataContributeOpen SourceFeedbackResources🕷️ScrapeGraphAI BadgeOfficial
SDKsPython SDKOfficial Python SDK for ScrapeGraphAI
PyPI PackagePython Support
Installation
Install the package using pip:
pip install scrapegraph-py

Features

AI-Powered Extraction: Advanced web scraping using artificial intelligence


Flexible Clients: Both synchronous and asynchronous support
Type Safety: Structured output with Pydantic schemas
Production Ready: Detailed logging and automatic retries
Developer Friendly: Comprehensive error handling

Quick Start
Initialize the client with your API key:
from scrapegraph_py import Client

client = Client(api_key="your-api-key-here")

You can also set the SGAI_API_KEY environment variable and initialize the client
without parameters: client = Client()
Services
SmartScraper
Extract specific information from any webpage using AI:
response = client.smartscraper(
website_url="https://example.com",
user_prompt="Extract the main heading and description"
)

Basic Schema ExampleDefine a simple schema for basic data extraction:from pydantic
import BaseModel, Field

class ArticleData(BaseModel):
title: str = Field(description="The article title")
author: str = Field(description="The author's name")
publish_date: str = Field(description="Article publication date")
content: str = Field(description="Main article content")
category: str = Field(description="Article category")

response = client.smartscraper(
website_url="https://example.com/blog/article",
user_prompt="Extract the article information",
output_schema=ArticleData
)

print(f"Title: {response.title}")
print(f"Author: {response.author}")
print(f"Published: {response.publish_date}")

Advanced Schema ExampleDefine a complex schema for nested data structures:from


typing import List
from pydantic import BaseModel, Field

class Employee(BaseModel):
name: str = Field(description="Employee's full name")
position: str = Field(description="Job title")
department: str = Field(description="Department name")
email: str = Field(description="Email address")

class Office(BaseModel):
location: str = Field(description="Office location/city")
address: str = Field(description="Full address")
phone: str = Field(description="Contact number")

class CompanyData(BaseModel):
name: str = Field(description="Company name")
description: str = Field(description="Company description")
industry: str = Field(description="Industry sector")
founded_year: int = Field(description="Year company was founded")
employees: List[Employee] = Field(description="List of key employees")
offices: List[Office] = Field(description="Company office locations")
website: str = Field(description="Company website URL")

# Extract comprehensive company information


response = client.smartscraper(
website_url="https://example.com/about",
user_prompt="Extract detailed company information including employees and
offices",
output_schema=CompanyData
)

# Access nested data


print(f"Company: {response.name}")
print("\nKey Employees:")
for employee in response.employees:
print(f"- {employee.name} ({employee.position})")

print("\nOffice Locations:")
for office in response.offices:
print(f"- {office.location}: {office.address}")

LocalScraper
Process local HTML content with AI extraction:
html_content = """
<html>
<body>
<h1>Company Name</h1>
<p>We are a technology company focused on AI solutions.</p>
<div class="contact">
<p>Email: [email protected]</p>
</div>
</body>
</html>
"""

response = client.localscraper(
user_prompt="Extract the company description",
website_html=html_content
)

Markdownify
Convert any webpage into clean, formatted markdown:
response = client.markdownify(
website_url="https://example.com"
)

Async Support
All endpoints support asynchronous operations:
import asyncio
from scrapegraph_py import AsyncClient

async def main():


async with AsyncClient() as client:
response = await client.smartscraper(
website_url="https://example.com",
user_prompt="Extract the main content"
)
print(response)
asyncio.run(main())

Feedback
Help us improve by submitting feedback programmatically:
client.submit_feedback(
request_id="your-request-id",
rating=5,
feedback_text="Great results!"
)

Support
GitHubReport issues and contribute to the SDKEmail SupportGet help from our
development team
LicenseThis project is licensed under the MIT License. See the LICENSE file for
details.Was this page helpful?YesNoFirefoxJavaScript SDKxgithublinkedinPowered by
MintlifyOn this pageInstallationFeaturesQuick
StartServicesSmartScraperLocalScraperMarkdownifyAsync SupportFeedbackSupport

------- • -------

https://docs.scrapegraphai.com/sdks/python#feedback

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationOfficial SDKsPython
SDKHomeCookbookAPI ReferenceOfficial WebsiteCommunityBlogGet
StartedIntroductionDashboardServicesSmartScraperLocalScraperMarkdownifyBrowser
ExtensionsOfficial SDKsPython SDKJavaScript SDKIntegrations🦜 LangChain🦙 LlamaIndex👥
CrewAI PhidataContributeOpen SourceFeedbackResources🕷️ScrapeGraphAI BadgeOfficial
SDKsPython SDKOfficial Python SDK for ScrapeGraphAI
PyPI PackagePython Support
Installation
Install the package using pip:
pip install scrapegraph-py

Features

AI-Powered Extraction: Advanced web scraping using artificial intelligence


Flexible Clients: Both synchronous and asynchronous support
Type Safety: Structured output with Pydantic schemas
Production Ready: Detailed logging and automatic retries
Developer Friendly: Comprehensive error handling

Quick Start
Initialize the client with your API key:
from scrapegraph_py import Client

client = Client(api_key="your-api-key-here")

You can also set the SGAI_API_KEY environment variable and initialize the client
without parameters: client = Client()
Services
SmartScraper
Extract specific information from any webpage using AI:
response = client.smartscraper(
website_url="https://example.com",
user_prompt="Extract the main heading and description"
)
Basic Schema ExampleDefine a simple schema for basic data extraction:from pydantic
import BaseModel, Field

class ArticleData(BaseModel):
title: str = Field(description="The article title")
author: str = Field(description="The author's name")
publish_date: str = Field(description="Article publication date")
content: str = Field(description="Main article content")
category: str = Field(description="Article category")

response = client.smartscraper(
website_url="https://example.com/blog/article",
user_prompt="Extract the article information",
output_schema=ArticleData
)

print(f"Title: {response.title}")
print(f"Author: {response.author}")
print(f"Published: {response.publish_date}")

Advanced Schema ExampleDefine a complex schema for nested data structures:from


typing import List
from pydantic import BaseModel, Field

class Employee(BaseModel):
name: str = Field(description="Employee's full name")
position: str = Field(description="Job title")
department: str = Field(description="Department name")
email: str = Field(description="Email address")

class Office(BaseModel):
location: str = Field(description="Office location/city")
address: str = Field(description="Full address")
phone: str = Field(description="Contact number")

class CompanyData(BaseModel):
name: str = Field(description="Company name")
description: str = Field(description="Company description")
industry: str = Field(description="Industry sector")
founded_year: int = Field(description="Year company was founded")
employees: List[Employee] = Field(description="List of key employees")
offices: List[Office] = Field(description="Company office locations")
website: str = Field(description="Company website URL")

# Extract comprehensive company information


response = client.smartscraper(
website_url="https://example.com/about",
user_prompt="Extract detailed company information including employees and
offices",
output_schema=CompanyData
)

# Access nested data


print(f"Company: {response.name}")
print("\nKey Employees:")
for employee in response.employees:
print(f"- {employee.name} ({employee.position})")

print("\nOffice Locations:")
for office in response.offices:
print(f"- {office.location}: {office.address}")

LocalScraper
Process local HTML content with AI extraction:
html_content = """
<html>
<body>
<h1>Company Name</h1>
<p>We are a technology company focused on AI solutions.</p>
<div class="contact">
<p>Email: [email protected]</p>
</div>
</body>
</html>
"""

response = client.localscraper(
user_prompt="Extract the company description",
website_html=html_content
)

Markdownify
Convert any webpage into clean, formatted markdown:
response = client.markdownify(
website_url="https://example.com"
)

Async Support
All endpoints support asynchronous operations:
import asyncio
from scrapegraph_py import AsyncClient

async def main():


async with AsyncClient() as client:
response = await client.smartscraper(
website_url="https://example.com",
user_prompt="Extract the main content"
)
print(response)

asyncio.run(main())

Feedback
Help us improve by submitting feedback programmatically:
client.submit_feedback(
request_id="your-request-id",
rating=5,
feedback_text="Great results!"
)

Support
GitHubReport issues and contribute to the SDKEmail SupportGet help from our
development team
LicenseThis project is licensed under the MIT License. See the LICENSE file for
details.Was this page helpful?YesNoFirefoxJavaScript SDKxgithublinkedinPowered by
MintlifyOn this pageInstallationFeaturesQuick
StartServicesSmartScraperLocalScraperMarkdownifyAsync SupportFeedbackSupport
------- • -------

https://docs.scrapegraphai.com/sdks/python#support

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationOfficial SDKsPython
SDKHomeCookbookAPI ReferenceOfficial WebsiteCommunityBlogGet
StartedIntroductionDashboardServicesSmartScraperLocalScraperMarkdownifyBrowser
ExtensionsOfficial SDKsPython SDKJavaScript SDKIntegrations🦜 LangChain🦙 LlamaIndex👥
CrewAI PhidataContributeOpen SourceFeedbackResources🕷️ScrapeGraphAI BadgeOfficial
SDKsPython SDKOfficial Python SDK for ScrapeGraphAI
PyPI PackagePython Support
Installation
Install the package using pip:
pip install scrapegraph-py

Features

AI-Powered Extraction: Advanced web scraping using artificial intelligence


Flexible Clients: Both synchronous and asynchronous support
Type Safety: Structured output with Pydantic schemas
Production Ready: Detailed logging and automatic retries
Developer Friendly: Comprehensive error handling

Quick Start
Initialize the client with your API key:
from scrapegraph_py import Client

client = Client(api_key="your-api-key-here")

You can also set the SGAI_API_KEY environment variable and initialize the client
without parameters: client = Client()
Services
SmartScraper
Extract specific information from any webpage using AI:
response = client.smartscraper(
website_url="https://example.com",
user_prompt="Extract the main heading and description"
)

Basic Schema ExampleDefine a simple schema for basic data extraction:from pydantic
import BaseModel, Field

class ArticleData(BaseModel):
title: str = Field(description="The article title")
author: str = Field(description="The author's name")
publish_date: str = Field(description="Article publication date")
content: str = Field(description="Main article content")
category: str = Field(description="Article category")

response = client.smartscraper(
website_url="https://example.com/blog/article",
user_prompt="Extract the article information",
output_schema=ArticleData
)

print(f"Title: {response.title}")
print(f"Author: {response.author}")
print(f"Published: {response.publish_date}")
Advanced Schema ExampleDefine a complex schema for nested data structures:from
typing import List
from pydantic import BaseModel, Field

class Employee(BaseModel):
name: str = Field(description="Employee's full name")
position: str = Field(description="Job title")
department: str = Field(description="Department name")
email: str = Field(description="Email address")

class Office(BaseModel):
location: str = Field(description="Office location/city")
address: str = Field(description="Full address")
phone: str = Field(description="Contact number")

class CompanyData(BaseModel):
name: str = Field(description="Company name")
description: str = Field(description="Company description")
industry: str = Field(description="Industry sector")
founded_year: int = Field(description="Year company was founded")
employees: List[Employee] = Field(description="List of key employees")
offices: List[Office] = Field(description="Company office locations")
website: str = Field(description="Company website URL")

# Extract comprehensive company information


response = client.smartscraper(
website_url="https://example.com/about",
user_prompt="Extract detailed company information including employees and
offices",
output_schema=CompanyData
)

# Access nested data


print(f"Company: {response.name}")
print("\nKey Employees:")
for employee in response.employees:
print(f"- {employee.name} ({employee.position})")

print("\nOffice Locations:")
for office in response.offices:
print(f"- {office.location}: {office.address}")

LocalScraper
Process local HTML content with AI extraction:
html_content = """
<html>
<body>
<h1>Company Name</h1>
<p>We are a technology company focused on AI solutions.</p>
<div class="contact">
<p>Email: [email protected]</p>
</div>
</body>
</html>
"""

response = client.localscraper(
user_prompt="Extract the company description",
website_html=html_content
)

Markdownify
Convert any webpage into clean, formatted markdown:
response = client.markdownify(
website_url="https://example.com"
)

Async Support
All endpoints support asynchronous operations:
import asyncio
from scrapegraph_py import AsyncClient

async def main():


async with AsyncClient() as client:
response = await client.smartscraper(
website_url="https://example.com",
user_prompt="Extract the main content"
)
print(response)

asyncio.run(main())

Feedback
Help us improve by submitting feedback programmatically:
client.submit_feedback(
request_id="your-request-id",
rating=5,
feedback_text="Great results!"
)

Support
GitHubReport issues and contribute to the SDKEmail SupportGet help from our
development team
LicenseThis project is licensed under the MIT License. See the LICENSE file for
details.Was this page helpful?YesNoFirefoxJavaScript SDKxgithublinkedinPowered by
MintlifyOn this pageInstallationFeaturesQuick
StartServicesSmartScraperLocalScraperMarkdownifyAsync SupportFeedbackSupport

------- • -------

https://docs.scrapegraphai.com/sdks/javascript#installation

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationOfficial
SDKsJavaScript SDKHomeCookbookAPI ReferenceOfficial WebsiteCommunityBlogGet
StartedIntroductionDashboardServicesSmartScraperLocalScraperMarkdownifyBrowser
ExtensionsOfficial SDKsPython SDKJavaScript SDKIntegrations🦜 LangChain🦙 LlamaIndex👥
CrewAI PhidataContributeOpen SourceFeedbackResources🕷️ScrapeGraphAI BadgeOfficial
SDKsJavaScript SDKOfficial JavaScript/TypeScript SDK for ScrapeGraphAI
NPM PackageLicense
Installation
Install the package using npm or yarn:
# Using npm
npm i scrapegraph-js

# Using yarn
yarn add scrapegraph-js
Features

AI-Powered Extraction: Smart web scraping with artificial intelligence


Async by Design: Fully asynchronous architecture
Type Safety: Built-in TypeScript support with Zod schemas
Production Ready: Automatic retries and detailed logging
Developer Friendly: Comprehensive error handling

Quick Start
Initialize with your API key:
import { smartScraper } from 'scrapegraph-js';

const apiKey = process.env.SGAI_APIKEY;


const websiteUrl = 'https://example.com';
const prompt = 'Extract the main heading and description';

try {
const response = await smartScraper(apiKey, websiteUrl, prompt);
console.log(response.result);
} catch (error) {
console.error('Error:', error);
}

Store your API keys securely in environment variables. Use .env files and libraries
like dotenv to load them into your app.
Services
SmartScraper
Extract specific information from any webpage using AI:
const response = await smartScraper(
apiKey,
'https://example.com',
'Extract the main content'
);

Basic Schema ExampleDefine a simple schema using Zod:import { z } from 'zod';

const ArticleSchema = z.object({


title: z.string().describe('The article title'),
author: z.string().describe('The author\'s name'),
publishDate: z.string().describe('Article publication date'),
content: z.string().describe('Main article content'),
category: z.string().describe('Article category')
});

const response = await smartScraper(


apiKey,
'https://example.com/blog/article',
'Extract the article information',
ArticleSchema
);

console.log(`Title: ${response.result.title}`);
console.log(`Author: ${response.result.author}`);
console.log(`Published: ${response.result.publishDate}`);

Advanced Schema ExampleDefine a complex schema for nested data structures:import


{ z } from 'zod';
const EmployeeSchema = z.object({
name: z.string().describe('Employee\'s full name'),
position: z.string().describe('Job title'),
department: z.string().describe('Department name'),
email: z.string().describe('Email address')
});

const OfficeSchema = z.object({


location: z.string().describe('Office location/city'),
address: z.string().describe('Full address'),
phone: z.string().describe('Contact number')
});

const CompanySchema = z.object({


name: z.string().describe('Company name'),
description: z.string().describe('Company description'),
industry: z.string().describe('Industry sector'),
foundedYear: z.number().describe('Year company was founded'),
employees: z.array(EmployeeSchema).describe('List of key employees'),
offices: z.array(OfficeSchema).describe('Company office locations'),
website: z.string().url().describe('Company website URL')
});

// Extract comprehensive company information


const response = await smartScraper(
apiKey,
'https://example.com/about',
'Extract detailed company information including employees and offices',
CompanySchema
);

// Access nested data


console.log(`Company: ${response.result.name}`);
console.log('\nKey Employees:');
response.result.employees.forEach(employee => {
console.log(`- ${employee.name} (${employee.position})`);
});

console.log('\nOffice Locations:');
response.result.offices.forEach(office => {
console.log(`- ${office.location}: ${office.address}`);
});

LocalScraper
Process local HTML content with AI extraction:
const html = `
<html>
<body>
<h1>Company Name</h1>
<p>We are a technology company focused on AI solutions.</p>
<div class="contact">
<p>Email: [email protected]</p>
</div>
</body>
</html>
`;

const response = await localScraper(


apiKey,
html,
'Extract the company description'
);

Markdownify
Convert any webpage into clean, formatted markdown:
const response = await markdownify(
apiKey,
'https://example.com'
);

API Credits
Check your available API credits:
import { getCredits } from 'scrapegraph-js';

try {
const credits = await getCredits(apiKey);
console.log('Available credits:', credits);
} catch (error) {
console.error('Error fetching credits:', error);
}

Feedback
Help us improve by submitting feedback programmatically:
import { sendFeedback } from 'scrapegraph-js';

try {
await sendFeedback(
apiKey,
'request-id',
5,
'Great results!'
);
} catch (error) {
console.error('Error sending feedback:', error);
}

Support
GitHubReport issues and contribute to the SDKEmail SupportGet help from our
development team
LicenseThis project is licensed under the MIT License. See the LICENSE file for
details.Was this page helpful?YesNoPython SDK🦜 LangChainxgithublinkedinPowered by
MintlifyOn this pageInstallationFeaturesQuick
StartServicesSmartScraperLocalScraperMarkdownifyAPI CreditsFeedbackSupport

------- • -------

https://docs.scrapegraphai.com/sdks/javascript#features

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationOfficial
SDKsJavaScript SDKHomeCookbookAPI ReferenceOfficial WebsiteCommunityBlogGet
StartedIntroductionDashboardServicesSmartScraperLocalScraperMarkdownifyBrowser
ExtensionsOfficial SDKsPython SDKJavaScript SDKIntegrations🦜 LangChain🦙 LlamaIndex👥
CrewAI PhidataContributeOpen SourceFeedbackResources🕷️ScrapeGraphAI BadgeOfficial
SDKsJavaScript SDKOfficial JavaScript/TypeScript SDK for ScrapeGraphAI
NPM PackageLicense
Installation
Install the package using npm or yarn:
# Using npm
npm i scrapegraph-js

# Using yarn
yarn add scrapegraph-js

Features

AI-Powered Extraction: Smart web scraping with artificial intelligence


Async by Design: Fully asynchronous architecture
Type Safety: Built-in TypeScript support with Zod schemas
Production Ready: Automatic retries and detailed logging
Developer Friendly: Comprehensive error handling

Quick Start
Initialize with your API key:
import { smartScraper } from 'scrapegraph-js';

const apiKey = process.env.SGAI_APIKEY;


const websiteUrl = 'https://example.com';
const prompt = 'Extract the main heading and description';

try {
const response = await smartScraper(apiKey, websiteUrl, prompt);
console.log(response.result);
} catch (error) {
console.error('Error:', error);
}

Store your API keys securely in environment variables. Use .env files and libraries
like dotenv to load them into your app.
Services
SmartScraper
Extract specific information from any webpage using AI:
const response = await smartScraper(
apiKey,
'https://example.com',
'Extract the main content'
);

Basic Schema ExampleDefine a simple schema using Zod:import { z } from 'zod';

const ArticleSchema = z.object({


title: z.string().describe('The article title'),
author: z.string().describe('The author\'s name'),
publishDate: z.string().describe('Article publication date'),
content: z.string().describe('Main article content'),
category: z.string().describe('Article category')
});

const response = await smartScraper(


apiKey,
'https://example.com/blog/article',
'Extract the article information',
ArticleSchema
);

console.log(`Title: ${response.result.title}`);
console.log(`Author: ${response.result.author}`);
console.log(`Published: ${response.result.publishDate}`);

Advanced Schema ExampleDefine a complex schema for nested data structures:import


{ z } from 'zod';

const EmployeeSchema = z.object({


name: z.string().describe('Employee\'s full name'),
position: z.string().describe('Job title'),
department: z.string().describe('Department name'),
email: z.string().describe('Email address')
});

const OfficeSchema = z.object({


location: z.string().describe('Office location/city'),
address: z.string().describe('Full address'),
phone: z.string().describe('Contact number')
});

const CompanySchema = z.object({


name: z.string().describe('Company name'),
description: z.string().describe('Company description'),
industry: z.string().describe('Industry sector'),
foundedYear: z.number().describe('Year company was founded'),
employees: z.array(EmployeeSchema).describe('List of key employees'),
offices: z.array(OfficeSchema).describe('Company office locations'),
website: z.string().url().describe('Company website URL')
});

// Extract comprehensive company information


const response = await smartScraper(
apiKey,
'https://example.com/about',
'Extract detailed company information including employees and offices',
CompanySchema
);

// Access nested data


console.log(`Company: ${response.result.name}`);
console.log('\nKey Employees:');
response.result.employees.forEach(employee => {
console.log(`- ${employee.name} (${employee.position})`);
});

console.log('\nOffice Locations:');
response.result.offices.forEach(office => {
console.log(`- ${office.location}: ${office.address}`);
});

LocalScraper
Process local HTML content with AI extraction:
const html = `
<html>
<body>
<h1>Company Name</h1>
<p>We are a technology company focused on AI solutions.</p>
<div class="contact">
<p>Email: [email protected]</p>
</div>
</body>
</html>
`;

const response = await localScraper(


apiKey,
html,
'Extract the company description'
);

Markdownify
Convert any webpage into clean, formatted markdown:
const response = await markdownify(
apiKey,
'https://example.com'
);

API Credits
Check your available API credits:
import { getCredits } from 'scrapegraph-js';

try {
const credits = await getCredits(apiKey);
console.log('Available credits:', credits);
} catch (error) {
console.error('Error fetching credits:', error);
}

Feedback
Help us improve by submitting feedback programmatically:
import { sendFeedback } from 'scrapegraph-js';

try {
await sendFeedback(
apiKey,
'request-id',
5,
'Great results!'
);
} catch (error) {
console.error('Error sending feedback:', error);
}

Support
GitHubReport issues and contribute to the SDKEmail SupportGet help from our
development team
LicenseThis project is licensed under the MIT License. See the LICENSE file for
details.Was this page helpful?YesNoPython SDK🦜 LangChainxgithublinkedinPowered by
MintlifyOn this pageInstallationFeaturesQuick
StartServicesSmartScraperLocalScraperMarkdownifyAPI CreditsFeedbackSupport

------- • -------

https://docs.scrapegraphai.com/sdks/javascript#quick-start

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationOfficial
SDKsJavaScript SDKHomeCookbookAPI ReferenceOfficial WebsiteCommunityBlogGet
StartedIntroductionDashboardServicesSmartScraperLocalScraperMarkdownifyBrowser
ExtensionsOfficial SDKsPython SDKJavaScript SDKIntegrations🦜 LangChain🦙 LlamaIndex👥
CrewAI PhidataContributeOpen SourceFeedbackResources🕷️ScrapeGraphAI BadgeOfficial
SDKsJavaScript SDKOfficial JavaScript/TypeScript SDK for ScrapeGraphAI
NPM PackageLicense
Installation
Install the package using npm or yarn:
# Using npm
npm i scrapegraph-js

# Using yarn
yarn add scrapegraph-js

Features

AI-Powered Extraction: Smart web scraping with artificial intelligence


Async by Design: Fully asynchronous architecture
Type Safety: Built-in TypeScript support with Zod schemas
Production Ready: Automatic retries and detailed logging
Developer Friendly: Comprehensive error handling

Quick Start
Initialize with your API key:
import { smartScraper } from 'scrapegraph-js';

const apiKey = process.env.SGAI_APIKEY;


const websiteUrl = 'https://example.com';
const prompt = 'Extract the main heading and description';

try {
const response = await smartScraper(apiKey, websiteUrl, prompt);
console.log(response.result);
} catch (error) {
console.error('Error:', error);
}

Store your API keys securely in environment variables. Use .env files and libraries
like dotenv to load them into your app.
Services
SmartScraper
Extract specific information from any webpage using AI:
const response = await smartScraper(
apiKey,
'https://example.com',
'Extract the main content'
);

Basic Schema ExampleDefine a simple schema using Zod:import { z } from 'zod';

const ArticleSchema = z.object({


title: z.string().describe('The article title'),
author: z.string().describe('The author\'s name'),
publishDate: z.string().describe('Article publication date'),
content: z.string().describe('Main article content'),
category: z.string().describe('Article category')
});

const response = await smartScraper(


apiKey,
'https://example.com/blog/article',
'Extract the article information',
ArticleSchema
);

console.log(`Title: ${response.result.title}`);
console.log(`Author: ${response.result.author}`);
console.log(`Published: ${response.result.publishDate}`);

Advanced Schema ExampleDefine a complex schema for nested data structures:import


{ z } from 'zod';

const EmployeeSchema = z.object({


name: z.string().describe('Employee\'s full name'),
position: z.string().describe('Job title'),
department: z.string().describe('Department name'),
email: z.string().describe('Email address')
});

const OfficeSchema = z.object({


location: z.string().describe('Office location/city'),
address: z.string().describe('Full address'),
phone: z.string().describe('Contact number')
});

const CompanySchema = z.object({


name: z.string().describe('Company name'),
description: z.string().describe('Company description'),
industry: z.string().describe('Industry sector'),
foundedYear: z.number().describe('Year company was founded'),
employees: z.array(EmployeeSchema).describe('List of key employees'),
offices: z.array(OfficeSchema).describe('Company office locations'),
website: z.string().url().describe('Company website URL')
});

// Extract comprehensive company information


const response = await smartScraper(
apiKey,
'https://example.com/about',
'Extract detailed company information including employees and offices',
CompanySchema
);

// Access nested data


console.log(`Company: ${response.result.name}`);
console.log('\nKey Employees:');
response.result.employees.forEach(employee => {
console.log(`- ${employee.name} (${employee.position})`);
});

console.log('\nOffice Locations:');
response.result.offices.forEach(office => {
console.log(`- ${office.location}: ${office.address}`);
});

LocalScraper
Process local HTML content with AI extraction:
const html = `
<html>
<body>
<h1>Company Name</h1>
<p>We are a technology company focused on AI solutions.</p>
<div class="contact">
<p>Email: [email protected]</p>
</div>
</body>
</html>
`;

const response = await localScraper(


apiKey,
html,
'Extract the company description'
);

Markdownify
Convert any webpage into clean, formatted markdown:
const response = await markdownify(
apiKey,
'https://example.com'
);

API Credits
Check your available API credits:
import { getCredits } from 'scrapegraph-js';

try {
const credits = await getCredits(apiKey);
console.log('Available credits:', credits);
} catch (error) {
console.error('Error fetching credits:', error);
}

Feedback
Help us improve by submitting feedback programmatically:
import { sendFeedback } from 'scrapegraph-js';

try {
await sendFeedback(
apiKey,
'request-id',
5,
'Great results!'
);
} catch (error) {
console.error('Error sending feedback:', error);
}

Support
GitHubReport issues and contribute to the SDKEmail SupportGet help from our
development team
LicenseThis project is licensed under the MIT License. See the LICENSE file for
details.Was this page helpful?YesNoPython SDK🦜 LangChainxgithublinkedinPowered by
MintlifyOn this pageInstallationFeaturesQuick
StartServicesSmartScraperLocalScraperMarkdownifyAPI CreditsFeedbackSupport

------- • -------

https://docs.scrapegraphai.com/sdks/javascript#services
ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationOfficial
SDKsJavaScript SDKHomeCookbookAPI ReferenceOfficial WebsiteCommunityBlogGet
StartedIntroductionDashboardServicesSmartScraperLocalScraperMarkdownifyBrowser
ExtensionsOfficial SDKsPython SDKJavaScript SDKIntegrations🦜 LangChain🦙 LlamaIndex👥
CrewAI PhidataContributeOpen SourceFeedbackResources🕷️ScrapeGraphAI BadgeOfficial
SDKsJavaScript SDKOfficial JavaScript/TypeScript SDK for ScrapeGraphAI
NPM PackageLicense
Installation
Install the package using npm or yarn:
# Using npm
npm i scrapegraph-js

# Using yarn
yarn add scrapegraph-js

Features

AI-Powered Extraction: Smart web scraping with artificial intelligence


Async by Design: Fully asynchronous architecture
Type Safety: Built-in TypeScript support with Zod schemas
Production Ready: Automatic retries and detailed logging
Developer Friendly: Comprehensive error handling

Quick Start
Initialize with your API key:
import { smartScraper } from 'scrapegraph-js';

const apiKey = process.env.SGAI_APIKEY;


const websiteUrl = 'https://example.com';
const prompt = 'Extract the main heading and description';

try {
const response = await smartScraper(apiKey, websiteUrl, prompt);
console.log(response.result);
} catch (error) {
console.error('Error:', error);
}

Store your API keys securely in environment variables. Use .env files and libraries
like dotenv to load them into your app.
Services
SmartScraper
Extract specific information from any webpage using AI:
const response = await smartScraper(
apiKey,
'https://example.com',
'Extract the main content'
);

Basic Schema ExampleDefine a simple schema using Zod:import { z } from 'zod';

const ArticleSchema = z.object({


title: z.string().describe('The article title'),
author: z.string().describe('The author\'s name'),
publishDate: z.string().describe('Article publication date'),
content: z.string().describe('Main article content'),
category: z.string().describe('Article category')
});
const response = await smartScraper(
apiKey,
'https://example.com/blog/article',
'Extract the article information',
ArticleSchema
);

console.log(`Title: ${response.result.title}`);
console.log(`Author: ${response.result.author}`);
console.log(`Published: ${response.result.publishDate}`);

Advanced Schema ExampleDefine a complex schema for nested data structures:import


{ z } from 'zod';

const EmployeeSchema = z.object({


name: z.string().describe('Employee\'s full name'),
position: z.string().describe('Job title'),
department: z.string().describe('Department name'),
email: z.string().describe('Email address')
});

const OfficeSchema = z.object({


location: z.string().describe('Office location/city'),
address: z.string().describe('Full address'),
phone: z.string().describe('Contact number')
});

const CompanySchema = z.object({


name: z.string().describe('Company name'),
description: z.string().describe('Company description'),
industry: z.string().describe('Industry sector'),
foundedYear: z.number().describe('Year company was founded'),
employees: z.array(EmployeeSchema).describe('List of key employees'),
offices: z.array(OfficeSchema).describe('Company office locations'),
website: z.string().url().describe('Company website URL')
});

// Extract comprehensive company information


const response = await smartScraper(
apiKey,
'https://example.com/about',
'Extract detailed company information including employees and offices',
CompanySchema
);

// Access nested data


console.log(`Company: ${response.result.name}`);
console.log('\nKey Employees:');
response.result.employees.forEach(employee => {
console.log(`- ${employee.name} (${employee.position})`);
});

console.log('\nOffice Locations:');
response.result.offices.forEach(office => {
console.log(`- ${office.location}: ${office.address}`);
});

LocalScraper
Process local HTML content with AI extraction:
const html = `
<html>
<body>
<h1>Company Name</h1>
<p>We are a technology company focused on AI solutions.</p>
<div class="contact">
<p>Email: [email protected]</p>
</div>
</body>
</html>
`;

const response = await localScraper(


apiKey,
html,
'Extract the company description'
);

Markdownify
Convert any webpage into clean, formatted markdown:
const response = await markdownify(
apiKey,
'https://example.com'
);

API Credits
Check your available API credits:
import { getCredits } from 'scrapegraph-js';

try {
const credits = await getCredits(apiKey);
console.log('Available credits:', credits);
} catch (error) {
console.error('Error fetching credits:', error);
}

Feedback
Help us improve by submitting feedback programmatically:
import { sendFeedback } from 'scrapegraph-js';

try {
await sendFeedback(
apiKey,
'request-id',
5,
'Great results!'
);
} catch (error) {
console.error('Error sending feedback:', error);
}

Support
GitHubReport issues and contribute to the SDKEmail SupportGet help from our
development team
LicenseThis project is licensed under the MIT License. See the LICENSE file for
details.Was this page helpful?YesNoPython SDK🦜 LangChainxgithublinkedinPowered by
MintlifyOn this pageInstallationFeaturesQuick
StartServicesSmartScraperLocalScraperMarkdownifyAPI CreditsFeedbackSupport
------- • -------

https://docs.scrapegraphai.com/sdks/javascript#smartscraper

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationOfficial
SDKsJavaScript SDKHomeCookbookAPI ReferenceOfficial WebsiteCommunityBlogGet
StartedIntroductionDashboardServicesSmartScraperLocalScraperMarkdownifyBrowser
ExtensionsOfficial SDKsPython SDKJavaScript SDKIntegrations🦜 LangChain🦙 LlamaIndex👥
CrewAI PhidataContributeOpen SourceFeedbackResources🕷️ScrapeGraphAI BadgeOfficial
SDKsJavaScript SDKOfficial JavaScript/TypeScript SDK for ScrapeGraphAI
NPM PackageLicense
Installation
Install the package using npm or yarn:
# Using npm
npm i scrapegraph-js

# Using yarn
yarn add scrapegraph-js

Features

AI-Powered Extraction: Smart web scraping with artificial intelligence


Async by Design: Fully asynchronous architecture
Type Safety: Built-in TypeScript support with Zod schemas
Production Ready: Automatic retries and detailed logging
Developer Friendly: Comprehensive error handling

Quick Start
Initialize with your API key:
import { smartScraper } from 'scrapegraph-js';

const apiKey = process.env.SGAI_APIKEY;


const websiteUrl = 'https://example.com';
const prompt = 'Extract the main heading and description';

try {
const response = await smartScraper(apiKey, websiteUrl, prompt);
console.log(response.result);
} catch (error) {
console.error('Error:', error);
}

Store your API keys securely in environment variables. Use .env files and libraries
like dotenv to load them into your app.
Services
SmartScraper
Extract specific information from any webpage using AI:
const response = await smartScraper(
apiKey,
'https://example.com',
'Extract the main content'
);

Basic Schema ExampleDefine a simple schema using Zod:import { z } from 'zod';

const ArticleSchema = z.object({


title: z.string().describe('The article title'),
author: z.string().describe('The author\'s name'),
publishDate: z.string().describe('Article publication date'),
content: z.string().describe('Main article content'),
category: z.string().describe('Article category')
});

const response = await smartScraper(


apiKey,
'https://example.com/blog/article',
'Extract the article information',
ArticleSchema
);

console.log(`Title: ${response.result.title}`);
console.log(`Author: ${response.result.author}`);
console.log(`Published: ${response.result.publishDate}`);

Advanced Schema ExampleDefine a complex schema for nested data structures:import


{ z } from 'zod';

const EmployeeSchema = z.object({


name: z.string().describe('Employee\'s full name'),
position: z.string().describe('Job title'),
department: z.string().describe('Department name'),
email: z.string().describe('Email address')
});

const OfficeSchema = z.object({


location: z.string().describe('Office location/city'),
address: z.string().describe('Full address'),
phone: z.string().describe('Contact number')
});

const CompanySchema = z.object({


name: z.string().describe('Company name'),
description: z.string().describe('Company description'),
industry: z.string().describe('Industry sector'),
foundedYear: z.number().describe('Year company was founded'),
employees: z.array(EmployeeSchema).describe('List of key employees'),
offices: z.array(OfficeSchema).describe('Company office locations'),
website: z.string().url().describe('Company website URL')
});

// Extract comprehensive company information


const response = await smartScraper(
apiKey,
'https://example.com/about',
'Extract detailed company information including employees and offices',
CompanySchema
);

// Access nested data


console.log(`Company: ${response.result.name}`);
console.log('\nKey Employees:');
response.result.employees.forEach(employee => {
console.log(`- ${employee.name} (${employee.position})`);
});

console.log('\nOffice Locations:');
response.result.offices.forEach(office => {
console.log(`- ${office.location}: ${office.address}`);
});

LocalScraper
Process local HTML content with AI extraction:
const html = `
<html>
<body>
<h1>Company Name</h1>
<p>We are a technology company focused on AI solutions.</p>
<div class="contact">
<p>Email: [email protected]</p>
</div>
</body>
</html>
`;

const response = await localScraper(


apiKey,
html,
'Extract the company description'
);

Markdownify
Convert any webpage into clean, formatted markdown:
const response = await markdownify(
apiKey,
'https://example.com'
);

API Credits
Check your available API credits:
import { getCredits } from 'scrapegraph-js';

try {
const credits = await getCredits(apiKey);
console.log('Available credits:', credits);
} catch (error) {
console.error('Error fetching credits:', error);
}

Feedback
Help us improve by submitting feedback programmatically:
import { sendFeedback } from 'scrapegraph-js';

try {
await sendFeedback(
apiKey,
'request-id',
5,
'Great results!'
);
} catch (error) {
console.error('Error sending feedback:', error);
}

Support
GitHubReport issues and contribute to the SDKEmail SupportGet help from our
development team
LicenseThis project is licensed under the MIT License. See the LICENSE file for
details.Was this page helpful?YesNoPython SDK🦜 LangChainxgithublinkedinPowered by
MintlifyOn this pageInstallationFeaturesQuick
StartServicesSmartScraperLocalScraperMarkdownifyAPI CreditsFeedbackSupport

------- • -------

https://docs.scrapegraphai.com/sdks/javascript#localscraper

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationOfficial
SDKsJavaScript SDKHomeCookbookAPI ReferenceOfficial WebsiteCommunityBlogGet
StartedIntroductionDashboardServicesSmartScraperLocalScraperMarkdownifyBrowser
ExtensionsOfficial SDKsPython SDKJavaScript SDKIntegrations🦜 LangChain🦙 LlamaIndex👥
CrewAI PhidataContributeOpen SourceFeedbackResources🕷️ScrapeGraphAI BadgeOfficial
SDKsJavaScript SDKOfficial JavaScript/TypeScript SDK for ScrapeGraphAI
NPM PackageLicense
Installation
Install the package using npm or yarn:
# Using npm
npm i scrapegraph-js

# Using yarn
yarn add scrapegraph-js

Features

AI-Powered Extraction: Smart web scraping with artificial intelligence


Async by Design: Fully asynchronous architecture
Type Safety: Built-in TypeScript support with Zod schemas
Production Ready: Automatic retries and detailed logging
Developer Friendly: Comprehensive error handling

Quick Start
Initialize with your API key:
import { smartScraper } from 'scrapegraph-js';

const apiKey = process.env.SGAI_APIKEY;


const websiteUrl = 'https://example.com';
const prompt = 'Extract the main heading and description';

try {
const response = await smartScraper(apiKey, websiteUrl, prompt);
console.log(response.result);
} catch (error) {
console.error('Error:', error);
}

Store your API keys securely in environment variables. Use .env files and libraries
like dotenv to load them into your app.
Services
SmartScraper
Extract specific information from any webpage using AI:
const response = await smartScraper(
apiKey,
'https://example.com',
'Extract the main content'
);
Basic Schema ExampleDefine a simple schema using Zod:import { z } from 'zod';

const ArticleSchema = z.object({


title: z.string().describe('The article title'),
author: z.string().describe('The author\'s name'),
publishDate: z.string().describe('Article publication date'),
content: z.string().describe('Main article content'),
category: z.string().describe('Article category')
});

const response = await smartScraper(


apiKey,
'https://example.com/blog/article',
'Extract the article information',
ArticleSchema
);

console.log(`Title: ${response.result.title}`);
console.log(`Author: ${response.result.author}`);
console.log(`Published: ${response.result.publishDate}`);

Advanced Schema ExampleDefine a complex schema for nested data structures:import


{ z } from 'zod';

const EmployeeSchema = z.object({


name: z.string().describe('Employee\'s full name'),
position: z.string().describe('Job title'),
department: z.string().describe('Department name'),
email: z.string().describe('Email address')
});

const OfficeSchema = z.object({


location: z.string().describe('Office location/city'),
address: z.string().describe('Full address'),
phone: z.string().describe('Contact number')
});

const CompanySchema = z.object({


name: z.string().describe('Company name'),
description: z.string().describe('Company description'),
industry: z.string().describe('Industry sector'),
foundedYear: z.number().describe('Year company was founded'),
employees: z.array(EmployeeSchema).describe('List of key employees'),
offices: z.array(OfficeSchema).describe('Company office locations'),
website: z.string().url().describe('Company website URL')
});

// Extract comprehensive company information


const response = await smartScraper(
apiKey,
'https://example.com/about',
'Extract detailed company information including employees and offices',
CompanySchema
);

// Access nested data


console.log(`Company: ${response.result.name}`);
console.log('\nKey Employees:');
response.result.employees.forEach(employee => {
console.log(`- ${employee.name} (${employee.position})`);
});

console.log('\nOffice Locations:');
response.result.offices.forEach(office => {
console.log(`- ${office.location}: ${office.address}`);
});

LocalScraper
Process local HTML content with AI extraction:
const html = `
<html>
<body>
<h1>Company Name</h1>
<p>We are a technology company focused on AI solutions.</p>
<div class="contact">
<p>Email: [email protected]</p>
</div>
</body>
</html>
`;

const response = await localScraper(


apiKey,
html,
'Extract the company description'
);

Markdownify
Convert any webpage into clean, formatted markdown:
const response = await markdownify(
apiKey,
'https://example.com'
);

API Credits
Check your available API credits:
import { getCredits } from 'scrapegraph-js';

try {
const credits = await getCredits(apiKey);
console.log('Available credits:', credits);
} catch (error) {
console.error('Error fetching credits:', error);
}

Feedback
Help us improve by submitting feedback programmatically:
import { sendFeedback } from 'scrapegraph-js';

try {
await sendFeedback(
apiKey,
'request-id',
5,
'Great results!'
);
} catch (error) {
console.error('Error sending feedback:', error);
}

Support
GitHubReport issues and contribute to the SDKEmail SupportGet help from our
development team
LicenseThis project is licensed under the MIT License. See the LICENSE file for
details.Was this page helpful?YesNoPython SDK🦜 LangChainxgithublinkedinPowered by
MintlifyOn this pageInstallationFeaturesQuick
StartServicesSmartScraperLocalScraperMarkdownifyAPI CreditsFeedbackSupport

------- • -------

https://docs.scrapegraphai.com/sdks/javascript#markdownify

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationOfficial
SDKsJavaScript SDKHomeCookbookAPI ReferenceOfficial WebsiteCommunityBlogGet
StartedIntroductionDashboardServicesSmartScraperLocalScraperMarkdownifyBrowser
ExtensionsOfficial SDKsPython SDKJavaScript SDKIntegrations🦜 LangChain🦙 LlamaIndex👥
CrewAI PhidataContributeOpen SourceFeedbackResources🕷️ScrapeGraphAI BadgeOfficial
SDKsJavaScript SDKOfficial JavaScript/TypeScript SDK for ScrapeGraphAI
NPM PackageLicense
Installation
Install the package using npm or yarn:
# Using npm
npm i scrapegraph-js

# Using yarn
yarn add scrapegraph-js

Features

AI-Powered Extraction: Smart web scraping with artificial intelligence


Async by Design: Fully asynchronous architecture
Type Safety: Built-in TypeScript support with Zod schemas
Production Ready: Automatic retries and detailed logging
Developer Friendly: Comprehensive error handling

Quick Start
Initialize with your API key:
import { smartScraper } from 'scrapegraph-js';

const apiKey = process.env.SGAI_APIKEY;


const websiteUrl = 'https://example.com';
const prompt = 'Extract the main heading and description';

try {
const response = await smartScraper(apiKey, websiteUrl, prompt);
console.log(response.result);
} catch (error) {
console.error('Error:', error);
}

Store your API keys securely in environment variables. Use .env files and libraries
like dotenv to load them into your app.
Services
SmartScraper
Extract specific information from any webpage using AI:
const response = await smartScraper(
apiKey,
'https://example.com',
'Extract the main content'
);

Basic Schema ExampleDefine a simple schema using Zod:import { z } from 'zod';

const ArticleSchema = z.object({


title: z.string().describe('The article title'),
author: z.string().describe('The author\'s name'),
publishDate: z.string().describe('Article publication date'),
content: z.string().describe('Main article content'),
category: z.string().describe('Article category')
});

const response = await smartScraper(


apiKey,
'https://example.com/blog/article',
'Extract the article information',
ArticleSchema
);

console.log(`Title: ${response.result.title}`);
console.log(`Author: ${response.result.author}`);
console.log(`Published: ${response.result.publishDate}`);

Advanced Schema ExampleDefine a complex schema for nested data structures:import


{ z } from 'zod';

const EmployeeSchema = z.object({


name: z.string().describe('Employee\'s full name'),
position: z.string().describe('Job title'),
department: z.string().describe('Department name'),
email: z.string().describe('Email address')
});

const OfficeSchema = z.object({


location: z.string().describe('Office location/city'),
address: z.string().describe('Full address'),
phone: z.string().describe('Contact number')
});

const CompanySchema = z.object({


name: z.string().describe('Company name'),
description: z.string().describe('Company description'),
industry: z.string().describe('Industry sector'),
foundedYear: z.number().describe('Year company was founded'),
employees: z.array(EmployeeSchema).describe('List of key employees'),
offices: z.array(OfficeSchema).describe('Company office locations'),
website: z.string().url().describe('Company website URL')
});

// Extract comprehensive company information


const response = await smartScraper(
apiKey,
'https://example.com/about',
'Extract detailed company information including employees and offices',
CompanySchema
);

// Access nested data


console.log(`Company: ${response.result.name}`);
console.log('\nKey Employees:');
response.result.employees.forEach(employee => {
console.log(`- ${employee.name} (${employee.position})`);
});

console.log('\nOffice Locations:');
response.result.offices.forEach(office => {
console.log(`- ${office.location}: ${office.address}`);
});

LocalScraper
Process local HTML content with AI extraction:
const html = `
<html>
<body>
<h1>Company Name</h1>
<p>We are a technology company focused on AI solutions.</p>
<div class="contact">
<p>Email: [email protected]</p>
</div>
</body>
</html>
`;

const response = await localScraper(


apiKey,
html,
'Extract the company description'
);

Markdownify
Convert any webpage into clean, formatted markdown:
const response = await markdownify(
apiKey,
'https://example.com'
);

API Credits
Check your available API credits:
import { getCredits } from 'scrapegraph-js';

try {
const credits = await getCredits(apiKey);
console.log('Available credits:', credits);
} catch (error) {
console.error('Error fetching credits:', error);
}

Feedback
Help us improve by submitting feedback programmatically:
import { sendFeedback } from 'scrapegraph-js';

try {
await sendFeedback(
apiKey,
'request-id',
5,
'Great results!'
);
} catch (error) {
console.error('Error sending feedback:', error);
}

Support
GitHubReport issues and contribute to the SDKEmail SupportGet help from our
development team
LicenseThis project is licensed under the MIT License. See the LICENSE file for
details.Was this page helpful?YesNoPython SDK🦜 LangChainxgithublinkedinPowered by
MintlifyOn this pageInstallationFeaturesQuick
StartServicesSmartScraperLocalScraperMarkdownifyAPI CreditsFeedbackSupport

------- • -------

https://docs.scrapegraphai.com/sdks/javascript#api-credits

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationOfficial
SDKsJavaScript SDKHomeCookbookAPI ReferenceOfficial WebsiteCommunityBlogGet
StartedIntroductionDashboardServicesSmartScraperLocalScraperMarkdownifyBrowser
ExtensionsOfficial SDKsPython SDKJavaScript SDKIntegrations🦜 LangChain🦙 LlamaIndex👥
CrewAI PhidataContributeOpen SourceFeedbackResources🕷️ScrapeGraphAI BadgeOfficial
SDKsJavaScript SDKOfficial JavaScript/TypeScript SDK for ScrapeGraphAI
NPM PackageLicense
Installation
Install the package using npm or yarn:
# Using npm
npm i scrapegraph-js

# Using yarn
yarn add scrapegraph-js

Features

AI-Powered Extraction: Smart web scraping with artificial intelligence


Async by Design: Fully asynchronous architecture
Type Safety: Built-in TypeScript support with Zod schemas
Production Ready: Automatic retries and detailed logging
Developer Friendly: Comprehensive error handling

Quick Start
Initialize with your API key:
import { smartScraper } from 'scrapegraph-js';

const apiKey = process.env.SGAI_APIKEY;


const websiteUrl = 'https://example.com';
const prompt = 'Extract the main heading and description';

try {
const response = await smartScraper(apiKey, websiteUrl, prompt);
console.log(response.result);
} catch (error) {
console.error('Error:', error);
}
Store your API keys securely in environment variables. Use .env files and libraries
like dotenv to load them into your app.
Services
SmartScraper
Extract specific information from any webpage using AI:
const response = await smartScraper(
apiKey,
'https://example.com',
'Extract the main content'
);

Basic Schema ExampleDefine a simple schema using Zod:import { z } from 'zod';

const ArticleSchema = z.object({


title: z.string().describe('The article title'),
author: z.string().describe('The author\'s name'),
publishDate: z.string().describe('Article publication date'),
content: z.string().describe('Main article content'),
category: z.string().describe('Article category')
});

const response = await smartScraper(


apiKey,
'https://example.com/blog/article',
'Extract the article information',
ArticleSchema
);

console.log(`Title: ${response.result.title}`);
console.log(`Author: ${response.result.author}`);
console.log(`Published: ${response.result.publishDate}`);

Advanced Schema ExampleDefine a complex schema for nested data structures:import


{ z } from 'zod';

const EmployeeSchema = z.object({


name: z.string().describe('Employee\'s full name'),
position: z.string().describe('Job title'),
department: z.string().describe('Department name'),
email: z.string().describe('Email address')
});

const OfficeSchema = z.object({


location: z.string().describe('Office location/city'),
address: z.string().describe('Full address'),
phone: z.string().describe('Contact number')
});

const CompanySchema = z.object({


name: z.string().describe('Company name'),
description: z.string().describe('Company description'),
industry: z.string().describe('Industry sector'),
foundedYear: z.number().describe('Year company was founded'),
employees: z.array(EmployeeSchema).describe('List of key employees'),
offices: z.array(OfficeSchema).describe('Company office locations'),
website: z.string().url().describe('Company website URL')
});

// Extract comprehensive company information


const response = await smartScraper(
apiKey,
'https://example.com/about',
'Extract detailed company information including employees and offices',
CompanySchema
);

// Access nested data


console.log(`Company: ${response.result.name}`);
console.log('\nKey Employees:');
response.result.employees.forEach(employee => {
console.log(`- ${employee.name} (${employee.position})`);
});

console.log('\nOffice Locations:');
response.result.offices.forEach(office => {
console.log(`- ${office.location}: ${office.address}`);
});

LocalScraper
Process local HTML content with AI extraction:
const html = `
<html>
<body>
<h1>Company Name</h1>
<p>We are a technology company focused on AI solutions.</p>
<div class="contact">
<p>Email: [email protected]</p>
</div>
</body>
</html>
`;

const response = await localScraper(


apiKey,
html,
'Extract the company description'
);

Markdownify
Convert any webpage into clean, formatted markdown:
const response = await markdownify(
apiKey,
'https://example.com'
);

API Credits
Check your available API credits:
import { getCredits } from 'scrapegraph-js';

try {
const credits = await getCredits(apiKey);
console.log('Available credits:', credits);
} catch (error) {
console.error('Error fetching credits:', error);
}

Feedback
Help us improve by submitting feedback programmatically:
import { sendFeedback } from 'scrapegraph-js';

try {
await sendFeedback(
apiKey,
'request-id',
5,
'Great results!'
);
} catch (error) {
console.error('Error sending feedback:', error);
}

Support
GitHubReport issues and contribute to the SDKEmail SupportGet help from our
development team
LicenseThis project is licensed under the MIT License. See the LICENSE file for
details.Was this page helpful?YesNoPython SDK🦜 LangChainxgithublinkedinPowered by
MintlifyOn this pageInstallationFeaturesQuick
StartServicesSmartScraperLocalScraperMarkdownifyAPI CreditsFeedbackSupport

------- • -------

https://docs.scrapegraphai.com/sdks/javascript#feedback

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationOfficial
SDKsJavaScript SDKHomeCookbookAPI ReferenceOfficial WebsiteCommunityBlogGet
StartedIntroductionDashboardServicesSmartScraperLocalScraperMarkdownifyBrowser
ExtensionsOfficial SDKsPython SDKJavaScript SDKIntegrations🦜 LangChain🦙 LlamaIndex👥
CrewAI PhidataContributeOpen SourceFeedbackResources🕷️ScrapeGraphAI BadgeOfficial
SDKsJavaScript SDKOfficial JavaScript/TypeScript SDK for ScrapeGraphAI
NPM PackageLicense
Installation
Install the package using npm or yarn:
# Using npm
npm i scrapegraph-js

# Using yarn
yarn add scrapegraph-js

Features

AI-Powered Extraction: Smart web scraping with artificial intelligence


Async by Design: Fully asynchronous architecture
Type Safety: Built-in TypeScript support with Zod schemas
Production Ready: Automatic retries and detailed logging
Developer Friendly: Comprehensive error handling

Quick Start
Initialize with your API key:
import { smartScraper } from 'scrapegraph-js';

const apiKey = process.env.SGAI_APIKEY;


const websiteUrl = 'https://example.com';
const prompt = 'Extract the main heading and description';

try {
const response = await smartScraper(apiKey, websiteUrl, prompt);
console.log(response.result);
} catch (error) {
console.error('Error:', error);
}

Store your API keys securely in environment variables. Use .env files and libraries
like dotenv to load them into your app.
Services
SmartScraper
Extract specific information from any webpage using AI:
const response = await smartScraper(
apiKey,
'https://example.com',
'Extract the main content'
);

Basic Schema ExampleDefine a simple schema using Zod:import { z } from 'zod';

const ArticleSchema = z.object({


title: z.string().describe('The article title'),
author: z.string().describe('The author\'s name'),
publishDate: z.string().describe('Article publication date'),
content: z.string().describe('Main article content'),
category: z.string().describe('Article category')
});

const response = await smartScraper(


apiKey,
'https://example.com/blog/article',
'Extract the article information',
ArticleSchema
);

console.log(`Title: ${response.result.title}`);
console.log(`Author: ${response.result.author}`);
console.log(`Published: ${response.result.publishDate}`);

Advanced Schema ExampleDefine a complex schema for nested data structures:import


{ z } from 'zod';

const EmployeeSchema = z.object({


name: z.string().describe('Employee\'s full name'),
position: z.string().describe('Job title'),
department: z.string().describe('Department name'),
email: z.string().describe('Email address')
});

const OfficeSchema = z.object({


location: z.string().describe('Office location/city'),
address: z.string().describe('Full address'),
phone: z.string().describe('Contact number')
});

const CompanySchema = z.object({


name: z.string().describe('Company name'),
description: z.string().describe('Company description'),
industry: z.string().describe('Industry sector'),
foundedYear: z.number().describe('Year company was founded'),
employees: z.array(EmployeeSchema).describe('List of key employees'),
offices: z.array(OfficeSchema).describe('Company office locations'),
website: z.string().url().describe('Company website URL')
});

// Extract comprehensive company information


const response = await smartScraper(
apiKey,
'https://example.com/about',
'Extract detailed company information including employees and offices',
CompanySchema
);

// Access nested data


console.log(`Company: ${response.result.name}`);
console.log('\nKey Employees:');
response.result.employees.forEach(employee => {
console.log(`- ${employee.name} (${employee.position})`);
});

console.log('\nOffice Locations:');
response.result.offices.forEach(office => {
console.log(`- ${office.location}: ${office.address}`);
});

LocalScraper
Process local HTML content with AI extraction:
const html = `
<html>
<body>
<h1>Company Name</h1>
<p>We are a technology company focused on AI solutions.</p>
<div class="contact">
<p>Email: [email protected]</p>
</div>
</body>
</html>
`;

const response = await localScraper(


apiKey,
html,
'Extract the company description'
);

Markdownify
Convert any webpage into clean, formatted markdown:
const response = await markdownify(
apiKey,
'https://example.com'
);

API Credits
Check your available API credits:
import { getCredits } from 'scrapegraph-js';

try {
const credits = await getCredits(apiKey);
console.log('Available credits:', credits);
} catch (error) {
console.error('Error fetching credits:', error);
}

Feedback
Help us improve by submitting feedback programmatically:
import { sendFeedback } from 'scrapegraph-js';

try {
await sendFeedback(
apiKey,
'request-id',
5,
'Great results!'
);
} catch (error) {
console.error('Error sending feedback:', error);
}

Support
GitHubReport issues and contribute to the SDKEmail SupportGet help from our
development team
LicenseThis project is licensed under the MIT License. See the LICENSE file for
details.Was this page helpful?YesNoPython SDK🦜 LangChainxgithublinkedinPowered by
MintlifyOn this pageInstallationFeaturesQuick
StartServicesSmartScraperLocalScraperMarkdownifyAPI CreditsFeedbackSupport

------- • -------

https://docs.scrapegraphai.com/sdks/javascript#support

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationOfficial
SDKsJavaScript SDKHomeCookbookAPI ReferenceOfficial WebsiteCommunityBlogGet
StartedIntroductionDashboardServicesSmartScraperLocalScraperMarkdownifyBrowser
ExtensionsOfficial SDKsPython SDKJavaScript SDKIntegrations🦜 LangChain🦙 LlamaIndex👥
CrewAI PhidataContributeOpen SourceFeedbackResources🕷️ScrapeGraphAI BadgeOfficial
SDKsJavaScript SDKOfficial JavaScript/TypeScript SDK for ScrapeGraphAI
NPM PackageLicense
Installation
Install the package using npm or yarn:
# Using npm
npm i scrapegraph-js

# Using yarn
yarn add scrapegraph-js

Features

AI-Powered Extraction: Smart web scraping with artificial intelligence


Async by Design: Fully asynchronous architecture
Type Safety: Built-in TypeScript support with Zod schemas
Production Ready: Automatic retries and detailed logging
Developer Friendly: Comprehensive error handling

Quick Start
Initialize with your API key:
import { smartScraper } from 'scrapegraph-js';

const apiKey = process.env.SGAI_APIKEY;


const websiteUrl = 'https://example.com';
const prompt = 'Extract the main heading and description';

try {
const response = await smartScraper(apiKey, websiteUrl, prompt);
console.log(response.result);
} catch (error) {
console.error('Error:', error);
}

Store your API keys securely in environment variables. Use .env files and libraries
like dotenv to load them into your app.
Services
SmartScraper
Extract specific information from any webpage using AI:
const response = await smartScraper(
apiKey,
'https://example.com',
'Extract the main content'
);

Basic Schema ExampleDefine a simple schema using Zod:import { z } from 'zod';

const ArticleSchema = z.object({


title: z.string().describe('The article title'),
author: z.string().describe('The author\'s name'),
publishDate: z.string().describe('Article publication date'),
content: z.string().describe('Main article content'),
category: z.string().describe('Article category')
});

const response = await smartScraper(


apiKey,
'https://example.com/blog/article',
'Extract the article information',
ArticleSchema
);

console.log(`Title: ${response.result.title}`);
console.log(`Author: ${response.result.author}`);
console.log(`Published: ${response.result.publishDate}`);

Advanced Schema ExampleDefine a complex schema for nested data structures:import


{ z } from 'zod';

const EmployeeSchema = z.object({


name: z.string().describe('Employee\'s full name'),
position: z.string().describe('Job title'),
department: z.string().describe('Department name'),
email: z.string().describe('Email address')
});

const OfficeSchema = z.object({


location: z.string().describe('Office location/city'),
address: z.string().describe('Full address'),
phone: z.string().describe('Contact number')
});

const CompanySchema = z.object({


name: z.string().describe('Company name'),
description: z.string().describe('Company description'),
industry: z.string().describe('Industry sector'),
foundedYear: z.number().describe('Year company was founded'),
employees: z.array(EmployeeSchema).describe('List of key employees'),
offices: z.array(OfficeSchema).describe('Company office locations'),
website: z.string().url().describe('Company website URL')
});

// Extract comprehensive company information


const response = await smartScraper(
apiKey,
'https://example.com/about',
'Extract detailed company information including employees and offices',
CompanySchema
);

// Access nested data


console.log(`Company: ${response.result.name}`);
console.log('\nKey Employees:');
response.result.employees.forEach(employee => {
console.log(`- ${employee.name} (${employee.position})`);
});

console.log('\nOffice Locations:');
response.result.offices.forEach(office => {
console.log(`- ${office.location}: ${office.address}`);
});

LocalScraper
Process local HTML content with AI extraction:
const html = `
<html>
<body>
<h1>Company Name</h1>
<p>We are a technology company focused on AI solutions.</p>
<div class="contact">
<p>Email: [email protected]</p>
</div>
</body>
</html>
`;

const response = await localScraper(


apiKey,
html,
'Extract the company description'
);

Markdownify
Convert any webpage into clean, formatted markdown:
const response = await markdownify(
apiKey,
'https://example.com'
);

API Credits
Check your available API credits:
import { getCredits } from 'scrapegraph-js';
try {
const credits = await getCredits(apiKey);
console.log('Available credits:', credits);
} catch (error) {
console.error('Error fetching credits:', error);
}

Feedback
Help us improve by submitting feedback programmatically:
import { sendFeedback } from 'scrapegraph-js';

try {
await sendFeedback(
apiKey,
'request-id',
5,
'Great results!'
);
} catch (error) {
console.error('Error sending feedback:', error);
}

Support
GitHubReport issues and contribute to the SDKEmail SupportGet help from our
development team
LicenseThis project is licensed under the MIT License. See the LICENSE file for
details.Was this page helpful?YesNoPython SDK🦜 LangChainxgithublinkedinPowered by
MintlifyOn this pageInstallationFeaturesQuick
StartServicesSmartScraperLocalScraperMarkdownifyAPI CreditsFeedbackSupport

------- • -------

https://docs.scrapegraphai.com/integrations/langchain#overview

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationIntegrations🦜
LangChainHomeCookbookAPI ReferenceOfficial WebsiteCommunityBlogGet
StartedIntroductionDashboardServicesSmartScraperLocalScraperMarkdownifyBrowser
ExtensionsOfficial SDKsPython SDKJavaScript SDKIntegrations🦜 LangChain🦙 LlamaIndex👥
CrewAI PhidataContributeOpen SourceFeedbackResources🕷️ScrapeGraphAI
BadgeIntegrations🦜 LangChainSupercharge your LangChain agents with AI-powered web
scrapingOverview
The LangChain integration enables your agents to extract structured data from
websites using natural language. This powerful combination allows you to build
sophisticated AI agents that can understand and process web content intelligently.
Official LangChain DocumentationView the integration in LangChain’s official
documentation
Installation
Install the package using pip:
pip install langchain-scrapegraph

Available Tools
SmartScraperTool
Extract structured data from any webpage using natural language prompts:
from langchain_scrapegraph.tools import SmartScraperTool

# Initialize the tool (uses SGAI_API_KEY from environment)


tool = SmartscraperTool()
# Extract information using natural language
result = tool.invoke({
"website_url": "https://www.example.com",
"user_prompt": "Extract the main heading and first paragraph"
})

Using Output SchemasDefine the structure of the output using Pydantic models:from
typing import List
from pydantic import BaseModel, Field
from langchain_scrapegraph.tools import SmartScraperTool

class WebsiteInfo(BaseModel):
title: str = Field(description="The main title of the webpage")
description: str = Field(description="The main description or first paragraph")
urls: List[str] = Field(description="The URLs inside the webpage")

# Initialize with schema


tool = SmartScraperTool(llm_output_schema=WebsiteInfo)

result = tool.invoke({
"website_url": "https://www.example.com",
"user_prompt": "Extract the website information"
})

LocalScraperTool
Process HTML content directly with AI extraction:
from langchain_scrapegraph.tools import LocalScraperTool

tool = LocalScraperTool()
result = tool.invoke({
"user_prompt": "Extract all contact information",
"website_html": "<html>...</html>"
})

Using Output Schemasfrom typing import Optional


from pydantic import BaseModel, Field
from langchain_scrapegraph.tools import LocalScraperTool

class CompanyInfo(BaseModel):
name: str = Field(description="The company name")
description: str = Field(description="The company description")
email: Optional[str] = Field(description="Contact email if available")
phone: Optional[str] = Field(description="Contact phone if available")

tool = LocalScraperTool(llm_output_schema=CompanyInfo)

html_content = """
<html>
<body>
<h1>TechCorp Solutions</h1>
<p>We are a leading AI technology company.</p>
<div class="contact">
<p>Email: [email protected]</p>
<p>Phone: (555) 123-4567</p>
</div>
</body>
</html>
"""
result = tool.invoke({
"website_html": html_content,
"user_prompt": "Extract the company information"
})

MarkdownifyTool
Convert any webpage into clean, formatted markdown:
from langchain_scrapegraph.tools import MarkdownifyTool

tool = MarkdownifyTool()
markdown = tool.invoke({"website_url": "https://example.com"})

Example Agent
Create a research agent that can gather and analyze web data:
from langchain.agents import initialize_agent, AgentType
from langchain_scrapegraph.tools import SmartScraperTool
from langchain_openai import ChatOpenAI

# Initialize tools
tools = [
SmartScraperTool(),
]

# Create an agent
agent = initialize_agent(
tools=tools,
llm=ChatOpenAI(temperature=0),
agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION,
verbose=True
)

# Use the agent


response = agent.run("""
Visit example.com, make a summary of the content and extract the main heading
and first paragraph
""")

Configuration
Set your ScrapeGraph API key in your environment:
export SGAI_API_KEY="your-api-key-here"

Or set it programmatically:
import os
os.environ["SGAI_API_KEY"] = "your-api-key-here"

Get your API key from the dashboard


Use Cases
Research AgentsCreate agents that gather and analyze web dataData
CollectionAutomate structured data extraction from websitesContent
ProcessingConvert web content into markdown for further processingInformation
ExtractionExtract specific data points using natural language
Support
Need help with the integration?
GitHub IssuesReport bugs and request featuresDiscord CommunityGet help from our
communityWas this page helpful?YesNoJavaScript SDK🦙
LlamaIndexxgithublinkedinPowered by MintlifyOn this
pageOverviewInstallationAvailable
ToolsSmartScraperToolLocalScraperToolMarkdownifyToolExample AgentConfigurationUse
CasesSupport

------- • -------

https://docs.scrapegraphai.com/integrations/langchain#installation

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationIntegrations🦜
LangChainHomeCookbookAPI ReferenceOfficial WebsiteCommunityBlogGet
StartedIntroductionDashboardServicesSmartScraperLocalScraperMarkdownifyBrowser
ExtensionsOfficial SDKsPython SDKJavaScript SDKIntegrations🦜 LangChain🦙 LlamaIndex👥
CrewAI PhidataContributeOpen SourceFeedbackResources🕷️ScrapeGraphAI
BadgeIntegrations🦜 LangChainSupercharge your LangChain agents with AI-powered web
scrapingOverview
The LangChain integration enables your agents to extract structured data from
websites using natural language. This powerful combination allows you to build
sophisticated AI agents that can understand and process web content intelligently.
Official LangChain DocumentationView the integration in LangChain’s official
documentation
Installation
Install the package using pip:
pip install langchain-scrapegraph

Available Tools
SmartScraperTool
Extract structured data from any webpage using natural language prompts:
from langchain_scrapegraph.tools import SmartScraperTool

# Initialize the tool (uses SGAI_API_KEY from environment)


tool = SmartscraperTool()

# Extract information using natural language


result = tool.invoke({
"website_url": "https://www.example.com",
"user_prompt": "Extract the main heading and first paragraph"
})

Using Output SchemasDefine the structure of the output using Pydantic models:from
typing import List
from pydantic import BaseModel, Field
from langchain_scrapegraph.tools import SmartScraperTool

class WebsiteInfo(BaseModel):
title: str = Field(description="The main title of the webpage")
description: str = Field(description="The main description or first paragraph")
urls: List[str] = Field(description="The URLs inside the webpage")

# Initialize with schema


tool = SmartScraperTool(llm_output_schema=WebsiteInfo)

result = tool.invoke({
"website_url": "https://www.example.com",
"user_prompt": "Extract the website information"
})

LocalScraperTool
Process HTML content directly with AI extraction:
from langchain_scrapegraph.tools import LocalScraperTool
tool = LocalScraperTool()
result = tool.invoke({
"user_prompt": "Extract all contact information",
"website_html": "<html>...</html>"
})

Using Output Schemasfrom typing import Optional


from pydantic import BaseModel, Field
from langchain_scrapegraph.tools import LocalScraperTool

class CompanyInfo(BaseModel):
name: str = Field(description="The company name")
description: str = Field(description="The company description")
email: Optional[str] = Field(description="Contact email if available")
phone: Optional[str] = Field(description="Contact phone if available")

tool = LocalScraperTool(llm_output_schema=CompanyInfo)

html_content = """
<html>
<body>
<h1>TechCorp Solutions</h1>
<p>We are a leading AI technology company.</p>
<div class="contact">
<p>Email: [email protected]</p>
<p>Phone: (555) 123-4567</p>
</div>
</body>
</html>
"""

result = tool.invoke({
"website_html": html_content,
"user_prompt": "Extract the company information"
})

MarkdownifyTool
Convert any webpage into clean, formatted markdown:
from langchain_scrapegraph.tools import MarkdownifyTool

tool = MarkdownifyTool()
markdown = tool.invoke({"website_url": "https://example.com"})

Example Agent
Create a research agent that can gather and analyze web data:
from langchain.agents import initialize_agent, AgentType
from langchain_scrapegraph.tools import SmartScraperTool
from langchain_openai import ChatOpenAI

# Initialize tools
tools = [
SmartScraperTool(),
]

# Create an agent
agent = initialize_agent(
tools=tools,
llm=ChatOpenAI(temperature=0),
agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION,
verbose=True
)

# Use the agent


response = agent.run("""
Visit example.com, make a summary of the content and extract the main heading
and first paragraph
""")

Configuration
Set your ScrapeGraph API key in your environment:
export SGAI_API_KEY="your-api-key-here"

Or set it programmatically:
import os
os.environ["SGAI_API_KEY"] = "your-api-key-here"

Get your API key from the dashboard


Use Cases
Research AgentsCreate agents that gather and analyze web dataData
CollectionAutomate structured data extraction from websitesContent
ProcessingConvert web content into markdown for further processingInformation
ExtractionExtract specific data points using natural language
Support
Need help with the integration?
GitHub IssuesReport bugs and request featuresDiscord CommunityGet help from our
communityWas this page helpful?YesNoJavaScript SDK🦙
LlamaIndexxgithublinkedinPowered by MintlifyOn this
pageOverviewInstallationAvailable
ToolsSmartScraperToolLocalScraperToolMarkdownifyToolExample AgentConfigurationUse
CasesSupport

------- • -------

https://docs.scrapegraphai.com/integrations/langchain#available-tools

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationIntegrations🦜
LangChainHomeCookbookAPI ReferenceOfficial WebsiteCommunityBlogGet
StartedIntroductionDashboardServicesSmartScraperLocalScraperMarkdownifyBrowser
ExtensionsOfficial SDKsPython SDKJavaScript SDKIntegrations🦜 LangChain🦙 LlamaIndex👥
CrewAI PhidataContributeOpen SourceFeedbackResources🕷️ScrapeGraphAI
BadgeIntegrations🦜 LangChainSupercharge your LangChain agents with AI-powered web
scrapingOverview
The LangChain integration enables your agents to extract structured data from
websites using natural language. This powerful combination allows you to build
sophisticated AI agents that can understand and process web content intelligently.
Official LangChain DocumentationView the integration in LangChain’s official
documentation
Installation
Install the package using pip:
pip install langchain-scrapegraph

Available Tools
SmartScraperTool
Extract structured data from any webpage using natural language prompts:
from langchain_scrapegraph.tools import SmartScraperTool

# Initialize the tool (uses SGAI_API_KEY from environment)


tool = SmartscraperTool()

# Extract information using natural language


result = tool.invoke({
"website_url": "https://www.example.com",
"user_prompt": "Extract the main heading and first paragraph"
})

Using Output SchemasDefine the structure of the output using Pydantic models:from
typing import List
from pydantic import BaseModel, Field
from langchain_scrapegraph.tools import SmartScraperTool

class WebsiteInfo(BaseModel):
title: str = Field(description="The main title of the webpage")
description: str = Field(description="The main description or first paragraph")
urls: List[str] = Field(description="The URLs inside the webpage")

# Initialize with schema


tool = SmartScraperTool(llm_output_schema=WebsiteInfo)

result = tool.invoke({
"website_url": "https://www.example.com",
"user_prompt": "Extract the website information"
})

LocalScraperTool
Process HTML content directly with AI extraction:
from langchain_scrapegraph.tools import LocalScraperTool

tool = LocalScraperTool()
result = tool.invoke({
"user_prompt": "Extract all contact information",
"website_html": "<html>...</html>"
})

Using Output Schemasfrom typing import Optional


from pydantic import BaseModel, Field
from langchain_scrapegraph.tools import LocalScraperTool

class CompanyInfo(BaseModel):
name: str = Field(description="The company name")
description: str = Field(description="The company description")
email: Optional[str] = Field(description="Contact email if available")
phone: Optional[str] = Field(description="Contact phone if available")

tool = LocalScraperTool(llm_output_schema=CompanyInfo)

html_content = """
<html>
<body>
<h1>TechCorp Solutions</h1>
<p>We are a leading AI technology company.</p>
<div class="contact">
<p>Email: [email protected]</p>
<p>Phone: (555) 123-4567</p>
</div>
</body>
</html>
"""

result = tool.invoke({
"website_html": html_content,
"user_prompt": "Extract the company information"
})

MarkdownifyTool
Convert any webpage into clean, formatted markdown:
from langchain_scrapegraph.tools import MarkdownifyTool

tool = MarkdownifyTool()
markdown = tool.invoke({"website_url": "https://example.com"})

Example Agent
Create a research agent that can gather and analyze web data:
from langchain.agents import initialize_agent, AgentType
from langchain_scrapegraph.tools import SmartScraperTool
from langchain_openai import ChatOpenAI

# Initialize tools
tools = [
SmartScraperTool(),
]

# Create an agent
agent = initialize_agent(
tools=tools,
llm=ChatOpenAI(temperature=0),
agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION,
verbose=True
)

# Use the agent


response = agent.run("""
Visit example.com, make a summary of the content and extract the main heading
and first paragraph
""")

Configuration
Set your ScrapeGraph API key in your environment:
export SGAI_API_KEY="your-api-key-here"

Or set it programmatically:
import os
os.environ["SGAI_API_KEY"] = "your-api-key-here"

Get your API key from the dashboard


Use Cases
Research AgentsCreate agents that gather and analyze web dataData
CollectionAutomate structured data extraction from websitesContent
ProcessingConvert web content into markdown for further processingInformation
ExtractionExtract specific data points using natural language
Support
Need help with the integration?
GitHub IssuesReport bugs and request featuresDiscord CommunityGet help from our
communityWas this page helpful?YesNoJavaScript SDK🦙
LlamaIndexxgithublinkedinPowered by MintlifyOn this
pageOverviewInstallationAvailable
ToolsSmartScraperToolLocalScraperToolMarkdownifyToolExample AgentConfigurationUse
CasesSupport

------- • -------

https://docs.scrapegraphai.com/integrations/langchain#smartscrapertool

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationIntegrations🦜
LangChainHomeCookbookAPI ReferenceOfficial WebsiteCommunityBlogGet
StartedIntroductionDashboardServicesSmartScraperLocalScraperMarkdownifyBrowser
ExtensionsOfficial SDKsPython SDKJavaScript SDKIntegrations🦜 LangChain🦙 LlamaIndex👥
CrewAI PhidataContributeOpen SourceFeedbackResources🕷️ScrapeGraphAI
BadgeIntegrations🦜 LangChainSupercharge your LangChain agents with AI-powered web
scrapingOverview
The LangChain integration enables your agents to extract structured data from
websites using natural language. This powerful combination allows you to build
sophisticated AI agents that can understand and process web content intelligently.
Official LangChain DocumentationView the integration in LangChain’s official
documentation
Installation
Install the package using pip:
pip install langchain-scrapegraph

Available Tools
SmartScraperTool
Extract structured data from any webpage using natural language prompts:
from langchain_scrapegraph.tools import SmartScraperTool

# Initialize the tool (uses SGAI_API_KEY from environment)


tool = SmartscraperTool()

# Extract information using natural language


result = tool.invoke({
"website_url": "https://www.example.com",
"user_prompt": "Extract the main heading and first paragraph"
})

Using Output SchemasDefine the structure of the output using Pydantic models:from
typing import List
from pydantic import BaseModel, Field
from langchain_scrapegraph.tools import SmartScraperTool

class WebsiteInfo(BaseModel):
title: str = Field(description="The main title of the webpage")
description: str = Field(description="The main description or first paragraph")
urls: List[str] = Field(description="The URLs inside the webpage")

# Initialize with schema


tool = SmartScraperTool(llm_output_schema=WebsiteInfo)

result = tool.invoke({
"website_url": "https://www.example.com",
"user_prompt": "Extract the website information"
})

LocalScraperTool
Process HTML content directly with AI extraction:
from langchain_scrapegraph.tools import LocalScraperTool
tool = LocalScraperTool()
result = tool.invoke({
"user_prompt": "Extract all contact information",
"website_html": "<html>...</html>"
})

Using Output Schemasfrom typing import Optional


from pydantic import BaseModel, Field
from langchain_scrapegraph.tools import LocalScraperTool

class CompanyInfo(BaseModel):
name: str = Field(description="The company name")
description: str = Field(description="The company description")
email: Optional[str] = Field(description="Contact email if available")
phone: Optional[str] = Field(description="Contact phone if available")

tool = LocalScraperTool(llm_output_schema=CompanyInfo)

html_content = """
<html>
<body>
<h1>TechCorp Solutions</h1>
<p>We are a leading AI technology company.</p>
<div class="contact">
<p>Email: [email protected]</p>
<p>Phone: (555) 123-4567</p>
</div>
</body>
</html>
"""

result = tool.invoke({
"website_html": html_content,
"user_prompt": "Extract the company information"
})

MarkdownifyTool
Convert any webpage into clean, formatted markdown:
from langchain_scrapegraph.tools import MarkdownifyTool

tool = MarkdownifyTool()
markdown = tool.invoke({"website_url": "https://example.com"})

Example Agent
Create a research agent that can gather and analyze web data:
from langchain.agents import initialize_agent, AgentType
from langchain_scrapegraph.tools import SmartScraperTool
from langchain_openai import ChatOpenAI

# Initialize tools
tools = [
SmartScraperTool(),
]

# Create an agent
agent = initialize_agent(
tools=tools,
llm=ChatOpenAI(temperature=0),
agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION,
verbose=True
)

# Use the agent


response = agent.run("""
Visit example.com, make a summary of the content and extract the main heading
and first paragraph
""")

Configuration
Set your ScrapeGraph API key in your environment:
export SGAI_API_KEY="your-api-key-here"

Or set it programmatically:
import os
os.environ["SGAI_API_KEY"] = "your-api-key-here"

Get your API key from the dashboard


Use Cases
Research AgentsCreate agents that gather and analyze web dataData
CollectionAutomate structured data extraction from websitesContent
ProcessingConvert web content into markdown for further processingInformation
ExtractionExtract specific data points using natural language
Support
Need help with the integration?
GitHub IssuesReport bugs and request featuresDiscord CommunityGet help from our
communityWas this page helpful?YesNoJavaScript SDK🦙
LlamaIndexxgithublinkedinPowered by MintlifyOn this
pageOverviewInstallationAvailable
ToolsSmartScraperToolLocalScraperToolMarkdownifyToolExample AgentConfigurationUse
CasesSupport

------- • -------

https://docs.scrapegraphai.com/integrations/langchain#localscrapertool

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationIntegrations🦜
LangChainHomeCookbookAPI ReferenceOfficial WebsiteCommunityBlogGet
StartedIntroductionDashboardServicesSmartScraperLocalScraperMarkdownifyBrowser
ExtensionsOfficial SDKsPython SDKJavaScript SDKIntegrations🦜 LangChain🦙 LlamaIndex👥
CrewAI PhidataContributeOpen SourceFeedbackResources🕷️ScrapeGraphAI
BadgeIntegrations🦜 LangChainSupercharge your LangChain agents with AI-powered web
scrapingOverview
The LangChain integration enables your agents to extract structured data from
websites using natural language. This powerful combination allows you to build
sophisticated AI agents that can understand and process web content intelligently.
Official LangChain DocumentationView the integration in LangChain’s official
documentation
Installation
Install the package using pip:
pip install langchain-scrapegraph

Available Tools
SmartScraperTool
Extract structured data from any webpage using natural language prompts:
from langchain_scrapegraph.tools import SmartScraperTool
# Initialize the tool (uses SGAI_API_KEY from environment)
tool = SmartscraperTool()

# Extract information using natural language


result = tool.invoke({
"website_url": "https://www.example.com",
"user_prompt": "Extract the main heading and first paragraph"
})

Using Output SchemasDefine the structure of the output using Pydantic models:from
typing import List
from pydantic import BaseModel, Field
from langchain_scrapegraph.tools import SmartScraperTool

class WebsiteInfo(BaseModel):
title: str = Field(description="The main title of the webpage")
description: str = Field(description="The main description or first paragraph")
urls: List[str] = Field(description="The URLs inside the webpage")

# Initialize with schema


tool = SmartScraperTool(llm_output_schema=WebsiteInfo)

result = tool.invoke({
"website_url": "https://www.example.com",
"user_prompt": "Extract the website information"
})

LocalScraperTool
Process HTML content directly with AI extraction:
from langchain_scrapegraph.tools import LocalScraperTool

tool = LocalScraperTool()
result = tool.invoke({
"user_prompt": "Extract all contact information",
"website_html": "<html>...</html>"
})

Using Output Schemasfrom typing import Optional


from pydantic import BaseModel, Field
from langchain_scrapegraph.tools import LocalScraperTool

class CompanyInfo(BaseModel):
name: str = Field(description="The company name")
description: str = Field(description="The company description")
email: Optional[str] = Field(description="Contact email if available")
phone: Optional[str] = Field(description="Contact phone if available")

tool = LocalScraperTool(llm_output_schema=CompanyInfo)

html_content = """
<html>
<body>
<h1>TechCorp Solutions</h1>
<p>We are a leading AI technology company.</p>
<div class="contact">
<p>Email: [email protected]</p>
<p>Phone: (555) 123-4567</p>
</div>
</body>
</html>
"""

result = tool.invoke({
"website_html": html_content,
"user_prompt": "Extract the company information"
})

MarkdownifyTool
Convert any webpage into clean, formatted markdown:
from langchain_scrapegraph.tools import MarkdownifyTool

tool = MarkdownifyTool()
markdown = tool.invoke({"website_url": "https://example.com"})

Example Agent
Create a research agent that can gather and analyze web data:
from langchain.agents import initialize_agent, AgentType
from langchain_scrapegraph.tools import SmartScraperTool
from langchain_openai import ChatOpenAI

# Initialize tools
tools = [
SmartScraperTool(),
]

# Create an agent
agent = initialize_agent(
tools=tools,
llm=ChatOpenAI(temperature=0),
agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION,
verbose=True
)

# Use the agent


response = agent.run("""
Visit example.com, make a summary of the content and extract the main heading
and first paragraph
""")

Configuration
Set your ScrapeGraph API key in your environment:
export SGAI_API_KEY="your-api-key-here"

Or set it programmatically:
import os
os.environ["SGAI_API_KEY"] = "your-api-key-here"

Get your API key from the dashboard


Use Cases
Research AgentsCreate agents that gather and analyze web dataData
CollectionAutomate structured data extraction from websitesContent
ProcessingConvert web content into markdown for further processingInformation
ExtractionExtract specific data points using natural language
Support
Need help with the integration?
GitHub IssuesReport bugs and request featuresDiscord CommunityGet help from our
communityWas this page helpful?YesNoJavaScript SDK🦙
LlamaIndexxgithublinkedinPowered by MintlifyOn this
pageOverviewInstallationAvailable
ToolsSmartScraperToolLocalScraperToolMarkdownifyToolExample AgentConfigurationUse
CasesSupport

------- • -------

https://docs.scrapegraphai.com/integrations/langchain#markdownifytool

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationIntegrations🦜
LangChainHomeCookbookAPI ReferenceOfficial WebsiteCommunityBlogGet
StartedIntroductionDashboardServicesSmartScraperLocalScraperMarkdownifyBrowser
ExtensionsOfficial SDKsPython SDKJavaScript SDKIntegrations🦜 LangChain🦙 LlamaIndex👥
CrewAI PhidataContributeOpen SourceFeedbackResources🕷️ScrapeGraphAI
BadgeIntegrations🦜 LangChainSupercharge your LangChain agents with AI-powered web
scrapingOverview
The LangChain integration enables your agents to extract structured data from
websites using natural language. This powerful combination allows you to build
sophisticated AI agents that can understand and process web content intelligently.
Official LangChain DocumentationView the integration in LangChain’s official
documentation
Installation
Install the package using pip:
pip install langchain-scrapegraph

Available Tools
SmartScraperTool
Extract structured data from any webpage using natural language prompts:
from langchain_scrapegraph.tools import SmartScraperTool

# Initialize the tool (uses SGAI_API_KEY from environment)


tool = SmartscraperTool()

# Extract information using natural language


result = tool.invoke({
"website_url": "https://www.example.com",
"user_prompt": "Extract the main heading and first paragraph"
})

Using Output SchemasDefine the structure of the output using Pydantic models:from
typing import List
from pydantic import BaseModel, Field
from langchain_scrapegraph.tools import SmartScraperTool

class WebsiteInfo(BaseModel):
title: str = Field(description="The main title of the webpage")
description: str = Field(description="The main description or first paragraph")
urls: List[str] = Field(description="The URLs inside the webpage")

# Initialize with schema


tool = SmartScraperTool(llm_output_schema=WebsiteInfo)

result = tool.invoke({
"website_url": "https://www.example.com",
"user_prompt": "Extract the website information"
})

LocalScraperTool
Process HTML content directly with AI extraction:
from langchain_scrapegraph.tools import LocalScraperTool

tool = LocalScraperTool()
result = tool.invoke({
"user_prompt": "Extract all contact information",
"website_html": "<html>...</html>"
})

Using Output Schemasfrom typing import Optional


from pydantic import BaseModel, Field
from langchain_scrapegraph.tools import LocalScraperTool

class CompanyInfo(BaseModel):
name: str = Field(description="The company name")
description: str = Field(description="The company description")
email: Optional[str] = Field(description="Contact email if available")
phone: Optional[str] = Field(description="Contact phone if available")

tool = LocalScraperTool(llm_output_schema=CompanyInfo)

html_content = """
<html>
<body>
<h1>TechCorp Solutions</h1>
<p>We are a leading AI technology company.</p>
<div class="contact">
<p>Email: [email protected]</p>
<p>Phone: (555) 123-4567</p>
</div>
</body>
</html>
"""

result = tool.invoke({
"website_html": html_content,
"user_prompt": "Extract the company information"
})

MarkdownifyTool
Convert any webpage into clean, formatted markdown:
from langchain_scrapegraph.tools import MarkdownifyTool

tool = MarkdownifyTool()
markdown = tool.invoke({"website_url": "https://example.com"})

Example Agent
Create a research agent that can gather and analyze web data:
from langchain.agents import initialize_agent, AgentType
from langchain_scrapegraph.tools import SmartScraperTool
from langchain_openai import ChatOpenAI

# Initialize tools
tools = [
SmartScraperTool(),
]

# Create an agent
agent = initialize_agent(
tools=tools,
llm=ChatOpenAI(temperature=0),
agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION,
verbose=True
)

# Use the agent


response = agent.run("""
Visit example.com, make a summary of the content and extract the main heading
and first paragraph
""")

Configuration
Set your ScrapeGraph API key in your environment:
export SGAI_API_KEY="your-api-key-here"

Or set it programmatically:
import os
os.environ["SGAI_API_KEY"] = "your-api-key-here"

Get your API key from the dashboard


Use Cases
Research AgentsCreate agents that gather and analyze web dataData
CollectionAutomate structured data extraction from websitesContent
ProcessingConvert web content into markdown for further processingInformation
ExtractionExtract specific data points using natural language
Support
Need help with the integration?
GitHub IssuesReport bugs and request featuresDiscord CommunityGet help from our
communityWas this page helpful?YesNoJavaScript SDK🦙
LlamaIndexxgithublinkedinPowered by MintlifyOn this
pageOverviewInstallationAvailable
ToolsSmartScraperToolLocalScraperToolMarkdownifyToolExample AgentConfigurationUse
CasesSupport

------- • -------

https://docs.scrapegraphai.com/integrations/langchain#example-agent

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationIntegrations🦜
LangChainHomeCookbookAPI ReferenceOfficial WebsiteCommunityBlogGet
StartedIntroductionDashboardServicesSmartScraperLocalScraperMarkdownifyBrowser
ExtensionsOfficial SDKsPython SDKJavaScript SDKIntegrations🦜 LangChain🦙 LlamaIndex👥
CrewAI PhidataContributeOpen SourceFeedbackResources🕷️ScrapeGraphAI
BadgeIntegrations🦜 LangChainSupercharge your LangChain agents with AI-powered web
scrapingOverview
The LangChain integration enables your agents to extract structured data from
websites using natural language. This powerful combination allows you to build
sophisticated AI agents that can understand and process web content intelligently.
Official LangChain DocumentationView the integration in LangChain’s official
documentation
Installation
Install the package using pip:
pip install langchain-scrapegraph

Available Tools
SmartScraperTool
Extract structured data from any webpage using natural language prompts:
from langchain_scrapegraph.tools import SmartScraperTool
# Initialize the tool (uses SGAI_API_KEY from environment)
tool = SmartscraperTool()

# Extract information using natural language


result = tool.invoke({
"website_url": "https://www.example.com",
"user_prompt": "Extract the main heading and first paragraph"
})

Using Output SchemasDefine the structure of the output using Pydantic models:from
typing import List
from pydantic import BaseModel, Field
from langchain_scrapegraph.tools import SmartScraperTool

class WebsiteInfo(BaseModel):
title: str = Field(description="The main title of the webpage")
description: str = Field(description="The main description or first paragraph")
urls: List[str] = Field(description="The URLs inside the webpage")

# Initialize with schema


tool = SmartScraperTool(llm_output_schema=WebsiteInfo)

result = tool.invoke({
"website_url": "https://www.example.com",
"user_prompt": "Extract the website information"
})

LocalScraperTool
Process HTML content directly with AI extraction:
from langchain_scrapegraph.tools import LocalScraperTool

tool = LocalScraperTool()
result = tool.invoke({
"user_prompt": "Extract all contact information",
"website_html": "<html>...</html>"
})

Using Output Schemasfrom typing import Optional


from pydantic import BaseModel, Field
from langchain_scrapegraph.tools import LocalScraperTool

class CompanyInfo(BaseModel):
name: str = Field(description="The company name")
description: str = Field(description="The company description")
email: Optional[str] = Field(description="Contact email if available")
phone: Optional[str] = Field(description="Contact phone if available")

tool = LocalScraperTool(llm_output_schema=CompanyInfo)

html_content = """
<html>
<body>
<h1>TechCorp Solutions</h1>
<p>We are a leading AI technology company.</p>
<div class="contact">
<p>Email: [email protected]</p>
<p>Phone: (555) 123-4567</p>
</div>
</body>
</html>
"""

result = tool.invoke({
"website_html": html_content,
"user_prompt": "Extract the company information"
})

MarkdownifyTool
Convert any webpage into clean, formatted markdown:
from langchain_scrapegraph.tools import MarkdownifyTool

tool = MarkdownifyTool()
markdown = tool.invoke({"website_url": "https://example.com"})

Example Agent
Create a research agent that can gather and analyze web data:
from langchain.agents import initialize_agent, AgentType
from langchain_scrapegraph.tools import SmartScraperTool
from langchain_openai import ChatOpenAI

# Initialize tools
tools = [
SmartScraperTool(),
]

# Create an agent
agent = initialize_agent(
tools=tools,
llm=ChatOpenAI(temperature=0),
agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION,
verbose=True
)

# Use the agent


response = agent.run("""
Visit example.com, make a summary of the content and extract the main heading
and first paragraph
""")

Configuration
Set your ScrapeGraph API key in your environment:
export SGAI_API_KEY="your-api-key-here"

Or set it programmatically:
import os
os.environ["SGAI_API_KEY"] = "your-api-key-here"

Get your API key from the dashboard


Use Cases
Research AgentsCreate agents that gather and analyze web dataData
CollectionAutomate structured data extraction from websitesContent
ProcessingConvert web content into markdown for further processingInformation
ExtractionExtract specific data points using natural language
Support
Need help with the integration?
GitHub IssuesReport bugs and request featuresDiscord CommunityGet help from our
communityWas this page helpful?YesNoJavaScript SDK🦙
LlamaIndexxgithublinkedinPowered by MintlifyOn this
pageOverviewInstallationAvailable
ToolsSmartScraperToolLocalScraperToolMarkdownifyToolExample AgentConfigurationUse
CasesSupport

------- • -------

https://docs.scrapegraphai.com/integrations/langchain#configuration

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationIntegrations🦜
LangChainHomeCookbookAPI ReferenceOfficial WebsiteCommunityBlogGet
StartedIntroductionDashboardServicesSmartScraperLocalScraperMarkdownifyBrowser
ExtensionsOfficial SDKsPython SDKJavaScript SDKIntegrations🦜 LangChain🦙 LlamaIndex👥
CrewAI PhidataContributeOpen SourceFeedbackResources🕷️ScrapeGraphAI
BadgeIntegrations🦜 LangChainSupercharge your LangChain agents with AI-powered web
scrapingOverview
The LangChain integration enables your agents to extract structured data from
websites using natural language. This powerful combination allows you to build
sophisticated AI agents that can understand and process web content intelligently.
Official LangChain DocumentationView the integration in LangChain’s official
documentation
Installation
Install the package using pip:
pip install langchain-scrapegraph

Available Tools
SmartScraperTool
Extract structured data from any webpage using natural language prompts:
from langchain_scrapegraph.tools import SmartScraperTool

# Initialize the tool (uses SGAI_API_KEY from environment)


tool = SmartscraperTool()

# Extract information using natural language


result = tool.invoke({
"website_url": "https://www.example.com",
"user_prompt": "Extract the main heading and first paragraph"
})

Using Output SchemasDefine the structure of the output using Pydantic models:from
typing import List
from pydantic import BaseModel, Field
from langchain_scrapegraph.tools import SmartScraperTool

class WebsiteInfo(BaseModel):
title: str = Field(description="The main title of the webpage")
description: str = Field(description="The main description or first paragraph")
urls: List[str] = Field(description="The URLs inside the webpage")

# Initialize with schema


tool = SmartScraperTool(llm_output_schema=WebsiteInfo)

result = tool.invoke({
"website_url": "https://www.example.com",
"user_prompt": "Extract the website information"
})

LocalScraperTool
Process HTML content directly with AI extraction:
from langchain_scrapegraph.tools import LocalScraperTool

tool = LocalScraperTool()
result = tool.invoke({
"user_prompt": "Extract all contact information",
"website_html": "<html>...</html>"
})

Using Output Schemasfrom typing import Optional


from pydantic import BaseModel, Field
from langchain_scrapegraph.tools import LocalScraperTool

class CompanyInfo(BaseModel):
name: str = Field(description="The company name")
description: str = Field(description="The company description")
email: Optional[str] = Field(description="Contact email if available")
phone: Optional[str] = Field(description="Contact phone if available")

tool = LocalScraperTool(llm_output_schema=CompanyInfo)

html_content = """
<html>
<body>
<h1>TechCorp Solutions</h1>
<p>We are a leading AI technology company.</p>
<div class="contact">
<p>Email: [email protected]</p>
<p>Phone: (555) 123-4567</p>
</div>
</body>
</html>
"""

result = tool.invoke({
"website_html": html_content,
"user_prompt": "Extract the company information"
})

MarkdownifyTool
Convert any webpage into clean, formatted markdown:
from langchain_scrapegraph.tools import MarkdownifyTool

tool = MarkdownifyTool()
markdown = tool.invoke({"website_url": "https://example.com"})

Example Agent
Create a research agent that can gather and analyze web data:
from langchain.agents import initialize_agent, AgentType
from langchain_scrapegraph.tools import SmartScraperTool
from langchain_openai import ChatOpenAI

# Initialize tools
tools = [
SmartScraperTool(),
]

# Create an agent
agent = initialize_agent(
tools=tools,
llm=ChatOpenAI(temperature=0),
agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION,
verbose=True
)

# Use the agent


response = agent.run("""
Visit example.com, make a summary of the content and extract the main heading
and first paragraph
""")

Configuration
Set your ScrapeGraph API key in your environment:
export SGAI_API_KEY="your-api-key-here"

Or set it programmatically:
import os
os.environ["SGAI_API_KEY"] = "your-api-key-here"

Get your API key from the dashboard


Use Cases
Research AgentsCreate agents that gather and analyze web dataData
CollectionAutomate structured data extraction from websitesContent
ProcessingConvert web content into markdown for further processingInformation
ExtractionExtract specific data points using natural language
Support
Need help with the integration?
GitHub IssuesReport bugs and request featuresDiscord CommunityGet help from our
communityWas this page helpful?YesNoJavaScript SDK🦙
LlamaIndexxgithublinkedinPowered by MintlifyOn this
pageOverviewInstallationAvailable
ToolsSmartScraperToolLocalScraperToolMarkdownifyToolExample AgentConfigurationUse
CasesSupport

------- • -------

https://docs.scrapegraphai.com/integrations/langchain#use-cases

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationIntegrations🦜
LangChainHomeCookbookAPI ReferenceOfficial WebsiteCommunityBlogGet
StartedIntroductionDashboardServicesSmartScraperLocalScraperMarkdownifyBrowser
ExtensionsOfficial SDKsPython SDKJavaScript SDKIntegrations🦜 LangChain🦙 LlamaIndex👥
CrewAI PhidataContributeOpen SourceFeedbackResources🕷️ScrapeGraphAI
BadgeIntegrations🦜 LangChainSupercharge your LangChain agents with AI-powered web
scrapingOverview
The LangChain integration enables your agents to extract structured data from
websites using natural language. This powerful combination allows you to build
sophisticated AI agents that can understand and process web content intelligently.
Official LangChain DocumentationView the integration in LangChain’s official
documentation
Installation
Install the package using pip:
pip install langchain-scrapegraph

Available Tools
SmartScraperTool
Extract structured data from any webpage using natural language prompts:
from langchain_scrapegraph.tools import SmartScraperTool

# Initialize the tool (uses SGAI_API_KEY from environment)


tool = SmartscraperTool()

# Extract information using natural language


result = tool.invoke({
"website_url": "https://www.example.com",
"user_prompt": "Extract the main heading and first paragraph"
})

Using Output SchemasDefine the structure of the output using Pydantic models:from
typing import List
from pydantic import BaseModel, Field
from langchain_scrapegraph.tools import SmartScraperTool

class WebsiteInfo(BaseModel):
title: str = Field(description="The main title of the webpage")
description: str = Field(description="The main description or first paragraph")
urls: List[str] = Field(description="The URLs inside the webpage")

# Initialize with schema


tool = SmartScraperTool(llm_output_schema=WebsiteInfo)

result = tool.invoke({
"website_url": "https://www.example.com",
"user_prompt": "Extract the website information"
})

LocalScraperTool
Process HTML content directly with AI extraction:
from langchain_scrapegraph.tools import LocalScraperTool

tool = LocalScraperTool()
result = tool.invoke({
"user_prompt": "Extract all contact information",
"website_html": "<html>...</html>"
})

Using Output Schemasfrom typing import Optional


from pydantic import BaseModel, Field
from langchain_scrapegraph.tools import LocalScraperTool

class CompanyInfo(BaseModel):
name: str = Field(description="The company name")
description: str = Field(description="The company description")
email: Optional[str] = Field(description="Contact email if available")
phone: Optional[str] = Field(description="Contact phone if available")

tool = LocalScraperTool(llm_output_schema=CompanyInfo)

html_content = """
<html>
<body>
<h1>TechCorp Solutions</h1>
<p>We are a leading AI technology company.</p>
<div class="contact">
<p>Email: [email protected]</p>
<p>Phone: (555) 123-4567</p>
</div>
</body>
</html>
"""

result = tool.invoke({
"website_html": html_content,
"user_prompt": "Extract the company information"
})

MarkdownifyTool
Convert any webpage into clean, formatted markdown:
from langchain_scrapegraph.tools import MarkdownifyTool

tool = MarkdownifyTool()
markdown = tool.invoke({"website_url": "https://example.com"})

Example Agent
Create a research agent that can gather and analyze web data:
from langchain.agents import initialize_agent, AgentType
from langchain_scrapegraph.tools import SmartScraperTool
from langchain_openai import ChatOpenAI

# Initialize tools
tools = [
SmartScraperTool(),
]

# Create an agent
agent = initialize_agent(
tools=tools,
llm=ChatOpenAI(temperature=0),
agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION,
verbose=True
)

# Use the agent


response = agent.run("""
Visit example.com, make a summary of the content and extract the main heading
and first paragraph
""")

Configuration
Set your ScrapeGraph API key in your environment:
export SGAI_API_KEY="your-api-key-here"

Or set it programmatically:
import os
os.environ["SGAI_API_KEY"] = "your-api-key-here"

Get your API key from the dashboard


Use Cases
Research AgentsCreate agents that gather and analyze web dataData
CollectionAutomate structured data extraction from websitesContent
ProcessingConvert web content into markdown for further processingInformation
ExtractionExtract specific data points using natural language
Support
Need help with the integration?
GitHub IssuesReport bugs and request featuresDiscord CommunityGet help from our
communityWas this page helpful?YesNoJavaScript SDK🦙
LlamaIndexxgithublinkedinPowered by MintlifyOn this
pageOverviewInstallationAvailable
ToolsSmartScraperToolLocalScraperToolMarkdownifyToolExample AgentConfigurationUse
CasesSupport

------- • -------

https://docs.scrapegraphai.com/integrations/langchain#support

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationIntegrations🦜
LangChainHomeCookbookAPI ReferenceOfficial WebsiteCommunityBlogGet
StartedIntroductionDashboardServicesSmartScraperLocalScraperMarkdownifyBrowser
ExtensionsOfficial SDKsPython SDKJavaScript SDKIntegrations🦜 LangChain🦙 LlamaIndex👥
CrewAI PhidataContributeOpen SourceFeedbackResources🕷️ScrapeGraphAI
BadgeIntegrations🦜 LangChainSupercharge your LangChain agents with AI-powered web
scrapingOverview
The LangChain integration enables your agents to extract structured data from
websites using natural language. This powerful combination allows you to build
sophisticated AI agents that can understand and process web content intelligently.
Official LangChain DocumentationView the integration in LangChain’s official
documentation
Installation
Install the package using pip:
pip install langchain-scrapegraph

Available Tools
SmartScraperTool
Extract structured data from any webpage using natural language prompts:
from langchain_scrapegraph.tools import SmartScraperTool

# Initialize the tool (uses SGAI_API_KEY from environment)


tool = SmartscraperTool()

# Extract information using natural language


result = tool.invoke({
"website_url": "https://www.example.com",
"user_prompt": "Extract the main heading and first paragraph"
})

Using Output SchemasDefine the structure of the output using Pydantic models:from
typing import List
from pydantic import BaseModel, Field
from langchain_scrapegraph.tools import SmartScraperTool

class WebsiteInfo(BaseModel):
title: str = Field(description="The main title of the webpage")
description: str = Field(description="The main description or first paragraph")
urls: List[str] = Field(description="The URLs inside the webpage")

# Initialize with schema


tool = SmartScraperTool(llm_output_schema=WebsiteInfo)

result = tool.invoke({
"website_url": "https://www.example.com",
"user_prompt": "Extract the website information"
})
LocalScraperTool
Process HTML content directly with AI extraction:
from langchain_scrapegraph.tools import LocalScraperTool

tool = LocalScraperTool()
result = tool.invoke({
"user_prompt": "Extract all contact information",
"website_html": "<html>...</html>"
})

Using Output Schemasfrom typing import Optional


from pydantic import BaseModel, Field
from langchain_scrapegraph.tools import LocalScraperTool

class CompanyInfo(BaseModel):
name: str = Field(description="The company name")
description: str = Field(description="The company description")
email: Optional[str] = Field(description="Contact email if available")
phone: Optional[str] = Field(description="Contact phone if available")

tool = LocalScraperTool(llm_output_schema=CompanyInfo)

html_content = """
<html>
<body>
<h1>TechCorp Solutions</h1>
<p>We are a leading AI technology company.</p>
<div class="contact">
<p>Email: [email protected]</p>
<p>Phone: (555) 123-4567</p>
</div>
</body>
</html>
"""

result = tool.invoke({
"website_html": html_content,
"user_prompt": "Extract the company information"
})

MarkdownifyTool
Convert any webpage into clean, formatted markdown:
from langchain_scrapegraph.tools import MarkdownifyTool

tool = MarkdownifyTool()
markdown = tool.invoke({"website_url": "https://example.com"})

Example Agent
Create a research agent that can gather and analyze web data:
from langchain.agents import initialize_agent, AgentType
from langchain_scrapegraph.tools import SmartScraperTool
from langchain_openai import ChatOpenAI

# Initialize tools
tools = [
SmartScraperTool(),
]

# Create an agent
agent = initialize_agent(
tools=tools,
llm=ChatOpenAI(temperature=0),
agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION,
verbose=True
)

# Use the agent


response = agent.run("""
Visit example.com, make a summary of the content and extract the main heading
and first paragraph
""")

Configuration
Set your ScrapeGraph API key in your environment:
export SGAI_API_KEY="your-api-key-here"

Or set it programmatically:
import os
os.environ["SGAI_API_KEY"] = "your-api-key-here"

Get your API key from the dashboard


Use Cases
Research AgentsCreate agents that gather and analyze web dataData
CollectionAutomate structured data extraction from websitesContent
ProcessingConvert web content into markdown for further processingInformation
ExtractionExtract specific data points using natural language
Support
Need help with the integration?
GitHub IssuesReport bugs and request featuresDiscord CommunityGet help from our
communityWas this page helpful?YesNoJavaScript SDK🦙
LlamaIndexxgithublinkedinPowered by MintlifyOn this
pageOverviewInstallationAvailable
ToolsSmartScraperToolLocalScraperToolMarkdownifyToolExample AgentConfigurationUse
CasesSupport

------- • -------

https://docs.scrapegraphai.com/integrations/llamaindex#overview

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationIntegrations🦙
LlamaIndexHomeCookbookAPI ReferenceOfficial WebsiteCommunityBlogGet
StartedIntroductionDashboardServicesSmartScraperLocalScraperMarkdownifyBrowser
ExtensionsOfficial SDKsPython SDKJavaScript SDKIntegrations🦜 LangChain🦙 LlamaIndex👥
CrewAI PhidataContributeOpen SourceFeedbackResources🕷️ScrapeGraphAI
BadgeIntegrations🦙 LlamaIndexIntegrate ScrapeGraphAI with LlamaIndex for powerful
data ingestionOverview
This tool integrates ScrapeGraph with LlamaIndex, providing intelligent web
scraping capabilities with structured data extraction.
Official LlamaHub DocumentationView the integration on LlamaHub
Installation
Install the package using pip:
pip install llama-index-tools-scrapegraphai

Usage
First, import and initialize the ScrapegraphToolSpec:
from llama_index.tools.scrapegraph.base import ScrapegraphToolSpec
scrapegraph_tool = ScrapegraphToolSpec()

Available Functions
Smart Scraping (Sync)
Extract structured data using a schema:
from pydantic import BaseModel, Field

class FounderSchema(BaseModel):
name: str = Field(description="Name of the founder")
role: str = Field(description="Role of the founder")
social_media: str = Field(description="Social media URL of the founder")

class ListFoundersSchema(BaseModel):
founders: list[FounderSchema] = Field(description="List of founders")

response = scrapegraph_tool.scrapegraph_smartscraper(
prompt="Extract product information",
url="https://scrapegraphai.com/",
api_key="sgai-***",
schema=ListFoundersSchema,
)

result = response["result"]

for founder in result["founders"]:


print(founder)

Smart Scraping (Async)


Asynchronous version of the smart scraper:
result = await scrapegraph_tool.scrapegraph_smartscraper_async(
prompt="Extract product information",
url="https://example.com/product",
api_key="your-api-key",
schema=schema,
)

Submit Feedback
Provide feedback on extraction results:
response = scrapegraph_tool.scrapegraph_feedback(
request_id="request-id",
api_key="your-api-key",
rating=5,
feedback_text="Great results!",
)

Check Credits
Monitor your API credit usage:
credits = scrapegraph_tool.scrapegraph_get_credits(api_key="your-api-key")

Use Cases
RAG ApplicationsBuild powerful retrieval-augmented generation systemsKnowledge
BasesCreate and maintain up-to-date knowledge basesWeb ResearchAutomate web
research and data collectionContent IndexingIndex and structure web content for
search
Support
Need help with the integration?
GitHub IssuesReport bugs and request featuresDiscord CommunityGet help from our
communityWas this page helpful?YesNo🦜 LangChain👥 CrewAIxgithublinkedinPowered by
MintlifyOn this pageOverviewInstallationUsageAvailable FunctionsSmart Scraping
(Sync)Smart Scraping (Async)Submit FeedbackCheck CreditsUse CasesSupport

------- • -------

https://docs.scrapegraphai.com/integrations/llamaindex#installation

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationIntegrations🦙
LlamaIndexHomeCookbookAPI ReferenceOfficial WebsiteCommunityBlogGet
StartedIntroductionDashboardServicesSmartScraperLocalScraperMarkdownifyBrowser
ExtensionsOfficial SDKsPython SDKJavaScript SDKIntegrations🦜 LangChain🦙 LlamaIndex👥
CrewAI PhidataContributeOpen SourceFeedbackResources🕷️ScrapeGraphAI
BadgeIntegrations🦙 LlamaIndexIntegrate ScrapeGraphAI with LlamaIndex for powerful
data ingestionOverview
This tool integrates ScrapeGraph with LlamaIndex, providing intelligent web
scraping capabilities with structured data extraction.
Official LlamaHub DocumentationView the integration on LlamaHub
Installation
Install the package using pip:
pip install llama-index-tools-scrapegraphai

Usage
First, import and initialize the ScrapegraphToolSpec:
from llama_index.tools.scrapegraph.base import ScrapegraphToolSpec

scrapegraph_tool = ScrapegraphToolSpec()

Available Functions
Smart Scraping (Sync)
Extract structured data using a schema:
from pydantic import BaseModel, Field

class FounderSchema(BaseModel):
name: str = Field(description="Name of the founder")
role: str = Field(description="Role of the founder")
social_media: str = Field(description="Social media URL of the founder")

class ListFoundersSchema(BaseModel):
founders: list[FounderSchema] = Field(description="List of founders")

response = scrapegraph_tool.scrapegraph_smartscraper(
prompt="Extract product information",
url="https://scrapegraphai.com/",
api_key="sgai-***",
schema=ListFoundersSchema,
)

result = response["result"]

for founder in result["founders"]:


print(founder)

Smart Scraping (Async)


Asynchronous version of the smart scraper:
result = await scrapegraph_tool.scrapegraph_smartscraper_async(
prompt="Extract product information",
url="https://example.com/product",
api_key="your-api-key",
schema=schema,
)

Submit Feedback
Provide feedback on extraction results:
response = scrapegraph_tool.scrapegraph_feedback(
request_id="request-id",
api_key="your-api-key",
rating=5,
feedback_text="Great results!",
)

Check Credits
Monitor your API credit usage:
credits = scrapegraph_tool.scrapegraph_get_credits(api_key="your-api-key")

Use Cases
RAG ApplicationsBuild powerful retrieval-augmented generation systemsKnowledge
BasesCreate and maintain up-to-date knowledge basesWeb ResearchAutomate web
research and data collectionContent IndexingIndex and structure web content for
search
Support
Need help with the integration?
GitHub IssuesReport bugs and request featuresDiscord CommunityGet help from our
communityWas this page helpful?YesNo🦜 LangChain👥 CrewAIxgithublinkedinPowered by
MintlifyOn this pageOverviewInstallationUsageAvailable FunctionsSmart Scraping
(Sync)Smart Scraping (Async)Submit FeedbackCheck CreditsUse CasesSupport

------- • -------

https://docs.scrapegraphai.com/integrations/llamaindex#usage

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationIntegrations🦙
LlamaIndexHomeCookbookAPI ReferenceOfficial WebsiteCommunityBlogGet
StartedIntroductionDashboardServicesSmartScraperLocalScraperMarkdownifyBrowser
ExtensionsOfficial SDKsPython SDKJavaScript SDKIntegrations🦜 LangChain🦙 LlamaIndex👥
CrewAI PhidataContributeOpen SourceFeedbackResources🕷️ScrapeGraphAI
BadgeIntegrations🦙 LlamaIndexIntegrate ScrapeGraphAI with LlamaIndex for powerful
data ingestionOverview
This tool integrates ScrapeGraph with LlamaIndex, providing intelligent web
scraping capabilities with structured data extraction.
Official LlamaHub DocumentationView the integration on LlamaHub
Installation
Install the package using pip:
pip install llama-index-tools-scrapegraphai

Usage
First, import and initialize the ScrapegraphToolSpec:
from llama_index.tools.scrapegraph.base import ScrapegraphToolSpec

scrapegraph_tool = ScrapegraphToolSpec()

Available Functions
Smart Scraping (Sync)
Extract structured data using a schema:
from pydantic import BaseModel, Field

class FounderSchema(BaseModel):
name: str = Field(description="Name of the founder")
role: str = Field(description="Role of the founder")
social_media: str = Field(description="Social media URL of the founder")

class ListFoundersSchema(BaseModel):
founders: list[FounderSchema] = Field(description="List of founders")

response = scrapegraph_tool.scrapegraph_smartscraper(
prompt="Extract product information",
url="https://scrapegraphai.com/",
api_key="sgai-***",
schema=ListFoundersSchema,
)

result = response["result"]

for founder in result["founders"]:


print(founder)

Smart Scraping (Async)


Asynchronous version of the smart scraper:
result = await scrapegraph_tool.scrapegraph_smartscraper_async(
prompt="Extract product information",
url="https://example.com/product",
api_key="your-api-key",
schema=schema,
)

Submit Feedback
Provide feedback on extraction results:
response = scrapegraph_tool.scrapegraph_feedback(
request_id="request-id",
api_key="your-api-key",
rating=5,
feedback_text="Great results!",
)

Check Credits
Monitor your API credit usage:
credits = scrapegraph_tool.scrapegraph_get_credits(api_key="your-api-key")

Use Cases
RAG ApplicationsBuild powerful retrieval-augmented generation systemsKnowledge
BasesCreate and maintain up-to-date knowledge basesWeb ResearchAutomate web
research and data collectionContent IndexingIndex and structure web content for
search
Support
Need help with the integration?
GitHub IssuesReport bugs and request featuresDiscord CommunityGet help from our
communityWas this page helpful?YesNo🦜 LangChain👥 CrewAIxgithublinkedinPowered by
MintlifyOn this pageOverviewInstallationUsageAvailable FunctionsSmart Scraping
(Sync)Smart Scraping (Async)Submit FeedbackCheck CreditsUse CasesSupport

------- • -------

https://docs.scrapegraphai.com/integrations/llamaindex#available-functions

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationIntegrations🦙
LlamaIndexHomeCookbookAPI ReferenceOfficial WebsiteCommunityBlogGet
StartedIntroductionDashboardServicesSmartScraperLocalScraperMarkdownifyBrowser
ExtensionsOfficial SDKsPython SDKJavaScript SDKIntegrations🦜 LangChain🦙 LlamaIndex👥
CrewAI PhidataContributeOpen SourceFeedbackResources🕷️ScrapeGraphAI
BadgeIntegrations🦙 LlamaIndexIntegrate ScrapeGraphAI with LlamaIndex for powerful
data ingestionOverview
This tool integrates ScrapeGraph with LlamaIndex, providing intelligent web
scraping capabilities with structured data extraction.
Official LlamaHub DocumentationView the integration on LlamaHub
Installation
Install the package using pip:
pip install llama-index-tools-scrapegraphai

Usage
First, import and initialize the ScrapegraphToolSpec:
from llama_index.tools.scrapegraph.base import ScrapegraphToolSpec

scrapegraph_tool = ScrapegraphToolSpec()

Available Functions
Smart Scraping (Sync)
Extract structured data using a schema:
from pydantic import BaseModel, Field

class FounderSchema(BaseModel):
name: str = Field(description="Name of the founder")
role: str = Field(description="Role of the founder")
social_media: str = Field(description="Social media URL of the founder")

class ListFoundersSchema(BaseModel):
founders: list[FounderSchema] = Field(description="List of founders")

response = scrapegraph_tool.scrapegraph_smartscraper(
prompt="Extract product information",
url="https://scrapegraphai.com/",
api_key="sgai-***",
schema=ListFoundersSchema,
)

result = response["result"]

for founder in result["founders"]:


print(founder)

Smart Scraping (Async)


Asynchronous version of the smart scraper:
result = await scrapegraph_tool.scrapegraph_smartscraper_async(
prompt="Extract product information",
url="https://example.com/product",
api_key="your-api-key",
schema=schema,
)

Submit Feedback
Provide feedback on extraction results:
response = scrapegraph_tool.scrapegraph_feedback(
request_id="request-id",
api_key="your-api-key",
rating=5,
feedback_text="Great results!",
)

Check Credits
Monitor your API credit usage:
credits = scrapegraph_tool.scrapegraph_get_credits(api_key="your-api-key")

Use Cases
RAG ApplicationsBuild powerful retrieval-augmented generation systemsKnowledge
BasesCreate and maintain up-to-date knowledge basesWeb ResearchAutomate web
research and data collectionContent IndexingIndex and structure web content for
search
Support
Need help with the integration?
GitHub IssuesReport bugs and request featuresDiscord CommunityGet help from our
communityWas this page helpful?YesNo🦜 LangChain👥 CrewAIxgithublinkedinPowered by
MintlifyOn this pageOverviewInstallationUsageAvailable FunctionsSmart Scraping
(Sync)Smart Scraping (Async)Submit FeedbackCheck CreditsUse CasesSupport

------- • -------

https://docs.scrapegraphai.com/integrations/llamaindex#smart-scraping-sync

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationIntegrations🦙
LlamaIndexHomeCookbookAPI ReferenceOfficial WebsiteCommunityBlogGet
StartedIntroductionDashboardServicesSmartScraperLocalScraperMarkdownifyBrowser
ExtensionsOfficial SDKsPython SDKJavaScript SDKIntegrations🦜 LangChain🦙 LlamaIndex👥
CrewAI PhidataContributeOpen SourceFeedbackResources🕷️ScrapeGraphAI
BadgeIntegrations🦙 LlamaIndexIntegrate ScrapeGraphAI with LlamaIndex for powerful
data ingestionOverview
This tool integrates ScrapeGraph with LlamaIndex, providing intelligent web
scraping capabilities with structured data extraction.
Official LlamaHub DocumentationView the integration on LlamaHub
Installation
Install the package using pip:
pip install llama-index-tools-scrapegraphai

Usage
First, import and initialize the ScrapegraphToolSpec:
from llama_index.tools.scrapegraph.base import ScrapegraphToolSpec

scrapegraph_tool = ScrapegraphToolSpec()

Available Functions
Smart Scraping (Sync)
Extract structured data using a schema:
from pydantic import BaseModel, Field

class FounderSchema(BaseModel):
name: str = Field(description="Name of the founder")
role: str = Field(description="Role of the founder")
social_media: str = Field(description="Social media URL of the founder")

class ListFoundersSchema(BaseModel):
founders: list[FounderSchema] = Field(description="List of founders")

response = scrapegraph_tool.scrapegraph_smartscraper(
prompt="Extract product information",
url="https://scrapegraphai.com/",
api_key="sgai-***",
schema=ListFoundersSchema,
)

result = response["result"]

for founder in result["founders"]:


print(founder)

Smart Scraping (Async)


Asynchronous version of the smart scraper:
result = await scrapegraph_tool.scrapegraph_smartscraper_async(
prompt="Extract product information",
url="https://example.com/product",
api_key="your-api-key",
schema=schema,
)

Submit Feedback
Provide feedback on extraction results:
response = scrapegraph_tool.scrapegraph_feedback(
request_id="request-id",
api_key="your-api-key",
rating=5,
feedback_text="Great results!",
)

Check Credits
Monitor your API credit usage:
credits = scrapegraph_tool.scrapegraph_get_credits(api_key="your-api-key")

Use Cases
RAG ApplicationsBuild powerful retrieval-augmented generation systemsKnowledge
BasesCreate and maintain up-to-date knowledge basesWeb ResearchAutomate web
research and data collectionContent IndexingIndex and structure web content for
search
Support
Need help with the integration?
GitHub IssuesReport bugs and request featuresDiscord CommunityGet help from our
communityWas this page helpful?YesNo🦜 LangChain👥 CrewAIxgithublinkedinPowered by
MintlifyOn this pageOverviewInstallationUsageAvailable FunctionsSmart Scraping
(Sync)Smart Scraping (Async)Submit FeedbackCheck CreditsUse CasesSupport

------- • -------

https://docs.scrapegraphai.com/integrations/llamaindex#smart-scraping-async

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationIntegrations🦙
LlamaIndexHomeCookbookAPI ReferenceOfficial WebsiteCommunityBlogGet
StartedIntroductionDashboardServicesSmartScraperLocalScraperMarkdownifyBrowser
ExtensionsOfficial SDKsPython SDKJavaScript SDKIntegrations🦜 LangChain🦙 LlamaIndex👥
CrewAI PhidataContributeOpen SourceFeedbackResources🕷️ScrapeGraphAI
BadgeIntegrations🦙 LlamaIndexIntegrate ScrapeGraphAI with LlamaIndex for powerful
data ingestionOverview
This tool integrates ScrapeGraph with LlamaIndex, providing intelligent web
scraping capabilities with structured data extraction.
Official LlamaHub DocumentationView the integration on LlamaHub
Installation
Install the package using pip:
pip install llama-index-tools-scrapegraphai

Usage
First, import and initialize the ScrapegraphToolSpec:
from llama_index.tools.scrapegraph.base import ScrapegraphToolSpec

scrapegraph_tool = ScrapegraphToolSpec()

Available Functions
Smart Scraping (Sync)
Extract structured data using a schema:
from pydantic import BaseModel, Field

class FounderSchema(BaseModel):
name: str = Field(description="Name of the founder")
role: str = Field(description="Role of the founder")
social_media: str = Field(description="Social media URL of the founder")

class ListFoundersSchema(BaseModel):
founders: list[FounderSchema] = Field(description="List of founders")

response = scrapegraph_tool.scrapegraph_smartscraper(
prompt="Extract product information",
url="https://scrapegraphai.com/",
api_key="sgai-***",
schema=ListFoundersSchema,
)

result = response["result"]

for founder in result["founders"]:


print(founder)

Smart Scraping (Async)


Asynchronous version of the smart scraper:
result = await scrapegraph_tool.scrapegraph_smartscraper_async(
prompt="Extract product information",
url="https://example.com/product",
api_key="your-api-key",
schema=schema,
)

Submit Feedback
Provide feedback on extraction results:
response = scrapegraph_tool.scrapegraph_feedback(
request_id="request-id",
api_key="your-api-key",
rating=5,
feedback_text="Great results!",
)

Check Credits
Monitor your API credit usage:
credits = scrapegraph_tool.scrapegraph_get_credits(api_key="your-api-key")

Use Cases
RAG ApplicationsBuild powerful retrieval-augmented generation systemsKnowledge
BasesCreate and maintain up-to-date knowledge basesWeb ResearchAutomate web
research and data collectionContent IndexingIndex and structure web content for
search
Support
Need help with the integration?
GitHub IssuesReport bugs and request featuresDiscord CommunityGet help from our
communityWas this page helpful?YesNo🦜 LangChain👥 CrewAIxgithublinkedinPowered by
MintlifyOn this pageOverviewInstallationUsageAvailable FunctionsSmart Scraping
(Sync)Smart Scraping (Async)Submit FeedbackCheck CreditsUse CasesSupport

------- • -------

https://docs.scrapegraphai.com/integrations/llamaindex#submit-feedback

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationIntegrations🦙
LlamaIndexHomeCookbookAPI ReferenceOfficial WebsiteCommunityBlogGet
StartedIntroductionDashboardServicesSmartScraperLocalScraperMarkdownifyBrowser
ExtensionsOfficial SDKsPython SDKJavaScript SDKIntegrations🦜 LangChain🦙 LlamaIndex👥
CrewAI PhidataContributeOpen SourceFeedbackResources🕷️ScrapeGraphAI
BadgeIntegrations🦙 LlamaIndexIntegrate ScrapeGraphAI with LlamaIndex for powerful
data ingestionOverview
This tool integrates ScrapeGraph with LlamaIndex, providing intelligent web
scraping capabilities with structured data extraction.
Official LlamaHub DocumentationView the integration on LlamaHub
Installation
Install the package using pip:
pip install llama-index-tools-scrapegraphai

Usage
First, import and initialize the ScrapegraphToolSpec:
from llama_index.tools.scrapegraph.base import ScrapegraphToolSpec

scrapegraph_tool = ScrapegraphToolSpec()

Available Functions
Smart Scraping (Sync)
Extract structured data using a schema:
from pydantic import BaseModel, Field

class FounderSchema(BaseModel):
name: str = Field(description="Name of the founder")
role: str = Field(description="Role of the founder")
social_media: str = Field(description="Social media URL of the founder")

class ListFoundersSchema(BaseModel):
founders: list[FounderSchema] = Field(description="List of founders")

response = scrapegraph_tool.scrapegraph_smartscraper(
prompt="Extract product information",
url="https://scrapegraphai.com/",
api_key="sgai-***",
schema=ListFoundersSchema,
)

result = response["result"]

for founder in result["founders"]:


print(founder)
Smart Scraping (Async)
Asynchronous version of the smart scraper:
result = await scrapegraph_tool.scrapegraph_smartscraper_async(
prompt="Extract product information",
url="https://example.com/product",
api_key="your-api-key",
schema=schema,
)

Submit Feedback
Provide feedback on extraction results:
response = scrapegraph_tool.scrapegraph_feedback(
request_id="request-id",
api_key="your-api-key",
rating=5,
feedback_text="Great results!",
)

Check Credits
Monitor your API credit usage:
credits = scrapegraph_tool.scrapegraph_get_credits(api_key="your-api-key")

Use Cases
RAG ApplicationsBuild powerful retrieval-augmented generation systemsKnowledge
BasesCreate and maintain up-to-date knowledge basesWeb ResearchAutomate web
research and data collectionContent IndexingIndex and structure web content for
search
Support
Need help with the integration?
GitHub IssuesReport bugs and request featuresDiscord CommunityGet help from our
communityWas this page helpful?YesNo🦜 LangChain👥 CrewAIxgithublinkedinPowered by
MintlifyOn this pageOverviewInstallationUsageAvailable FunctionsSmart Scraping
(Sync)Smart Scraping (Async)Submit FeedbackCheck CreditsUse CasesSupport

------- • -------

https://docs.scrapegraphai.com/integrations/llamaindex#check-credits

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationIntegrations🦙
LlamaIndexHomeCookbookAPI ReferenceOfficial WebsiteCommunityBlogGet
StartedIntroductionDashboardServicesSmartScraperLocalScraperMarkdownifyBrowser
ExtensionsOfficial SDKsPython SDKJavaScript SDKIntegrations🦜 LangChain🦙 LlamaIndex👥
CrewAI PhidataContributeOpen SourceFeedbackResources🕷️ScrapeGraphAI
BadgeIntegrations🦙 LlamaIndexIntegrate ScrapeGraphAI with LlamaIndex for powerful
data ingestionOverview
This tool integrates ScrapeGraph with LlamaIndex, providing intelligent web
scraping capabilities with structured data extraction.
Official LlamaHub DocumentationView the integration on LlamaHub
Installation
Install the package using pip:
pip install llama-index-tools-scrapegraphai

Usage
First, import and initialize the ScrapegraphToolSpec:
from llama_index.tools.scrapegraph.base import ScrapegraphToolSpec

scrapegraph_tool = ScrapegraphToolSpec()
Available Functions
Smart Scraping (Sync)
Extract structured data using a schema:
from pydantic import BaseModel, Field

class FounderSchema(BaseModel):
name: str = Field(description="Name of the founder")
role: str = Field(description="Role of the founder")
social_media: str = Field(description="Social media URL of the founder")

class ListFoundersSchema(BaseModel):
founders: list[FounderSchema] = Field(description="List of founders")

response = scrapegraph_tool.scrapegraph_smartscraper(
prompt="Extract product information",
url="https://scrapegraphai.com/",
api_key="sgai-***",
schema=ListFoundersSchema,
)

result = response["result"]

for founder in result["founders"]:


print(founder)

Smart Scraping (Async)


Asynchronous version of the smart scraper:
result = await scrapegraph_tool.scrapegraph_smartscraper_async(
prompt="Extract product information",
url="https://example.com/product",
api_key="your-api-key",
schema=schema,
)

Submit Feedback
Provide feedback on extraction results:
response = scrapegraph_tool.scrapegraph_feedback(
request_id="request-id",
api_key="your-api-key",
rating=5,
feedback_text="Great results!",
)

Check Credits
Monitor your API credit usage:
credits = scrapegraph_tool.scrapegraph_get_credits(api_key="your-api-key")

Use Cases
RAG ApplicationsBuild powerful retrieval-augmented generation systemsKnowledge
BasesCreate and maintain up-to-date knowledge basesWeb ResearchAutomate web
research and data collectionContent IndexingIndex and structure web content for
search
Support
Need help with the integration?
GitHub IssuesReport bugs and request featuresDiscord CommunityGet help from our
communityWas this page helpful?YesNo🦜 LangChain👥 CrewAIxgithublinkedinPowered by
MintlifyOn this pageOverviewInstallationUsageAvailable FunctionsSmart Scraping
(Sync)Smart Scraping (Async)Submit FeedbackCheck CreditsUse CasesSupport
------- • -------

https://docs.scrapegraphai.com/integrations/llamaindex#use-cases

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationIntegrations🦙
LlamaIndexHomeCookbookAPI ReferenceOfficial WebsiteCommunityBlogGet
StartedIntroductionDashboardServicesSmartScraperLocalScraperMarkdownifyBrowser
ExtensionsOfficial SDKsPython SDKJavaScript SDKIntegrations🦜 LangChain🦙 LlamaIndex👥
CrewAI PhidataContributeOpen SourceFeedbackResources🕷️ScrapeGraphAI
BadgeIntegrations🦙 LlamaIndexIntegrate ScrapeGraphAI with LlamaIndex for powerful
data ingestionOverview
This tool integrates ScrapeGraph with LlamaIndex, providing intelligent web
scraping capabilities with structured data extraction.
Official LlamaHub DocumentationView the integration on LlamaHub
Installation
Install the package using pip:
pip install llama-index-tools-scrapegraphai

Usage
First, import and initialize the ScrapegraphToolSpec:
from llama_index.tools.scrapegraph.base import ScrapegraphToolSpec

scrapegraph_tool = ScrapegraphToolSpec()

Available Functions
Smart Scraping (Sync)
Extract structured data using a schema:
from pydantic import BaseModel, Field

class FounderSchema(BaseModel):
name: str = Field(description="Name of the founder")
role: str = Field(description="Role of the founder")
social_media: str = Field(description="Social media URL of the founder")

class ListFoundersSchema(BaseModel):
founders: list[FounderSchema] = Field(description="List of founders")

response = scrapegraph_tool.scrapegraph_smartscraper(
prompt="Extract product information",
url="https://scrapegraphai.com/",
api_key="sgai-***",
schema=ListFoundersSchema,
)

result = response["result"]

for founder in result["founders"]:


print(founder)

Smart Scraping (Async)


Asynchronous version of the smart scraper:
result = await scrapegraph_tool.scrapegraph_smartscraper_async(
prompt="Extract product information",
url="https://example.com/product",
api_key="your-api-key",
schema=schema,
)
Submit Feedback
Provide feedback on extraction results:
response = scrapegraph_tool.scrapegraph_feedback(
request_id="request-id",
api_key="your-api-key",
rating=5,
feedback_text="Great results!",
)

Check Credits
Monitor your API credit usage:
credits = scrapegraph_tool.scrapegraph_get_credits(api_key="your-api-key")

Use Cases
RAG ApplicationsBuild powerful retrieval-augmented generation systemsKnowledge
BasesCreate and maintain up-to-date knowledge basesWeb ResearchAutomate web
research and data collectionContent IndexingIndex and structure web content for
search
Support
Need help with the integration?
GitHub IssuesReport bugs and request featuresDiscord CommunityGet help from our
communityWas this page helpful?YesNo🦜 LangChain👥 CrewAIxgithublinkedinPowered by
MintlifyOn this pageOverviewInstallationUsageAvailable FunctionsSmart Scraping
(Sync)Smart Scraping (Async)Submit FeedbackCheck CreditsUse CasesSupport

------- • -------

https://docs.scrapegraphai.com/integrations/llamaindex#support

ScrapeGraphAI home
pageSearch...StatusSupportDashboardDashboardSearch...NavigationIntegrations🦙
LlamaIndexHomeCookbookAPI ReferenceOfficial WebsiteCommunityBlogGet
StartedIntroductionDashboardServicesSmartScraperLocalScraperMarkdownifyBrowser
ExtensionsOfficial SDKsPython SDKJavaScript SDKIntegrations🦜 LangChain🦙 LlamaIndex👥
CrewAI PhidataContributeOpen SourceFeedbackResources🕷️ScrapeGraphAI
BadgeIntegrations🦙 LlamaIndexIntegrate ScrapeGraphAI with LlamaIndex for powerful
data ingestionOverview
This tool integrates ScrapeGraph with LlamaIndex, providing intelligent web
scraping capabilities with structured data extraction.
Official LlamaHub DocumentationView the integration on LlamaHub
Installation
Install the package using pip:
pip install llama-index-tools-scrapegraphai

Usage
First, import and initialize the ScrapegraphToolSpec:
from llama_index.tools.scrapegraph.base import ScrapegraphToolSpec

scrapegraph_tool = ScrapegraphToolSpec()

Available Functions
Smart Scraping (Sync)
Extract structured data using a schema:
from pydantic import BaseModel, Field

class FounderSchema(BaseModel):
name: str = Field(description="Name of the founder")
role: str = Field(description="Role of the founder")
social_media: str = Field(description="Social media URL of the founder")
class ListFoundersSchema(BaseModel):
founders: list[FounderSchema] = Field(description="List of founders")

response = scrapegraph_tool.scrapegraph_smartscraper(
prompt="Extract product information",
url="https://scrapegraphai.com/",
api_key="sgai-***",
schema=ListFoundersSchema,
)

result = response["result"]

for founder in result["founders"]:


print(founder)

Smart Scraping (Async)


Asynchronous version of the smart scraper:
result = await scrapegraph_tool.scrapegraph_smartscraper_async(
prompt="Extract product information",
url="https://example.com/product",
api_key="your-api-key",
schema=schema,
)

Submit Feedback
Provide feedback on extraction results:
response = scrapegraph_tool.scrapegraph_feedback(
request_id="request-id",
api_key="your-api-key",
rating=5,
feedback_text="Great results!",
)

Check Credits
Monitor your API credit usage:
credits = scrapegraph_tool.scrapegraph_get_credits(api_key="your-api-key")

Use Cases
RAG ApplicationsBuild powerful retrieval-augmented generation systemsKnowledge
BasesCreate and maintain up-to-date knowledge basesWeb ResearchAutomate web
research and data collectionContent IndexingIndex and structure web content for
search
Support
Need help with the integration?
GitHub IssuesReport bugs and request featuresDiscord CommunityGet help from our
communityWas this page helpful?YesNo🦜 LangChain👥 CrewAIxgithublinkedinPowered by
MintlifyOn this pageOverviewInstallationUsageAvailable FunctionsSmart Scraping
(Sync)Smart Scraping (Async)Submit FeedbackCheck CreditsUse CasesSupport

------- • -------

You might also like