Webscraping with LLMs

    Up-to-date web scraping combined with LLMs.

    LLM for Devs

    Collection thumbnail

    Recently added

    Advanced Web Crawling Extraction with Firecrawl /extract
    Lesson 4

    Advanced Web Crawling Extraction with Firecrawl /extract

    This lesson demonstrates efficient web scraping techniques using the Firecrawl library, focusing on extracting AI model pricing data from multiple websites concurrently. By comparing individual and batch extraction methods, it highlights the significant speed improvements achieved through asynchronous programming and parallel processing.

    15mFeb 5, 2025
    Member
    Firecrawl /extract vs scrape + LLM extract
    Lesson 3

    Firecrawl /extract vs scrape + LLM extract

    Master web data extraction with Firecrawl's powerful `/extract` feature, leveraging AI to transform websites into structured JSON data using simple prompts and schemas. Explore the user-friendly playground API, compare different extraction methods, and efficiently validate your data using Pydantic for streamlined workflows.

    13mFeb 3, 2025
    Free
    Scrape any website with OpenAI Functions & LangChain
    Lesson 11

    Scrape any website with OpenAI Functions & LangChain

    This lesson teaches you to build a Python web scraping and data extraction system using AI. It covers techniques like using Playwright for efficient scraping, Langchain for LLM-powered information extraction, and Pydantic for schema-based data validation, resulting in a robust and adaptable system.

    24mJan 11, 2025
    Free
    Batch Scrape URLs instead of one at a time
    Lesson 10

    Batch Scrape URLs instead of one at a time

    This lesson demonstrates efficient web scraping of Anthropic's job postings using the `firecrawl` library, extracting key details like job titles, skills, and salary ranges. The asynchronous approach, leveraging `asyncio`, allows for concurrent processing of multiple URLs, significantly improving the speed and efficiency of data extraction.

    6mDec 4, 2024
    Member
    Agentically scrape the web with Firecrawl & LangGraph (LangChain)
    Lesson 9

    Agentically scrape the web with Firecrawl & LangGraph (LangChain)

    This lesson teaches you to build efficient web scraping agents using Python libraries `langgraph` and `firecrawl-py`. By combining agent-based design with a state machine, you'll learn to extract specific product information from websites like Canada Goose, handling errors and optimizing for speed.

    15mNov 3, 2024
    Free

    All lessons

    Why is web scraping popular with LLMs now?
    Lesson 1

    Why is web scraping popular with LLMs now?

    This lesson explores AI-powered web scraping tools, comparing user-friendly platforms like ScrapeGraphAI and Tavily with the more advanced, customizable Firecrawl API. Key features, pricing models, and integrations with LLMs are discussed to help viewers choose the best tool for their data extraction needs.

    5mOct 29, 2024
    Member
    Baby step: scrape your first website and pipe to an LLM
    Lesson 2

    Baby step: scrape your first website and pipe to an LLM

    This lesson teaches you how to use Python's Firecrawl package to scrape pricing data from Stripe and Paddle, then uses a large language model to compare their features and costs. The tutorial covers web scraping techniques, LLM integration, and a detailed analysis of Stripe and Paddle's pricing models, highlighting key differences and advantages.

    11mOct 31, 2024
    Member
    Firecrawl /extract vs scrape + LLM extract
    Lesson 3

    Firecrawl /extract vs scrape + LLM extract

    Master web data extraction with Firecrawl's powerful `/extract` feature, leveraging AI to transform websites into structured JSON data using simple prompts and schemas. Explore the user-friendly playground API, compare different extraction methods, and efficiently validate your data using Pydantic for streamlined workflows.

    13mFeb 3, 2025
    Free
    Advanced Web Crawling Extraction with Firecrawl /extract
    Lesson 4

    Advanced Web Crawling Extraction with Firecrawl /extract

    This lesson demonstrates efficient web scraping techniques using the Firecrawl library, focusing on extracting AI model pricing data from multiple websites concurrently. By comparing individual and batch extraction methods, it highlights the significant speed improvements achieved through asynchronous programming and parallel processing.

    15mFeb 5, 2025
    Member
    Onboard new users by scraping their sites and extract required info with LLM
    Lesson 5

    Onboard new users by scraping their sites and extract required info with LLM

    This lesson showcases Firecrawl, a user-friendly web scraping SaaS, integrating seamlessly with Python and LLMs for efficient data extraction. Learn how to quickly onboard customers by automating data retrieval from websites like Whit's Custard, leveraging Firecrawl's frequent updates and ease of use for streamlined workflows.

    24mOct 29, 2024
    Member
    Scrape interactive sites like Perplexity or Meetup.com with Actions from Firecrawl
    Lesson 6

    Scrape interactive sites like Perplexity or Meetup.com with Actions from Firecrawl

    This lesson teaches you how to use Firecrawl for web scraping, focusing on extracting event data from websites like Meetup, Amazon, and Perplexity. It also demonstrates how to combine web scraping with LLMs for OCR and data analysis, overcoming challenges like unreliable website structures and improving data extraction accuracy.

    16mOct 29, 2024
    Member
    Don't wait and poll crawl jobs - use webhooks to get notified of when they're done
    Lesson 7

    Don't wait and poll crawl jobs - use webhooks to get notified of when they're done

    This lesson teaches efficient web scraping using Firecrawl and webhooks, eliminating inefficient polling methods. It demonstrates building a FastAPI server to receive real-time job completion notifications from Firecrawl, enabling asynchronous processing and analysis of scraped data, such as comparing Stripe and Paddle pricing with an LLM.

    7mOct 31, 2024
    Member
    Evaluate several software solutions at once with Firecrawl's crawl and LLM for judging
    Lesson 8

    Evaluate several software solutions at once with Firecrawl's crawl and LLM for judging

    This lesson teaches you how to rapidly evaluate software vendors using Python, leveraging the Firecrawl API to automate web scraping and either Groq or OpenAI to analyze gathered data against predefined criteria (like SOC 2 compliance and F500 testimonials). The process dramatically accelerates the software selection process compared to manual research, focusing on platforms like Drata, Vanta, and Secureframe.

    9mNov 1, 2024
    Member
    • Previous
    • 1
    • 2
    • Next