Scrapy: Powerful Web Scraping & Crawling with Python
Python Scrapy Tutorial – Learn how to scrape websites and build a powerful web crawler using Scrapy, Splash and Python
What you’ll learn
-
Creating a web crawler in Scrapy
-
Crawling a single or multiple pages and scrape data
-
Deploying & Scheduling Spiders to ScrapingHub
-
Logging into Websites with Scrapy
-
Running Scrapy as a Standalone Script
-
Integrating Splash with Scrapy to scrape JavaScript rendered websites
-
Using Scrapy with Selenium in Special Cases, e.g. to Scrape JavaScript Driven Web Pages
-
Building Scrapy Advanced Spider
-
More functions that Scrapy offers after Spider is Done with Scraping
-
Editing and Using Scrapy Parameters
-
Exporting data extracted by Scrapy into CSV, Excel, XML, or JSON files
-
Storing data extracted by Scrapy into MySQL and MongoDB databases
-
Several real-life web scraping projects, including Craigslist, LinkedIn and many others
-
Python source code for all exercises in this Scrapy tutorial can be downloaded
-
Q&A board to send your questions and get them answered quickly
Requirements
-
Python Level: Intermediate. This Scrapy tutorial assumes that you already know the basics of writing simple Python programs and that you are generally familiar with Python’s core features (data structures, file handling, functions, classes, modules, common libraries, etc.).
-
Python 2.7+ or Python 3.3+
-
Any operating system (Linux, Mac, Windows) is good.
-
Positiveness and willingness to learn new things and to ask questions (if any) at the Q&A board of the course.
-
If you do not know what Scrapy is or why you should use it, please read the course description and watch the preview lectures BEFORE joining the course.
Who this course is for:
- This Scrapy tutorial is meant for those who are familiar with Python and want to learn how to create an efficient web crawler and scraper to navigate through websites and scrape content from pages that contain useful information.
- NEW Update: This Scrapy course now includes a dedicated section about Splash and how to use it with Scrapy to extract data from JavaScript websites.