Wednesday 3:20 PM–5:20 PM in Track 3 - Rainier

Python Web Sraping

Lingqiang Kong

Audience level:
Novice

Description

When you do data science projects, it's common to need to find data from the web. At Metis, one of our projects focuses on data collection using web scraping. In this tutorial, you will learn to write your own script to retrieve and extract information programmatically using Python packages and explore ways to extract data from a website so that you will have a fully functional Python web scrape.

Abstract

Detailed Abstract:

In this tutorial, we will start with the understanding the basic components of web pages, HTML, CSS, JS etc. Then we will start building a fully functional web scraper in python: - retrieve page content through GET request - parsing HTML with BeautifulSoup - searching for instances through tags, classes and ids - searching for instances using CSS selectors - extracting and organizing information from the page - combining data into a dataframe and

Extra credit: - Ways to scrape from dynamic web pages - Retrieving data using the Selenium web driver - Interactive data retrieval: search bars

Hands on: - Build a web scraper to retrieve top 100 grossing movies from boxofficemojo.com

Subscribe to Receive PyData Updates

Subscribe

Tickets

Get Now