Facebook acquired WhatsApp for 22 billion US dollars or 55 dollars per user in 2014. What was the reason behind this acquisition? Get more user data! Diverse, representative, good quality data is the lifeblood of an analytics pipeline.
The number of websites on the internet is estimated to be around 2 billion. Web scraping turns the entire world wide web into your data set. In this webinar, we will introduce how to scrape a website using the BeautifulSoup package in Python. We will discuss how to navigate the HTML DOM to find data that interests you, some best practices, the legality of web scraping, and briefly touch on how to build and automate a web scraper on the cloud using Azure Functions.
Arham Noman is a Data Scientist at Data Science Dojo. He has worked on a variety of projects ranging from building cloud-based machine learning pipelines to indexing and extracting insights from large unstructured datasets. Arham is also an instructor at Data Science Dojo and takes joy in practicing the “Data Science for everyone” philosophy through his sessions.