For my app Dunbar I needed to scrape person information. The goal was to get profile picture options to show them in the app. It wasn't easy. A whole new world opened up to me. The world of data scraping!

Puppeteer (Node JS) is a tool to do automate browsing using a Headless browser. I created a bot that scrapes Facebook and LinkedIn Images with it. See this repo for a working example. The biggest thing I had to learn was CSS Selectors in order to obtain information from web pages.

Unfortunately, after hours and hours of work, I noticed that especially the websites I wanted to scrape have anti-scraping protection in place. There is rate limiting and sometimes it shows an auth wall. How to counteract this?

https://proxycrawl.com/ is a paid SDK  to set a proxy in front of any webpage.

Apify WebScraper is a web scraper that has many features baked in to bypass anti-scraping protection.

https://serpapi.com/ is an api for Google Search.