Thoughts

Trying New Things

What to do when you need data from a website but it does not have an API?

An Application Programming Interface provides a set of definitions for interacting with data on a server.

For sites without API, the only way is to visit it through a browser and scrape the data manually. Doing it by hand is indeed repetitive. So there are solutions that utilize a command-based browser (a browser that can be interfaced from the console) to automatically navigate a website and scrape data from it.

After spending time yesterday to fiddle with PhantomJS, it turned out unsuccessful because of some IPv4/IPv6 issue, which none Google search could solve. The target site rejected my request to proceed to the success page, and redirected me to a warning page with no data to scrape.

I was on the verge of giving up…

A no is unacceptable

That’s what one of my colleagues said. I am going to try another browser automation tool, Selenium or CapserJS. Fingers crossed.

#summer-2017 #uber-internship