I really wanted to write a guide for this myself but didn’t get the time. Here Naren arya wrote a great post and I think that you should definitely give it a look.
We all scraped web pages.HTML content returned as response has our data and we scrape it for fetching certain results.If web page has JavaScript implementation, original data is obtained after rendering process. When we use normal requests package in that situation then responses those are returned contains no data in them.Browsers know how to render and display the final result,but how a program can know?. So I came with a power pack solution to scrape any JavaScript rendered website very easily.
Many of us use below libraries to perform scraping.
1)Lxml
2)BeautifulSoup
I don’t mention scrapy or dragline frameworks here since underlying basic scraper is lxml .My favorite one is lxml.why? ,It has the element traversal methods rather than relying on regular expressions methodology like BeautifulSoup.Here I am going to take a very interesting example.I am so amazed after finding that ,my article is appeared in recent PyCoders weekly issue…
View original post 681 more words