web-scraping Archives

Programming, Python

IT Nursery

Web scraping with Python [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers. We don’t allow questions seeking recommendations for books, ...

June 4, 2022
0 Comments

Programming, Python

IT Nursery

How to save an image locally using Python whose URL address I already know?

I know the URL of an image on Internet. e.g. http://www.digimouth.com/news/media/2011/09/google-logo.jpg, which contains the logo of Google. Now, how can I download this ...

June 2, 2022
0 Comments

Java, Programming

IT Nursery

How can I efficiently parse HTML with Java?

I do a lot of HTML parsing in my line of work. Up until now, I was using the HtmlUnit headless browser for ...

May 31, 2022
0 Comments

javascript, Programming

IT Nursery

How can I pass variable into an evaluate function?

I’m trying to pass a variable into a page.evaluate() function in Puppeteer, but when I use the following very simplified example, the variable ...

May 27, 2022
0 Comments

Scraping: SSL: CERTIFICATE_VERIFY_FAILED error for http://en.wikipedia.org

I’m practicing the code from ‘Web Scraping with Python’, and I keep having this certificate problem: from urllib.request import urlopen from bs4 import ...

May 26, 2022
0 Comments

Web-scraping JavaScript page with Python

I’m trying to develop a simple web scraper. I want to extract text without the HTML code. It works on plain HTML, but ...

May 22, 2022
0 Comments

How can I get the Google cache age of any URL or web page? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers. Want to improve this question? Update the question ...

May 20, 2022
0 Comments

Headless Browser and scraping – solutions [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers. Want to improve this question? Update the question ...

May 10, 2022
0 Comments

How to find elements by class

I’m having trouble parsing HTML elements with “class” attribute using Beautifulsoup. The code looks like this soup = BeautifulSoup(sdata) mydivs = soup.findAll('div') for ...

May 1, 2022
0 Comments

Which HTML Parser is the best?

Self plug: I have just released a new Java HTML parser: jsoup. I mention it here because I think it will do what you ...

April 7, 2022
0 Comments