from mechanize import Browser
br = Browser()
br.open('http://somewebpage')
html = br.response().readlines()
for line in html:
  print line

When printing a line in an HTML file, I’m trying to find a way to only show the contents of each HTML element and not the formatting itself. If it finds '<a href="https://stackoverflow.com/questions/753052/whatever.com">some text</a>', it will only print ‘some text’, '<b>hello</b>' prints ‘hello’, etc. How would one go about doing this?

28 Answers
28

Tags:

Leave a Reply

Your email address will not be published. Required fields are marked *