Python: find_all in Beautiful soup does not return what is expected -
i have html this:
<ul class='whs-nw m-0 items'> <li> <a href='/news/stocks-hold-slight-gains-amid-140642829.html' class='d-b fz-s fw-400' data-ylk='rspns:nav;t3:sub0;elm:hdln;elmt:ct;itc:0;pkgt:15;g:e3b49674-fd8a-3acb-9395-4ac0811af672;ct:1;cpos:2;'> <div class='p-0 whs-n'> <div class='m-0 pt-2 ov-h'> <p class='m-0 d-i'>dow closes down more 150 wal-mart, boeing weigh</p> </div> </div> </a> </li> ... </ul>
i trying use beautifulsoup
exctract /news/stocks-hold-slight-gains-amid-140642829.html
, doing this:
soup = beautifulsoup(html) tmp= soup.find_all('ul', attrs={'class' : 'whs-nw m-0 items'})
but tmp
empty when @ it. doing wrong?
for reference page trying scrape here.
try tmp= soup.findall('ul', {'class' : 'whs-nw m-0 items'})
or tmp= soup.find_all('ul', attrs={'class' : 'whs-nw m-0 items'})
working code-
import urllib2 bs4 import beautifulsoup response = urllib2.urlopen('http://finance.yahoo.com/') html = response.read() soup = beautifulsoup(html, 'html.parser') tmp= soup.findall('ul', {'class' : 'whs-nw m-0 items'}) in tmp: print i.get_text()
Comments
Post a Comment