python - Extracting website (URL) from Google local search when URL not found in source? -

- July 15, 2014

i'm looking extract (with webdriver, xpath, css selector, class or id) url lives behind each of website images in google local search results page such this

when mouseover of these, can see url reached if click image. yet if view full page source , search of these urls, they're not found. @ source around 1 of images:

suggest urls perhaps read in dynamically, though knowledge of web design ends. possible construct xpath or css selector or indeed plain-text search these urls?

clarification: when url, mean ultimate urls. mouseover of website images , you'll see urls such bodinbalanceny.com, lamchiropractic.com etc. – these urls i'm looking extract.

you can use urlparse. once fetch href attribute, append "https://www.google.com" , try code below.

>>> import urlparse >>> url = """https://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&cad=rja&uact=8&ved=0cbaqgu8wagovchmi6c6mhpvjyaivqyeuch0eiaai&url=http%3a%2f%2fwww.taihealthsolutions.com%2f&usg=afqjcnhhovnrx0zdxz1cu4p2xiueffczta&bvm=bv.105841590,d.dgo""" >>> parsed = urlparse.urlparse(url) >>> print urlparse.parse_qs(parsed.query)['url'][0] http://www.taihealthsolutions.com/

note: python 2.x. python 3, code different.

Search This Blog

WIKI

python - Extracting website (URL) from Google local search when URL not found in source? -

Comments

Post a Comment

Popular posts from this blog

jquery - ReferenceError: CKEDITOR is not defined -

javascript - Chart.js (Radar Chart) different scaleLineColor for each scaleLine -

java - Android – MapFragment overlay button shadow, just like MyLocation button -