Using urlopen I can get the html of the page, but a crucial part is missing

I am trying to make a script that gets similar images from google using a url, using a part from this code.

The problem is, that I want to get to this link, because from it I can get to the images themselves by cloicking on the “search by image” link, but when I use the script, I get the exact same page, but without the “search by image” link.

I would like to know why and if there is a way to fix it.

Thanks a lot in advance!

P.S. Here’s the code

import os
from urllib2 import Request, urlopen
from cookielib import LWPCookieJar

USER_AGENT = r"Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.0)"
LOCAL_PATH = r"C:scriptsgoogle_search"
COOKIE_JAR_FILE = r".google-cookie"

class google_search(object):
    def cleanup(self):
        if os.path.isfile(self.cookie_jar_path):
            os.remove(self.cookie_jar_path)

        os.chdir(LOCAL_PATH)
        for html in os.listdir("."):
            if html.endswith(".html"):
                os.remove(html)

    def __init__(self, cookie_jar_path):
        self.cookie_jar_path = cookie_jar_path
        self.cookie_jar = LWPCookieJar(self.cookie_jar_path)
        self.counter = 0
        self.cleanup()
        try:
            cookie.load()
        except Exception:
            pass


    def get_html(self, url):
        request = Request(url = url)

        request.add_header("User-Agent", USER_AGENT)
        self.cookie_jar.add_cookie_header(request)
        response = urlopen(request)
        self.cookie_jar.extract_cookies(response, request)
        html_response = response.read()
        response.close()
        self.cookie_jar.save()
        return html_response


def main():
    url_2 = r"http://www.google.com/search?hl=en&q=http%3A%2F%2Fi.imgur.com%2FqGRxTNA.jpg&btnG=Google+Search"
    search = google_search(os.path.join(LOCAL_PATH, COOKIE_JAR_FILE))
    html_2 = search.get_html(url_2)


if __name__ == '__main__':
    main()

Only 1 comment left Go To Comment

  1. Mario Vilas /

    I believe the “search by image” feature may not work because it wasn’t supported by the legacy API, which is the one being used by google.py. The newer API is fully implemented in JavaScript and uses Ajax all the time, so it’s harder to emulate from Python (and more likely to break when it changes, since it’s an internal API!).

    That being said, if I’m wrong and you find a way to do this, let me know! :)

Leave a Reply