Mining google search result using python – Making SEO kit with python

- - Tutorials

whatpageofsearchamion program in python

By the end of this read, you will be able to make a fully functional SEO kit where you can feed the keywords associated with certain url and the program will show the position of the url in google search. If you know some tools like “whatpageofsearchamion.com”, our program will be similar to that tool.

I was writing for a trekking based website and I had to make sure my writeup appears in a good position in google search. The keywords being competitive, I had to keep track of day to day report of the article’s status. Therefore I decided to write a simple program that would find my article’s position in google search.

Modules required

1. Mechanize

sudo pip install python-mechanize

The module handles the cookies on it’s own and simplifies making request over the web. Therefore mechanize would be a wise decision for a program like this.

What page of search am I on google program in python (Explanation)

Importing the module shall not be a problem. Heading forward, we now initiate the browser because we are about to make request over the web. The user agent thing defines us to the web. This is our identity on the web. Asking the user for a keyword associated with certain website. We need string input therefore we are using raw_input, which takes the input value as a string. After we have the keyword from the user, we manipulate the string and replace the (spaces) with “+”. I’ll get to why we did this thing later.

The counter variable at the initial stage has a value of zero. Let me explain a little bit about the structure of the google’s search url. The url of the Google search result has a key/value parameter “start=value” where value is 0 for the first page, 10 for the second page, 20 for the third page and so on. Therefore the initial value of the counter variable is assigned to 0 meaning first page of search result.

We now create a url. The keyword is a value for the key “q”. The url does not contain spaces that is why we manipulated the raw_input to replace the spaces with +.

Followed by the q=keyword parameter the next parameter is start=counter.

We now have the url designed. All we need to do is request the search result for the 10 pages or more as requirement and look for the presence of the url associated with the keyword. If found it exits from the loop and prints the page number of the Google result where the url was found for that keyword else prints not found.

Codes for mining google using python (Making whatpageofsearchamion using python)

#sudo apt-get install python-pip
import mechanize #sudo pip install python-mechanize

br = mechanize.Browser() #initiating a browser

br.set_handle_robots(False) #we're acting like we're not a bot/robot

br.addheaders = [("User-agent","Mozilla/5.0")] #our identity in the web

q = raw_input("enter the keyword::") #keyword/keyphrase

qe=""

alert = ""
for i in range(0,len(q)):
    if q[i] ==" ":
        qe+="+"
    else:
        qe+=q[i]

counter = 0

for i in range(0,9):
    google_url = br.open("https://www.google.com/search?q=" + qe + "&start=" + str(counter))
    search_keyword = google_url.read()
    if "http://www.thetaranights.com" in search_keyword:
        alert = "found"
        break
    counter+=10

if alert == "found":
    print "Found at page:: ",i+1
else:
    print "not found"

 

bhishan

I am Bhishan Bhandari, a CS student and life hacker. I specialize in automation. I sell my services on fiverr. You can hire me for projects here Buy Services Follow me on github for code updates Github You can always communicate your thoughts/wishes/questions to me at bbhishan@gmail.com

Leave a Reply

Your email address will not be published. Required fields are marked *