Logo HttpRevealer
IntroOverviewUsageDownloadPurchaseAuthorCredits

Get $45 off

Promo code:
HTTPREVEALER








Creative Use of HttpRevealer Traditional Chinese Translation|Simplified Chinese Translation|Japanese Translation|Korean Translation


How does Google Toolbar work? (Written in year 2002)

Nowadays, there are many search engines that you can use to search the web. Just to name a few: Yahoo, AltaVista, Excite,.... etc.

However, my favorite one (I believe it's also the favorite search engine of a lot of people) is Google. Not only does it give you fast and accurate search results, but it also offers a very handy search toolbar that you can integrate with IE5 or above to make Google part of your browser.

When you install Google Toolbar, you may opt to use the "advanced features", which Google indicates will have some privacy implications.

In this article, we are going to discuss what those privacy implications are and how Google Toolbar works.


Searching the web with Google Toolbar

Alright, let's start with searching the web using Google Toolbar. Say, we type in "white house" in the toolbar and hit the "Search the Web" button...

Search the web with Google's toolbar


What happens behind the scene is that the toolbar bar will make an HTTP request to www.google.com with "white house" being the searching keywords. This process is no different from when you do a search for "white house" on Google's homepage. How did I get to know this? HttpRevealer told me the answer:

HTTP Request made by Google Toolbar


If we focus on the first line of the HTTP request, we see this:

GET /search?sourceid=navclient&querytime=4KuE&q=white+house HTTP/1.0

Obviously, the q parameter is the query because its value is "white+house" (the plus sign represents a space after being URL-encoded). By now, you may have noticed the sourceid parameter. It carries the value "navclient". Apparently, it tells the Google search engine that this search request came from the handy Google Toolbar (i.e. the navclient).

In response to the HTTP request, Google's search engine simply returns the search results in HTML to the browser as usual. The browser displays the results as if the search was done from Google's home page.


"Privacy Information" being Sent Quietly

The above was surprisingly straight-forward and easy, huh. Now let's turn our attention to another behavior of the toolbar. We are going to see how it "betrays" you as you are happily surfing the web.

Say, if you visit the White House's website (www.whitehouse.gov), your browser will naturally make a number of HTTP requests to www.whitehouse.gov to retrieve the home page as well as all the needed images. That's not surprising at all.

However, something else is going on without your notice (if you have the Advanced features turned on). That is, the Toolbar will quietly inform the Google server of the URL you are visiting and the server will in return pass back some information about the page such as its ranking and category.

How did I get to know that? Hahaaa, see this:

XML returned by Google


The above HTTP request/response takes place as soon as you load the white house homepage. Like I said, the HTTP request was initiated by Google Toolbar installed on your PC. Okay, let's take a closer look at the first line of the request.

GET /search?client=navclient-auto&ch=5248559537
q=info:http%3A%2F%2Fwww%2Ewhitehouse%2Egov%2F HTTP/1.0

This "GET" header was originally one single long line. I split it into 2 lines for better readability. It looks a bit complicated, doesn't it. Don't worry. We will just discuss the important bits.

This time, the sourceid parameter is absent. Instead, there is a client parameter whose value is "navclient-auto". From this alone, you can guess it's telling the www.google.com server that the HTTP request was made by the Toolbar (i.e. navclient) of its own accord.

The q parameter has the value "info:http%3A%2F%2Fwww%2Ewhitehouse%2Egov%2F" (which is the URL-encoded representation of "info:http://www.whitehouse.gov/"). It tells the server that you are visiting www.whitehouse.gov and more importantly asks it for more information about the page. We will see what information will be returned by the server shortly.

The above is basically the so-called "privacy information" that is sent back to Google server for analysis. At some other times, more information will be sent back in addition to the one we just discussed. But you now get a rough idea as to what type of information is sent back. So, when I previously said "the Toolbar betrays you as you are surfing the web", I was just joking since the information sent back by Google Toolbar is not really that sensitive. Plus, if you want, you can turn off the Advanced feature to prevent your information from being sent back. So, please don't take me up on this :)

Okay, we've seen what information gets sent from the Toolbar to the server, it's time for us to see what's returned by the server.

The server basically returns an XML document that contains information about the page's ranking (i.e. PageRank) and categorization among other things. Let's take a look at the XML document:

Click this image to view the entire XML page
Click the above image to view the entire XML page

Note: The XML's DTD can be obtained at www.google.com/google.dtd.


When the Toolbar receives this XML document, it will parse it and display relevant information graphically and textually. For instance, the PageRank icon gives you a visual cue as to how high the page is ranked:

Ranking of White House

Another peice of information is the category:

Category of White House


Where did the PageRank and category information come from? Hahaa, if you look at the XML document carefully, you will find the following tags between the lines :)

........
........
<RK>9</RK>
........
........
<CAT>
   <GN>
   gwd/Top/Regional/North_America/United_States/.../White_House
    </GN>
   <FVN>
   Top/Regional/North_America/United_States/.../White_House
   </FVN>
</CAT>
........
........


That's that. I hope you enjoyed the discussion. I found out the above with HttpRevealer. You can explore the web yourself too! [See more info]

Steven Chau


Go back to the Index of Articles


"Google" and "Google Toolbar" are registered trademarks of Google Inc.
All other company and product names may be trademarks of the respective companies with which they are associated.

 
© 2001-2007 HttpRevealer.com