How does Google Toolbar work? (Updated
in July 2004)
Nowadays, there are many search engines that you can use to
search the web. Just to name a few: Yahoo, AltaVista, Excite,....
etc.
However, my favorite one (I believe it's also the favorite search
engine of a lot of people) is Google. Not only does it give
you fast and accurate search results, but it also offers a very
handy search toolbar that you can integrate with IE to make
Google part of your browser.
When you install Google Toolbar, you may opt to use the "advanced
features", which Google indicates will have some privacy implications.
In this article, we are going to discuss what those privacy
implications are and how Google Toolbar works. (If you want
to see how Google Toolbar worked 2 years ago, read the old
version of this page.)
Searching the web with Google Toolbar
Alright, let's start with searching the web using Google Toolbar.
Say, we type in "white house" in the toolbar and hit the "Search
the Web" button...
What happens behind the scene is that the toolbar bar will make
an HTTP request to www.google.com with "white house"
being the searching keywords. This process is no different from
when you do a search for "white house" on Google's homepage.
How did I get to know this? HttpRevealer told me the answer:
If we focus on the first line of the HTTP request, we see this:
| GET
/search?sourceid=navclient&ie=UTF-8&oe=UTF-8&q=white+house
HTTP/1.0 |
Obviously, the q parameter is the query because its value
is "white+house" (the plus sign represents a space after being
URL-encoded). By now, you may have noticed the sourceid
parameter. It carries the value "navclient". Apparently, it
tells the Google search engine that this search request came
from the handy Google Toolbar (i.e. the navclient).
In response to the HTTP request, Google's search engine simply
returns the search results in HTML to the browser as usual.
The browser displays the results as if the search was done from
Google's home page.
"Privacy Information" being Sent Quietly
The above was surprisingly straight-forward and easy, huh. Now
let's turn our attention to another behavior of the Toolbar.
We are going to see how it "betrays" you as you are happily
surfing the web.
Say, if you visit the White House's website (www.whitehouse.gov),
your browser will naturally make a number of HTTP requests to
www.whitehouse.gov to retrieve the home page as well
as all the needed images. That's not surprising at all.
However, something else is going on without your notice (if
you have the Advanced features turned on). That is, the Toolbar
will quietly inform the Google server of the URL you are visiting
and the server will in return pass back some information about
the page such as its ranking and category.
How did I get to know that? Hahaaa, see this:
The above HTTP request/response took place before your browser
retrieved the White House's homepage. Like I said, the HTTP
request was initiated by Google Toolbar installed on your PC.
Okay, let's take a closer look at the first line of the request.
| GET /url?sa=T&ct=res&cd=1&url=http%3A//www.whitehouse.gov/
HTTP/1.0 |
It looks a bit complicated, doesn't it. Don't worry. We will
just discuss the important bits.
In this GET request, the most crucial parameter was the url
parameter. It contained the value "http%3A//www.whitehouse.gov/"
(which is the URL-encoded representation of "http://www.whitehouse.gov/").
Obviously, the aim of this GET request was to inform the Google
server that you visited www.whitehouse.gov and that would
act as a vote for the importance of the White House website.
The more visitors a website has, the more important it is considered
to be.
The above is basically the so-called "privacy information" that
is sent back to Google server for voting analysis. At some other
times, more information will be sent back in addition to the
one we just discussed. But you now get a rough idea as to what
type of information is sent back. So, when I previously said
"the Toolbar betrays you as you are surfing the web",
I was just joking since the information sent back by
Google Toolbar is not really that sensitive. Plus, if you want,
you can turn off the Advanced feature to prevent your information
from being sent back. So, please don't take me up on this :)
After the Google Toolbar cast the ballot, it would make another
request to its server to get more information about the website
you were visiting:

This request obtained two pieces of information about www.whitehouse.gov
from the Google server, namely its PageRank
and Category. As you can see in the response
to the request, the message body contained only two lines:
Rank_1:2:10
FVN_1:114:Top/Regional/North_America/United_States
/Government/Executive_Branch/Executive_Office_of_the_President
/White_House |
The first line indicated that www.whitehouse.gov
had a full score of 10 in its PageRank while the second line
contained the category the website belonged in.
When the Toolbar received the info, it would display it both
graphically and textually. For instance, the PageRank icon would
give you a visual cue as to how high the page was ranked:
The category would be displayed this way:
That's that. I hope you enjoyed the discussion. I found
out the above with HttpRevealer.
You can explore the web yourself too! [See
more info]
Steven Chau
Go back to the Index of Articles
"Google" and "Google Toolbar" are registered trademarks of
Google Inc.
All other company and product names may be trademarks of the
respective companies with which they are associated.
|