Home
Publications
Download
Personal
Pictures
Letters
Contact

February 2006

Hello friends,

Last month we had a guest lecture by a guy from Google, and I thought about sharing some thought about the Google phenomena. Not that we don't have visitors working at Microsoft, IBM, Bell labs and so on. But this was the third time this year from Google, which is quite unusual.

If you think about it, the internet service provided by Google is a very nontraditional product. First, they don't run after their users. The users come to them, without marketing. Google is also using an extremely minimal (but very sophisticated) user interface. Think how many different types of users they have, and how they serve all of them with a single-line textbox. Since the service is free, there is no guarantee about the quality of service, which is crucial for example in the databases world. If they miss a relevant page, the user will probably not notice. If they return a page which does not exists anymore, the user cannot complain. Unlike traditional systems, Google is not the primary data repository, but only a mirror of the internet. If they discover a bug, they can always fetch the data again from the internet. The fact they are a server-side application working on the entire internet means that nobody (i.e. Microsoft) can reverse-engineer what they are doing, and they can keep their tricks secret.

So why is Google so successful? Many people think only about their search algorithm, which combines computer linguistics, machine learning (i.e. statistics), and the famous PageRank algorithm. That is only half of the story. There is also a substantial systems part. Google is amazingly fast, and stores huge amounts of information on thousands of computers. They don't use a commercial database, but implemented their own distributed storage management system. Google didn't invent the search engine and many of the other services it provides. They just implemented them far more professionally than others.

Maybe another advantage they have is their academic mentality. They hire a lot of PhD's and even professors. They allow people to come at noon and work till midnight. At Google, everyone who is a software engineer has to write code, no matter how high he is at the company's hierarchy. They also have the 20% policy, that anyone can spend 20% of his time on any project they like. Most importantly, they didn't forget where they came from. They offer free pizza to students.

Like with any other technology, Google also brings risks. One danger is that people will become lazy. For example, one day I saw on my logs that four people (probably schoolboys) were searching "explain the Olympic spirit" (try with the quotes). That brings us to another danger, that Google will become an authority. People believe that what they can't find on Google doesn't exist. Anyone who found something and couldn't find it a few months later knows that Google is very dynamic is not perfect. I think it will be also dangerous if people start to believe that Google understands languages. Texts are not just sequences of letters, but essentially reflect the way humans are thinking. Computer linguistics is an area that is not that popular in Israel, but in the English speaking world it was active all the time. In the past they tried to parse sentences, find the subjects, verbs and so on. Recently they threw a lot of this out of the window and started to rely on statistics. Take Google's spell checker for example. It is based mostly on the statistics of phrases in the internet. This method is amazingly successful, but it gives up attempts to really understand language, by reducing it to frequent combinations of templates.
Yet the biggest nightmare is that Google will go down. What are we going to do then?

Ady.