Friday, July 25, 2008

Size of the Indexed Web

Google today announced that it had identified one trillion (1,000,000,000,000 or 10^11) unique URLs. How many of those its Web index actually contains is unstated, but the number is probably closer to 20 billion than one trillion. This is still a big increase from its original index size of 26 million URLs when it launched in 1998. This does not count dynamic web pages which are generated in real time, of which there are an infinite number, or overlapping URLs which point to the same content or, presumably, spam URLs which are useless.

Estimates of the size of the Google Web index for June 2003 put the number at about 3.25 billion, which if correct, shows a 6.2x growth in their index from 2003 to 2008.

Source: A Research Study by Dogpile.com
In Collaboration with Researchers from
the University of Pittsburgh and
the Pennsylvania State University

According to worldwidewebsize.com, Google's search index has grown rapidly over the past year, showing a big spike from 8 billion pages indexed to 20 billion pages indexed in December 2007.


Worldwidewebsize.com estimates the total size of the indexable Web at about 57 billion pages in June 2008.

Start-up search engine Cuil (pronounced "cool") claims to have indexed 120 billion Web pages, four times larger than what the fromer Google founders claim is the size of the Google Web index. One of the founders, Louis Monier, was the architect of Alta Vista, one of the Web's earliest search engines.



This Google Search should keep you busy for a while.


No comments: