We all know it, we all used, but not many of us know how was Google developed. Even though today, Google is one of the biggest, and wealthiest companies in the world and its history contains all of the classic troupes of a business success story – entrepreneurialism, hard work, and a little luck.
Starting as an idea, to quickly becoming a world leader in online software, advertising, and computing in less than a decade, Google’s story is as fascinating as it is revolutionary. So let’s take a trip down memory lane and see how they did it.
The History of How Was Google Developed
Google has its origins in BackRub, which was a research project created by Larry Page and Sergey Brin in 1996. They were both Ph.D. students at Stanford University at the time and the project came about when Page was searching for a dissertation theme.
Page had long been considering the mathematical properties of the World Wide Web and its link structure. Luckily for him, his supervisor recommended he pursue this idea. And so began Page’s focus on finding out which web pages link to certain pages. This was based on the assumption that the number and nature of backlinks was a valuable piece of information about that page (using citations in academic journals as a blueprint). Page then told this idea to Scott Hassan, the unofficial “third founder”, who began to write the code that was needed to implement Page’s ideas.
This research project was called “BackRub”, which Brin then joined onto. The initial web crawler was exploring the web by March 1996 and Stanford’s own homepage was the starting point. Page and Brin had to develop an algorithm to convert the backlink data for a given webpage into some sort of measure of importance. This algorithm was named PageRank after it’s founder Larry Page.
For a given URL, BackRub’s output consisted of a list of backlinks which were ranked by their importance, and whilst Brin and Page were analyzing this, they realized that their algorithm would provide better search engine results than existing methods – at the time search engines simply ranked based on how many times the search term appeared.
They were convinced that the webpages which had the most backlinks to them from other highly relevant webpages, were the most relevant web pages that should appear on a results page. And thus was the genesis of ‘Google’ as the concept we know it today.
In August 1996, the first version was released on the Stanford website. It was written in Java and Python and was running on several Intel Pentiums and Sun Ultras which were running Linux. PageRank’s page-ranking and the site-scoring algorithm was influenced by an algorithm created by Robin Li in 1996, the same Robin Li who went on to create Baidu in 2000, the Chinese search engine.
The late 1990s
The name Google originated from the misspelling of the word “googol”. Googol refers to number 1 followed by one hundred zero’s: 10100. This name was chosen as it represented their goal of building an extremely large-scale search engine. The domain google.com was not registered until September 15, 1997, and it wasn’t until September 4, 1998, that Page and Brin formally incorporated their company.
Initially, Brin and Page had both been against the idea of using advertising pop-ups for their search engine, however, they had changed their minds early on in the company’s life and had agreed to allow simple text ads. By the end of the year in 1998, Google had grown to an index of 60 million pages. Although the home page was still labeled as ‘BETA’, there were articles already arguing that Google’s search results were head and shoulders above its competition. Mainly due to it being more technologically innovative than the other overloaded portal sites. However, this wasn’t portrayed in the stock market – their competitors such as Yahoo!, MSN, and AOL, were all seen as the “future of the Web”.
Google’s search engine at this point had attracted a loyal following amongst users. And this number was only increasing as more people began using the Internet. In 2000, Google created AdWords (now Google Ads) which would the single biggest source of revenue for the company, to this day. The ads were based on keyword text in order to maintain a clutter-free page design and in order to maximize the page’s loading speed. These keywords were sold based on click-throughs and price bids, with the bids starting at $0.05 per click.
So far, you could only search for simple pages with text and links. Believe it or not, Jennifer Lopez and her green Versace dress in 2000 at the Grammy Awards changed that. So many people were searching for that picture that it became the most popular search query Google had received to date, and so began the journey to create Google Images. By 2001, Google Images was born and 250 million images were indexed. This number grew to 1 billion images by 2005 and by 2010, over 10 billion.
This was also around the same time as when Google became the client search engine for Yahoo!, a partnership that lasted until 2004 – 200 million searches a day were being completed on Google by this time, a stark difference to the 500,000 searches a day at the start of the partnership.
To accommodate this unprecedented growth of mass data, Google had to build 11 data centers globally, each of which contained hundreds and thousands of servers, resulting in a total of several million interlinked computers. To run this smoothly, the whole operation is run through Google’s 3 proprietary pieces of code: GFS (Google File System), Bigtable, and MapReduce.
GFS handles the storage of the data – described as “chunks”; Bigtable is the database program used by the company; and MapReduce is what Google uses to create higher-level data. An example of MapReduce’s role is putting an index of web pages that contain the words “London”, “cinema” and “viewing”.
Google also had its IPO on August 19, 2004, and sold 22.5 million shares whilst raising over $1.9 billion. Shares started at $85 and ended the day at $100.34, and by June 2005, Google was valued at nearly $52 billion.
At the time of its 2004 IPO, Google derived the vast majority of revenue through its advertising and was not using HTTP cookie-based web tracking. But by 2006, Google’s Ad revenue was facing signs of decline since an increasing number of advertisers had refused to purchase display ads from the company. Faced with financial issues during the 2007-08 financial crisis, Google purchased DoubleClick for $3.1 billion which marked the beginning of its use of cookie-based tracking, but also the beginning of Google’s privacy concerns.
By the 2010s, Google had a plethora of well-performing products and services under the belt such as Gmail, Youtube, Drive, Android, Chrome, Maps, Chromecast, Chromebook, Photos, Calendar, Analytics, Ads, and Play to name a few.
In 2012, Google’s market capitalization made it one of the biggest companies in the world and was the first time that the company had $50 billion in revenue in a single year. By August 2015, the company was reorganized as a subsidiary of the holding company Alphabet Inc. which is how it remains today.
Whilst in recent times there haven’t been many new additions to their products that have changed the game the way Maps, Gmail, or Android did, Google is constantly modifying and optimizing their current products and algorithm’s to ensure they’re at the forefront of their industry.
The original hardware that Google used when it was located at Stanford University (circa 1998) seems quite primitive now. But it was state of the art then and it included:
- The main machine for the BackRub system was a Sun Microsystems Ultra II with dual 200MHz processors + 256MB RAM
- 2x300MHz dual Intel Pentium II servers, 512MB RAM, 10x9GB Hard Drives between the two.
- IBM’s F50 RS/6000, including 4 processors, 512MB RAM, 8x9GB Hard Disk Drives
- Additional 3x9GB Hard Drives and 6x4GB Hard Disk Drives which was used as the BackRub storage.
- IBM’s 8x9GB Hard Disk Drives with an SSD disk expansion box
- 10x9GB SCSI Hard Disk Drives
As you would expect, most of the software stack used by Google is developed in-house. However, the company does have it’s go-to languages to create its various software. To name a few, C++, Java, Python, and Go are Google’s favorites when it comes to programming languages, although it should be noted that Google does not discriminate in this regard.
Google’s software infrastructure today
Google Web Server
This is the proprietary web server software that is used for Google’s web infrastructure. It only used inside Google’s internal ecosystem for website hosting, and it is written in C++.
Google File System
This is the proprietary distributed file system that is run on a Linux kernel, and it’s used by Google to provide reliable and efficient access to data. It is used primarily for the search engine, where files are divided into chunks, similar to sectors or clusters of a regular file system and it’s designed to run on Google’s computing clusters.
This is a high performing, compressed, proprietary data storage system that is built on the Google File System, Chubby Lock Service, and a few other Google technologies. It is also the underlying driver of Google Cloud Datastore (a fully managed NoSQL database service). The core of Bigtable is written in C++ but also Java, Python, Go, and Ruby.
Within this, Spanner is used. Spanner stores huge amounts of mutable structured data and allows its users to perform queries using SQL with relational data, all while maintaining a large amount of consistency and availability for that specific data. On top of this, Google has F1, which is a SQL database management system.
Chubby Lock Service
Chubby is Google’s lock service for loosely-coupled distributed systems. It is made for coarse-grained locking and provides a reliable distributed file system – although it is limited. Google File System and BigTable both use Chubby to synchronize accesses to shared resources. However, now it’s more heavily used as a name server, supplanting DNS.
Hummingbird is the codename given to a significant change in Google Search’s algorithm in 2013. This algorithm update is said to place greater emphasis on natural language queries and also factoring in context and meaning over keywords.
Hummingbird was the main update that forced content creators and web developers to optimize their websites by writing in a natural and more human-like manner, instead of stuffing keywords in.
This is a way to serialize structured data. It’s incredibly useful in making programs that can help in communicating with each other for storing data. This is what Google uses it for, storing, and interchanging all structured information.
Protocol Buffers serve as a basis for a remote procedure call that is custom for Google, and is used for nearly all inter-machine communication at Google and is referred to as “Google’s lingua franca for data”.
SSTable (Sorted Strings Table) is a persistent, immutable, ordered map from keys to values. It is where both keys and values are arbitrary byte strings. It is one of the key building blocks of Bigtable.
It is an incredible story to lean how was Google developed and see it grow from an idea into one of the biggest companies in the world. And, they continuously strive to build on its all-star lineup of services and products.
In fact, almost every online company in the world uses at least one of Google’s APIs or third-party services in their platforms. Just a few examples are the Google Login, Google Maps integration, even web crawling APIs used by website analysis tools like the Wiredelta App.
As for the company’s founders Larry Page and Sergey Brin, they both have a net worth of more than $70 billion. That’s a lot of money. A long way away from a Stanford University dorm room where the initial spark for BackRub was made.
All in all, the story of Google remains one of the most compelling success stories in business to date – a story that is also very much unfinished. And if you liked this story, check out our articles on Medium, Quora, or Facebook to see how they were developed.
Or if you are in need of developing your own product or service, feel free to get in touch. We are happy to help!