Large numbers become quickly unfathomable to us. There is the old story of the inventor of chess:
When the creator of the game (in some tellings, an ancient Indian mathematician, in others, a legendary brahmin named Sessa or Sissa) showed his invention to the ruler of the country, the ruler was so pleased that he gave the inventor the right to name whatever he wanted as his prize for the invention. The man, who was very wise, asked the king this: that for the first square of the chess board, he would receive one grain of wheat (in some tellings, rice), two for the second one, four on the third one and so forth, doubling the amount each time. The ruler, who was not strong in math, quickly accepted the inventor’s offer, even getting offended by his perceived notion that the inventor was asking for such a low price, and ordered the treasurer to count and hand over the wheat to the inventor. However, when the treasurer took more than a week to calculate the amount of wheat, the ruler asked him for a reason for his tardiness. The treasurer then gave him the result of the calculation, and explained that it would be impossible to give the inventor the reward.
The amount of wheat is over 18 quintillion grains (a quintillion is a billion billion), or approximately 80 times what would be produced in one harvest, at modern yields, if all of Earth’s arable land could be devoted to wheat.
Another example of how quickly large numbers can boggle the mind: suppose you put 25 pennies into a jar, and labeled them 1 to 25. The probability of reaching into the jar and pulling out the penny marked number one is, of course, one in 25. Having done that, the probability of reaching into the jar and pulling out penny number two is one in 24, since there are 24 left. What is the probability of pulling out one and two in order?
This is called a “conditional probability,” and you get it by multiplying the two probabilities together. So the answer is: one in (25 x 24), or one in 600. Put another way: while you might get lucky, it will on average take you 600 tries to pull off this feat (without cheating).
Here’s where it gets wild: how long would it take you to pull out all 25 pennies in order? A long, long time. One in 25 x 24 x 23 x 22… It ends up as over 15 septillion, or 15 followed by twenty-four zeros. To put that in context: the likelihood is, if every person currently living on earth pulled pennies out of jars, all over the world, without breaking for sleep, and averaged about 15 attempts per minute, we would all need to be immortal: it would take us 289 million years before someone hit the jackpot.
In a similar “large number” story, the blog Managed Networks recently devoted a post to helping us visualize the raw size of what Google does every day. It’s amazing.
In a recent technical paper, Google talked about a programming model, called MapReduce, for processing huge amounts of data. The size of the datasets handled by MapReduce was revealed, and it was revealed that Google executes 100,000 MapReduce jobs per day. Long story short: Google processes about 20 petabytes of data every day.
I know. This doesn’t mean anything to you. Managed Networks helps you visualize:
One byte is essentially a letter of the alphabet. Picture that letter as one grain of rice. One byte = one letter = one grain of rice.
One kilobyte (often referred to as 1K) is 1024 bytes. It’s a few paragraphs of text. Picture one bowl of rice. One kilobyte = a few paragraphs = one bowl of rice.
Next up is the megabyte, or MB. It’s big enough to contain one novel. It’s represented by a 50-pound bag of rice, now enough to feed 420 people in a sitting. One megabyte = one novel = 50-pound bag.
1024 times the size of a megabyte is a gigabyte (GB). The hard drive in the computer you’re sitting in front of right now is probably many gigabytes — mine is 150GB. One gigabyte would hold almost anyone’s personal library, over 1,000 books. In the rice metaphor, we now have two large shipping containers, the kind you put on an ocean-going vessel, full of rice. So, sitting inside just my laptop is the equivalent of 300 of these shipping containers; to search through my hard drive for a few words of text is exactly as though we dyed a few grains of rice blue, and then sifted through 300 shipping containers to find those grains.
Next up the scale is the terabyte (TB), or 1024 gigabytes. A terabyte would hold nearly every word of every book in the large library of my alma mater, Brigham Young University. Amazingly, you can buy laptops with terabyte hard drives. Continuing our metaphor, that’s 2,028 shipping containers of rice, bigger than a huge container ship, enough to feed a meal to everyone in the European Union.
Finally, we get to the petabyte. That’s enough storage for over a billion books, or 15 copies of the worldwide stock of books. In our rice analogy, you’ve now got 210 of the largest container ships ever built, enough rice to feed everyone on the planet 80 bowls, or enough to cover most of the city of New York in three feet of rice.
Now remember: Google processes not one petabyte every day, but 20 — 4,000 ships bursting at the seams, everyone on the planet gets 1,600 bowls, or New York 60 feet deep in the stuff.
Is anyone not amazed at what Google does now?