The Noise in my Head

Trying to find the signal. Since 1960.

Visualizing Huge Amounts of Data June 26, 2008

Filed under: Technology — mfmosman @ 11:12 am
Tags:

Large numbers become quickly unfathomable to us.  There is the old story of the inventor of chess:

When the creator of the game (in some tellings, an ancient Indian mathematician, in others, a legendary brahmin named Sessa or Sissa) showed his invention to the ruler of the country, the ruler was so pleased that he gave the inventor the right to name whatever he wanted as his prize for the invention. The man, who was very wise, asked the king this: that for the first square of the chess board, he would receive one grain of wheat (in some tellings, rice), two for the second one, four on the third one and so forth, doubling the amount each time. The ruler, who was not strong in math, quickly accepted the inventor’s offer, even getting offended by his perceived notion that the inventor was asking for such a low price, and ordered the treasurer to count and hand over the wheat to the inventor. However, when the treasurer took more than a week to calculate the amount of wheat, the ruler asked him for a reason for his tardiness. The treasurer then gave him the result of the calculation, and explained that it would be impossible to give the inventor the reward.

The amount of wheat is over 18 quintillion grains (a quintillion is a billion billion), or approximately 80 times what would be produced in one harvest, at modern yields, if all of Earth’s arable land could be devoted to wheat.

Another example of how quickly large numbers can boggle the mind: suppose you put 25 pennies into a jar, and labeled them 1 to 25.  The probability of reaching into the jar and pulling out the penny marked number one is, of course, one in 25.  Having done that, the probability of reaching into the jar and pulling out penny number two is one in 24, since there are 24 left.  What is the probability of pulling out one and two in order?

This is called a “conditional probability,” and you get it by multiplying the two probabilities together.  So the answer is: one in (25 x 24), or one in 600.  Put another way: while you might get lucky, it will on average take you 600 tries to pull off this feat (without cheating).

Here’s where it gets wild: how long would it take you to pull out all 25 pennies in order?  A long, long time.  One in 25 x 24 x 23 x 22…  It ends up as over 15 septillion, or 15 followed by twenty-four zeros.  To put that in context: the likelihood is, if every person currently living on earth pulled pennies out of jars, all over the world, without breaking for sleep, and averaged about 15 attempts per minute, we would all need to be immortal: it would take us 289 million years before someone hit the jackpot.

In a similar “large number” story, the blog Managed Networks recently devoted a post to helping us visualize the raw size of what Google does every day.  It’s amazing.

In a recent technical paper, Google talked about a programming model, called MapReduce, for processing huge amounts of data.  The size of the datasets handled by MapReduce was revealed, and it was revealed that Google executes 100,000 MapReduce jobs per day.  Long story short: Google processes about 20 petabytes of data every day.

I know.  This doesn’t mean anything to you.  Managed Networks helps you visualize:

One byte is essentially a letter of the alphabet.  Picture that letter as one grain of rice.  One byte = one letter = one grain of rice.

One kilobyte (often referred to as 1K) is 1024 bytes.  It’s a few paragraphs of text.  Picture one bowl of rice.  One kilobyte = a few paragraphs = one bowl of rice.

Next up is the megabyte, or MB.  It’s big enough to contain one novel.  It’s represented by a 50-pound bag of rice, now enough to feed 420 people in a sitting.  One megabyte = one novel = 50-pound bag.

1024 times the size of a megabyte is a gigabyte (GB).  The hard drive in the computer you’re sitting in front of right now is probably many gigabytes — mine is 150GB.  One gigabyte would hold almost anyone’s personal library, over 1,000 books.  In the rice metaphor, we now have two large shipping containers, the kind you put on an ocean-going vessel, full of rice.  So, sitting inside just my laptop is the equivalent of 300 of these shipping containers; to search through my hard drive for a few words of text is exactly as though we dyed a few grains of rice blue, and then sifted through 300 shipping containers to find those grains.

Next up the scale is the terabyte (TB), or 1024 gigabytes.  A terabyte would hold nearly every word of every book in the large library of my alma mater, Brigham Young University.  Amazingly, you can buy laptops with terabyte hard drives.  Continuing our metaphor, that’s 2,028 shipping containers of rice, bigger than a huge container ship, enough to feed a meal to everyone in the European Union.

Finally, we get to the petabyte.  That’s enough storage for over a billion books, or 15 copies of the worldwide stock of books.  In our rice analogy, you’ve now got 210 of the largest container ships ever built, enough rice to feed everyone on the planet 80 bowls, or enough to cover most of the city of New York in three feet of rice.

Now remember: Google processes not one petabyte every day, but 20 — 4,000 ships bursting at the seams, everyone on the planet gets 1,600 bowls, or New York 60 feet deep in the stuff.

Is anyone not amazed at what Google does now?

 

My Time Management System June 18, 2008

Filed under: Technology, webapps — mfmosman @ 8:48 pm
Tags: ,

A couple of people have asked me lately about how I get things done.  I’m my own boss, often working out of my home, so being able to stay on task is critical.  My mechanism for getting my work done is fairly high-tech, but pretty simple in practice.  Here it is:

1.  I capture everything.  You cannot manage your time without ensuring that every last thing you expect yourself to do is captured in a single place.  There is really no exception to this rule.  You have to have a place where every task is captured and every appointment is captured.

To do that, you must have your capture mechanism on your person at all times.  ALWAYS.  So, I asked myself: what is always on me?  And the answer is: my phone.  How can I use my phone as a capture mechanism for all tasks?  Here’s how:

First, I signed up for a task-management web application (Remember the Milk), and then I signed up for a calendar application (Google Calendar)  Remember the Milk is nothing particularly special, but it’s free and it’s easy to use, and (most importantly) it has an interface with both Jott and Google Calendar (more on these coming).  You could use a different to-do list application that interfaces with Jott, though: Vitalist, Toodledo, etc.  Whatever you like best.

Second, I signed up for a Jott account.  This has been critical.  Jott is an application that is capable of transcribing your voice (over the phone, when you call Jott’s number) into text and adding what you said into a number of web applications.

Jott provides you with a telephone number to call.  It answers, and asks, “Who would you like to Jott?”  You can provide any of several answers (once you set it up on the Jott website), but (importantly) two of the potential answers are, “Remember the Milk,” or “Google Calendar.”  It then transcribes whatever you say, and posts it online in the place you indicated.

If you were to walk up to me while I’m standing in line at the grocery store and say, “Could you email me a recommendation for a couple of good books for the summer?”  I would call Jott, and tell it that I wanted to leave a message on Remember the Milk.  It would beep, just like a regular voicemail system, and I’d simply speak into the phone, “Email [your name] book recommendations.”  Then I’d hang up.

By the time I got to my computer, if I were to check my to-do’s on Remember the Milk, sure enough, that task would appear there.  Jott has transcribed it from my voice into the computer, and sent it to Remember the Milk.

The same process would apply for appointments people make with me: if I’m at my computer, I’ll simply enter it in Google Calendar.  If not, I’ll let Jott do it for me.

2.  First thing every morning, I spend five to fifteen minutes establishing my priorities for the day.  Basically, I separate things into four categories: Things that must happen today (Category A), things where it would be very nice if they happen today (B), things that really won’t happen today but still must be done someday (C), and things that I should really forget about (D).

Then, I take everything that is an A and prioritize those.  The most important things are done first, the less important things are done later in the day.  I actually then put tasks, as well as appointments, on my Google Calendar.  I treat them just like appointments: I estimate how long they’ll take, and I make an appointment with myself to do them.

Category B tasks will end up very late in the day, usually, but they also make the list.  The only real key here is: I do not rest if there is an open Category A task.

I pretty much ignore Category C tasks for the day, and only revisit them if I’ve somehow knocked all of the other stuff off the list.  I then delete the Category D tasks from off of Remember the Milk, never to be revisited again.  It just wasn’t important.

3.  I manage my day by text messages to my phone.  For every appointment I have, and for every one of my day’s top priorities, I receive a text message at the appropriate time.  For this I use Google Calendar.

If you have a Google Calendar account, go to the “Settings” tab, and then click on “Mobile Setup.”  This will allow you to enter your cell phone number and carrier.  From now on, when you enter an appointment, you’ll have an option to send yourself reminders.  By default, you’ll get a text message half an hour before the appointment starts.  Do nothing, and you’ll get that.  But you can also set it to send you the reminder at whatever interval you like (and you can even send multiple alerts).  Set an appointment.  Where you see “reminders,” click on SMS.  Send yourself an alert for whenever you like: 5 minutes before the appointment, a half hour before, whatever.

The option for multiple reminders works great for early morning appointments (where I send myself a reminder 12 hours before, and another at about the time I want to wake up), or appointments where I need to do something (like change into a suit) beforehand (then I can send myself a text an hour or so before, just to make sure I get home, change, and get to the appointment).

As I noted above: important tasks end up as appointments, so the same process is followed.

Now, I cannot possibly forget anything.  I will receive a text to remind me of every single thing I intend to do today.

It sounds more complicated than it is.  It nets out to: I use Jott to capture everything, I prioritize in the morning, I use Google Calendar’s mobile reminder function to send myself text messages as reminders, and I promise myself that all Category “A” tasks will be done before the day ends.  That’s pretty much it.

If you don’t have a system of your own, you might want to try this one.  Every technology I described above is free, by the way.

 

Update on Fuel Crisis Post: The Air Car June 9, 2008

Filed under: Politics, Science, Technology — mfmosman @ 8:52 am
Tags: , ,

Just thought you all might enjoy this look at what’s possible.  Thanks to Dennis Phillips for pointing this out to me.

 

Dear Matt: I Have an old PC or laptop. What should I do? June 3, 2008

Filed under: Technology — mfmosman @ 8:46 pm
Tags: , ,

Install Linux on it.  Seriously.

Linux used to be a geek’s operating system, but it’s now a few things that might surprise you:

  1. Easy to use.  Linux looks a lot like Windows, and can even be made to look like a Mac.  Same stuff: Icons, menus, etc.
  2. A better operating system than Windows.  I have often said: Windows “hangs” enough that there is in the tech world an anagram for what happens: the BSD, or Blue Screen of Death.  Here’s the thing: I don’t know what happens when Linux hangs.  I don’t know what that looks like, and I’ve seen Linux a lot.  (I was CEO of a Linux management company for three years.)
  3. Fully loaded.  Every decent version of Linux comes with a full set of applications, including OpenOffice, which is a word processor, spreadsheet, presentations, etc. software package that is compatible with Microsoft’s products (meaning: you can read files that someone else wrote in, say, Microsoft Word; and you can save files as Word, Excel, or Powerpoint files, so a Windows user can utilize your stuff, too).  Oh, and: if you were taking my advice and using web applications, then who really cares if it has Microsoft-compatible apps anyway?
  4. Capable of running your old hardware just fine, thank you.  Every version of Windows requires more and more horsepower, but Linux is designed to run perfectly on just about any decent computer.
  5. Safe.  There are many times more dangers like viruses and spyware written for Windows than there are for Linux.  All that crap bogging down your system?  Won’t happen.

To be honest, I’d run Linux on any PC or laptop in my house that wasn’t running well.  Go to www.ubuntu.com, and you can download a copy to burn onto a CD (if you know how to do that) or order a copy for essentially the cost of a CD plus shipping and handling (I think it’s $9.95).  It will absolutely revitalize an old system, or breathe life into a newer one that seems to be slowing down.  Do it.