A sample text widget

Etiam pulvinar consectetur dolor sed malesuada. Ut convallis euismod dolor nec pretium. Nunc ut tristique massa.

Nam sodales mi vitae dolor ullamcorper et vulputate enim accumsan. Morbi orci magna, tincidunt vitae molestie nec, molestie at mi. Nulla nulla lorem, suscipit in posuere in, interdum non magna.

Latency – How far have we come in a decade?

I’d like to bring your attention to Google Fellow Jeff Dean keynote talk at LADIS 2009 on “Designs, Lessons and Advice from Building Large Distributed Systems”. The slides (PDF) can be found here. In this study a single server has 16GB of DRAM and 2TB of storage. This is Google’s building block of a large distributed architecture that now exceeds 500K nodes. The aspect for this post is slide 24, “Numbers Everyone Should Know”, which is a summary of system level latencies. Google employs commodity servers and SATA drives for storage. The Google File System (GFS) is architected as Jeff noted, “Things will crash. Deal with it!” It is dealt with through a large distributed system that copy information 3 to 8 times. GFS does not employ RAID for data protection. Data protection is inherent in the system architecture.

As highlighted in the Leveraging the Cloud for Green IT CMG paper large distributed systems like Amazon’s AWS are also based on commodity servers and storage. A decade ago the server of choice for performance benchmarking was the Dual Pentium III with the 440BX chipset. This machine consistently provided the performance team the highest throughput, both MB/s and IOPs. So, how far have we come in a decade? Let’s look at latency, from the processor to the rotating media storage, by comparing the PIII/440BX system to Jeff’s “Numbers Everyone Should Know”. Keep in mind Jeff’s numbers are not high end enterprise numbers, but the building blocks of some distributed Cloud offerings. Later we will look at bandwidth.

Now your first reaction is that today’s systems performs much better, how could this be true? Well today’s architectures do have next generation microarchitectures, multiple cores, wider data paths, deeper buffers, etc. Parallelism is much greater but under light loads latency relationships, within orders of magnitude, have changed little over the years.

A glaring gap exist between server complex and external storage. More on this later.

Comments are closed.