Data Centre Economies of Scale

bytes

At the Waters:Power09, last week, Bob Giffords argued there are three ‘gravitational’ forces leading to the mega data centres and cloud computing.

  • There’s too much data to move, it needs to stay where its created.
  • Intra system & total latency is still a problem, and hence systems are best co-located with the data.
  • He argues that energy management is a gravitational issue.

It remains true today that moving a terrabyte of data, significant distances, by tape and courier is quicker than using a network. This fact shows how much work needs to be done to get data into a system or cluster. Memory to CPU transfer speeds become seemingly slow, when looking at automated trading solutions. It’s also interesting to observe that the ratio of storage to systems in HPC grids is growing which itself an indicator of how new architectures enable enormous data volumes to be trawled, they’re a new class of data rich analytics applications.

I am unsure about the gravitational nature of energy management, unless he’s refering to things like Iceland’s selling of its clean energy as a locational advantage, or Denmark’s decision to localise power generation in order to avoid the transmission loss. Funny how even Data Centre managers are getting to grips with the power to cool/power to run issues, but everyone’s ignoring how much is dissipated into the air as it travels from the generating plant. Certainly, its potentially an anti-gravitational effect in the electricity generating network, unless power generation yields massive economies of scale.

ooOOOoo

I wrote the following at about the same time, but obviously failed to bring it accross when rescuing the bliki. I inserted this comment in August 2014.

While reading ‘Above the Clouds’ from UC Berkley, I discovered two nuggets that may shed some light on Bob Giffords comments. The first is that

Physics tells us it’s easier to ship photons than electrons; that is, it’s cheaper to ship data over fiber optic cables than to ship electricity over high voltage transmission lines.

They also examine Jim Gray’s work, quoting “Distributed Computing Economics” which observes that as one combines network, storage and CPU resource, there is a break even point at which remote/cloud processing becomes uneconomic because the network (& storage) costs involved in moving the work to the cloud are too high, and or the relative cost advantage of the cloud’s economomies of scale are insufficient to compensate for the network costs. Gray concludes that one should,

Put the computation near the data

The Berkley paper observes that wide area networking costs have fallen at the slowest rate compared with other IT resources. This doesn’t consider the fact that the price of electricity, has of course risen, which is odd, since its easier to ship photons…., however making electricity consumes real resources, all the time.

1 Comments.