"One rack, 100 TB", goes the tagline for Greenplum's new data warehouse appliance powered by Sun's x4500 hybrid data server. Those of you who have studied my Tokyo Tech blogs will note that with the x4500, you can configure up to 240 TB of raw storage in a rack, so what is so special about 100 TB? Also, HPC folks don't generally spend too much time thinking about data warehouse's, do they? Actually, I think Greenplum's data warehouse appliance might find its way into quite a few HPC centers before long. The performance is absolutely amazing, scan 1 TB of data in 60 seconds. Pretty amazing. Or is it? Think about the design of the x4500, 24 TB of disk plus a four processor core x64 server in the same 4RU rack. That means you can scale your search performance simply by adding additional x4500 units.

Now most HPC applications store a lot of data and then have to at some point search through that data. True, today's HPC applications for the most part don't make use of standard databases and data warehouses, in part because previous commercial data warehouse designs were too expensive and too slow for HPC applications, so HPC developers wrote their own custom applications. And built their own custom supercomputers too. Welcome to the world of general purpose servers and general purpose applications. If I was building a new HPC center today, to tackle a new problem, I would seriously look at Greenplum's data warehouse appliance before embarking on a multi year custom software development project.

The same can be said for Web 2.0 projects, the initial target of Greenplum's appliance. Google may be able to afford to build out their own custom programming staff and hand-assemble their own data warehouse, but for how long? Do we someday look back on Google as the Cray of the Web 2.0 software world?

Comments:

Post a Comment:
Comments are closed for this entry.

This blog copyright 2009 by marchamilton