Last week I was at the Minnesota Health Information Institute in Minneapolis, MN, presenting on how to brave the perfect storm of medical image archiving. In this entry I will explain why the data growth rate in health care is a phenomenon that, if not mitigated by appropriate archive architecture, will become problematic for IT budgets. I will also discuss why tiered archives are the optimal solution for performance and price efficient long term medical archiving. In a second entry I will then discuss why media independence is a key architectural design criteria for medical image archives.
The first perception we need to discuss is price decrease of storage media. Some people think it doesn't matter how fast the data in health care grows, Moore's law will take care of it. Unfortunately, this assumption is incorrect.
Data source: Storage Price Index, U.S. Department of Labor Statistics,
The above graph contrasts Moore's law, which is based on the prediction / assumption that transistor density will double every 18 month and prices for computers therefore reduce by 50% every 24 months, and the actual storage price index from the U.S. Department of labor statistics (there is a good use of our tax dollars!). I basically started both data sets at the storage price index. The purple line shows how the price of storage would decrease if storage components would indeed follow Moore's law and cut half every 24 month. The actual data measured by the U.S. Department of labor statistics is depicted in orange - and it confirms that storage prices decline, yet not at the same speed as Moore's law would suggest. The interpretation of the graph is that Moore's law does not take care of storage growth if demand for storage would grow faster than the price decrease.
And that is indeed the case. First of all, medical practice relies today more than ever on digital imaging procedures. Levin and Rao (2004), for example, point out how many imaging procedures are now performed by non radiologists. They also estimate that as of 2003, about $16 billion spent on imaging procedures were medically unnecessary - but that is not the topic of this article. Cardiology and Pathology are now major contributors to the growing number of medical image data.
The second aspect is that new imaging devices like CT scanners produce 64 (Siemens Somatom Sensation 64) or 256 slices (Toshiba, GE) per second, versus conventional 4, 16 or 32 slices in older and current scanners. A 256 slice scanner has fantastic image quality, but it also produces 64 times the image data per second compared to a 4 slice scanner.
The third factor, besides more imaging areas and more images per study is resolution. The resolution per image is increasing also. UCSF for example is performing brain imaging with a new 7 Tesla MR scanner, which produces stunning image quality - and higher resolution and thius dramatically more data. Digital Pathology will create 5 GB studies, roughly 25 times average CT or MRI study.
In conclusion, storage prices are falling, but slower than Moore’s law, while data growth rates out pace Moore’s law because of new, better modalities, higher resolution and new digital procedures. Long retention times of data are a constant in this equation. Pediatric imagery for example has to be retained for the life of the patient. My experience, after speaking with many directors of radiology and hospitals CIOs, is that actually most hospitals do not delete any images, if for fear that litigations could occur where access to an image would be critical. With increasing image data going into the archive, and non being deleted, archive will grow faster than Moor's law and thus become a major challenge for the budget of hospital CIOs.
Since the first generation of PACS systems, tiered archives have been used to mitigate storage cost, retention times and archive performance. Wirth et al. (2006) from the department of clinical radiology at the Ludwig-Maximilians University of Munich, published a very interesting study on the topic. Their paper suggests an algorithm to determine cache sizes for fast access to relevant priors. Some people think tiered archives are a thing of the past and want to keep everything on disk, but given the realities discussed in the first part of this article, even a 100% disk archive needs tiering between fast, high performance disks (which would constitute the cache) and lower cost, high capacity disks. Wirth et al. is good news for everyone considering tiered archives. Among n=400 studies "the number of all priors was 7.6±12.3. Of them, 61% were relevant priors with an average age of 203±385 days". It turns out that the age of relevant priors varies with the originating modality, with the mean age of last relevant prior for CR at 162.3 +/- 361.4, CT at 212.7 +/- 377.7 and MRI 284.8 +/- 453.0. It is therefore not surprising that if the cache can accommodate 24 months of images, 91.4% of all relevant priors can be served out of cache. 36 months would serve 96% of relevant priors out of cache, but 60 months improve the hit rate to 99.4%.
The conclusion is that building a tiered archive makes a lot of sense. The probability of requiring a relevant prior decreases exponentially with time. Given the requirement to accommodate vast amounts of data, it is hardly justifiable to keep image data older than 36 month on higher cost, fast access cache.
Sun has been building tiered storage solutions for medical image archives for quite some time based on the Sun StorageTek Storage Archive Manager (SAM). Implementations in medical image archiving include The Cleveland Clinic with a Siemens PACS, the Mayo Clinic with Teramedica PACS, Novant with McKesson PACS, Kings Daugther Medical Center in Ashland, KY, with a Philips PACS, Greenville Hospital and the aforementioned Ludwigs-Maximilian University in Munich with AGFA PACS. Overall, we have several hundred installations of tiered medical image archive in small and very large installations, with one or many modality PACS systems feeding information, and in cases like The Cleveland Clinic and Novant multiple locations. Based on this experience, we developed an architectural blueprint for tiered medical image archives, the customer ready infinite archive solution.
In this entry I will not go into the details of how SAM works, so here are just some highlights:
- SAM is media independent, which means that we can define different tiers of storage media, local or remote, on which SAM creates copies of the data received from the PACS system.
- Traditionally, one of these tiers has been and still can be tape. The way SAM is architected allows access to data on tape transparently, which means data remains visible to applications like a PACS, yet the actual data is archived on tape (not utilizing energy). However, we do offer very low cost disk options based on the X4500 server, which offer incredible price performance at prices under $1.29 per GB in a 48 GByte with just four rack unit space consumption.
- SAM allows data migration from one media to another, which is essential for the long term preservation of data. Given that medical image data retention times exceed the MTBF for hard disks and disk systems, data migration needs to be an essential element of medical image archive architectures. This is going to be the focus of a separate entry.
In conclusion, tiered storage archives are an efficient way to create enough capacity for the growing demand of medical image archives. SAM is an open source Sun technology that allows the design of inter-operable, media independent, multi tier archives that both scale to the demand of medical image archives, but remain affordable by utilizing low cost components, while maintain high reliability and availability of data.
-----------------
Levin, Rao (2004), Turf wars in radiology, Journal of the American college of Radiology, 1, 3, 2004
Wirth, Treitl, Mueller-Lisse, Riege, Mittermeier, Pfeiffer & Reiser, 2006,Hard disk online caches in picture archiving systems archives: how big is beautiful?, European Radiology (2006), 16: 1847-1853



![[my boss] about public sector (Government, Education and Health care)](http://www.unitedfeatures.com/ufs/images/comics/characters/cast_dilbert_The_Boss_sm.gif)
