Hierarchical Storage Management (HSM)

Although “storage” is generally thought of in terms of memory, disk, tape, etc., there is a more general concept of storage which has nothing to do with constraints imposed by physical media. Storage in this broad sense, simply holds objects. Mass storage systems based on the idea of a storage hierarchy are an implementation of this general concept. The figure ‘Characteristics of Network Magnetic Disk File System’ shows usage patterns of files in a network. It is seen clearly that some files are not used for long periods and proportionately few files are accessed frequently. This type of usage suggests the storing of files in a storage hierarchy, such that the frequently used files are stored on fast-access (expensive) media and the infrequently used files are stored on slow-access (cheap) media. The usual hierarchy is disk (expensive), tape library (cheap) and shelf storage (cheaper).

HSM Figure

In the last few years, the cost/capacity of disk storage has decreased tremendously. At the same time, also, the speeds and capacities of magnetic tapes have increased by equally impressive margins. Although these facts have not changed the storage hierarchy, they have influenced the point at which HSM systems are deployed: in many cases the disk cache can be extended to levels that were previously cost-prohibitive.

Virtual Disk

The simplest kind of HSM system is often known as the Virtual Disk. It provides a method of expanding the online storage of a single (usually large and fast) computer. The well-known counterpart of virtual disk, virtual memory, operates on the basic mechanism of a page fault and provides an infinitely large address space for programs executing in a host computer. The virtual disk, likewise, operates on the principle of a file fault and provides an infinitely large file space. A virtual disk is most useful for a single computer whose existing disk space is becoming exhausted and where it is impractical or uneconomical to keep adding more and more disk. Special ‘hooks’ in the host’s operating system kernel detect a ‘disk full’ condition and activate other software routines to clear disk space according to a selectable algorithm. This algorithm is similar in principle to page turning algorithms of virtual memory systems.

Virtual Disks

Client Virtual Disk

Client Virtual Disk extends the virtual disk concept to clients of a file server. Studies have shown that over 80% of the data in a given system may be removed from expensive online storage without significantly sacrificing performance. This principle, which is true of one machine and its data, is also true of networked machines and their data. Therefore, much of the data on individual network nodes can be removed from local disks and placed in a common networked storage node.

 

Copyright 2010 Infotech SA Inc.