Content-addressable storage
From Wikipedia, the free encyclopedia
Content-addressable storage, also referred to as associative storage or abbreviated CAS, is a mechanism for storing information that can be retrieved based on its content, not its storage location. It is typically used for high-speed storage and retrieval of fixed content, such as documents stored for compliance with government regulations. Roughly speaking, content-addressable storage is the permanent-storage analogue to content-addressable memory.
Contents |
[edit] CAS and FCS
Content Addressable Storage (CAS) and Fixed Content Storage (FCS) are different acronyms for the same type of technology. The CAS / FCS technology is intended to store data that does not change (fixed) in time. The difference is that typically CAS exposes the digest (i.e., MD5) generated from the document it refers to and is prone to collisions (different documents returning the same hash). The main advantages of CAS / FCS technology is that the location of the actual data and the number of copies is unknown to the user. The metaphor of a CAS / FCS is not that of memory and memory locations. The proper metaphor is that of a coat check. The difference is that, with a coat check, once the item has been retrieved it cannot be retrieved again. With CAS / FCS technology a client is able to retrieve the same data using the same claim check over and over.
[edit] History
In the period between 1990 and 1992 John Canessa, while working with different Hierarchical Storage Management (HSM) software, came up with (invented) a different metaphor for storing data. The logic behind the technology is as follows. Humans need cues and mechanisms to help them locate data in disk(s) attached to computers. File systems allow users to organize data in an inverted tree structure starting from a root (e.g., c:\documents\my_company\2009). Leaves in the inverted tree represent the actual data files. Each file in a file system has a unique file ID and occupies a well define location on disk. For ease of use, humans limit the number of files in a folder. The names of folders and files, including their extensions, are used as mnemonics for the contents and type of data they hold (e.g., lay_off_workers.doc). All of this is done in order to help humans organize and locate data.
Computers are not bound by the same rules. As far as a computer is concerned, all files in a file system could be in a single container / folder and given ascending numbers (i.e., block number). The idea behind CAS / FCS is to have at the application level a unique name associated with the contents of the file. The CAS / FCS implement a coat check metaphor in which the client presents a Global Unique Identifier (GUID) (or possibly a MD5 digest) and the CAS / FCS system returns the associated data as a file or a stream.
Independent of the type of storage used, application layers typically make use of databases to keep track of metadata. They have done this for decades since the invention of database engines. The only change to support a CAS / FCS is to replace the field that holds the path to a file (e.g., deed, invoice, resignation letter, radiology image, car loan application, the 05:00 PM news on May 22, 2008 among others) to hold a GUID. Due to hardware and network modifications, the actual path to files could change in time (e.g., the disk was replaced for a NAS and given a different network path). Instead of using an absolute file path to the data, the application could store a GUID that is independent of any local or network path. A GUID needs to contain more than a sequential number in order to allow it to be unique in multiple distributed Direct Attached Storage (DAS), Network Attached Storage (NAS) or Storage Area Network (SAN).
In May 1992, then professor at the Laboratory for Computer Science at the Massachusetts Institute of Technology (MIT) Ron Rivest published the “The MD5 Message-Digest Algorithm” Request for Comments (RFC). In a nutshell, the MD5 digest is a sequence of 16 bytes (commonly displayed as 32 hexadecimal digits) generated by a mathematical procedure performed on the actual contents of a data file or stream. The MD5 algorithm when applied to the same data always returns the same digest.
The first commercial application of a CAS / FCS developed by John Canessa was named Diverse Storage Manager (DSM) by the inventor and Directed Storage Manager (also DSM) by his management. It supported on-line (DAS) and near-line media in the form of Magneto Optical Discs (MOD) in a Hewlett-Packard (HP) automated library. The CAS / FCS was supporting a Digital Imaging and Communications in Medicine (DICOM) medical archive at the Radiology Society of North America (RSNA) 1993 show in Chicago, Illinois.
In 1994 John Canessa founded Software Engineering Corporation (SENCOR) www.sencor.com.
In 1995 John Canessa spoke a few times with Leonard M. Adleman, at the time professor at Stanford University, one of the founders of RSA and contributor to the MD5 algorithm, to get his insights on the uniqueness of the MD5 digest. Dr. Adleman responded as a true scientist by stating that, for all practical purposes, generating the same MD5 digest for two (2) different documents was mathematically highly improbable, but possible. Because of this the CAS / FCS software products developed by John Canessa have always had a unique identifier in addition to a hash value (i.e., MD5 digest).
In 1996 SENCOR spawned a marketing company by the name FileLink with the intent of selling products based on the SENCOR CAS / FCS technology. FileLink and later SENCOR showed DICOM archives at RSNA boasting the second generation of CAS / FCS technology. Both companies sold by then hundreds of CAS / FCS systems.
In March 1998 Ron Anderson wrote an article titled “Storing Smart Saves Space” in the now defunct Byte Magazine (www.byte.com). The article described the operation and technical advantages of a CAS / FCS system.
In the late 1990’s FileLink was approached by EMC (www.emc.com) to license / purchase the rights to the CAS / FCS software Backup and Archive Manager (BAM). Management was not able to reach a mutually satisfactory agreement.
In the late 1990’s Paul Carpentier and Jan van Riel, while working at a Belgian startup FilePool, coined the term Content Addressable Storage (CAS).
By the year 2000, SENCOR had implemented a few commercial products to process car loans, manage video for broadcasting, and managing video clips for investing purposes using its CAS / FCS technology. Today there are over 2,000 CAS implementations worldwide using the SENCOR CAS product supporting on-line, near-line and off-line storage.
In 2001 EMC acquired FilePool whose main product became the Centera platform.
In May 2003 John Canessa submitted to the Network Working Group an Informational draft titled “Fixed Content Storage (FCS) Application Programming Interface (API)” (http://www.watersprings.org/pub/id/draft-canessa-fcs-api-00.txt). At the time it represented the third generation of the CAS / FCS technology.
The Storage Networking Industry Association (SNIA) www.snia.org finalized the first pass on a C programming API for CAS / FCS using XML named “Information Management – extensible Access Method (XAM) – Part 2: C API Version 1.0 TECHNICAL POSITION July 9, 2008”.
John Canessa is about to start work on the fourth generation of the CAS / FCS technology.
[edit] Pros and Cons
CAS storage works most efficiently on data that does not change often. It is of particular interest to large organizations that must comply with document-retention laws, such as Sarbanes-Oxley. In these corporations a large volume of documents will be stored for as much as a decade, with no changes and infrequent access. CAS is designed to make the searching for a given document content very quick, and provides an assurance that the retrieved document is identical to the one originally stored. (If the documents were different, their content addresses would differ.) In addition, since data is stored into a CAS system by what it contains, there is never a situation where more than one copy of an identical document exists in storage. By definition, two identical documents have the same content address, and so point to the same storage location.
For data that changes frequently, CAS is not as efficient as location-based addressing. In these cases, the CAS device would need to continually recompute the address of data as it was changed, and the client systems would be forced to continually update information regarding where a given document exists. For random access systems, a CAS would also need to handle the possibility of two initially identical documents diverging, requiring a copy of one document to be created on demand.
[edit] Typical Implementation
Paul Carpentier and Jan van Riel coined the term CAS while working at a company called FilePool in the late 1990s. FilePool was acquired in 2001 and became the underpinnings of the first commercially available CAS system, which was introduced as EMC's Centera platform[1]. Paul and Jan are now working together again at Caringo which has introduced advancements in CAS technology with the CAStor content storage software. The Centera CAS system consists of a series of networked nodes (1-U servers running Linux), divided between storage nodes and access nodes. The access nodes maintain a synchronized directory of content addresses, and the corresponding storage node where each address can be found. When a new data element, or blob (Binary large object), is added, the device calculates a hash of the content and returns this hash as the blob's content address.[2] As mentioned above, the hash is searched for to verify that identical content is not already present. If the content already exists, the device does not need to perform any additional steps; the content address already points to the proper content. Otherwise, the data is passed off to a storage node and written to the physical media.
When a content address is provided to the device, it first queries the directory for the physical location of the specified content address. The information is then retrieved from a storage node, and the actual hash of the data recomputed and verified. Once this is complete, the device can supply the requested data to the client. Within the Centera system, each content address actually represents a number of distinct data blobs, as well as optional metadata. Whenever a client adds an additional blob to an existing content block, the system recomputes the content address.
To provide additional data security, the Centera access nodes, when no read or write operation is in progress, constantly communicate with the storage nodes, checking the presence of at least two copies of each blob as well as their integrity. Additionally, they can be configured to exchange data with a different, e.g. off-site, Centera system, thereby strengthening the precautions against accidental data loss.
IBM has another flavor of CAS which can be software based, Tivoli Storage manager 5.3, or hardware based, the IBM DR550. The architecture is different in that it is based on a hierarchical storage management (HSM) design which provides some additional flexibility such as being able to support not only WORM disk but WORM tape and the migration of data from WORM disk to WORM tape and vice versa. This provides for additional flexibility in disaster recovery situations as well as the ability to reduce storage costs by moving data off disk to tape.
Another typical implementation is from iTernity. The concept of iTernity bases of container, each container is addressed by its hash value. A container is a multiple number of fixed content documents, so one container is not changeable and the hash value is fixed after the write process.
[edit] Open Source Implementations
One of the very first content-addressed storage servers, Venti [3], is available for Plan 9 in the Plan 9 distribution and for Unix as part of Plan 9 from User Space.
A first step towards an open source CAS+ implementation is Twisted Storage[4]. Active development continues on Twisted Storage with a new release being worked on.
While it is generally used as a source code control system, Linus Torvald's Git program is a userspace CAS filesystem.
Project Honeycomb is an open source API for CAS systems [1]. The XAM interface being developed under the auspices of SNIA.ORG is an attempt to create a standard interface for archiving on CAS (and CAS like) products and projects.
[edit] References
- ^ Content-addressable storage - Storage as I See it | Computer Technology Review |Find Articles at BNET.com
- ^ http://www.techworld.com/features/index.cfm?featureID=235&printerfriendly=1
- ^ FAST 2002 Technical Program - Abstract
- ^ Twisted Storage