Behind the Wizard’s Curtain (OneFS Revealed, Part II)

So, first I compared OneFS® to a big pot of chili. Then I walked through its innovative engineering and the fundamental architectural advantages of a single file system. Now we’ll walk through how this single file system solves the classic “bin-packing” dilemma, delivers unprecedented data protection.

An often under-stated benefit of a single scalable file system is the added storage efficiencies obtained by having a single “bin” to pack data into. A computer science challenge of packing differently sized items into differently sized bins is called “bin-packing”. This is very challenging algorithmically and does not result in optimal space efficiency. This is analogous to administrators attempting to coerce users into choosing the volumes which have the requisite amount of free space or manually moving the data themselves. If the users choose incorrectly, the performance demands of a particular workflow cannot be satisfied by a particular volume, the organizations cannot address a particular volume, or the storage admin cannot move data transparently and quickly then the storage efficiency will be sub-optimal. Industry analysis of real deployments suggests that on average 43% of storage capacity is wasted due to these inefficiencies. An Isilon system has no such constraints and storage efficiencies typically exceed 80%.

As any system scales, techniques that were appropriate at a small size become inadequate at a larger size. There is no better example in storage systems for this than RAID. RAID can only be effective if the data can be reconstructed before another failure can occur, yet as the amount of data increases, the speed to access that data does not – and the probability of additional failures continues to increase. OneFS has a unique implementation of protection capabilities, founded on solid mathematical constructs: FlexProtect, utilizing Reed-Solomon Encodings. FlexProtect provides protection for up to 4 simultaneous failures – either full nodes or individual drives. As a cluster scales up, FlexProtect delivers on the need that the reconstruction time for an individual failure must decrease.

FlexProtect is extremely innovative, taking a file-specific approach towards data protection; storing protection information for each file independently. This allows protected data to be distributed along with the file data, dramatically increasing the potential parallelism for access and ultimately reconstruction. When a failure (either node or drive) occurs in an Isilon storage system, FlexProtect is able to specifically identify which portions of files are affected by that particular failure – and then employ multiple nodes to participate in the reconstruction of just the affected portions. Since AutoBalance spreads files out across the cluster, the amount of spindles and CPUs available for reconstruction far exceeds what would be found in a typical hardware RAID implementation. In addition, FlexProtect doesn’t need to reconstruct data back to a single spare drive (which would be a bottleneck) – instead, the file data is reconstructed in available space – providing a Virtual Hot Spare.

Since FlexProtect (and OneFS) is file aware, not only does it optimize reconstruction techniques around file-specific behavior, it also provides file-specific protection capabilities. An individual file (or more typically, a directory) can be given a specific protection level – allowing different portions of the file system to be protected at different levels. Critical data can be protected at a higher level whereas less critical data can be protected at a lower level. This provides storage administrators with a very granular protection/capacity trade-off that can be adjusted dynamically as a cluster scales and a workflow ages – transparently, on-the-fly.

About the Author: Nick Kirsch