Object Storage Gaining In Importance

What is object storage and why should anyone care?

Object storage is a highly scalable architecture where data gets stored by unique IDs and attributes that describe the data. These IDs and attributes provide fast and easy access to the stored objects.  Like many things in the IT industry, it is a concept most have heard about but few have adopted. But, with unstructured data growth projected by the Gartner Group at some 80% over the course of the next five years, interest in better, more economical methods to store data and scale storage is on the rise. Data center managers need to act or risk being overwhelmed. Object storage may be the answer because the data does not need the hierarchical indexing characteristic of file- or the high-speed data throughput of block-based storage.

To better appreciate what object storage can do, it might help to first describe the challenge of unstructured data growth, and understand the characteristics of objects, how object storage differs from file and block, and look at some typical use cases. Then, we can look at the wider use of object storage for enterprises, service providers, and public clouds.

Focus on Objects

What makes up unstructured data growth?

Unstructured data is comprised of archiving, Web 2.0, imaging, and cloud-based workloads, including the videos, audio, and online documents that make up our backups and Web experiences. With the cost per gigabyte declining, and the need to meet compliance requirements and the interest in data analytics for business advantage escalating, all types of data are now being retained causing data growth to soar.

With this reality, it makes sense that an alternative to traditional hierarchical file- and block-based storage needs to be considered to economically scale storage to handle the large amount of data. Object storage as a complement to file- and block-based storage systems could be the answer.

What are objects?

Objects may be scissors, paper, or rocks to the general populace, but in IT parlance, objects are chunks of data of varying size retained by their attributes. Because these attributes often vary, objects are stored in containers with a unique object ID or URL and metadata (i.e. an extensible list of the attributes).

Anything can be stored as an object. How something is stored determines whether it is an object or a file. For example, using a video with our own EMC vice president and division CTO Sal DeSimone done at EMC World this year, you can see the differences between object and file-based storage.  If stored in a file-based system, the video is accessed by following a specific path through the file system to the directory where the video is located. If stored in an object-based system, the video is accessed from anywhere via the combination of its object ID and a number of attributes stored in the video’s object metadata.

Characteristics of File and Object

File

…/wp-content/uploads/2012/05/emc-sal-desimone.mov

File size: 26,492 KB

Object


…/wp-content/uploads-2012-05-emc-sal-desimone.mov

File information plus attributes:
Sal DeSimone
EMC World
EMC ProSphere
EMC Appsync

Object Storage at a Glance

How does object storage differ from file and block?

With object storage, millions of items can be easily stored without running into restrictions that might be found in a file-based system. Storing objects with their IDs and attributes ensures easy retrieval. This makes object storage perfect for applications where large amounts of unstructured data are stored, like for example, archives.

Object storage’s use of unique IDs makes for flat address spaces, rather than the specific file locations with the lengthy file addresses common to file-based systems as seen in the above example. It also makes for better scaling and lower administrative costs because it does not require setting up and managing LUNs.

Object storage uses the HTTP protocol compared to traditional storage standards like NFS for file- or SCSI or Fibre Channel for block-based storage. APIs used with object storage are also oriented to the Web and based on design models like REST which look for those unique object IDs. Although the idea of global addressability and HTTP access has merits, the trade-off is in the data throughput rates because they are far slower for object storage than for file- or block-based storage.

Where does object storage get used?

Object storage saw early adoption in Web-based cloud services. Access any media-heavy website like Facebook to see object storage in use for pictures, videos, and more. In the enterprise, where used, object storage is usually applied to archiving email, data, images, videos, documents, and virtual machine images. In some cases, this means archiving to public clouds, much like data might have been backed up for disaster recovery with a managed services provider in the past.

The table below provides a brief summary of the access characteristics, uses, strengths, weaknesses, and opportunities for each storage type.

Object vs. File-Based vs. Block-Based Storage

Type

Object Storage

File-Based Storage

Block-Based Storage

Protocols

HTTP

CIFS, NFS

SCSI, iSCSI, Fibre Channel (FC), FCoE

API Design Model

REST, SOAP (open)

Proprietary

Proprietary

Metadata Support

Custom attributes

Fixed file-system attributes

Fixed block-system attributes

Use

Unstructured data  (e.g. images, videos) in cloud

Shared file data

Online transaction processing (OLTP)

Strengths

Scalability, distributed access

Simple access to shared files

High performance

Limitations

Not good for frequently changing data

Difficult to implement distributed access

Difficult to implement distributed access

Opportunities

Repository for rapidly growing unstructured data

Solid methodology for data storage and consistency in shared file applications

OLTP applications

While object storage provides the scalability needed to accommodate many data needs, it is not a replacement for file- and block-based storage. File-based storage ensures data consistency in shared-file applications and block-based storage still provides the best in high-performance for OLTP applications.

Wider Use of Object Storage

What can object storage mean to the enterprise, service providers, and the cloud?

Data center managers need do something to manage the forecasted high unstructured data growth as well as the 40-50% structured data growth projected by Gartner for the next five years. Object storage may be the answer because some of the data does not need the hierarchical indexing common to file- and block-based systems. After years of looking over their shoulders at object storage, many data center managers are realizing that object storage may be closer to mainstream than ever before.

Add to these storage needs the fact that more than half of the organizations in a recent InformationWeek survey (51%) had or were about to have a private cloud, and one can see how object storage can follow the rising use of cloud infrastructure technologies into the enterprise. For an enterprise building a private cloud or a service provider enabling a hybrid cloud model, it only makes sense to architect the data center to resemble public clouds and the Internet. This approach also means not limiting data storage to just file and block, especially when you need to scale to accommodate rapid data growth.

Object storage is gaining in importance but there is more to objects that just adding another storage system. Look to upcoming posts to discuss object storage in more detail, including the challenges and benefits of object storage for enterprises and service providers, and getting to object-as-a-service.

About the Author: Mark Prahl