Object storage is a data storage system designed to manage large volumes of unstructured data. It stores and organizes data into discrete units called objects. Each object is characterized by three main components:
- Data: actual contents of the object
- Metadata: information about the object such as name, created/uploaded date, and file type
- Unique identifier: usually a unique key generated by some hashing algorithm to identify and retrieve each object
Since object storage does not impose many restrictions on the type of objects it can store, it is often used to store unstructured data that does not fit well into other types of storage including pictures, audio, and other digital content. Coupled with cloud technology, object storage makes up the backbone of scalable, durable, highly-available, and cost-efficient storage options.
Traditionally unstructured data was stored in a file or block storage. These types of storage systems are still used but for different purposes:
File storage systems like Amazon EFS and Google Filestore store data into a hierarchical structure involving folders. Files are placed into folders that can be stored at top-level or nested under other folders. To retrieve a specific file, the path to that file must be known. Multiple devices (e.g., VMs, servers) can access file storage concurrently. Compared to object storage systems, file storage systems are less scalable, but can be a good option if a hierarchical structure is already in place or if support for POSIX operations (e.g., updates, random reads) or standard file systems are required.
From a user's perspective, block storage systems look and behave like a locally mounted disk drive. However, behind the scenes, block storage systems break files into discrete chunks called blocks and store them separately. Each chunk of data is assigned a unique identifier, which is used to retrieve and assemble into the full file when needed. Block storage is best suited for transactional workloads or use cases that require consistent performance or lowest latency as block storage is treated as a local disk . In general, block storage is often cheaper than file storage systems, but has lower availability as only a single device can attach to it. There are some cloud offerings that allow multiple devices to attach, but they are limited to read-only operations.
Object storage systems store self-contained objects into flat data structures, even though a hierarchical structure can be supported. Object storage is best suited for workloads with relatively static data. Examples include static assets (e.g., HTML, pictures, videos) that are not modified frequently, but read many times. Object storage is massively scalable as its flat architecture allows for horizontal scaling through adding more storage nodes. The biggest advantage of object storage is that it is significantly cheaper than other options, especially when less frequently accessed data can be archived or moved to different tiers of storage.