0% found this document useful (0 votes)
18 views23 pages

Swift

OpenStack Swift is a distributed object storage service designed for scalable and redundant storage of unstructured data, such as documents and media files. It utilizes a unique architecture that ensures data integrity and replication across multiple nodes, allowing for cost-effective storage solutions. Swift is accessible via a RESTful API and supports extensive metadata, making it suitable for applications requiring large-scale data management.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views23 pages

Swift

OpenStack Swift is a distributed object storage service designed for scalable and redundant storage of unstructured data, such as documents and media files. It utilizes a unique architecture that ensures data integrity and replication across multiple nodes, allowing for cost-effective storage solutions. Swift is accessible via a RESTful API and supports extensive metadata, making it suitable for applications requiring large-scale data management.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 23

Topics

 Introducing Object Storage


 Features and Benefits
 Object Storage Characteristics
 Swift Components
 Swift Architecture
 Cluster Architecture
 Ring Builder
 Swift Replications
 Cinder Snapshots and Backups
What is Swift?
 Swift is a highly available, distributed, eventually consistent
object storage service
 OpenStack Swift is used to backup and archive unstructured
data, such as documents, images, audio and video files,
emails and virtual machine images
 It is used for redundant, scalable data storage using clusters of
standardized servers
 It is a long-term storage system for large amounts of static
data which can be retrieved and updated easily
Deep Dive into Swift
 Swift uses a distributed architecture with no central point of
control, providing greater scalability, redundancy, and
permanence
 Objects are written to multiple hardware devices, with the
OpenStack software responsible for ensuring data replication
and integrity across the cluster
 If a node fails, OpenStack works to replicate its content from
other active nodes
 Object Storage is ideal for cost effective, scale-out storage
 It provides distributed API-accessible storage platform that
can be integrated directly into applications or used for
backup, archiving, and data retention
Features and Benefits
Swift Interaction with Rest API
Swift Components
Object Storage Building Blocks
Proxy Servers
 Proxy servers handle all of the incoming API requests
 Once a proxy server receives a request, it determines
the storage node based on the object's URL
 It also coordinate responses, handle failures, and
coordinate timestamps
 It uses a shared-nothing architecture and can be
scaled as needed based on projected workloads
 A minimum of two proxy servers should be deployed
behind a separately- managed load balancer. If one
proxy server fails, the others takes over
Rings
 Ring represents mapping between the
names of entities stored in the cluster
and their physical locations on disks
 There are separate rings for accounts,
containers, and objects
 When components of the system need to
perform an operation on an object,
container, or account, they need to
interact with the corresponding ring to
determine the appropriate location in the
cluster
 The ring maintains this mapping using
zones, devices, partitions, and replicas
Zones
 Zones are configured to isolate failure boundaries
 Each data replica resides in a separate zone
 A zone could be a single drive or a grouping of a few drives
 The goal of zones is to allow the cluster to tolerate significant
outages of storage servers without losing all replicas of the
data
Accounts and Containers
 Each account and container is an individual SQLite database
 They are distributed across the cluster
 An account database contains the list of containers in that account
 A container database contains the list of objects in that container
 Each account in the system has a database that references all of its
containers, and each container database references each object in order
to keep track of object data locations.
Partitions
 Partition is a collection of stored data
 This includes account databases, container databases, and objects.
Partitions are core to the replication system
 System replicators and object uploads or downloads operate on partitions
 Partition is just a directory sitting on a disk with a corresponding hash
table of what it contains
Replicators
 Replicators continuously examine each partition
 For each local partition, the replicator compares it against the replicated copies in
the other zones to see if there are any difference
 Replication takes place by examining hashes (hash file is created for each
partition)
 If the hashes are different, then it is time to replicate, and the directory that
needs to be replicated is copied
 If a zone goes down, one of the nodes containing a replica notices and proactively
copies data to a handoff location
Swift in Use
 The following shows the use case for object uploads and
downloads and introduce the components
Upload
 Client uses the REST API to make a HTTP request to PUT an object into an
existing container
 The cluster receives the request
 First, the system must know where the data is going to go (the account
name, container name, and object name are all used to determine the
partition where this object is present)
 A lookup in the Ring figures out which storage nodes contain the partitions
 The data is then sent to each storage node where it is placed. At least two
of the three writes must be successful before the client is notified that the
upload was successful
 The container database is updated asynchronously to reflect that there is
a new object in it
Downloads
 A request comes in for an account/container/object
 Using the same consistent hashing, the partition index is
determined
 Lookup in the ring reveals which storage nodes contain that
partition
 A request is made to one of the storage nodes to fetch the
object, if that fails, requests are made to the other nodes
Swift Architecture - High Level
Architecture Description
 Proxy Server -
 responsible for tying together the rest of the Swift architecture
 responsible for encoding and decoding object data
 handles failure
 A Storage Policies - provides a way for object storage providers to
differentiate service levels, features and behaviours of a Swift deployment
 Account Server - responsible for listings of containers rather than objects
 Container Server - handles listing of objects which are stored as SQLite
database files
 Object Server - store, retrieve and delete objects stored on local devices
 Auditors - forwards the local server checking the integrity of the objects,
containers, and accounts
 Updaters - updates container or account data
 Replication - responsible for keeping the system in a consistent state in
temporary error conditions like network outages or drive failures
SWIFT
 The OpenStack Object Store project, known as Swift, offers
cloud storage software so that you can store and retrieve lots
of data with a simple API.
 It's built for scale and optimized for durability, availability, and
concurrency across the entire data set.
 Swift is ideal for storing unstructured data that can grow
without bound.
 OpenStack Object Storage (swift) is used for redundant, scalable data storage using
clusters of standardized servers to store petabytes of accessible data.
 It is a long-term storage system for large amounts of static data which can be
retrieved and updated.
 Object Storage uses a distributed architecture with no central point of control,
providing greater scalability, redundancy, and permanence.
 Objects are written to multiple hardware devices, with the OpenStack software
responsible for ensuring data replication and integrity across the cluster.
 Storage clusters scale horizontally by adding new nodes. Should a node fail,
OpenStack works to replicate its content from other active nodes.
 Because OpenStack uses software logic to ensure data replication and distribution
across different devices, inexpensive commodity hard drives and servers can be
used in lieu of more expensive equipment.
 Object Storage is ideal for cost effective, scale-out storage. It provides a fully
distributed, API-accessible storage platform that can be integrated directly into
applications or used for backup, archiving, and data retention.
Swift Characteristics
 Swift is an object storage system that is part of the OpenStack project
 Swift is open-source and freely available
 Swift currently powers the largest object storage clouds, including
Rackspace Cloud Files, the HP Cloud, IBM Softlayer Cloud and countless
private object storage clusters
 Swift can be used as a stand-alone storage system or as part of a cloud
compute environment.
 Swift runs on standard Linux distributions and on standard x86 server
hardware
 Swift—like Amazon S3—has an eventual consistency architecture, which
make it ideal for building massive, highly distributed + infrastructures with
lots of unstructured data serving global sites.
 All objects (data) stored in Swift have a URL
Swift Characteristics
 Applications store and retrieve data in Swift via an industry-
standard RESTful HTTP API
 Objects can have extensive metadata, which can be indexed and
searched
 All objects are stored with multiple copies and are replicated in as-
unique-as-possible availability zones and/or regions
 Swift is scaled by adding additional nodes, which allows for a cost-
effective linear storage expansion
 When adding or replacing hardware, data does not have to be
migrated to a new storage system, i.e. there are no fork-lift
upgrades
 Failed nodes and drives can be swapped out while the cluster is
running with no downtime. New nodes and drives can be adde

You might also like