0% found this document useful (0 votes)
63 views8 pages

Minio Ibm

The document discusses how MinIO object storage running on IBM Power Systems servers with IBM POWER9 processors can deliver high performance for workloads like AI, IoT, and analytics. It provides an overview of the benefits of object storage and how MinIO object storage combined with IBM Power Systems servers offers scalability, fast retrieval of data, cost effectiveness, and protection of large volumes of unstructured data.

Uploaded by

berryguo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
63 views8 pages

Minio Ibm

The document discusses how MinIO object storage running on IBM Power Systems servers with IBM POWER9 processors can deliver high performance for workloads like AI, IoT, and analytics. It provides an overview of the benefits of object storage and how MinIO object storage combined with IBM Power Systems servers offers scalability, fast retrieval of data, cost effectiveness, and protection of large volumes of unstructured data.

Uploaded by

berryguo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

IBM Systems June 2019

White Paper

Implement high-performance
object storage with MinIO
and IBM
Achieve robust performance for AI, IoT and more using MinIO
and IBM Power Systems servers with POWER9 processors
2 Implement high-performance object storage with MinIO and IBM

Executive summary It also offers the flexibility to disaggregate storage from compute
Object storage presents several important benefits for resources, enabling organizations to optimize compute and
accommodating fast-growing volumes of unstructured data. storage for specific workflows. As a result, object storage is fast
With the right object storage solution and hardware becoming the default storage option for these organizations.
infrastructure, organizations can also achieve the robust
performance required for supporting computationally intensive Using the right hardware infrastructure, object storage can also
workloads, including artificial intelligence (AI)/machine provide a fundamentally different performance profile than
learning, Internet of Things (IoT), and big data analytics. other types of storage, enabling organizations to implement new
use cases and launch more ambitious projects. High-performance
Recent benchmark testing shows that MinIO object storage object storage can support workloads ranging from training AI
running on IBM Power Systems servers with IBM POWER9 algorithms to analyzing IoT data. Running MinIO object
processors can deliver exceptional throughput performance— storage with IBM Power Systems servers based on IBM
up to 25 GB/s in aggregate for four servers—plus linear POWER9 processors can deliver this level of performance,
scalability as clusters grow. That level of performance enables opening important opportunities for enterprises deploying
organizations to unlock the full value of their data while also workloads in private cloud or multicloud environments.
capitalizing on the scalability, accessibility, data protection, and
cost-effectiveness of object storage. Recognizing the advantages
of object storage
Launching data-intensive initiatives For storing large, rapidly expanding volumes of unstructured
Across industries, organizations are launching new technology data, object storage can present your organization with several
initiatives that require them to store, access, and analyze large, advantages over more traditional file- or block-based storage.
fast-growing volumes of data. Whether they are implementing
artificial intelligence (AI)/machine learning, capitalizing on Scalability
Internet of Things (IoT) technology, or employing other big Object storage is designed to scale. Instead of the nested files and
data solutions, these organizations might need to store and folders used by hierarchical file systems, object storage uses a flat
analyze tens—or hundreds—of petabytes of data. structure. That structure enables you to store billions of files
without the complexity and performance issues that can develop
Much of that data is unstructured. From multimedia files as you scale hierarchical environments. Object storage also lets
and text documents to web pages and log files, unstructured you scale incrementally: you can scale performance or capacity
data can be difficult to query, making it challenging for simply by adding racks of clusters.
organizations to work with all of the data they are collecting.
Traditional hierarchical file storage systems and block storage Fast retrieval
are not the best fit for these unstructured data volumes. With MinIO object storage, each object has metadata and uses
the URL as a unique identifier. These tags and ID numbers help
Object storage offers an important alternative to file- and eliminate the need to know the exact location of data within the
block-based storage for big data, as proven by organizations storage environment. Every object is accessible from anywhere
with hyperscale environments. Object storage provides through its unique URL—only standard IP routing and DNS
the right combination of cost-effective scalability, data mechanisms are required. The right object storage solution can
integrity, and accessibility that many organizations need. also avoid the bottleneck of a centralized metadata server, storing
the metadata alongside objects.
IBM Systems 3

Data protection and preservation Until recently, big data, IoT, and AI workloads often drove
Object storage solutions protect and preserve data more organizations to employ Hadoop Distributed File System
efficiently than other types of storage architectures. By using (HDFS) storage. With HDFS, you bring the algorithm to the
data protection capabilities such as erasure coding, object storage data. Each node computes a part of the algorithm using local
can protect data using far less raw storage capacity than storage and then sends the results back to a centralized server,
RAID-based architectures. Data protection capabilities can also where results are aggregated. This approach can work well for
help quickly repair problems on a per-object basis, instead of on some algorithms, and it can offer scalability for large-scale
a per-disk basis, helping to avoid data loss and to maintain high collections of data.
availability of data.
However, object storage presents several advantages over HDFS.
Cost-effectiveness For example, object storage can provide greater flexibility for
The ability of object storage to scale incrementally, without balancing compute and storage across your environment. Using
forklift upgrades, can help you control storage costs. high-speed networking with your object storage environment,
In addition, object storage data protection capabilities help you can consume your compute and storage resources in the
eliminate the need for numerous copies of files, reducing the optimal way for each particular workload.
raw storage capacity required to safeguard data and driving
down capital expenditures. Object storage also requires less capacity than HDFS to ensure
data protection for the same amount of data. While HDFS stores
Unlocking the full value of data with multiple copies of each file, object storage can use data
high-performance object storage protection capabilities such as erasure coding to protect data
Object storage has not always been used for high-performance more efficiently. Object storage also helps eliminate the risk of
workloads. In fact, some organizations employ object storage as a using a single master node, which can become a single point of
backup environment or a long-term disk-based archive. failure. Overall, high-performance object storage provides a
more efficient and reliable way to support data-intensive
Object storage does have advantages for these use cases. By workloads than HDFS.
storing objects along with metadata, object storage can make it
easier for users to find and retrieve the files, media clips, or entire Capitalizing on MinIO high-performance
projects they need among millions or billions of files. At the same object storage with enterprise capabilities
time, data protection capabilities can help securely preserve data MinIO high-performance distributed object storage is designed
over the long term. for large-scale data environments. It is a well-suited Amazon
S3–compatible replacement for HDFS, especially when used for
Yet to maximize the value of data residing in object storage, AI/machine learning, IoT, and other big data workloads.
you need to be able to consume it quickly. High-performance
object storage solutions can help you extend the benefits
of object storage to new use cases and extract more value from
your stored data. If you can achieve sufficient throughput, you
can use object storage for big data and IoT analytics, as well as
AI/machine learning workloads.
4 Implement high-performance object storage with MinIO and IBM

MinIO object storage comprises a server, optional client, and Flexibility


optional software development kits (SDKs): MinIO allows you to combine multiple data instances to form
a unified global namespace. As a result, you can support
• MinIO Server is a distributed object storage server that geographically distributed users while accommodating a
includes an array of enterprise-grade capabilities. variety of applications from a single console. By using an
• MinIO Client (“mc”) is a modern alternative to UNIX Amazon S3 API, MinIO also gives you the flexibility to support
commands that supports web-scale object storage multiple clouds—and incorporate existing storage—while
deployments. ensuring that your view of data looks exactly the same.
• MinIO Client SDKs include simple APIs for accessing any
Amazon S3–compatible object storage. Achieving robust object storage performance
with MinIO optimized for POWER9
MinIO is an open source solution that offers several enterprise IBM Power Systems servers based on POWER9 processors
capabilities for protecting data, maintaining data integrity, provide the high-performance infrastructure required by
tightening security, and maximizing flexibility. MinIO high-performance object storage software. Together,
these solutions can support demanding workloads such as AI,
Data protection and integrity
IoT analytics, and big data analytics.
Per-object, inline erasure coding protects against data loss and
maintains availability of data—even if multiple drives or
For many organizations, Power Systems servers offer the
devices are lost. Bitrot protection avoids reading corrupted
right combination of performance, reliability, cloud flexibility,
data caused by aging drives, firmware bugs, accidental
and security.
overwrites, and other problems.

Security • Robust performance: Outstanding core performance


MinIO supports multiple, sophisticated server-side encryption plus high memory bandwidth help deliver industry-
schemes to protect data wherever it resides. MinIO Server leading performance.
encrypts each object with a unique object key. Even if an • Reliability: IBM Power Systems servers provide dependable
individual object is compromised, the same decryption key on-premises infrastructure to meet around-the-clock
cannot be used with any other object. In addition, MinIO user demands.
offers a write-once, read-many (WORM) mode, which disables • Cloud flexibility: These servers integrate easily into private
all APIs that can potentially mutate the object data and cloud and multicloud strategies.
metadata: once written, data becomes tamperproof. • Security: Strong security capabilities—such as
accelerated encryption built into the chip—help ensure
Support for advanced standards in identity management data remains protected.
creates centralized access with temporary and rotated
passwords. Fine-grained, configurable access policies facilitate
simple support of multitenant and multi-instance deployments.
IBM Systems 5

To achieve the object storage performance needed for AI, IoT, Several POWER9-based servers also feature a storage-rich
and big data workloads, the POWER9-based servers take design that supports processing and analysis of very large data
advantage of PCIe 4.0 technology. PCIe 4.0 doubles the volumes. The Power Systems LC922—which offers the
bandwidth offered by PCIe 3.0, which remains the standard highest storage capacity in the Power Systems portfolio—
used by other CPU architectures. supports up to 120 TB of capacity in a 2U form factor.

In addition, these servers support nonvolatile memory Benchmarking IBM POWER9-based


express (NVMe) storage technology, through which each servers with MinIO
processor core communicates directly with storage devices MinIO engineers conducted benchmark testing to demonstrate
using the PCIe bus. NVMe drives can deliver superior the extreme performance that is possible using MinIO Server with
performance compared to previous-generation, flash-based POWER9-based systems. The testing deployed four IBM Power
storage. These drives also enable you to achieve that Systems LC922 servers, equipped with POWER9 processors,
performance in dense environments that help control along with four POWER8-based servers as clients. The POWER9
infrastructure costs. servers included NVMe-based flash drives in addition to hard-disk
drives. The environment used a high-speed 100 Gb private network.
Fast networking is critical for maximizing bandwidth across
object storage clusters. By supporting multiple 100 Gb/s To fully capitalize on the throughput performance of POWER9-
Ethernet networking links per server, the Power Systems based servers, the MinIO team optimized and accelerated
servers help eliminate networking bottlenecks. MinIO Server for the POWER9 architecture using the Golang
(Plan 9) assembly feature.

100 GbE top-of-rack switch

4x IBM Power
Systems S822LC
4x IBM Power
Systems LC922
servers (clients) servers

Figure 1: The test environment included four IBM Power Systems LC922 POWER9 servers (right), four IBM Power Systems S822LC servers as clients, and
100 GbE networking.
6 Implement high-performance object storage with MinIO and IBM

The MinIO team first evaluated throughput performance for Hashing operations require considerable CPU resources,
accelerated versions of two computationally intensive algorithms: but the POWER9-based servers can deliver the required
erasure coding and HighwayHash (for bitrot detection). performance. In the benchmark testing, the optimized
HighwayHash algorithm running on the POWER9 servers
Erasure coding achieved throughput of 5 GB/s per core, which can saturate
With MinIO, erasure coding is designed to take place inline on the 100 Gb network.
a per-object basis. When you store 1 GB of data, MinIO splits
up that data across a large number of drives and creates the COSBench
appropriate amount of parity data on separate drives. The team also ran COSBench, a commonly used open source
Depending on the parity configuration you choose, you can benchmarking tool, to measure the performance of object
afford to lose up to half of the servers and half of the drives— storage services. COSBench testing used four POWER9-based
you will still be able to reconstruct all of your data. Running systems, each with four NVMe drives and connected with
erasure coding inline—instead of offline—enables you to start 100 Gb/s networking.
protecting data the moment you store it, but it inherently
demands high-performance object storage, which MinIO is The team ran COSBench on the four clients with 256 threads
able to provide. per client (1024 total). Each test typically took about an hour,
with a prepare (WRITE) stage of 20–30 minutes, a 20-minute
In the benchmark testing, the optimized erasure coding main (READ) stage, and a final cleanup stage. The team
algorithm running on POWER9 systems achieved throughput uploaded and downloaded more than 10 TB of data to
of 7–9 GB/s per core, which is critical for saturating the fast mitigate any memory caching effects that could inflate the
100 Gb network. This level of throughput for the optimized performance numbers.
algorithm reflects the robust performance of the POWER9
system architecture, which is particularly well suited for this Object-size benchmarks: The team used the four-node
type of high-throughput workload. cluster to benchmark MinIO object storage read and write
throughput for objects of increasing size. Read performance
Bitrot detection reached 18 GB/s and stayed constant through 32 MB and
Similar to erasure coding, MinIO is designed to run bitrot 64 MB object sizes. For larger objects, the write performance
detection on the fly. MinIO’s implementation of the achieved 50 percent of the read performance, which is a
HighwayHash algorithm helps prevent the reading of corrupt strong result.
data. The algorithm computes a hash on read and verifies the
hash on write from the application. Any change in the hash Object Size 10 MB 20 MB 32 MB 64 MB
fingerprint indicates data corruption and requires the use of Read (GB/s) 14.9 18.1 18.7 18.0
parity data instead of the corrupted data. Write (GB/s) 5.7 7.3 10.1

Figure 2: Read performance reached 18 GB/s for objects of 20 MB or larger.


IBM Systems 7

Cluster scaling benchmarks: The team also benchmarked Moving forward with MinIO and IBM
MinIO cluster scaling by increasing the number of nodes used Object storage provides an important alternative to file and
in the test. The COSBench test demonstrated a maximum read block storage for large and growing volumes of unstructured
performance of nearly 25 GB/s in aggregate for the four data. By selecting high-performance object storage, your
POWER9-based servers. organization can extend the benefits of object storage to new
use cases, including AI/machine learning, IoT, and other big
Expanding the cluster could also boost read performance. data workloads. Employing MinIO in combination with IBM
Because MinIO clusters can grow to any number of servers, Power Systems servers based on POWER9 processors can
and overall throughput increases as cluster size increases, the deliver the performance to support those workloads and unlock
total read performance could be higher than 25 GB/s. greater value from data.

Number of Servers 1 2 3 4
Learn more
Throughput (GB/s) 10.5 19.4 24.1 25.4
To discover more about MinIO benefits for AI, IoT, and
Figure 3: MinIO Server performance increases as the cluster size expands. additional big data workloads, visit: https://min.io

Benchmarking summary To learn more about the complete line of the IBM Power
Results from the erasure coding, bitrot, and COSBench testing Systems family, visit: ibm.com/it-infrastructure/power
all show the impressive throughput performance that can be
achieved with MinIO Server on POWER9-based systems. The
results of the erasure coding and bitrot detection algorithm
testing highlight how well this architecture handles these two
specific computationally intensive processes. But the results
also suggest that this architecture could deliver strong results
for computationally intensive AI, IoT, and big data workloads.

The COSBench testing illustrates how this distributed object


storage architecture can deliver outstanding aggregate
throughput performance across a cluster, enabling clients to
take full advantage of the high-performance nature of MinIO
object storage. Whether your organization is running a private
or multicloud environment, you can use this architecture to
gain the performance you need for parallel processing of large
sets of unstructured data.
© Copyright IBM Corporation 2019

IBM Global Services


Route 100
Somers, NY 10589
USA

Produced in the United States of America


June 2019
All Rights Reserved

IBM, the IBM logo and ibm.com are trademarks or registered trademarks
of International Business Machines Corporation in the United States, other
countries, or both. If these and other IBM trademarked terms are marked
on their first occurrence in this information with a trademark symbol
(® or ™), these symbols indicate US registered or common law trademarks
owned by IBM at the time this information was published. Such trademarks
may also be registered or common law trademarks in other countries.
A current list of IBM trademarks is available on the Web at “Copyright and
trademark information” at ibm.com/legal/copytrade.shtml. Other company,
product and service names may be trademarks or service marks of others.

References in this publication to IBM products and services do not


imply that IBM intends to make them available in all countries in which
IBM operates.

Please Recycle

XXX-XXXXX-XXXX-00

You might also like