Powerscale Onefs s3 Overview
Powerscale Onefs s3 Overview
October 2021
H18292.2
Introduction
Revisions
Date Description
June 2020 Initial release
October 2021 Update with OneFS 9.3.0 S3 enhancements: MPU, DeleteObjects, Chunked Upload,
non-slash delimiter support for ListObjects/ ListObjectsV2
Acknowledgments
Author: Lieven Lin
The information in this publication is provided “as is.” Dell Inc. makes no representations or warranties of any kind with respect to the information in this
publication, and specifically disclaims implied warranties of merchantability or fitness for a particular purpose.
Use, copying, and distribution of any software described in this publication requires an applicable software license.
Copyright © 2021 Dell Inc. or its subsidiaries. All Rights Reserved. Dell Technologies, Dell, EMC, Dell EMC and other trademarks are trademarks of Dell
Inc. or its subsidiaries. Other trademarks may be trademarks of their respective owners. [10/8/2021] [Technical White Paper] [H18292.2]
Table of contents
Revisions............................................................................................................................................................................. 2
Acknowledgments ............................................................................................................................................................... 2
Table of contents ................................................................................................................................................................ 3
Executive summary ............................................................................................................................................................. 4
1 Introduction ................................................................................................................................................................... 5
1.1 OneFS S3 overview ............................................................................................................................................ 5
2 Use cases ..................................................................................................................................................................... 8
3 OneFS S3 implementation ........................................................................................................................................... 9
3.1 Buckets .............................................................................................................................................................10
3.1.1 Bucket naming rules .........................................................................................................................................10
3.1.2 Bucket operations .............................................................................................................................................11
3.2 Objects ..............................................................................................................................................................11
3.2.1 Multipart upload ................................................................................................................................................12
3.2.2 Object operations..............................................................................................................................................12
3.2.3 OneFS S3 ETag ...............................................................................................................................................13
4 OneFS S3 authentication and authorization ..............................................................................................................15
4.1 OneFS S3 request authentication ....................................................................................................................15
4.2 OneFS S3 request authorization ......................................................................................................................16
4.3 ACL ...................................................................................................................................................................18
4.3.1 Bucket ACL .......................................................................................................................................................18
4.3.2 Object ACL .......................................................................................................................................................18
4.3.3 Canned ACL .....................................................................................................................................................20
4.4 S3 predefined groups .......................................................................................................................................20
A Technical support and resources ...............................................................................................................................21
A.1 Related resources.............................................................................................................................................21
Executive summary
Dell EMC™ PowerScale™ OneFS™ supports the native S3 protocol. With this capability, clients can access
OneFS cluster file-based data as objects. PowerScale combines the benefits of traditional NAS storage and
emerging object storage to provide an enhanced data-lake capability and cost-effective unstructured data
storage solution. OneFS S3 is designed as a first-class protocol, including features for bucket and object
operations, security implementation, and management interface.
This document introduces how the S3 API is implemented in OneFS to provide high-performance data
access. It introduces the benefits of OneFS S3 and provides applicable use cases. This document also details
bucket and object operations, authentication, and the authorization process.
1 Introduction
Data is now a new form of capital. It provides the insights that facilitate your organization’s digital
transformation, and 80% of the information is represented as unstructured data. Organizations in every
industry generate an exponentially larger number of unstructured data volumes than ever before, from edge
to core to cloud. The way of storing and managing unstructured data is evolving. Its goal is to unlock the value
of your data by using both the traditional network attached storage (NAS) system and emerging object
storage.
Many organizations run their critical applications on traditional NAS storage, and develop new modern
applications using object storage. There are some challenges under the heterogeneous storage platforms for
unstructured data:
• Applications running on different storage platforms may need to access the same set of data. In this
case, data migration is required between NAS storage and object storage, and the extra copy of the
data consumes additional storage capacity.
• There is an inability to access object storage through the NAS protocol, like NFS and SMB. Many
object-storage systems provide access to the NAS protocol using a gateway-like architecture which
does not perform adequately when combined with the object storage stack.
• In contrast to NAS storage, object storage is not intended for transactional data where operations-
per-second and latency are critical.
Designed to address these challenges, Dell EMC PowerScale is ideal for demanding enterprise file workloads
and can store, manage, and protect unstructured data with efficiency and massive scalability. It supports
multiple NAS protocols natively for applications.
Starting with OneFS version 9.0, PowerScale enables accessing data through the Amazon Simple Storage
Service (Amazon S3) application programing interface (API) natively. PowerScale implements the S3 API as
a first-class protocol along with other NAS protocols on top of its distributed OneFS file system. Whether your
application is based on traditional NAS storage or on the emerging object storage, the application can access
data in a single scale-out storage platform through NAS protocols or the S3 API as needed. This document
introduces how S3 API is implemented in OneFS and how it can provide high-performance data access.
Figure 1 shows the traditional scale-up NAS platform and the emerging object-storage architecture. The
traditional scale-up NAS platform is only accessible through file protocols and is not easy to scale as the
performance requirement increases. The object storage allows both file and object access, but the file access
is achieved through a gateway, with either a software daemon or additional dedicated hardware. This limits
file-access performance compared to a traditional NAS platform.
S3 SMB/NFS
SMB/NFS
File service
gateway
This is where PowerScale scale-out storage comes in. With the introduction of the OneFS S3 protocol,
PowerScale combines the advantages of both platform types into a single storage system while providing
performance for file and object access.
Starting with OneFS version 9.0, PowerScale OneFS supports the native S3 protocol. OneFS implements S3
as a first-class protocol along with other protocols, including NFS, SMB, and HDFS. The S3 protocol is
implemented over HTTP and secure HTTP (HTTPS). Through OneFS S3, you can access file-based data
stored on your OneFS cluster as objects. Since the S3 API is considered to be a protocol, content and
metadata can be ingested using S3 and concurrently accessed through other protocols that are configured on
the cluster.
Note: The OneFS S3 service is disabled by default. If the service is enabled, it only listens on port 9021 for
HTTPS. Port 9020 for HTTP is disabled by default. These ports are configurable through the OneFS WebUI
and CLI.
In OneFS 9.0, each S3 bucket is mapped to a directory under an access zone base path, each S3 object is
mapped to a file, and the associated object prefix is mapped to directories. A directory for a bucket is created
under the access zone base path by default. See the AWS S3 documentation regarding S3 bucket and object
definition.
2 Use cases
By implementing the S3 protocol, OneFS enhances its data-lake capability by supporting both the traditional
NAS protocol and object storage protocol. You can unify your file and object data access in a single storage
namespace. The S3 protocol on a OneFS cluster provides following benefits:
As the Figure 2 shows, OneFS seamless interoperates between file-based and object-based data access in a
single NAS platform for various workloads.
OneFS
OneFS can now extend its use cases with the benefits of OneFS S3. The following list includes general use
cases for OneFS S3:
• Back up and archive: It is possible to make OneFS as an ideal target for S3-compatible data backup
and archive software.
• File service: Data access for files and data access for objects are easily consolidated in a single
scale-out storage. This provides faster service than cloud and more cost-effective service than
traditional NAS platforms.
• Enhanced multiprotocol data access: The data in a OneFS cluster can be accessed as files or
objects.
3 OneFS S3 implementation
OneFS implements the S3 protocol on top of the file-service engine like other protocols. Clients that connect
to a OneFS cluster with S3 gain access to the single volume of the distributed OneFS file system and take
advantage of the entire cluster’s performance. To work with OneFS S3, clients connect to the S3 service over
HTTP or HTTPS and use standard REST calls such as PUT, GET, and POST to perform bucket and object
operations.
File service
File system
OneFS
Making an analogy with an SMB share which is associated with a path, a OneFS S3 bucket is also created
based on a specific path within the access zone base path. OneFS S3 maps an object to a file and maps the
object prefix to directories correspondingly. For example, assume a file is stored in OneFS with a full path of
/ifs/data/docs/finance/sample.pdf. To access the file with S3, create a bucket bkt01 in OneFS and
associate the bucket with a path /ifs/data/docs/. The object key of /finance/sample.pdf is used to represent
the file.
OneFS support two types of requests when resolving buckets and objects. See the Amazon S3
documentation Virtual Hosting of Buckets for more details about the following:
• Virtual hosted-style requests: Specify a bucket in a request by using the HTTP Host header
• Path-style requests: Specify a bucket by using the first slash-delimited component of the Request-
URI path
With OneFS S3, you can access the file as an object by using GET and PUT operations with the following
URLs (for example) which contain the SmartConnect zone name:
The path-style request is available through both the SmartConnect zone name and IP address of a node.
• Configure the S3 base domain to your SmartConnect zone name using the WebUI or CLI.
3.1 Buckets
OneFS requires a bucket to map to a specific directory in an access zone. This directory is called the bucket
path. If the bucket path is not specified, a default path is used, which is configurable at an access zone level
through the WebUI or CLI. When creating a bucket, OneFS creates a directory with a prefix of .isi_s3_ under
the bucket path, and creates 16 other subdirectories named 0 through 15 under the .isi_s3_ directory. An
example of this name is .isi_s3_ 1_1000000010001. The 16 subdirectories are used to store temporary files
for the PUT operation. OneFS automatically balances different temporary files between the directories for
better performance. Figure 4 shows the process of putting an object to the OneFS cluster which uses the
temporary directory under the bucket.
Bucket: finance
1 Path: /ifs/data/finance
PUT /docs/feb/sample.pdf
Host: finance.sc.example.local:9020
3.2 Objects
The AWS S3 data model is a flat structure—you create a bucket, and the bucket stores objects. There is no
hierarchy of subbuckets or subfolders. However, AWS S3 provides a logical hierarchy using object key name
prefixes and delimiters to support a concept of folders. For example, instead of naming an object sample.jpg
in the bucket named examplebucket, you can name it photos/samples/sample.jpg. The photos/samples/
becomes the object key name prefix using the slash (/) as delimiter.
OneFS maps objects to files and creates directories as object key name prefixes implicitly. The OneFS file
system requires the following rules for object naming:
• Objects cannot contain a prefix that conflicts with an existing object key name in its path. For
example, creating the object /document and /document/sample.docx is not allowed within the
same bucket.
• You cannot use a period (.) or double period (..) as object key name or part of object key name prefix.
For example, you can create the object /.sample.jpg but not the object /./sample.jpg.
• You cannot use .snapshot as the object key name or part of an object key name prefix; this is
reserved for OneFS SnapshotIQ.
Parts are uploaded to the temporary directory .isi_s3_parts_UploadId, and the temporary directory is
created under the target directory. Upon receiving the complete multipart-upload request, OneFS
concatenates the temporary files to the target file. You can access the file as object in your bucket. Starting
from OneFS 9.3.0, MPU complete request is semi-instant if met following conditions:
Most S3 applications do not send the MD5 value in their requests, thus, OneFS generates a unique string for
that file as an ETag in response. This behavior causes many issues with applications that rely on the ETag
value. Starting from OneFS 9.2, OneFS introduces two new options to allow administrators to specify if the
MD5 should be calculated and verified.
These two options are under the S3 zone settings. You can configure them using CLI isi s3 settings zone
modify --use-md5-for-etag=true/false --validate-content-md5=true/false or WebUI, shown in Figure 5.
S3 ETag options
Table 3 shows the different behavior by using the two new options.
--use-md5-for-etag=false --use-md5-for-etag=true
--validate-content-md5=false This is the default value. If “Content-MD5” exists in client
request and its value is properly
If “Content-MD5” exists in client encoded as BASE64 format, OneFS
request, OneFS uses it directly as uses it as the ETag without
the ETag without validation and validation.
checking the BASE64 encoding
format. If “Content-MD5” does not exist in
client request, OneFS calculates the
If “Content-MD5” does not exist in MD5 value as the ETag.
client request, OneFS generates a
unique string for that file as the
ETag.
--validate-content-md5=true If “Content-MD5” exists in client If “Content-MD5” exists in client
request and its value is properly request, OneFS calculates the MD5
encoded as BASE64 format, value and compare with the MD5
OneFS calculates the MD5 value value from client request, if matched,
and compare with the MD5 value uses it as the ETag. Otherwise, an
from client request, if matched, error is returned to client.
uses it as the ETag. Otherwise, an
error is returned to client. If “Content-MD5” does not exist in
client request, OneFS calculates the
If “Content-MD5” does not exist in MD5 value as the ETag.
client request, OneFS generates a
unique string for that file as the
ETag.
Note: Objects created with a multipart upload request do not use MD5 value as ETag.
There are two types of uploading options when authenticating requests using the Authorization header of
AWS Signature Version 4:
Some applications are using the chunked upload option by default, like AWS S3 Java SDK. Starting with
OneFS 9.3.0, the chunked upload is introduced to enable these applications to work with OneFS S3 service
seamlessly.
With chunked upload, you can break up your payload into chunks. These can be fixed or variable-size
chunks. By uploading data in chunks, you avoid reading the entire payload to calculate the signature. Instead,
for the first chunk, you calculate a seed signature that uses only the request headers. The second chunk
contains the signature for the first chunk, and each subsequent chunk contains the signature for the chunk
that precedes it. At the end of the upload, you send a final chunk with 0 bytes of data that contains the
signature of the last chunk of the payload. You can refer to AWS S3 Chunked Upload for more details.
The access ID indicates who the user is. OneFS generates one access ID per user. For example, OneFS
may generate access ID 1_joe_accid for user joe. The prefix number represents the user’s access zone ID.
Each access ID would have a latest secret key without an expiry time set and an old secret key that has an
expiry time set.
The secret key is used to generate the signature value along with several request-header values. After
receiving the signed request, OneFS uses the access ID to retrieve a copy of the secret key internally,
recompute the signature value of the request, and compare it against the received signature. If they match,
the requester is authenticated, and any header value that was used in the signature is now verified to be
untampered as well.
OneFS treats unauthenticated requests as anonymous requests made by the nobody user (UID 65534). If an
object is uploaded to a bucket through an unauthenticated request, the nobody user owns the object and has
FULL_CONTROL to the object. To prevent an anonymous request from modifying OneFS buckets and
objects, we recommend not to use ACLs that give the nobody user write access to your bucket.
For example, use the following OneFS CLI to grant or revoke write permissions on the bucket for the nobody
user.
OneFS
User No Yes
authenticates S3 Bucket ACL
request Anonymous request through Authenticated?
request? signature
No
Access No
denied Authorized?
Yes
Access granted
Figure 8 shows the authentication and authorization evaluation for the object operation.
OneFS
No Yes PUT/DELETE Yes
authenticates S3
Anonymous Bucket ACL
request through Authenticated? objects?
request? signature
No No
No Access
Access
User
denied Authorized? denied
request
Yes
No
4.3 ACL
OneFS implements both the bucket ACL and object ACL to control user access. Every bucket and object has
an associated ACL. The ACL defines which OneFS users or groups are granted access and the type of
access.
• READ: Allows users to read the object data and its metadata.
• WRITE: Not applicable but can be set on an object. When updating an object, S3 overwrites the
entire object instead of modifying existing data. Overwriting an object requires the WRITE permission
at the bucket level. It also requires the WRITE permission on the directory containing the object at the
OneFS level, where the directory is the largest prefix of the object (for example, if the object is
/a/b/c.txt, the directory is /a/b/). When this permission is set on an object, OneFS assigns the
corresponding OneFS ACL permission on disk so that the object data can be modified through other
protocols.
OneFS always handles permission enforcement based on the OneFS ACL. The object ACL is for
representation only. The object ACL only defines ALLOW ACL, whereas OneFS ACL also defines DENY
ACL. When viewing the object ACL through S3, it is possible that only a subset of the actual ACL is shown.
To reconcile the S3 object ACL and OneFS ACL, OneFS provides the following object ACL policy along with
the mapping relationship shown as Table 4:
• Replace: When S3 modifies a OneFS ACL, OneFS removes the existing ACL (including both
ALLOW and DENY settings) and replaces it with the new S3 ACL. This is the default setting.
• Deny: OneFS ACL modification by S3 is not allowed. The S3 client receives a permission-denied
error when trying to modify ACL.
The object ACL permissions are mapped to the following OneFS ACL permissions respectively in Table 4.
See the white paper Access Control Lists on Dell EMC PowerScale OneFS on Dell.com/StorageResources
for more details about OneFS ACL permissions.
S3 predefined groups
S3 predefined Group OneFS predefined Group Note
Authenticated Users Authenticated Users Only authenticated request is allowed in this group
group
All Users group Everyone Both authenticated request and unauthenticated
request are allowed in this group
Log Delivery group Not supported OneFS does not support this group
Storage technical documents and videos provide expertise that helps to ensure customer success on Dell
Technologies storage platforms.