HyperStoreAdminGuide V 7.5
HyperStoreAdminGuide V 7.5
Administration Guide
Version 7.5
This page left intentionally blank.
Confidentiality Notice
The information contained in this document is confidential to, and is the intellectual property of, Cloudian,
Inc. Neither this document nor any information contained herein may be (1) used in any manner other than
to support the use of Cloudian software in accordance with a valid license obtained from Cloudian, Inc, or (2)
reproduced, disclosed or otherwise provided to others under any circumstances, without the prior written per-
mission of Cloudian, Inc. Without limiting the foregoing, use of any information contained in this document in
connection with the development of a product or service that may be competitive with Cloudian software is
strictly prohibited. Any permitted reproduction of this document or any portion hereof must be accompanied
by this legend.
This page left intentionally blank.
Contents
What's New in HyperStore Version 7.5 15
APIs -- New Features and Enhancements 15
System Behavior and Management -- New Features and Enhancements 16
Documentation -- New Features and Enhancements 17
1.3.7. Auditing 27
2.6.1. S3 API 66
5.9.1. Repairing a Node That's Been Down for Longer than the Proactive Repair Limit 247
5.13.8. Move the Credentials DB Master Role or QoS DB Master Role 270
7.2.3. Pushing Configuration File Edits to the Cluster and Restarting Services 411
7.3.10. Redis (Credentials DB and QoS DB) and Redis Monitor Logs 530
l CreateVirtualMFADevice
l DeactivateMFADevice
l DeleteVirtualMFADevice
l EnableMFADevice
l ListMFADevices
l ListVirtualMFADevices
l ResyncMFADevice
l GET /user/mfa/list
l POST /user/mfa/deactivateDevice
l POST /user/mfa/enableDevice
l POST /user/mfa/resyncDevice
l POST /user/mfa/verify
More information:
l The "IAM API" section of the Cloudian HyperStore AWS APIs Support Reference
l The "user" section of the Cloudian HyperStore Admin API Reference
More information:
l PutBucketPolicy in the "S3 API" section of the Cloudian HyperStore AWS APIs Support Reference
15
What's New in HyperStore Version 7.5
S3 Service now restricts "Version" for new bucket policies and IAM policies
The "Version" in a bucket policy or an IAM policy is now required to be "2012-10-17" (as opposed to older AWS
policy versions such as "2008-10-17"). This restriction is applied only to newly created policies. It does not
apply to policies created before the upgrade to HyperStore 7.5.
Note For both the S3 Service and the CMC's built-in S3 client, the only bucket ownership types cur-
rently supported are "BucketOwnerPreferred" and "ObjectWriter". The "BucketOwnerEnforced" own-
ership type -- which is supported by AWS -- is not yet supported by HyperStore.
More information:
More information:
l If you do not have a HyperStore Single-Node System and want to learn more, contact your Cloudian
sales representative.
l If you have a HyperStore Single-Node System, your HyperStore documentation set is oriented toward
Single-Node operation.
Users who enable MFA on their account can subsequently disable MFA if they no longer wish to use it.
System admins cannot enable MFA on other users' accounts. However, system admins can disable MFA on a
user account for which it has been enabled by the user (which may be necessary if for example the user is
unable to log into the CMC because of a problem with their MFA application).
Note For the MFA feature to work, the IAM Service must be running in your HyperStore system (in
"HyperStore Configuration Files" (page 418), the iam_service_enabled setting must be set to true;
and you must include the IAM Service endpoint in your DNS set-up if you haven't already done so).
More information:
16
Documentation -- New Features and Enhancements
l For enabling MFA on one's own account: While on the CMC's Security Credentials page, click Help
l For disabling MFA for a user: While on the CMC's Manage Users page, click Help
More information:
More information:
hsstool whereis command now shows expected location of missing replicas or fragments
The command hsstool whereis -- which shows where the replicas or erasure coded fragments of an object are
located across the cluster -- has been extended to show not just the locations where replicas or erasure coded
fragments are found, but also any locations where replicas or fragments are expected to be but are missing.
More information:
More information:
Buckets created in HyperStore 7.4 and later use a different metadata structure than older buckets. Starting in
HyperStore 7.5, all buckets created in HyperStore 7.4 and later use an improved tombstone monitoring and
cleanup mechanism that leverages the new metadata structure. Such buckets are able to support mass deletes
without negative service impact (they are not subject to the 100,000 deletes per hour limit beyond which older
buckets can exhibit negative impacts such as difficulty processing S3 ListObjects calls).
If you have buckets that were created prior to HyperStore 7.4 and that need to be able to support mass deletes,
contact Cloudian Support for assistance in upgrading those buckets to the newer metadata structure (known as
"rules based partitioning").
17
What's New in HyperStore Version 7.5
The HyperStore Administrator's Guide (in PDF) has been pared down by removing from it the API reference
material and the CMC reference material (Help for each CMC screen). The documentation is now organized as
follows:
PDF:
l Contains all of the information from the documents listed above, except for installation instructions
l Also contains CMC reference material (Help for each CMC screen)
Note The Help no longer has section numbering. There are multiple reasons for this change, including
that there is no longer an exact correspondence between the Help scope and the Admin Guide scope
and so it is no longer possible to have Help section numbering that matches Admin Guide section num-
ber.
18
Chapter 1. Introduction to HyperStore
The Help is available through the CMC (by clicking the Help button) and is also available in the directory /op-
t/cloudian-staging/7.5/doc/HyperStoreHelp on your Configuration Master node (in that directory you can open
the HyperStoreHelp.html file). The PDF guides are available in the directory /opt/cloudian-sta-
ging/7.5/doc/HyperStorePDFManuals on the Configuration Master node.
The Help has all of the content from the Installation Guide, the Administrator's guide, and the API guides.
The Help features a built-in search engine. The search box is in the upper right of the interface. As with any
search engine, enclose your search phrase in quotes if you want to limit the results to exact match only.
In the Help, in most cases screen shots are presented initially as small thumbnail images. This allows for a
more compact initial view of the content on a page and makes it easier for you to skim through the text on the
page. If you want to see the full size image simply hold your cursor over it.
Also in the interest of presenting a compact initial view of the content on a page, the Help often makes use of
expandable/collapsible text. To expand (or subsequently collapse) such text you can click on the triangle icon
to the left of the text or on the text itself.
19
Chapter 1. Introduction to HyperStore
Example of that same page with the first expandable text item expanded:
To expand or collapse all of the expandable/collapsible text on a page, click this button in the upper left of the
Help interface:
The Help is also responsive to the type of user logged in to the CMC: a system administrator logged in to the
CMC sees the full Help content, but a group administrator or regular user logged in to the CMC sees only the
content applicable to their role. Also, in the Help that group admins and regular users see, there is no Cloudian
branding and no references to HyperStore or the Cloudian Management Console.
Note A system administrator logged into the CMC can download any of the HyperStore PDF doc-
uments listed above. A group administrator or regular user logged into the CMC can only download the
Cloudian HyperStore AWS APIs Support Reference PDF document.
The HyperStore system is designed specifically to meet the demands of high volume, multi-tenant data stor-
age:
20
1.3. Licensing and Auditing
l Amazon S3 API compliance. The HyperStore system is fully compatible with Amazon S3’s HTTP REST
API. Customers' existing S3 applications will work with the HyperStore service, and existing S3 devel-
opment tools and libraries can be used for building HyperStore client applications.
l Secure multi-tenancy. The HyperStore system provides the capability to securely have multiple user
groups reside on a single, shared infrastructure. Data for each user is logically separated from other
users' data and cannot be accessed by any other user unless access permission is explicitly granted.
l Quality of service controls. HyperStore system administrators can set storage quotas and usage rate
limits on a per-group and per-user basis. Group administrators can set quotas and rate controls for indi-
vidual members of the group.
l Access control rights. Read and write access controls are supported at per-bucket and per-object
granularity. Objects can also be exposed via public URLs for regular web access, subject to con-
figurable expiration periods.
l Reporting and billing. The HyperStore system supports usage reporting on a system-wide, group-wide,
or individual user basis. Billing of groups or users can be based on storage quotas and usage rates
(such as bytes in and bytes out).
l Tiering and replication to external systems. The HyperStore system supports having objects be tiered
(moved) to an external S3-compatible destination system on a defined schedule. The system also sup-
ports having objects be replicated to an external S3-compatible destination system.
l Object Lock (Write Once Read Many). The HyperStore system supports the S3 APIs for applying
Object Lock to buckets and objects, so that object versions cannot be deleted or altered for a defined
retention period.
l Horizontal scalability. Running on commodity off-the-shelf hardware, a HyperStore system can scale
up to hundreds of nodes across multiple data centers, supporting millions of users and hundreds of
petabytes of data. New nodes can be added without service interruption.
l High availability. The HyperStore system has a fully distributed, peer-to-peer architecture, with no
single point of failure. The system is resilient to network and node failures with no data loss due to the
automatic replication and recovery processes inherent to the architecture. A HyperStore cluster can be
deployed across multiple data centers to provide redundancy and resilience in the event of a data cen-
ter scale disaster.
A valid Cloudian software license is required to run HyperStore software. Evaluation licenses are available as
well as production licenses. Before using HyperStore software, you must obtain a license from Cloudian.
21
Chapter 1. Introduction to HyperStore
l Expiration date
l Maximum allowed on-premise storage volume
l Maximum allowed tiered storage volume
l Object lock functionality enabled or disabled
l HyperIQ license level
You can see the attributes of your particular HyperStore license by accessing the CMC's Cluster Information
page (Cluster -> Cluster Config -> Cluster Information).
The sections that follow describe these attributes and their enforcement in more detail.
If you reach the warning period preceding your license expiration, then when you use any part of the CMC, the
top of the interface displays a warning that your license expiration date is approaching.
22
1.3. Licensing and Auditing
If you reach your license expiration date you enter a grace period, per the terms of your contract. During the
grace period:
l In the CMC, the top of the screen displays a warning indicating that your license has expired and that
your HyperStore system will be disabled in a certain number of days (the number of days remaining in
your grace period).
l The system still accepts and processes incoming S3 requests, but every S3 response returned by the
S3 Service includes an extension header indicating that the system license has expired (header name:
x-gemini-license; value: Expired: <expiry_time>).
If you reach the end of your grace period after the license expiration date:
l No S3 service is available for end users. All incoming S3 requests will be rejected with a "503 Service
Unavailable" error response. The response also includes the expiration header described above.
l You can still log into the CMC to perform system administration functions (including applying an
updated license), but you will not be able to access users' stored S3 objects.
l The top of the CMC screen will display an error message indicating that your license has expired and
that your HyperStore system has been disabled. Also, in the CMC's Dashboard page ( ) the Cluster
Health panel will indicate that the system is disabled.
l If you stop the S3 Service on a node you will not be able to restart it. This applies also to the Admin Ser-
vice, since that service stops and starts together with the S3 Service.
It's best to update your license well in advance of your license expiration date. See "License Updating" (page
27) below.
With a license based on Net storage, the limit is on total object storage bytes minus overhead from storage
policies (object replication or erasure coding). For example if a 1GB object is replicated three times in your sys-
tem it counts as only 1GB toward a Net storage limit. A Net storage license is typically used if your cluster con-
sists entirely of software-only nodes, with no HyperStore Appliance nodes.
With a license based on Raw storage the limit is on the total raw storage capacity used in your system. All
HyperStore object data and metadata counts toward this limit, including storage overhead from replication or
erasure coding. For example if a 1GB object is replicated three times in your system it counts as 3GB toward a
Raw storage limit. Likewise, all object metadata and system metadata count toward a Raw storage limit.
l Appliance-only environment. Each HyperStore Appliance has its own amount of licensed Raw storage
capacity. If your system consists entirely of HyperStore Appliances, then the Raw licensed storage capa-
city for your whole system is the simply sum of the individual Appliance licensed capacities.
l Mixed environment of Appliances and software-only nodes. In a mixed environment, the Raw licensed
storage capacity for your whole system is the sum of the individual Appliance licensed capacities plus
an additional raw capacity allowance that Cloudian builds into your license to accommodate the soft-
ware-only nodes.
If your license is based on Raw storage, then your total licensed Raw storage limit will be automatically
increased if you add a new HyperStore Appliance node to the system. The amount of Raw storage added to
your licensed system maximum depends on the particular HyperStore Appliance that you've added to your
23
Chapter 1. Introduction to HyperStore
system. Conversely, if you remove an Appliance from your system this will reduce your total licensed system
Raw storage maximum; and the Raw storage allowance associated with a particular Appliance machine can-
not be transferred to other nodes in your system.
Note If your HyperStore licensed maximum storage is in terms of Net bytes, adding a HyperStore Appli-
ance to your cluster will not change your Net storage limit. If you are interested in increasing your Net
storage limit or converting to a Raw storage limit, consult with Cloudian Support.
In the CMC's Cluster Information page you can view the Net or Raw licensed usage maximum for your whole
system and also your current system-wide Net or Raw bytes usage count. If your current usage level exceeds
70% of your licensed maximum usage the CMC displays a warning message in both the Cluster Information
page and the Dashboard page. If your current usage level exceeds 90% of your licensed maximum usage the
CMC displays a critical message in both of those pages.
Your total system storage usage maximum will be automatically enforced by the system no longer allowing
S3 clients to upload data to the system. This enforcement will kick in when your system stored byte count
reaches 110% of your licensed maximum usage. At such point the system will reject S3 PUT and POST
requests and return an error to the S3 clients. This will continue until one of the following occurs:
l You delete object data so that the system byte count falls below 100% of your licensed maximum.
HyperStore checks every five minutes to see if your storage usage has fallen below the licensed max-
imum, and if it does fall below the maximum then S3 PUTs and POSTs will again be allowed.
Note In the case of Raw usage, your deletions will not impact your system's raw usage count
until the hourly system cron job for processing the object deletion queue runs. By contrast, a
Net usage count is decremented immediately when you delete objects.
l You acquire and install a new license with a larger storage maximum (see "License Updating" (page
27)). Upon new license installation, S3 PUTs and POSTs will again be allowed.
If the system stored byte count reaches 110% of your licensed maximum usage the CMC will display a pop-up
warning message to the system administrator whenever he or she logs in. This will recur on each login event
until the system byte count falls below 100% of usage, or a new license with larger storage maximum has been
installed.
IMPORTANT ! Regardless of whether your system has a Net storage license or a Raw storage license,
if the data disks on a node become 90% full then that node will stop accepting new S3 writes. This
is not a license enforcement mechanism but rather a system safety feature. For more information see
"Automated Disk Management Feature Overview" (page 281).
24
1.3. Licensing and Auditing
In regard to HyperStore licensing, the system treats auto-tiered data as a separate category than on-premise
data. Data that's been auto-tiered out of your HyperStore system and is now stored in an external, third party
system counts toward a Maximum Tiered Storage limit -- not toward your on-premise storage limit.
The enforcement of this separate tiered storage limit works in largely the same way as the enforcement of the
on-premise storage limit.
In the CMC's Cluster Information page you can view your tiered storage limit and also your current tiered stor-
age usage level. If your tiered usage level exceeds 70% of your licensed tiered storage maximum the CMC dis-
plays a warning message in the Cluster Information page. This becomes a critical message if the usage
exceeds 90% of licensed maximum.
The tiered usage maximum will be automatically enforced by the system no longer allowing auto-tiering to
any third party destination system. This enforcement will kick in when your tiered storage byte count reaches
110% of your licensed maximum. At this point auto-tiering to third party destinations will no longer work until
one of the following occurs:
l Through the HyperStore interface -- i.e. the CMC or the HyperStore S3 API -- you delete auto-tiered
data so that the total tiered byte count falls below 100% of your licensed maximum. HyperStore checks
every five minutes to see if your tiered storage usage has fallen below the licensed maximum, and if it
does fall below the maximum then auto-tiering to non-HyperStore destinations is allowed again and
automatically resumes.
IMPORTANT ! If you're trying to reduce your tiered storage volume to below your licensed max-
imum, be sure to delete auto-tiered objects through a HyperStore interface and not directly
through one of the tiering destination system's interfaces. If you do the latter, HyperStore will not
detect that you've reduced your tiered storage volume. For more information see "Accessing
Auto-Tiered Objects" (page 136).
l You acquire and install a new license with a larger tiered storage maximum (see "License Updating"
(page 27)). Upon new license installation, auto-tiering to third party destinations will be allowed again
and will automatically resume.
If the tiered byte count reaches 110% of your licensed maximum usage the CMC will display a pop-up warning
message to the system administrator whenever he or she logs in. This will recur on each login event until the
tiered byte count falls below the licensed maximum, or a new license with larger tiered storage maximum has
been installed.
25
Chapter 1. Introduction to HyperStore
If auto-tiering to third party destination systems stops because you've exceeded 110% of your licensed max-
imum, and then later resumes when you come back into compliance with your license, the system will auto-tier
any objects that were flagged for auto-tiering during the time period when auto-tiering was halted for license
non-compliance. So as long as you come back into compliance, the period of non-compliance will not result in
any permanent failures to auto-tier objects that are supposed to have been auto-tiered based on users' bucket
lifecycle configurations.
For more information about this feature, including descriptions of the two types of licensed Object Lock support
-- "Compatible Object Lock" and "Certified Object Lock" -- see "Object Lock Feature Overview" (page 144).
If your current license does not support Object Lock and you want to use this feature, or if you want to switch
your licensed Object Lock type from "Compatible" to "Certified" or vice versa, contact Cloudian Support.
26
1.4. Nodes, Data Centers, and Regions
Your HyperStore license has a HyperIQ attribute that determines the level of HyperIQ functionality available to
you if you acquire and set up the HyperIQ virtual appliance:
l Basic -- HyperIQ dashboards for OS and service status monitoring are supported indefinitely. This is the
default.
l Enterprise -- HyperIQ dashboards for OS and service status monitoring are supported indefinitely, and
also an S3 analytics dashboard is supported until a defined expiration date. The presence of the S3
analytics dashboard is what distinguishes Enterprise level HyperIQ support from Basic HyperIQ sup-
port.
Once you've obtained a new license file you can use the CMC's Information page to dynamically apply the
new license file to your HyperStore system. For instructions see Install a New License File.
1.3.7. Auditing
If you have a production license for HyperStore software, the system will regularly transmit auditing data to
Cloudian, Inc., using the system’s Smart Support functionality.
l Single data center constituting a single region. This is the simplest deployment topology, where the
whole HyperStore system consists of a single data center in which multiple HyperStore nodes are run-
ning. The one DC constitutes its own region. The number of nodes can scale from a minimum of three --
27
Chapter 1. Introduction to HyperStore
the smallest viable size of a HyperStore system -- up to dozens of nodes, all running in one DC. Adding
more nodes within a DC is the most common way to add capacity to the system. For more information
see "Capacity Monitoring and Expansion" (page 298).
l Multiple data centers in a single region. With this topology, HyperStore nodes running in multiple data
centers comprise one unified storage cluster. Typically the motivation for this type of deployment is rep-
lication of data across DCs, for the purposes of data protection, service resilience, and/or disaster recov-
ery. This cross-DC replication is configurable through the use of HyperStore storage policies.
l Multiple service regions. With this topology the HyperStore system spans multiple service regions,
each of which consists of one or more data centers. Each region has its own separate S3 service end-
point (to which S3 clients submit requests), its own independent storage cluster, and its own separate
inventory of stored objects. In a multi-region HyperStore system, the regions are in most respects sep-
arate S3-compatible object storage systems -- with the significant exceptions that the same population
of authorized end users has access to all the service regions, and that HyperStore affords a substantial
degree of unified administration across the multiple regions. Typically the motivation for having multiple
service regions is to allow users to choose one geographic region or another for storing their data, for
reasons of proximity or regulatory compliance.
For a diagram showing the relation between nodes, data centers, and service regions see "System Levels"
(page 42).
In a multi-region system, each region has its own set of storage policies, and each region's storage policies
operate only within that region. As noted previously, each region is essentially an independent storage cluster
with its own inventory of objects -- and its own storage policies for distributing data within the region.
For more information about storage policies, including details about multi-DC storage policies, see "Storage
Policies Feature Overview" (page 91).
For information about a supported option for asynchronously replicating data from a bucket in one region to a
different bucket in a different region, see "Cross-Region Replication Feature Overview" (page 138).
For a summary of services and how they are allocated, see "HyperStore Services Overview" (page 34).
For a diagram showing typical services distribution in a multi-DC region, see "Services Distribution -- Multi-
DC, Single Region" (page 45).
28
1.4. Nodes, Data Centers, and Regions
Also, in a multi-DC region each DC has its own sub-set of HyperStore nodes that are automatically configured
to act as internal NTP servers. For more information see "NTP Automatic Set-Up" (page 511).
l One of the regions serves as the default region. The default region plays several roles in a multi-region
system. For example:
o If service users do not specify a region when they create a new S3 storage bucket, the system
will create the bucket in the default region.
o Only the Admin Service instances in the default service region support the full Admin API.
l Each region has its own S3 service endpoint (URI used by client applications for HTTP access).
l Each region has its own independent object storage cluster and by default there is no object replication
across regions (although there is an option for cross-region replication on a bucket-to-bucket basis).
l When users create a new bucket they choose which region the bucket will be created in.
l User access credentials are valid across the system as a whole. In support of user authentication, a
single, uniform Credentials DB serves the entire multi-region system. There is just one Credentials DB
master node for the whole system, and that node is located in your default region. Within each region,
there are two Credentials DB slave nodes per data center.
l Quality of service (QoS) controls are implemented separately in each region. The QoS limits that you
establish for a service region will be applied only to user activity in that particular region. In support of
QoS implementation, each region has its own independent QoS DB. Each regional QoS DB has its own
master node. In each region there is also one QoS DB slave node per data center.
l The Redis Monitor application will monitor the Credentials DB and QoS DBs in all the regions (and if
necessary trigger failover of the master role within each database). One primary Redis Monitor applic-
ation instance serves the whole multi-region system, and if the primary Redis Monitor instance goes
down the backup instance takes over. The primary Redis Monitor instance and backup Redis Monitor
instance are on separate nodes in your default region.
l Group and user profile information is stored only in the default region, and is accessed there by ser-
vices in the other regions. Group and user information is stored only in the Metadata DB in the default
region. HyperStore services in non-default regions access the Metadata DB in the default region to
retrieve this group and user information as needed.
l Just one Configuration Master node is used to propagate system configuration settings throughout the
whole multi-region system, during system installation and for ongoing system configuration man-
agement.
For a diagram showing typical services distribution in a multi-region system, see "Services Distribution --
Multi-Region" (page 46).
29
Chapter 1. Introduction to HyperStore
more new data centers and updating the system configuration to reflect the addition of the new service region
and data center[s]). For instructions see:
HyperStore also supports the option of installing HyperStore software to multiple DCs or regions from the out-
set, upon the initial system installation. From a HyperStore system configuration perspective the key here is the
"survey file", which the HyperStore system_setup.sh tool helps you to create. By responding to that tool's inter-
active prompts, you create a survey file that identifies (among other things) the name of the data center and
region that each node resides in. Subsequently the installer tool (cloudianInstall.sh) installs HyperStore soft-
ware to each of those nodes and configures the system to be a single-DC, multi-DC, or multi-DC / multi-region
system, in accordance with your survey file.
Below is an example of an installation survey file for a 3-node HyperStore system that will be configured as just
a single service region with just a single data center. Note that you must provide a data center name and a
region name even if you will have just one DC constituting just one region. Here the region name is "tokyo" and
the data center name is "DC1".
tokyo,cloudian-vm7,66.10.1.33,DC1,RAC1
tokyo,cloudian-vm8,66.10.1.34,DC1,RAC1
tokyo,cloudian-vm9,66.10.1.35,DC1,RAC1
Here is a second example, this time for a system that will be installed as a single-region system with two data
centers:
tokyo,cloudian1,66.1.1.11,DC1,RAC1
tokyo,cloudian2,66.1.1.12,DC1,RAC1
tokyo,cloudian3,66.1.1.13,DC1,RAC1
tokyo,cloudian4,67.2.2.17,DC2,RAC1
tokyo,cloudian5,67.2.2.18,DC2,RAC1
tokyo,cloudian6,67.2.2.19,DC2,RAC1
Below is a third example, this time for a system that will be installed as a two-region system. Note that in this
example, the "tokyo" region encompasses two data centers while the "osaka" region consists of just one data
center.
tokyo,cloudian1,66.1.1.11,DC1,RAC1
tokyo,cloudian2,66.1.1.12,DC1,RAC1
tokyo,cloudian3,66.1.1.13,DC1,RAC1
tokyo,cloudian4,67.2.2.17,DC2,RAC1
tokyo,cloudian5,67.2.2.18,DC2,RAC1
tokyo,cloudian6,67.2.2.19,DC2,RAC1
osaka,cloudian7,68.10.3.24,DC3,RAC1
osaka,cloudian8,68.10.3.25,DC3,RAC1
osaka,cloudian9,68.10.3.26,DC3,RAC1
For more information about installation including node preparation, DNS, and load balancer requirements, see
the "Installing HyperStore" section of the HyperStore Help.
30
1.4. Nodes, Data Centers, and Regions
l The Capacity Explorer page (Analytics -> Capacity Explorer) lets you see how much free storage
capacity remains in each DC (as well as in each node, and also in the service region as a whole).
l The Data Centers page (Cluster -> Data Centers) shows you your node inventory in each DC and
provides summary status information for each node. From this page you can also add nodes to your
cluster on a per-DC basis.
l The Object Locator page (Analytics -> Object Locator) lets you see exactly where all of a specified
object's replicas or erasure coded fragments are located (on which nodes, in which DC)
31
Chapter 1. Introduction to HyperStore
l For some Admin API calls you can optionally use a "region" URI parameter to indicate that you want the
operation applied to a particular region. For example, the syntax for retrieving a user’s rating plan is:
GET /user/ratingPlan?userId=string&groupId=string[®ion=string] HTTP/1.1
For such API calls, if you do not specify a region then the default region is presumed.
l Certain Admin API calls are only supported by the Admin Service in the default region. If you submit
these calls to the Admin Service in a non-default region, you will receive a 403: Forbidden response.
For more information see the Introduction section in the Cloudian HyperStore Admin API Reference.
For services distribution diagrams, see "Services Distribution -- 3 Nodes, Single DC" (page 44).
Specialized Support Ser- Credentials DB Master One node per entire Hyper-
vices Store system.
32
1.5. HyperStore Services Overview
Crontab configuration and Monitoring Data Col- One primary node and one
lector backup node per service
region.
Note Within your installation cluster, the HyperStore installer automatically chooses the hosts for ser-
vices that are not intended to run on every node. These host assignments are recorded to an install-
ation configuration file that the installer generates when it runs (CloudianInstallConfiguration.txt in your
installation staging directory).After your installation is completed, these host assignments can also be
viewed on the CMC's Cluster Information page (Cluster -> Cluster Config -> Cluster Information). If
you want to modify these assignments after install, see "Change Node Role Assignments" (page
259).
If you installed HyperStore software on only one node, then all these services will run on that node.
33
Chapter 1. Introduction to HyperStore
For services distribution diagrams, see "Services Distribution -- 3 Nodes, Single DC" (page 44).
Specialized Support Ser- Credentials DB Master One node per entire Hyper-
vices Store system.
34
1.5. HyperStore Services Overview
Crontab configuration and Monitoring Data Col- One primary node and one
lector backup node per service
region.
Note Within your installation cluster, the HyperStore installer automatically chooses the hosts for ser-
vices that are not intended to run on every node. These host assignments are recorded to an install-
ation configuration file that the installer generates when it runs (CloudianInstallConfiguration.txt in your
installation staging directory).After your installation is completed, these host assignments can also be
viewed on the CMC's Cluster Information page (Cluster -> Cluster Config -> Cluster Information). If
you want to modify these assignments after install, see "Change Node Role Assignments" (page
259).
If you installed HyperStore software on only one node, then all these services will run on that node.
As a HyperStore system administrator, you can use the CMC to perform tasks such as:
Group administrators can perform a more limited range of admin tasks pertaining to their own group, and can
also perform S3 operations such as creating and configuring buckets and uploading and downloading objects.
Regular users can only perform S3 operations such as creating and configuring buckets and uploading and
downloading objects.
The CMC acts as a client to several of the API Services including the Admin Service and the S3 Service.
35
Chapter 1. Introduction to HyperStore
The "Cloudian Management Console (CMC) Service" (page 35) is a client to the Admin Service. You also
have the option of using a command line tool such as cURL to submit Admin API commands, or building your
own Admin Service client.
For more information about the Admin API see the Cloudian HyperStore Admin API Reference.
The HyperStore IAM, STS, and SQS Services provide partial support for the AWS Identity and Access Man-
agement API, the AWS Security Token Service API, and the AWS Simple Queue Service API, respectively.
For the S3 Service and IAM Service you can use the "Cloudian Management Console (CMC) Service" (page
35) as a client application, or use a third party or custom client application. For the STS and SQS Services, the
CMC does not provide client access and so you must use third party or custom client applications to access
these services.
For more information about HyperStore's implementation of these services see the Cloudian HyperStore
AWS APIs Support Reference.
1.5.4.1. Metadata DB
The HyperStore system stores object metadata, user metadata, and system metadata in the Metadata DB. The
Metadata DB is built on the Apache open source storage platform Cassandra. The HyperStore system creates
and uses several "keyspaces" within Cassandra:
36
1.5. HyperStore Services Overview
Note There is one UserData_<policyid> keyspace for each storage policy in the system. For
information about storage policies see "Storage Policies Feature Overview" (page 91).
l The AccountInfo keyspace stores information about HyperStore S3 user accounts and group accounts
(including IAM user and group accounts)
l The Reports keyspace stores system-wide, per-group, and per-user S3 usage data, in support of the
HyperStore usage reporting functionality. It will also store per-bucket usage data if you enable per-
bucket usage tracking.
l The Monitoring keyspace stores system monitoring statistics in support of HyperStore’s system mon-
itoring functionality.
l The ECKeyspace keyspace does not actually store any erasure coded object data; rather, the Hyper-
Store system creates this keyspace so that the HyperStore erasure coding feature can leverage Cas-
sandra functions for token-based mapping of objects (erasure coded object fragments, in this case) to
nodes within the storage cluster.
l The Notification keyspace stores bucket notification messages. For more information see the SQS sec-
tion of the Cloudian HyperStore AWS APIs Support Reference.
S3 client applications do not access the Metadata DB directly. Instead, all S3 client access is to the S3 Service,
which in turn accesses the Metadata DB in support of S3 operations. The HyperStore Service and Admin Ser-
vice also access the Metadata DB.
The QoS DB stores user-level and group-level Quality of Service settings that have been established by sys-
tem administrators. The QoS DB is also used to keep count of user requests, so that Quality of Service limits
can be enforced by the system.
The Credentials DB and QoS DB are both built on Redis, an open source, in-memory key-value data store
optimized for fast performance.
The S3 Service, Admin Service, and HyperStore Service are the clients to the Credentials DB and QoS DB.
Communication is through a protocol called Redis Serialization Protocol (RESP).
l Just one, universal Credentials DB which serves the entire HyperStore deployment.
l A separate, independent QoS DB in each service region
Each Credentials DB and each QoS DB is implemented across two or more nodes, with the nodes playing dif-
ferent roles. These roles are:
l master — All write requests from DB clients are implemented on the master node. There is only one
master node for each Credentials DB and each QoS DB. In a multi-region HyperStore deployment, the
universal Credentials DB has one master node and each regional QoS DB has its own master node.
l slave — In each Credentials DB and each QoS DB, data from the DB master node is asynchronously
replicated on to one or more slave nodes (at least one slave node per data center). The slave nodes
37
Chapter 1. Introduction to HyperStore
support doing reads for DB clients but not writes. If a master node fails, the master role is automatically
failed over to a slave node. This fail-over process is managed by the Redis Monitor Service.
Credentials DB and QoS DB roles are assigned to your HyperStore nodes automatically during installation. For
more information on the distribution of these services across the cluster see "Services Distribution -- 3
Nodes, Single DC" (page 44).
1.5.4.3. Checksum DB
The Checksum DB stores digests (MD5 hashes) of the object data files stored in the HyperStore File System
(HSFS). For more information see "HyperStore Service and the HSFS" (page 38), particularly the section on
"File Digests".
The HyperStore system uses a hybrid storage solution where Cassandra is used for storing metadata while the
Linux filesystem on Cassandra nodes is used for storing object data. The area of the Linux file system where
S3 object data is stored is called the HyperStore File System (HSFS).
The general strategy is that Cassandra capabilities are used to determine the distributed data management
information such as the nodes that a specific object's metadata should be written to and the nodes that the
object's data should be written to. Then at the storage layer, the metadata is stored in Cassandra and the
object data is stored in the HSFS.
Within the HSFS, objects can be stored and protected in either of two ways:
l Replicated storage
l Erasure coded storage
For more information on data storage and protection options, see "Storage Policies Feature Overview" (page
91).
When the system stores S3 objects, the full path to the objects will be as indicated below:
38
1.5. HyperStore Services Overview
o The <policyid> segment indicates the storage policy used by the S3 storage bucket with which
the object is associated.
o The two <000-255> segments of the path are based on a hash of the <filename>, normalized to
a 255*255 number.
o The <filename> is a dot-separated concatenation of the object’s system-assigned token and a
timestamp based on the object's Last Modified Time. The token is an MD5 hash (in decimal
format) of the bucket name and object name. The timestamp is formatted as <UnixTimeMil-
lis><6digitAtomicCounter>-<nodeIPaddrHex>. The last element of the timestamp is the IP
address (in hexadecimal format) of the S3 Service node that processed the object upload
request.
Note For objects last modified prior to HyperStore version 6.1, the timestamp is simply
Unix time in milliseconds. This was the timestamp format used in HyperStore versions
6.0.x and older.
o "hyperstore1" is one of the HyperStore data mount points configured for the system (as specified
by the configuration setting common.csv: hyperstore_data_directory)
o "hsfs" indicates that the object is a replicated object (not an erasure-coded object)
o "1L1tEZZCCQwdQBdGel4yNk" is the Base-62 encoding of the token belonging to the vNode to
which the object instance is assigned
o "c4a276180b0c99346e2285946f60e59c" is the system-generated identifier of the storage policy
used by the S3 storage bucket with which the object is associated.
o "109/154" is a hash of the file name, normalized to a 255*255 number.
o "55898779481268535726200574916609372181.1487608689783689800-0A320A15" is the
file name. The "55898779481268535726200574916609372181" segment is the object’s sys-
tem-assigned token, in decimal format. The "1487608689783689800-0A320A15" segment is the
object's Last Modified Time timestamp, in format <UnixTimeMillis><6digitAtomicCounter>-
<nodeIPaddrHex>.
39
Chapter 1. Introduction to HyperStore
the corresponding file. On each mount point, there is one Checksum DB for storing digests for replica data files,
and one Checksum DB for storing digests for erasure coded data files. The Checksum DBs are built on Rock-
sDB, an open source, high-performance persistent key-value store.
<mountpoint>/digest/hsfs/
<mountpoint>/digest/ec/
Within each Checksum DB, the key is a byte array consisting of the object's token and the file timestamp in bin-
ary format, and the value is the digest itself.
Note
• When an existing object is updated by an S3 client, the object’s token (a decimal formatted MD5
hash of the object key) remains the same but the object’s digest (including a hexadecimal formatted
MD5 hash of the object data) changes.
• For an erasure coded S3 object, each fragment has the same token (based on object key) but a dif-
ferent digest (based on fragment content).
• For multipart S3 objects — uploaded to the system through the S3 API methods for Multipart Uploads
— each part has a different token (since each part has a distinct object key incorporating a part num-
ber) and a different digest (based on part content).
The HyperStore system supports a JMX command for retrieving a digest from a particular node:
For example, using the command line JMX tool cmdline-jmxclient that comes bundled with your HyperStore
system:
For clarity, in the example above an empty line has been inserted between the JMX command and the
response. This example is for a replicated object named LocalInstallProcedure.docx from the bucket named
bucket1. In the response, 68855684469431950092982403183202182439.1477401010696032189-
0A0A1608 is the Checksum DB key for this entry (in format <objectToken>.<timestamp>). The subsequent
lines are the digest contents. 9c741e3e7bbe03e05510071055151a6e is the replica's MD5 hash in
40
1.5. HyperStore Services Overview
hexadecimal; the /var/lib/cloudian/hsfs/... line is the replica file path and name; and 13164 is the replica's file
size in bytes.
Note In the command line example above, -:- is the USER:PASS value (indicating that the system is
not configured to require a userId and password for JMX access). For cmdline-jmxclient usage inform-
ation, enter the following command:
To check the current cmdline-jmxclient version number (replaced by the wildcard character in the com-
mand above), change to the /opt/cloudian/tools directory and list the directory contents. Look for the
cmdline-jmxclient-<version>.jar file.
l Configuration Master and Agents — HyperStore's cluster configuration management system is built
on the open source version of Puppet. For more information see "Pushing Configuration File Edits to
the Cluster and Restarting Services" (page 411). A Configuration Agent runs on every node. The
Configuration Master runs on one node and is also configured on a backup node. Manual failover to
the backup is supported if the primary Configuration Master instance goes down.
l Cloudian Monitoring Data Collector and Agents — The Cloudian Monitoring Data Collector runs on
one node in each of your service regions (the same node on which the system maintenance cron jobs
run), and regularly collects data from the Monitoring Agents that run on every node. The Monitoring Col-
lector writes its collected node health statistics to the Metadata DB's "Monitoring" keyspace. The Mon-
itoring Collector and the system maintenance cron jobs are also configured on a backup node, and
automatic failover to the backup occurs if the primary node goes offline or if crond goes down on the
primary.
l Redis Monitor — The Credentials DB and QoS DB are both built on Redis. The Redis Monitor mon-
itors Credentials DB and QoS DB cluster health and implements automatic failover of the master node
role within each of the DBs. If the Redis Monitor detects that a DB master node has gone down, it pro-
motes an available slave node to the master node role; and informs the DB’s clients (the S3 Service,
IAM Service, Admin Service, and HyperStore Service) of the identity of the new master. For redundancy,
the Redis Monitor runs on two HyperStore nodes, configured as primary on one node and as backup
on the other node.
l Pre-Configured ntpd — Accurate, synchronized time across the cluster is vital to HyperStore service.
When you install your HyperStore cluster, the installation script automatically configures a robust NTP
set-up using ntpd. In each HyperStore data center four of your HyperStore nodes are automatically con-
figured to act as internal NTP servers, which synchronize with external NTP servers (by default the serv-
ers from the pool.ntp.org project). Other HyperStore hosts in each data center are configured as clients
of the internal NTP servers. For more information see "NTP Automatic Set-Up" (page 511). To see
which of your HyperStore nodes are internal NTP servers and which external NTP servers they are syn-
chronizing with, log into the CMC and go to the Cluster Information page (Cluster -> Cluster Config ->
Cluster Information).
Note If a HyperStore data center has only four or fewer nodes, then all the nodes in the data
41
Chapter 1. Introduction to HyperStore
l Dnsmasq — Dnsmasq is a lightweight domain resolution utility. This utility is bundled with Cloudian
HyperStore software. The HyperStore interactive installation wizard gives you the option to have dns-
masq installed and configured to resolve HyperStore service domains (specifically the S3 service
domain, the S3 website endpoint domain, and the CMC domain). The dnsmasq utility may be helpful if
you are evaluating a small HyperStore system but it is not appropriate for production use.
l System
l Region (also known as a "Cluster")
l Data Center
l Node
l vNode
42
1.6. System Diagrams
43
Chapter 1. Introduction to HyperStore
Note The diagram excludes certain supporting services such as the Redis Monitor, the Monitoring
Data Collector, and the Configuration Master and Agents. For a complete list of HyperStore services
and the listening ports they use, see "HyperStore Listening Ports" (page 577).
l On every node in your cluster are the S3 Service, IAM Service, Admin Service, HyperStore Service,
Metadata DB, CMC, Configuration Agent, and Monitoring Agent. These collectively are labeled as
"COMMON SERVICES" in the diagram.
l For each specialized service that has a primary instance and a backup instance (such as Configuration
Master or Redis Monitor), the backup resides on a different node than the primary. Likewise the QoS
DB slave will reside on a different node than the QoS DB Master, and the two Credentials DB slaves
will reside on different nodes than the Credentials DB Master.
l If you have a larger cluster in a single DC, you will still have the same number of specialized service
instances as shown in the diagram (for example, one primary Configuration Master instance and one
backup instance) — the only difference is that this set of specialized service instances will be spread
across your cluster rather than concentrated among three nodes as shown in the diagram. For example
44
1.6. System Diagrams
if you have five or more nodes in the DC, then the Credentials DB Master, the two Credentials DB
slaves, the QoS DB Master, and the QoS DB slave will all be on different nodes.
Note For information about the exact location of services in your HyperStore system, log into the CMC
and go to the Cluster Information page (Cluster -> Cluster Config -> Cluster Information). The sys-
tem allows you to move services from one host to another, if you wish to do so. For instructions see
"Change Node Role Assignments" (page 259).
Note If you have a very large cluster (25 nodes or more in a data center), consult with Cloudian Sup-
port about whether you should add more Credentials DB slaves. For instructions on adding Credentials
DB slaves, see "Move or Add a Credentials DB Slave or QoS DB Slave" (page 259).
l The "COMMON SERVICES" (S3 Service, IAM Service, Admin Service, HyperStore Service, Metadata
DB, CMC, Configuration Agent, and Monitoring Agent) run on every node in your multi-DC system.
l Each data center has its own QoS DB slave and its own two Credentials DB slaves, for read per-
formance optimization.
l The Configuration Master backup is placed in a different DC than the Configuration Master primary; and
the same is true for the Cronjobs backup and primary.
45
Chapter 1. Introduction to HyperStore
Note The Redis Monitor backup must remain in the same data center as the Redis Monitor primary,
and this should be the same data center as where the Credentials DB master is located.
Note To check the current location of specialized services within your multi-DC HyperStore system, go
to the CMC's Cluster Information page (Cluster -> Cluster Config -> Cluster Information).
l The "COMMON SERVICES" (S3 Service, Admin Service, HyperStore Service, Metadata DB, CMC, Con-
figuration Agent, and Monitoring Agent) run on every node in your multi-region system. The IAM Service
runs on every node in the default region only.
l The whole multi-region system is served by a single active Configuration Master, a single Credentials
DB Master, and a single active Redis Monitor.
l Each region has its own QoS DB Master and its own active Cronjob host.
46
1.6. System Diagrams
The diagram below illustrates how the system ensures high availability of these specialized services by sup-
porting failover of each service type, from the primary instance to the backup instance. For nearly all service
types, the system automatically detects a failure of the primary instance and automatically fails over to the
backup instance. The one exception is the Configuration Master role (for managing system configuration) — in
the case of the Configuration Master you can manually implement failover if there’s a problem with the
primary instance.
The diagram shows six nodes, but the principles are the same regardless of how many nodes you have: spe-
cialized services are dispersed across the cluster, and the backup instance of any given service is deployed on
a different node than the primary instance.
47
Chapter 1. Introduction to HyperStore
Note The automatic failover of the Cronjobs and Monitoring Data Collector roles from the primary to
the backup instance invokes the cloudianInstall.sh script and will fail if cloudianInstall.sh is already run-
ning. When you occasionally use cloudianInstall.sh for system configuration tasks, remember to exit the
installer when you are done — do not leave it running.
Also, the automatic failover of the Cronjobs and Monitoring Data Collector roles from the primary to the
backup instance will not occur until the primary instance has been down for 10 minutes.
48
1.6. System Diagrams
49
Chapter 1. Introduction to HyperStore
50
1.6. System Diagrams
Consider a 3X replication scenario where QUORUM has been used as the write consistency level (which is the
default configuration for replication storage policies). Suppose an S3 PUT of an updated version of an object
has succeeded even though only two of three object data replica writes and only two of three object metadata
replica writes succeeded. We then can temporarily have a condition like that shown in the following diagram,
where "T2" indicates the timestamp of the new version of the data and metadata and "T1" indicates the out-
dated version. (For example, perhaps node5 was momentarily offline when the S3 write request came in; and
now it’s back online but proactive repair has not yet completed.)
If an S3 read request on the object comes into the system during this temporary period of data inconsistency,
the system works as follows:
l As long as the read consistency level is set to at least QUORUM (the default for replication storage
policies), the system will read at least two of the metadata replicas. Consequently it will read at least
one of the fresh metadata replicas, with timestamp T2. If it reads one T1 metadata replica and one T2
metadata replica, it works with the metadata that has the freshest timestamp. The system then tries to
retrieve an object data replica that has this same fresh timestamp.
l If object data replicas with the fresh timestamp are available, that object data is returned to the S3 client.
If nodes are down in such a way that the only available object data replica is the outdated one, then the
system fails the S3 request.
51
Chapter 1. Introduction to HyperStore
Note HyperStore allows you to configure storage policies that use a read CL of ONE rather than
QUORUM. This non-default configuration maximizes read availability and speed, but also increases
the chances of returning a stale replica to the client. This is because -- if your write CL is QUORUM (the
default) and your read CL is ONE (non-default) -- there is a chance of reading a stale metadata replica
and returning a stale object replica to the client.
For more information on S3 write and read availability under various consistency level configurations, see
"Storage Policy Resilience to Downed Nodes" (page 99).
The first flow chart below illustrates the standard consistency level logic when only one consistency level (CL)
is used per operation type. The second flow chart illustrates the logic for a two-tier dynamic consistency level
configuration. Following the flow charts is a detailed text description of how the dynamic consistency level fea-
ture works, which includes discussion of what constitutes a "qualified endpoint".
52
1.6. System Diagrams
53
Chapter 1. Introduction to HyperStore
l HyperStore Service has been marked as down by the S3 Service (see "hss.bring.back.wait, hss.-
timeout.*, and hss.fail.*" (page 489))
l Node has been put into Maintenance Mode by operator (see Start Maintenance Mode)
l Node is in a StopWrite condition due to all data disks being 90% full or more (relevant only for write
requests)
l Disk on which requested data resides is disabled (relevant only for read requests)
Then, dynamic consistency levels can come into play at two different phases of S3 PUT or GET processing:
l If the number of qualifying endpoints meets the requirements of the either the primary CL or the fall-
back CL, the system proceeds with trying to write to (or read from in the case of GET processing) those
endpoints. If the number of qualifying endpoints does not meet the requirements of either the primary
CL or the fallback CL, the system does not try to write to (or read from) the endpoints and the S3 request
fails and an error is returned to the S3 client.
l When trying to write to (or read from) qualifying endpoint nodes, if the number of successful
writes/reads meets the requirements of the either the primary CL or the fallback CL, a success
54
1.6. System Diagrams
response is returned to the S3 client. If the number of successful writes/reads does not meet the require-
ments of either the primary CL or the fallback CL, the S3 request fails and an error is returned to the S3
client.
S3 object placement and replication within a HyperStore cluster is based on a consistent hashing scheme that
utilizes an integer token space ranging from 0 to 2127 -1. Traditionally, in a storage cluster based on consistent
hashing, each physical node is assigned an integer token from the token space. A given node is then respons-
ible for a token range that extends from the next-lower token assigned to a different node (excluding the token
number itself), up to and including the given node's own token. Then, an integer hash value is calculated for
each S3 object as it is being uploaded to storage. The object is stored to the node responsible for the token
range in which the object’s hash value falls. Replication is implemented by also storing the object to the nodes
responsible for the next-higher token ranges.
Advancing beyond traditional consistent hash based storage, the HyperStore system utilizes and extends the
"virtual node" (vNode) functionality originally introduced in Cassandra version 1.2. This optimized design
assigns multiple tokens to each physical node. In essence, the storage cluster is composed of very many "vir-
tual nodes", with multiple virtual nodes residing on each physical node. Each virtual node is assigned its own
token and has its own token range for which it is responsible.
The HyperStore system goes a significant step further by assigning a different set of tokens (virtual nodes) to
each HyperStore data disk on each host. With this implementation, each data disk on a host is responsible for
a set of different token ranges and -- consequently -- a different inventory of object data. If a disk fails it affects
only the object data on that one disk. The other disks on the host can continue operating and supporting their
own data storage responsibilities.
The number of tokens that the system assigns to each host is based on the total combined storage capacity of
the host's HyperStore data disks. Specifically, the system determines the number of tokens to assign to a
host by taking the total number of terabytes of HyperStore data storage capacity on the host, multiplying
by .7, and then rounding off to the nearest integer. Further, the system applies a lower bound of one token per
HyperStore data disk and an upper bound of 512 tokens per host.
For example:
55
Chapter 1. Introduction to HyperStore
Note The data disk sizes in the table above are only for simple illustration of how the algorithm works.
In calculations for actual hosts, the system uses not the disk raw size (the decimal-based size stated by
the disk manufacturer) but rather the disk usable size after the disks are formatted and the file system is
mounted (the binary-based size the operating system reports if you run the lsblk command on the host).
On each host, the host's assigned tokens -- and associated token ranges -- are automatically allocated to each
HyperStore data disk in a manner such that storage capacity utilization should be approximately balanced
among the disks on a given host.
In the HyperStore File System mounted to each HyperStore data disk there are sub-directories that demarc-
ate each vNode's data.
For illustration of how vNodes work to guide the distribution of data across a cluster, consider a cluster of six
HyperStore hosts each of which has four disks designated for S3 object storage. Suppose that each physical
host is assigned 32 tokens. And suppose for illustration that there is a simplified token space ranging from 0 to
960, and the values of the 192 tokens in this system (six hosts times 32 tokens each) are 0, 5, 10, 15, 20, and
so on up through 955.
The diagram below shows one possible allocation of tokens across the cluster. Each host’s 32 tokens are
divided evenly across the four disks (eight tokens per disk), and that token assignment is randomized across
the cluster.
56
1.6. System Diagrams
Now further suppose that you’ve configured your HyperStore system for 3X replication of S3 objects. And say
that an S3 object is uploaded to the system and the hashing algorithm applied to the unique <buck-
etname>/<objectname> combination gives us a hash value of 322 (for this simplified example; in reality the sys-
tem uses MD5 hashing). The diagram below shows how three instances or "replicas" of the object will be
stored across the cluster:
l With its object name hash value of 322, the "primary replica" of the object is stored on the vNode
responsible for the token range that includes the value 322. This is the vNode assigned token 325 (high-
lighted in red in the diagram below) -- this vNode has responsibility for a token range spanning from
320 (exclusive) up to 325 (inclusive). A simple way of identifying where the primary replica will go is
that it's the vNode with the lowest token that's higher than the object's hash value. Note that the
"primary replica" has no functional primacy compared to other replicas; it’s called that only because its
placement is based simply on identifying the disk that’s responsible for the token range into which the
object hash falls.
l The secondary replica is stored to the vNode that’s assigned the next-higher token (330, highlighted in
orange), which is located at hyperstore4:Disk2.
l The tertiary replica is stored to the vNode that’s assigned the next-higher token after that (335, in yel-
low), which is at hyperstore3:Disk3.
57
Chapter 1. Introduction to HyperStore
Working with the same cluster and simplified token space, we can next consider a second object replication
example that illustrates an important HyperStore vNode principle: no more than one of an object’s replicas will
be stored on the same physical host. Suppose that an S3 object is uploaded to the system and the object name
hash is 38. The next diagram shows how the object’s three replicas are placed:
l The primary replica is stored to the vNode that's assigned token 40 — at hyperstore1:Disk3 (red high-
light in the diagram below).
l The vNode with the next-higher token — 45 (with white label) — is on a different disk (Disk1) on the
same physical host as token 40, where the HyperStore system is placing the primary replica. Because
it’s on the same physical host, the system skips over the vNode with token 45 and places the object’s
secondary replica where the vNode with token 50 is — at hyperstore5:Disk3 (orange highlight).
l The tertiary replica is stored to the vNode with token 55, at hyperstore2:Disk1 (yellow highlight).
58
1.6. System Diagrams
Assuming the cluster is configured for 3X replication, on hyperstore2:Disk2 will be stored the following:
59
Chapter 1. Introduction to HyperStore
As noted previously, the HyperStore system when placing secondary and tertiary replicas may in some cases
skip over tokens so as not to store more than one replica of an object on the same physical host. So this
dynamic could result in additional responsibilities for hyperstore2:disk2 as a possible endpoint for secondary
or tertiary replicas.
In the event that Disk 2 fails, Disks 1, 3, and 4 will continue fulfilling their storage responsibilities. Meanwhile,
objects that are on Disk 2 persist within the cluster because they’ve been replicated on other hosts. (Whether
those objects will still be readable by S3 clients will depend on how you have configured consistency level
requirements.)
Consider an example of a HyperStore deployment that spans two data centers — DC1 and DC2 — each of
which has three physical nodes. As in our previous examples, each physical node has four disks; each host is
assigned 32 tokens (vNodes); and we’re supposing a simplified token space that ranges from 0 to 960. In this
multi-DC scenario, the token space is divided into 192 tokens — 32 for each of the six physical hosts — which
are randomly distributed across the six hosts.
Suppose also that S3 object replication in this deployment is configured at two replicas in each data center.
60
1.6. System Diagrams
We can then see how a hypothetical S3 object with a hash value of 942 would be replicated across the two
data centers:
l The first replica is stored to the vNode that's assigned token 945 (in red in the diagram below) — which
is located in DC2, on hyperstore5:Disk3.
l The second replica is stored to vNode 950 (orange) — DC2, hyperstore6:Disk4.
l The next-higher vNode (955, with high-contrast label) is in DC2, where we’ve already met the con-
figured replication level of two replicas — so we skip that vNode.
l The third replica is stored to vNode 0 (yellow) — DC1, hyperstore2:Disk3. Note that after the highest-
numbered token (955) the token "ring" circles around to the lowest token (0). (In a more realistic token
space there would be a token range spanning from the highest vNode token [exclusive] through the top
of the token space and around to the lowest vNode token (inclusive]).
l The next-higher vNode (5, high-contrast label) is in DC2, where we’ve already met the configured rep-
lication level — so we skip that vNode.
l The fourth and final replica is stored to vNode 10 (green) — DC1, hyperstore3:Disk3.
Multi-Region Deployments
If you deploy a HyperStore system across multiple service regions, each region has its own independent
61
Chapter 1. Introduction to HyperStore
storage cluster -- with each cluster having its own 0 to 2127 -1 token space, its own set of vNodes, and its own
independent inventory of stored S3 objects. There is no per-object replication across regions.
Cassandra Data
In a HyperStore system, Cassandra is used for storing object metadata and system metadata. In a typical
deployment, on each HyperStore node Cassandra data is stored on the same RAID-mirrored disks as the OS.
Cassandra data is not stored on the HyperStore data disks (the disks whose mount points are specified by the
configuration setting common.csv: hyperstore_data_directory).
When vNodes are assigned to a host machine, they are allocated across only the host’s HyperStore data
mount points. vNodes are not allocated to the mirrored disks on which Cassandra data is stored.
Within a cluster, metadata in Cassandra is replicated in accordance with your storage policies. Cassandra data
replication leverages vNodes in this manner:
l When a new Cassandra object -- a row key and its associated column values -- is created, the row key
is hashed and the hash (token) is used to associate the object with a particular vNode (the vNode
responsible for the token range that contains the Cassandra object's token). The system checks to see
which host machine that vNode is located on, and the "primary" replica of the Cassandra object is then
stored on the Cassandra disk(s) on that host.
l For example, suppose a host machine is assigned 96 vNodes, allocated across its multiple HyperStore
data disks. Cassandra objects whose hash values fall into the token ranges of any of those 96 vNodes
will get written to the Cassandra disk(s) on that host.
l Additional replicas of the Cassandra object (the number of replicas depends on your configuration set-
tings) are then associated with next-higher-up vNodes and stored to whichever hosts those vNodes are
located on — with the condition that if necessary vNodes will be "skipped" in order to ensure that each
replica of the Cassandra object is stored on a different host machine.
vNode Benefits
vNodes provide several advantages over conventional one-token-per-host schemes, including:
l Token assignment is performed automatically by the system — there is no need to manually assign
tokens when you first set up a storage cluster, or when you resize a cluster.
l For cluster operations that involve transferring data across nodes — such as data repair operations or
replacing a failed disk or host machine — the operations complete faster because data is transferred in
small ranges from a large number of other hosts.
l The allocation of different vNodes to each disk means that failure of a disk affects only a known portion
of the data on the host machine. Other vNodes assigned to the host’s other disks are not impacted.
l The allocation of different vNodes to each disk, coupled with storage policies for replication or eras-
ure coding, enable you to efficiently and safely store S3 object data without the overhead of RAID.
62
Chapter 2. Accessing HyperStore Inter-
faces and Tools
Since the CMC runs on all of your HyperStore nodes, for <CMC_host> you can use the fully qualified
domain name (FQDN) or IP address of any node.
2. The first time you log in to the CMC you will get an SSL certificate warning (since the CMC's default cer-
tificate is a self-signed certificate). Follow the prompts to add an exception for the certificate. You should
then see the CMC’s login screen.
3. Enter the system administrator user ID admin and -- if this is your first time logging in -- the default pass-
word public. When you do so, the login screen will display additional fields in which you must create a
new password for the admin user. After you create the new password and click Save you will be logged
into the CMC.
63
Chapter 2. Accessing HyperStore Interfaces and Tools
Note The first time you log into the CMC the system requires you to create a new password for the
admin user. On subsequent logins to the CMC as the admin user, use the password that you created.
For security purposes you should periodically change the admin user password. You can do so
through the CMC's Security Credentials page (accessible from a drop-down menu under the login
user name in the upper right corner of the CMC interface).
Note For information about managing SSL certificates in HyperStore -- including the option to replace
the CMC's default self-signed certificate with a CA-signed certificate -- see "Setting Up Security and
Privacy Features" (page 104).
./cloudianInstall.sh
hspkg install
Also located in the installation staging directory is the HyperStore host set-up tool system_setup.sh. You (or a
Cloudian representative working with you) would have used this tool to prepare individual host machines for
the installation of your HyperStore cluster, and -- if you are expanding your cluster -- you also use this tool to
prepare new hosts to be added to the cluster. To use this tool, change into the installation staging directory
and then launch the tool as follows:
64
2.3. Accessing Configuration Files
./system_setup.sh
$ hspkg setup
Another useful tool is the HyperStore node and cluster management tool hsstool. You can launch this tool from
any directory on any HyperStore node as follows:
hsstool
hsstool
For more information about using this tool -- including the option to run hsstool commands from the
CMC interface rather than from the command line -- see "hsstool " (page 309).
HyperStore also includes several special-purpose tools that Cloudian Support might have you use in certain
circumstances. These tool are located in the directory /opt/cloudian/tools on each node.
/etc/cloudian-7.5-puppet/
/var/log/cloudian
65
Chapter 2. Accessing HyperStore Interfaces and Tools
Each of these APIs has a service endpoint (URI) at which they can be accessed by whichever third party client
applications you use. The service endpoints are specific to your environment and are derived from information
that you provided during HyperStore installation.
To access any of these APIs with a client application, your users need S3 security credentials (access key
and secret key). These credentials are created automatically as part of the HyperStore user provisioning pro-
cess. For an overview of user provisioning see "Group and User Provisioning Feature Overview" (page
190). Once you've provisioned users in the system they can obtain their S3 security credentials from the CMC's
Security Credentials page (via the drop-down menu under their user name in the upper right of the CMC). Pro-
grammatically a user's security credentials can be obtained through the Admin API call GET /user-
/credentials/list/active (for detail see the "user" section of the Cloudian HyperStore Admin API Reference).
2.6.1. S3 API
The HyperStore S3 Service provides comprehensive support for the AWS Simple Storage Service API. Third
party S3 client applications can access this service at your HyperStore system's S3 service endpoint. To find
the S3 service endpoint for your system go to the CMC's Cluster Information page (Cluster -> Cluster Config
-> Cluster Information).
For more information on HyperStore support for the S3 API see the S3 API section of the Cloudian HyperStore
AWS APIs Support Reference.
Note The CMC has a built-in S3 client through which your HyperStore users can perform many S3
operations such as creating and configuring buckets and uploading and downloading objects.
66
2.6. Accessing HyperStore Implementations of AWS APIs
To find the IAM service endpoint for your system go to the CMC's Cluster Information page (Cluster ->
Cluster Config -> Cluster Information).
For more information on HyperStore support for the IAM API see the IAM API section of the Cloudian Hyper-
Store AWS APIs Support Reference.
Note The CMC has a built-in IAM client through which your HyperStore users can perform many IAM
operations such as creating IAM groups and users under their HyperStore root account.
l http://sts.<your-organization-domain>:16080
l https://sts.<your-organization-domain>:16443
For more information on HyperStore support for the STS API see the STS API section of the Cloudian Hyper-
Store AWS APIs Support Reference.
For more information on HyperStore support for the SQS API, and for information about enabling and con-
figuring the SQS Service, see the Reference -> AWS APIs -> SQS API section of the HyperStore Help.
67
This page left intentionally blank
Chapter 3. Setting Up Administration
Features
The HSH is present on each HyperStore node but by default the HSH is disabled. For information on enabling
the HSH, and provisioning HSH users, see "Enabling the HSH and Managing HSH Users" (page 70). That
section also describes an option to disable root password access to HyperStore nodes, so that administration
of the nodes is performed exclusively by way of the HSH.
For information on the HyperStore commands and Linux OS commands supported by the HSH, see "Using the
HSH" (page 74).
Each HyperStore node has its own HSH log, recording HSH logins and commands run on that node. The log is
configured as append only, and HSH users cannot modify this configuration.
For more information on HSH logging, see "HyperStore Shell (HSH) Log" (page 551).
For information about Object Lock, see "Object Lock Feature Overview" (page 144).
For information about enabling HSH and disabling the root password, see "Enabling the HSH and Managing
HSH Users" (page 70).
See also:
69
Chapter 3. Setting Up Administration Features
Note The exception is that HyperStore appliances delivered for new deployments in certain industry
sectors with stringent security requirements may arrive on site with HSH already enabled (and the root
password already disabled). If you're unsure whether this applies to you, consult with your Cloudian
representative.
The HSH is now enabled in your system, but by default no users are able to log into the HSH and use it. To
provision users who can use the HSH, see "Adding HSH Users" (page 71) below.
70
3.1. HyperStore Shell (HSH)
When HyperStore creates an HSH user corresponding to a CMC System Admin user:
l The HSH user name is the CMC user name prefixed by "sa_". For example, for CMC user "admin"
the HSH login name is "sa_admin"; and for CMC user "tom" the HSH login name is "sa_tom".
l The HSH user login password is the same as the CMC user login password. The HSH user login
password cannot be managed separately from -- or differ from -- the corresponding CMC user login
password. If an HSH user wants to change their password they should change their CMC password,
and their HSH password will automatically change to match their new CMC password.
Note If you have LDAP authentication enabled for the System Admin group, then when Hyper-
Store creates an HSH user corresponding to a CMC System Admin user the HSH user name
will not have an "sa_" prefix (instead it will be the same as the CMC user name); and when the
user logs into the HSH they will use their LDAP credentials and the system will verify those cre-
dentials against your LDAP service. Note that if a local Unix user account with that same user
name already exists, a new HSH user account will not be created (the existing local user
account will not be overwritten). For more information on LDAP authentication see "LDAP Integ-
ration" (page 194) and especially the sub-section "LDAP Authentication of System Admin-
istrators and HSH Users" (page 196).
Once an HSH user has been created, that user can use SSH to log into any HyperStore node. Upon login, the
user's shell will be the HyperStore shell. The prompt will appear as follows:
71
Chapter 3. Setting Up Administration Features
<username>@<hostname>$
For example:
sa_admin@hyperstore1$
You can confirm that you are in the HyperStore shell by typing help:
sa_admin@hyperstore1$ help
HyperStore Shell
Version: 1.0.1-2, d6b3c8d46ecaaaa69b62af25c4e7d4270f8b4d7e(otp protocol: 1)
Commands:
...
For information on using the HyperStore shell see "Using the HSH" (page 74).
Note HyperStore does not create corresponding HSH users for CMC Group Admin level users (or for
regular users). Only System Admin level users are allowed HSH access.
While regular HSH users can run most of the commands that the HSH supports, Trusted HSH users can run all
of the commands that the HSH supports. Put differently, there are some commands that only Trusted HSH
users can run. For information about supported commands and which ones are available only to Trusted users,
see "Using the HSH" (page 74).
The HSH user sa_admin -- the HSH user corresponding to the CMC default System Admin user -- is auto-
matically a Trusted HSH user. By contrast, HSH users corresponding to additional System Admin users that
have been created in the CMC are by default regular HSH users who do not have the Trusted role.
As a Trusted user the sa_admin user can elevate one or more regular HSH users so that they too are Trusted:
1. Log into the Configuration Master node as the sa_admin user. Upon login you will be in the HyperStore
shell.
2. Run this command to add a regular HSH user to the Trusted role:
$ hspkg role -a <username> trusted
Be sure to specify the HSH user name (including the sa_ prefix), not the CMC user name.
For example:
Note The response shows the current list of users who have been explicitly granted the Trusted
role (which is just one user in the example above). The sa_admin user, who is automatically
Trusted, will not appear in this list.
72
3.1. HyperStore Shell (HSH)
3. If you want to elevate any other regular HSH users to the Trusted role at this time, run the command
from Step 2 again, with another HSH user name.
4. Apply the configuration change to the cluster. (Here you are again using the hsctl tool that was men-
tioned previously).
$ hsctl config apply hsh
Once additional HSH users have been elevated to the Trusted role, then those Trusted users are also able to
elevate regular HSH users to the Trusted role. That is, any Trusted HSH user has the ability to elevate regular
HSH users to the Trusted role.
To see the current list of HSH users who have been explicitly granted the Trusted role, any Trusted user
can run this command:
Note again that the response will exclude the sa_admin user, who is automatically Trusted.
To remove an HSH user from the list of Trusted users, a Trusted user can run these commands:
This does not delete the HSH user; it only demotes them to being a regular HSH user rather than a Trusted
user.
Note If an HSH user that you elevate to the Trusted role (or remove from the Trusted role) is logged in
to the HSH when you make the change, the change in their role status will not take effect until after the
user logs out and then logs back in again.
In the case of a System Admin user who is made inactive, and then subsequently made active again, after
being made active again that user must change their password in the CMC or through the Admin API in order
to trigger the system to recreate a corresponding HSH account for the user.
The system does not support deleting or disabling the HSH account of a CMC System Admin user while their
CMC account remains active.
Note Disabling the root password is required if you want to use the Object Lock feature.
73
Chapter 3. Setting Up Administration Features
Also before you disable the root account password, it's important to be aware that:
l If you disable the root password, and then you subsequently want root password access to your Hyper-
Store nodes again, you will need to contact Cloudian Support for assistance. Once you disable root
password access to HyperStore nodes you cannot regain root password access without assistance
from Cloudian Support.
l Disabling the root password will prevent users from logging in to a HyperStore node as root using a
password but it will not prevent a root user from accessing a HyperStore node using an SSH key, if
in your environment SSH key access has been set up for the root user.
l Disabling the root password will have no effect on any users who were granted sudo privileges
before you disabled the root password. Such users (if any) will continue to have the privileges granted
to them by a node's etc/sudoers configuration. To help secure your HyperStore nodes, Cloudian, Inc.
recommends that you manually remove the sudo privileges of all users, on all HyperStore nodes
(including yourself if you have sudo privileges). Also if you have granted elevated permissions to any
user or application through any other mechanism, remove those permissions.
1. As root, log into the Configuration Master and then change into the installation staging directory.
Note You must be logged in as root to disable the root password -- you cannot do it while
logged in as an HSH user.
3. In the installer main menu enter 4 for "Advanced Configuration Options", then at the next menu enter m
for "Disable the root password".
4. Follow the prompts to disable the root password.
After exiting the installer, log out from the node. Then try to log back in as root, using the root password -- the
login attempt should fail. Then log in as sa_admin or another HSH user, with that user's password. The login
should succeed, and you should be in the HyperStore shell.
If you disable root password access to your HyperStore nodes as described above, the only way to regain root
password access is to contact Cloudian Support for assistance.
74
3.1. HyperStore Shell (HSH)
<username>@<hostname>$
For example:
sa_admin@hyperstore1$
To confirm that you are in the HyperStore shell and to view a list of HSH commands, type help:
$ help
HyperStore Shell
Version: 1.0.1-2, d6b3c8d46ecaaaa69b62af25c4e7d4270f8b4d7e(otp protocol: 1)
Commands:
...
To view HSH command inline help type <command> --help (or <command> -h):
$ hslog --help
View protected log files under the /var/log directory.
Usage:
hslog [flags] FILENAME
Flags:
-h, --help help for hslog
To end your HSH session and disconnect from the HyperStore node, type exit.
$ exit
Note On each node an HSH user's home directory is /home/<username> -- for example /home/sa_
admin.
If you are logged into the HSH on a node and your SSH connection gets cut:
l Any long-running commands that were in-progress in that HSH session will continue to run
l You can log back into the HSH on that same node and you will be given the choice to reattach to the
existing session or to start a new session
You can also manually detach from a session by using the keystroke sequence ctrl-g d. Then when you log in
again you can choose whether to reattach to the existing session or to start a new session.
Note If an HSH user gets disconnected -- or intentionally detaches -- from a session, and then logs in
and starts a new session, the user can have multiple concurrent sessions on a node. The system sets a
limit of 9 concurrent sessions for a single HSH user on a node.
75
Chapter 3. Setting Up Administration Features
To end an existing session from which you are detached, reattach to it and then type exit on the command
line.
Note Do not precede these commands with a path -- for example, run hsstool as simply hsstool not as
./hsstool.
More Inform-
Command Purpose
ation
This command serves the same purpose CMC Cluster -
cloudian_systeminfo.sh as the "Collect Diagnostics" function in the > Nodes ->
CMC. Advanced
"Decrypting
Decrypt encrypted log field values in log an Encrypted
GDPR_decryption.py file copies that have been uploaded to Log Field
Cloudian Support. Value" (page
85)
"Enabling
Elasticsearch
Synchronize object metadata in your
Integration for
elasticsearchSync Elasticsearch cluster to object metadata
Metadata
currently in your HyperStore cluster
Search" (page
157)
"Setting Up
Security and
Privacy
Features"
(page 104)
hsctl is a new HyperStore node man-
"FIPS Sup-
agement tool that is of limited use in the
hsctl (requires Trusted role) port" (page
current HyperStore release but will be
125)
more prominent in future releases
11.1.6 HTTP
(S) Basic
Authentication
for Admin API
Access
In shell, hslog -
-help
hslog View a HyperStore log file Or for more
detailed inform-
ation see:
76
3.1. HyperStore Shell (HSH)
More Inform-
Command Purpose
ation
"Using the
HSH to View
Logs" (page
566)
hspkg Manage HyperStore installation and con- In shell, hspkg
Sub-commands include those listed below. figuration --help
In shell, hspkg
setup --help
See also:
"Preparing
Launch the HyperStore host setup tool Your Nodes
l hspkg setup (requires Trusted role)
(system_setup.sh) For HyperStore
Installation" in
the Cloudian
HyperStore
Installation
Guide
In shell, hspkg
install --help
See also:
"Upgrading
Your Hyper-
Store Soft-
ware Version"
(page 203)
"Installer
Advanced
Configuration
Options"
Launch the HyperStore installer (page 407)
l hspkg install (requires Trusted role)
(cloudianInstall.sh)
"Pushing Con-
figuration File
Edits to the
Cluster and
Restarting Ser-
vices" (page
411)
Note
When
launch-
ing the
77
Chapter 3. Setting Up Administration Features
More Inform-
Command Purpose
ation
installe-
r from
the
Hyper-
Store
Shell,
the
installe-
r's
"Unin-
stall
Cloud-
ian
Hyper-
Store"
menu
option
is dis-
abled.
In shell, hspkg
config --help
Or for more
detailed inform-
ation see:
View or edit a HyperStore configuration
l hspkg config (requires Trusted role)
file "Using the
HSH to Man-
age Con-
figuration
Files" (page
416)
In shell, hspkg
role --help
Or for more
detailed inform-
ation see:
l hspkg role (requires Trusted role) Manage HSH "Trusted" role membership
"Elevating a
Regular HSH
User to the
"Trusted"
Role" (page
72)
l hspkg version Check the HyperStore software version --
78
3.1. HyperStore Shell (HSH)
More Inform-
Command Purpose
ation
tographically signed by Cloudian (such as
-help
a HyperStore release package file).
The HSH supports running any hsstool
"hsstool "
hsstool command, such as hsstool repair, hsstool
(page 309)
cleanup, and so on
In shell,
install_appli-
install_appliance_license.sh (requires Trusted Install a new license for a HyperStore
ance_
role) Appliance
license.sh --
help
Introduction
Create a Jetty-obfuscated password, in section of the
connection with modifying the HTTP/S Cloudian
jetty_password.sh
Basic Authentication password for the HyperStore
Admin Service. Admin
API Reference
Cassandra's native node and cluster man-
nodetool agement tool. Typically you should use --
hsstool instead.
Copy image and resource files to or from
"Rebranding
the Puppet configuration directory, in con-
rebrand_cmc.sh (requires Trusted role) the CMC UI"
nection with rebranding the CMC inter-
(page 183)
face.
This can be used if working with your
reset_expired_password (requires Trusted role) Cloudian representative to set up a newly --
delivered HyperStore appliance.
The following commands pertain only to HyperStore appliances and would typically be used only on instruc-
tions from Cloudian Support. All of these commands require the Trusted role.
l slot_disk_map_1500.sh
l slot_disk_map_4000.sh
l slot_disk_map_hsx.sh
l slot_disk_map_x3650.sh
79
Chapter 3. Setting Up Administration Features
Note For security reasons, for some commands the HyperStore shell forbids certain arguments / sub-
commands. If you try to run a command with one of its forbidden arguments the shell returns a mes-
sage indicating that the argument is not permitted.
blkid hostname ps
cat iostat pwd
cd ip rm
chmod ipmitool (requires Trusted role) rmdir
cp keytool scp
curl kill (requires Trusted role) sftp
# sftp [email protected]
[email protected]'s password:
Connected to 127.0.0.1.
sftp> put survey.csv
Uploading survey.csv to /home/sa_admin/survey.csv
survey.csv 100% 28 6.3KB/s 00:00
sftp> get survey.csv
Fetching /home/sa_admin/survey.csv to survey.csv
/home/sa_admin/survey.csv 100% 28 13.1KB/s 00:00
sftp> put abc /etc/passwd
Uploading abc to /etc/passwd
remote open("/etc/passwd"): Permission denied
80
3.2. Smart Support
Note For the list of files that the HSH has permission to access, run this command:
For more information see "Using the HSH to Manage Configuration Files" (page 416).
l PATH is fixed for each command; you cannot specify a path when running a command.
l sudo is not allowed. Instead, commands that need root privilege (such as systemctl) are automatically
given that privilege inside the HSH. (The HSH accomplishes this by careful management of the effect-
ive UID, which is only elevated to root when running a command as root is required.)
l Entering more than one command on a line is not allowed. Characters such as ";", "&", and "|" are pro-
cessed as literal characters, not as special characters.
l Input and output redirection are not allowed.
l Job control commands such as ctrl-z are not allowed.
l When listing files, the use of the wildcard character "*" is not allowed.
l Tab completion for command names is supported but tab completion for file names is not.
HyperStore includes two automated features that help you collaborate with Cloudian Support to keep your sys-
tem running smoothly: Smart Support and on-demand Node Diagnostics.
81
Chapter 3. Setting Up Administration Features
IMPORTANT ! Cloudian's Smart Support feature is not a comprehensive remote monitoring program.
Cloudian Support will notify you only in the case of critical issues, and not in real time. Primary respons-
ibility for monitoring and managing your HyperStore system lies with you, the customer.
Specifically, the Smart Support mechanism provides Cloudian Support the following system and node inform-
ation:
This information is collected daily on to one of your nodes (the node identified as the "System Mon-
itoring/Cronjob Primary Host" in the CMC's Cluster Information page [Cluster -> Cluster Config -> Cluster
Information]). Under /var/log/cloudian on the node there are these files:
l diagnostics.csv -- This is the live file into which the current day's performance statistics are continuously
written.
l diagnostics_<date/time>_<version>_<region>.tgz-- Once a day the current diagnostics.csv file is pack-
aged -- together with application and transaction log files -- into a diagnostics_<date/time>_<version>_
<region>.tgz file. This file is transmitted to Cloudian Support once each day. By default a local copy of
each diagnostics_<date/time>_<version>_<region>.tgz file remains on the node for 15 days before
being automatically deleted.
The Smart Support feature is enabled by default. You have the option of disabling the feature, although this is
not recommended.
82
3.2. Smart Support
As part of the Smart Support feature, if a data disk on a HyperStore node fails (becomes disabled), information
about the failed disk is automatically sent to Cloudian Support within minutes. This triggers the automatic open-
ing of a Support case for the failed disk. This occurs in addition to the alerting functions that bring the failed disk
to your attention.
For HyperStore Appliances, automatic case creation is also performed for failed OS disks.
For more information on automated disk failure handling see "Automated Disk Management Feature Over-
view" (page 281).
l System log files and HyperStore log files for the target node(s)
l Outputs from various system commands and HyperStore application commands for the target node(s)
l MBean data for the Java-based HyperStore services for the target node(s)
l Configuration files from the target node(s)
l Puppet and Salt configuration files and logs from the system
On the target node(s), under the directory /var/log/cloudian/cloudian_sysinfo, the Node Diagnostics mech-
anism packages all this data into a file named <hostname>_<YYYYMMDDhhmm>.tar.gz. Optionally you can
have the system also automatically send a copy of the package(s) to Cloudian Support.
Note The system creates the /var/log/cloudian/cloudian_sysinfo directory on a node the first time you
use the Collect Diagnostics feature for that node.
See also:
83
Chapter 3. Setting Up Administration Features
l Have the daily Smart Support upload go to an S3 destination other than Cloudian Support, by setting
the phonehome_uri, phonehome_bucket, phonehome_access_key, and phonehome_secret_key set-
tings in common.csv. For more information see "phonehome_uri" (page 430) and the subsequent set-
ting descriptions.
l Have the daily Smart Support upload use a local forward proxy by setting the phonehome_proxy_host,
phonehome_proxy_port, phonehome_proxy_username, and phonehome_proxy_password settings in
common.csv. For more information see "phonehome_proxy_host" (page 429) and the subsequent set-
ting descriptions.
l Disable the daily Smart Support upload by setting phonehome.enabled in mts.properties.erb to false.
This is not recommended.
l Have on-demand Node Diagnostics uploads (triggered by your using the CMC's Collect Diagnostics
function [Cluster -> Nodes -> Advanced]) go to an S3 destination other than Cloudian Support, by set-
ting the sysinfo.uri, sysinfo.bucket, sysinfo.accessKey, and sysinfo.secretKey properties in mts.-
properties.erb. For more information see "sysinfo.uri" (page 486) and the subsequent property
descriptions.
Note Be sure to do a Puppet push and restart the S3 Service if you edit any of these configuration set-
tings. For instructions see "Pushing Configuration File Edits to the Cluster and Restarting Services"
(page 411).
When enabled, the daily diagnostics upload to an S3 URI is triggered by a HyperStore cron job. The timing of
the cron job is configured in /etc/cloudian-<version>-puppet/modules/cloudians3/templates/cloudian-
crontab.erb on the Configuration Master node. If you want to edit the cron job configuration, look for the job that
includes the string "phoneHome". If you edit the crontab, do a Puppet push. No service restart is necessary.
The deletion of old diagnostics packages is managed by Puppet, as configured in common.csv by the
"cleanup_directories_byage_withmatch_timelimit" (page 421) setting (for daily system diagnostics pack-
ages associated with the Smart Support feature) and the "cleanup_sysinfo_logs_timelimit" (page 421) set-
ting (for on-demand Node Diagnostics packages that you've generated) . By default these settings have
Puppet delete the diagnostics packages after they are 15 days old. This presumes that you have left the Pup-
pet daemons running in your HyperStore cluster, which is the default behavior. If you do not leave the Puppet
daemons running the diagnostics logs will not be automatically deleted. In that case you should delete the old
packages manually, since otherwise they will eventually consume a good deal of storage space.
If you edit either of these settings in common.csv, do a Puppet push. No service restart is necessary.
3.2.2.1.3. Encrypting Sensitive Fields in Uploaded Log File Copies to Protect Data Pri-
vacy
If you need to comply with the European Union's General Data Protection Regulation (GDPR), or if it's in keep-
ing with your own organization's data privacy policies, you can have your HyperStore system encrypt per-
sonally identifiable user information and other sensitive values -- such as user IDs, email addresses, IP
addresses, hostnames, and role session information -- in the log copies that get uploaded to Cloudian Support
as part of the Smart Support and Node Diagnostics features. This feature uses an encryption key that is unique
84
3.2. Smart Support
to your HyperStore system and is generated automatically when you do a fresh installation of -- or upgrade to --
HyperStore version 7.4 or newer. Cloudian does not have access to the encryption key, so Cloudian cannot
decrypt the encrypted log field values. Only you can decrypt an encrypted log field value, using a tool that
comes with your HyperStore system. Your ability to decrypt an encrypted log field value will come into play if
Cloudian Support is analyzing logs while working with you to troubleshoot an issue, and Cloudian Support
asks you to provide the decrypted value of a particular field from a relevant log entry.
Note This feature encrypts sensitive log field values only in the log file copies that are uploaded to
Cloudian Support. It does not alter in any way the original log files.
This feature is disabled by default. To enable this feature, so that user IDs, email addresses, IP addresses, host-
names, and role session information are encrypted in the log file copies that are uploaded to Cloudian Support
as part of the Smart Support and Node Diagnostics features:
If you enable the feature that encrypts sensitive fields in log copies uploaded to Cloudian Support (as
described above), and the Cloudian Support team is analyzing log copies while working with you to
troubleshoot a system issue, the Cloudian Support team may ask you to decrypt an encrypted value from a rel-
evant log entry.
Here is an example of a log file entry with an encrypted value. An encrypted field appears as "Enc:<value>". In
the example below the encrypted value is shown in bold:
2021-09-03 00:00:05,503|Enc:<3Wi5Nacd9lMJgNQJ:kk2krSUIcNWiHRlY>|healthCheck|115|0|0|0|115
|95|200|977d68ed-1f8c-1fa3-a7d9-a81e84492d6f|0|0|||Apache-HttpClient/4.5.9 (Java/11.0.
In this example, if Cloudian Support needed you to decrypt the value to aid in resolving a support case, Cloud-
ian Support would provide you the encrypted string 3Wi5Nacd9lMJgNQJ:kk2krSUIcNWiHRlY through the Sup-
port case.
You would then log into your Configuration Master node and run this command to decrypt the encrypted string:
# /opt/cloudian/tools/GDPR_decryption.py encrypted-string
# /opt/cloudian/tools/GDPR_decryption.py 3Wi5Nacd9lMJgNQJ:kk2krSUIcNWiHRlY
10.50.40.125
$ GDPR_decryption.py 3Wi5Nacd9lMJgNQJ:kk2krSUIcNWiHRlY
10.50.40.125
You would then provide Cloudian Support with the decrypted string value -- 10.50.40.125 in this example.
85
Chapter 3. Setting Up Administration Features
Note The GDPR_decryption.py tool uses your system's unique encryption key to decrypt the string.
At Cloudian Support's direction, you may be asked to use some of these support tunnel related commands on
a HyperStore node:
Action Command
Enable the support tunnel service systemctl enable support-tunnel
Start the support tunnel service systemctl start support-tunnel
systemctl status support-tunnel
(this status check is one method of obtain-
Check the support tunnel service status ing a support "token" that Cloudian Sup-
port may transmit to you in some
circumstances)
Stop the support tunnel service systemctl stop support-tunnel
Also, in some circumstances Cloudian Support may direct you to edit this support tunnel configuration file on a
HyperStore node:
/etc/cloudian/support-tunnel/support-tunnel.yaml
If you are using the HyperStore Shell as a trusted user you can perform any of the above tasks, at the direction
of Cloudian Support (you can run the systemctl commands and you can use hspkg config commands to view
or edit the support-tunnel.yaml file).
86
3.3. Setting Up Alerts and Notifications
By default, alerts are communicated to you only by appearing in the CMC's Alerts page. However, it is
strongly recommended that you supply a system administrator email address to which HyperStore will
send notification emails when alerts are triggered. You can supply the email information (system administrator
address and other required information such as your SMTP server's FQDN) in the CMC's Configuration Set-
tings page. . For more information, while logged into the CMC's Configuration Settings page click Help. Once
you've supplied the system with the information required to send emails to a system administrator, the pre-con-
figured alert rules will include sending email to the administrator's address when an alert is triggered.
If you wish you can also have HyperStore send SNMP traps when alerts are triggered. To do so you need to
take two steps:
1. Supply the SNMP destination information in the CMC's Configuration Settings page. . For more inform-
ation, while logged into the CMC's Configuration Settings page click Help.
2. In the CMC's Alert Rules page, modify each of the pre-configured alert rules so that they include send-
ing an SNMP trap. (Or modify a subset of the alert rules if you want SNMP traps sent for some alerts and
not for others.) The pre-configured alert rules do not include sending an SNMP trap by default.
While logged into the CMC as the admin user you can use the CMC's Manage Users page (Users & Groups -
> Manage Users) to create additional system admin users if you wish (with different user names). When cre-
ating a new user in that page you can select the user type from a drop-down list, and "System Admin" is one of
the options. Once created, a new system admin user will have the same capabilities as the admin user: they
will be able to log into the CMC and perform system monitoring and administration tasks, including user pro-
visioning (and including the ability to create additional system admin users).
System admin users can also delete other system admin users, with the exception that the pre-provisioned sys-
tem admin user named admin cannot be deleted.
If you have the HyperStore Shell enabled, each system admin user will be able to log into and use the Hyper-
Store Shell. For more information see "Enabling the HSH and Managing HSH Users" (page 70).
Note You can also create and manage system admin users through the Admin API, similar to creating
and managing other types of users. For more information see the "user" section of the Cloudian Hyper-
Store Admin API Reference.
87
Chapter 3. Setting Up Administration Features
IMPORTANT ! The central logging server must not be one of your HyperStore nodes.
Note rsyslog is included in the HyperStore Appliance and also in standard RHEL/CentOS dis-
tributions. This procedure has been tested using rsyslog v5.8.10.
To aggregate HyperStore application logs, request logs, and system logs to a central logging server follow the
instructions below.
# BEFORE EDITING
#$ModLoad imudp
#$UDPServerRun 514
# AFTER EDITING
$ModLoad imudp
$UDPServerRun 514
b) Still on the central logging server, create a file /etc/rsyslog.d/cloudian.conf and enter these con-
figuration lines in the file:
c) Still on the central logging server, restart rsyslog by "service rsyslog restart".
d) Still on the central logging server, enable rotation on the centralized HyperStore logs. For example, to
use logrotate for rotating the HyperStore logs, create a file /etc/logrotate.d/cloudian and enter the
88
3.5. Aggregating Logs to a Central Server
following configuration lines in the file. (Optionally adjust the rotated file retention scheme — 14 rota-
tions before deletion in the example below — to match your retention policy.)
/var/log/cloudian-*.log
{
daily
rotate 14
create
missingok
compress
delaycompress
sharedscripts
postrotate
/bin/kill -HUP `cat /var/run/syslogd.pid 2> /dev/null` 2> /dev/null || true
endscript
}
o log4j-admin.xml.erb
o log4j-hyperstore.xml.erb
o log4j-s3.xml.erb
In each of these three files search for and uncomment all sections marked with a "#syslog" tag.
When you are done uncommenting, there should be no remaining "#syslog" tags.
Also, in the first #syslog section at the top of each file — nested within the "Properties" block — in addi-
tion to uncommenting the #syslog section set the sysloghost property to the hostname or IP address or
your central syslog server and set the syslogport property to the central syslog server’s UDP port (which
by default is port number is 514).
Here is a before and after example for uncommenting and editing the #syslog section within a "Prop-
erties" block. In the BEFORE EDITING text, the commenting boundaries (which need to be removed) are
highlighted in red. In the AFTER EDITING text, the commenting boundaries have been removed and the
central logging server hostname has been set to "regulus".
# BEFORE EDITING
<Properties>
<!-- #syslog
<Property name="sysloghost">localhost</Property>
<Property name="syslogport">514</Property>
-->
</Properties>
# AFTER EDITING
<Properties>
<Property name="sysloghost">regulus</Property>
<Property name="syslogport">514</Property>
</Properties>
Be sure to uncomment all of the "#syslog" tagged sections in each of the three files. Remember that
89
Chapter 3. Setting Up Administration Features
only the #syslog section within the "Properties" block at the top of each file requires editing an attribute
value. The rest of the #syslog sections only require uncommenting.
In this example of a #syslog section in log4j-s3.xml.erb you would remove the commenting boundaries
that here are highlighted in red.
<!-- #syslog
<Syslog name="SYSLOG-S3APP" format="RFC5424" host="${sysloghost}" port="${syslogport}"
protocol="UDP" appName="S3APP" mdcId="mdc" includeMDC="true" facility="USER" newLine="true">
</Syslog>
<Syslog name="SYSLOG-S3REQ" format="RFC5424" host="${sysloghost}" port="${syslogport}"
protocol="UDP" appName="S3REQ" mdcId="mdc" includeMDC="true" facility="USER" newLine="true">
</Syslog>
<Syslog name="SYSLOG-S3WORM" format="RFC5424" host="${sysloghost}" port="${syslogport}"
protocol="UDP" appName="s3-worm" mdcId="mdc" includeMDC="true" facility="USER" newLine="true">
</Syslog>
-->
Here is a second example of a section that needs uncommenting, from that same file:
# BEFORE EDITING
#*.* @<loghost>:514
# AFTER EDITING
*.* @regulus:514
IMPORTANT ! The central log host must not be one of your HyperStore hosts. (The reason is that if
you have one of your HyperStore hosts acting as the central log host, then that host is sending logs to
itself which results in a loop and rapid proliferation of log messages.)
c) Still on your HyperStore Configuration Master node, use the installer to push your changes to the
cluster and to restart the HyperStore Service and the S3 Service. For instructions see "Pushing Con-
figuration File Edits to the Cluster and Restarting Services" (page 411).
b) Confirm that system messages from HyperStore nodes appear on the log host (for example in /var/-
log/messages). If you want you can proactively test this by running the command "logger test 1" on any
HyperStore node.
90
Chapter 4. Setting Up Service Features
Storage policies are ways of protecting data so that it’s durable and highly available to users. The HyperStore
system lets you pre-configure one or more storage policies. Users when they create a new storage bucket can
then choose which pre-configured storage policy to use to protect data in that bucket. Users cannot create
buckets until you have created at least one storage policy.
For each storage policy that you create you can choose from either of two data protection methods:
l Replication — With replication, a configurable number of copies of each data object are maintained in
the system, and each copy is stored on a different node. For example, with 3X replication 3 copies of
each object are stored, with each copy on a different node.
l Erasure coding — With erasure coding, each object is encoded into a configurable number (known as
the "k" value) of data fragments plus a configurable number (the "m" value) of redundant parity frag-
ments. Each of an object’s "k" plus "m" fragments is unique, and each fragment is stored on a different
node. The object can be decoded from any "k" number of fragments. To put it another way: the object
remains readable even if "m" number of nodes are unavailable. For example, in a 4+2 erasure coding
configuration (4 data fragments plus 2 parity fragments), each object is encoded into a total of 6 unique
fragments which are stored on 6 different nodes, and the object can be decoded and read so long as
any 4 of those 6 fragments are available.
In general, erasure coding requires less storage overhead -- the amount of storage consumption above and
beyond the original size of the stored objects, in order to ensure data persistence and availability -- than rep-
lication. Put differently, erasure coding is more efficient in utilizing raw storage capacity than is replication.
For example, while 3X replication incurs a 200% storage overhead, 4+2 erasure coding incurs only a 50% stor-
age overhead. Or stated in terms of storage capacity utilization efficiency, 3X replication is 33% efficient (for
instance with 12TB of available storage capacity you can store 4TB of net object data) whereas 4+2 erasure
coding is 67% efficient (with 12TB of available storage capacity you can store 8TB of net object data). On the
other hand, erasure coding results in somewhat longer request processing latency than replication, due to the
need for encoding/decoding.
In light of its benefits and drawbacks, erasure coding is best suited to long-term storage of large objects that
are infrequently accessed.
Regardless of whether you use replication or erasure coding, if your HyperStore system spans multiple data
centers, for each storage policy you can also choose how data is allocated across your data centers — for
example, you could have a storage policy that for each S3 object stores 3 replicas of the object in each of your
data centers; and a second storage policy that erasure codes objects and stores them in just one particular
data center (for more information see "Multi- Data Center Storage Policies" (page 92)).
91
Chapter 4. Setting Up Service Features
Also as part of the configuration options for each storage policy, you can choose whether to compress and/or
encrypt stored objects.
Individual storage policies are not confined to dedicated nodes or disks. Instead, all policies utilize all the
resources of your cluster, and data stored in association with a particular policy will tend to be spread fairly
evenly across the cluster (with the exception that you can limit a policy to a particular data center as noted
above). This helps to ensure that regardless of how many or what types of storage policies you configure, and
regardless of how much data is stored in association with particular policies, the physical resources of your
entire cluster — disks, CPU, RAM — will be used in an approximately even manner.
l 4+2
l 6+2
l 8+2
l 9+3
l 12+4
The choice among these supported EC configurations is largely a matter of how many HyperStore nodes you
have in the data center. For example, compared to a 4+2 configuration, 6+2 EC provides the same degree of
data availability assurance (objects can be read even if 2 of the involved nodes are unavailable), while deliv-
ering a higher level of storage efficiency (4+2 is 67% efficient whereas 6+2 is 75% efficient). So 6+2 may be
preferable to 4+2 if you have at least 8 HyperStore nodes in the data center.
Likewise, 9+3 EC provides a higher degree of protection and availability than 6+2 EC (since with 9+3 EC,
objects can be read even if 3 of the involved nodes are unavailable) while delivering the same level of storage
efficiency (both 6+2 and 9+3 are 75% efficient). So 9+3 may be preferable to 6+2 if you have at least 12 Hyper-
Store nodes in the data center.
Note If you want to use a k+m configuration other than those mentioned above, contact Cloudian Sup-
port or your Cloudian Sales representative to see whether your desired configuration can be sup-
ported.
Note For detailed information on S3 write and read availability under various combinations of cluster
size and storage policy configuration, see "Storage Policy Resilience to Downed Nodes" (page 99).
For storage policies that use replication only, in a multiple data center environment you can choose how
many replicas to store in each data center -- for example, for each object store 3 replicas in DC1 and 2 replicas
in DC2.
92
4.1. Creating Storage Policies
For erasure coding storage policies, you have the option of replicating the k+m fragments in each of the par-
ticipating DCs (so that each participating DC stores k+m fragments), or distributing the k+m fragments across
the participating DCs (so that there are a combined total of k+m fragments across the participating DCs).
l 4+2
l 6+2
l 8+2
l 9+3
l 12+4
In each of the above configurations the k+m fragments can be replicated across multiple DCs.
For distributed erasure coding, the supported options depend on how many data centers you are using in the
storage policy. You must use at least 3 DCs for this type of policy, and by default your k+m options are as indic-
ated in the table below:
Note If you want to use a k+m configuration other than those mentioned above, contact Cloudian Sup-
port or your Cloudian Sales representative to see whether your desired configuration can be sup-
ported.
Note For any type of storage policy in a multiple data center environment, you have the option of con-
figuring the policy such that data is stored in some of your data centers and not others — for example,
you can create a policy that stores data in DC1 and DC2 but not in DC3. Note, however, that DC3 may
be involved in processing S3 requests associated with buckets that use this policy. By default there is
only one S3 service endpoint per region, and incoming S3 requests may resolve to any DC within the
region. If the S3 Service in DC3 receives an S3 PUT request in association with a policy that stores
data only in DC1 and DC2, it will transmit the uploaded object on to DC1 and D2 (it will not be stored in
DC3). Likewise, if DC3 receives an S3 GET request in association with a policy that stores data only in
DC1 and DC2, then DC3’s S3 Service will get the object from DC1 or DC2 and pass it on to the client. If
you want more absolute barriers so that for example DC3 never touches DC2’s data and vice-versa,
you need to set up your system so those DCs are in different service regions.
93
Chapter 4. Setting Up Service Features
See also:
Below is the list of consistency levels supported by the HyperStore system. Your consistency level options
when configuring a storage policy will be limited by the data distribution scheme (replication or erasure coding,
single DC or multi-DC) that you have selected for that policy.
l ALL
l QUORUM
l EACH QUORUM
l LOCAL QUORUM
l ANY QUORUM
l ONE
For descriptions of these consistency settings, while on the CMC's Storage Policies page (Cluster -> Storage
Policies) click Help.
For detailed information on S3 write and read availability under various combinations of cluster size, data dis-
tribution scheme, and consistency level settings, see "Storage Policy Resilience to Downed Nodes" (page
99).
Note In the case of writes, if the consistency requirement is met by something less than completing
writes of all replicas (or all erasure coded fragments), then after returning a success response to the cli-
ent the system continues to try to complete the remaining writes. If any of these writes fail they will later
be recreated by automatic data repair.
94
4.1. Creating Storage Policies
Note As an advanced option you can also configure "dynamic" consistency levels, whereby the system
will try to achieve a "fallback" consistency level if the primary consistency level cannot be achieved. For
more information see "Dynamic Consistency Levels" (page 52).
If the read consistency requirements are met for an S3 GET operation -- for reading the required number of
object metadata replicas (in Cassandra) and the digests for the required number of object data replicas -- the
system then retrieves just one object data replica file in order to return the object data to the S3 client. For
example to meet a read consistency requirement of ALL, the system must be able to read all the object's
metadata replicas in Cassandra, and all the object's data replicas' file digests in RocksDB -- and then it
retrieves one object data replica and returns it to the client.
Metadata for objects is stored in two different types of record in the Metadata DB: object-level records (with one
such record for each object) and bucket-level records that identify the objects in a bucket (along with some
metadata for each of those objects). Both types of object metadata are replicated to the same degree. So for
example, in a 3X replication storage policy, for each object the object-level metadata record is replicated three
times in the cluster and for each bucket the bucket-level object metadata records are replicated three times in
the cluster.
A GetObject request requires reading the object's object-level metadata record and a List Objects request
requires reading the bucket's bucket-level object metadata records. Whatever read consistency requirements
you set for a storage policy apply not only to reads of individual objects but also to reads of bucket content lists.
So for example if you use a QUORUM read consistency requirement, then in order to successfully execute a
List Objects request the system must be able to read a QUORUM of the bucket-level object metadata records
for the bucket.
95
Chapter 4. Setting Up Service Features
HyperStore object metadata is stored in Cassandra, and is protected by replication. The degree to which
object metadata is replicated depends on the type of storage policy being used.
Specifically, for erasure coding storage policies the system will store 2m-1 replicas of object metadata. For
example, with a 4+2 erasure coding storage policy, the object metadata is protected by 3X replication.
Both types of object metadata are replicated to the same degree. So for example, in a 3X replication storage
policy, for each object the "skinny row" object metadata is replicated three times in the cluster and the "wide
row" object metadata is replicated three times in the cluster.
The skinny row metadata has the same key format as the object data (bucketname/objectname) and so has the
same hash token and will be written to the same nodes as the object data -- or a subset of those nodes in the
case of an erasure coding storage policy. The wide row metadata has a different key format and hash token
and so may be written to different nodes than the object data if the cluster exceeds the minimize size required
for the storage policy.
Within Cassandra, both types of object metadata record are part of the UserData_<policyid> keyspaces.
96
4.1. Creating Storage Policies
The table below shows examples of what the automatic system metadata replication level would be in different
system configuration scenarios.
System Metadata
System Configuration Comment
Replication Level
l Single data center
The highest replication
l System metadata replication level configured dur-
level is yielded by match-
ing install = 2 3
ing the level of the 3X
l Just one storage policy created in system: a 3X replication storage policy
replication policy
l Single data center The highest replication
l System metadata replication level configured dur- level is yielded by using
ing install = 5 5 the level configured by
l Just one storage policy created in system: a 3X the operator during sys-
replication policy tem installation
97
Chapter 4. Setting Up Service Features
System Metadata
System Configuration Comment
Replication Level
lication policy at 2X per DC; and a 5+4 distributed
policy
erasure coding policy
The general logic behind this automated adjustment of system metadata replication level is that the greater
the resilience of your configured storage policies -- in terms of ability to read and write object data and
object metadata when a node or nodes are unavailable -- the greater will be the resilience built into the sys-
tem metadata storage configuration.
For details, while on the CMC's Storage Policies page (Cluster -> Storage Policies) click Help.
At all times you must have one and only one default storage policy defined in each of your HyperStore service
regions. The default policy is the one that will be applied when users create new buckets without specifying a
policy.
Note The system supports a configurable maximum number of storage policies (mts.properties: "cloud-
ian.protection.policy.max" (page 488), default = 25). After you have created this many storage
policies, you cannot create additional new policies until you either delete unused policies or increase
the configurable maximum.
Creating or modifying storage policies through the Admin API is not supported. However, you can use the API
to retrieve a list of buckets that use each storage policy, with the GET /bppolicy/bucketsperpolicy method. For
more information see the "bppolicy" section of the Cloudian HyperStore Admin API Reference.
98
4.1. Creating Storage Policies
IMPORTANT ! After a bucket is created, it cannot be assigned a different storage policy. The storage
policy assigned to the bucket at bucket creation time will continue to be bucket’s storage policy for the
life of the bucket.
If a user does not explicitly select a policy when creating a new bucket, the system’s current default storage
policy is automatically applied to the bucket.
l hsstool whereis
l The storage policy applied to the objects -- particularly the data distribution scheme (such as 3X rep-
lication or 4+2 erasure coding) and the configured consistency level requirements.
l The number of nodes in the cluster.
l The number of nodes that are down.
99
Chapter 4. Setting Up Service Features
The tables that follow below indicate HyperStore S3 write and read availability for common single-DC storage
policy configurations, in scenarios where either one or two nodes are down. For simplicity the tables refer to
nodes as being "down", but the same logic applies if nodes are unavailable for other reasons such as being
inaccessible on the network, or in a stop-write condition, or in maintenance mode.
Configured
S3 Oper- Consistency Number of 5 or More Nodes In
ation Type Level (CL) Nodes Down 3 Nodes in Cluster 4 Nodes in Cluster Cluster
Writes ALL 1 All writes fail Writes succeed for Writes succeed for
some objects and some objects and
fail for others. fail for others.
2 All writes fail All writes fail Writes succeed for
some objects and
fail for others
QUORUM 1 All writes succeed All writes succeed All writes succeed
(default) 2 All writes fail Writes succeed for Writes succeed for
some objects and some objects and
fail for others fail for others
Reads ALL 1 All reads fail Reads succeed for Reads succeed for
some objects and some objects and
fail for others fail for others
2 All reads fail All reads fail Reads succeed for
some objects and
fail for others
QUORUM 1 All reads succeed All reads succeed All reads succeed
(default) 2 All reads fail Reads succeed for Reads succeed for
some objects and some objects and
fail for others fail for others
ONE 1 or 2 All reads succeed All reads succeed All reads succeed
100
4.1. Creating Storage Policies
Configured
S3 Oper- Consistency Number of k+2 Nodes in k+3 Nodes in k+4 or More Nodes
ation Type Level (CL) Nodes Down Cluster Cluster in Cluster
Writes ALL 1 All writes fail Writes succeed for Writes succeed for
some objects and some objects and
fail for others fail for others
2 All writes fail All writes fail Writes succeed for
some objects and
fail for others
QUORUM 1 All writes succeed All writes succeed All writes succeed
(default) 2 All writes fail Writes succeed for Writes succeed for
some objects and some objects and
fail for others fail for others
Reads ALL 1 or 2 Reads succeed for Reads succeed for Reads succeed for
some objects and some objects and some objects and
fail for others fail for others fail for others
QUORUM 1 All reads succeed All reads succeed All reads succeed
(default) 2 Reads succeed for Reads succeed for Reads succeed for
some objects and some objects and some objects and
fail for others fail for others fail for others
Configured
S3 Oper- Consistency Number of k+3 Nodes in k+4 Nodes in k+5 or More Nodes
ation Type Level (CL) Nodes Down Cluster Cluster in Cluster
Writes ALL 1 All writes fail Writes succeed for Writes succeed for
some objects and some objects and
fail for others fail for others
2 All writes fail All writes fail Writes succeed for
some objects and
fail for others
QUORUM 1 or 2 All writes succeed All writes succeed All writes succeed
(default)
Reads ALL 1 or 2 Reads succeed for Reads succeed for Reads succeed for
some objects and some objects and some objects and
101
Chapter 4. Setting Up Service Features
Configured
S3 Oper- Consistency Number of k+3 Nodes in k+4 Nodes in k+5 or More Nodes
ation Type Level (CL) Nodes Down Cluster Cluster in Cluster
fail for others fail for others fail for others
QUORUM 1 or 2 All reads succeed All reads succeed All reads succeed
(default)
Configured
S3 Oper- Consistency Number of k+4 Nodes in k+5 Nodes in k+6 or More Nodes
ation Type Level (CL) Nodes Down Cluster Cluster in Cluster
Writes ALL 1 All writes fail Writes succeed for Writes succeed for
some objects and some objects and
fail for others fail for others
2 All writes fail All writes fail Writes succeed for
some objects and
fail for others
QUORUM 1 or 2 All writes succeed All writes succeed All writes succeed
(default)
Reads ALL 1 or 2 Reads succeed for Reads succeed for Reads succeed for
some objects and some objects and some objects and
fail for others fail for others fail for others
QUORUM 1 or 2 All reads succeed All reads succeed All reads succeed
(default)
102
4.1. Creating Storage Policies
Also, for each object larger than 10MB -- or for Multipart Uploads, for each object part larger than 10MB -- the
HyperStore system breaks the object or part into multiple "chunks" of 10MB or smaller (for more detail on this
configurable feature see System Settings). Each chunk is assigned its own hash value and is separately rep-
licated or erasure coded within the cluster.
For S3 write and read availability for large objects, within HyperStore the write or read of each part and/or
chunk must satisfy the object data consistency level requirement in order for the S3 object upload or object
read operation as a whole to succeed. (The metadata requirements are not impacted by object size since the
metadata records are for the object as a whole and not for each part or chunk.)
In terms of the Result categories in the tables above, the consequences of an object being large are as follows:
l "All writes/reads succeed" -- No effect. Just as writes or reads of all individual small objects succeed,
so too do writes or reads of all large object parts and chunks.
l "All writes/reads fail" -- No effect. Large objects fail just as small ones do.
l "Writes/Reads succeed for some objects and fail for others." -- Here, the object data consistency cri-
teria that determine which objects succeed (as displayed when you click the Result text in the tables)
must be met by each part and/or chunk in order for the S3 object upload or object read operation as a
whole to succeed. Consequently, in these scenarios, the write or read of a large object with multiple
parts and/or chunks has a greater chance of failing than the write or read of a small object.
For a given object, the per-object metadata record has the same key format as the object data (buck-
etname/objectname) and therefore is assigned the same hash token as the object data and will be written to
the same endpoint nodes as the object data -- or a subset of those nodes, in the case of erasure coding stor-
age policies. By contrast, the per-bucket object metadata record has a different key format and therefore a dif-
ferent hash token, and so may be written to different endpoint nodes than the object data. The per-bucket
object metadata record is replicated to the same degree as the per-object metadata record -- for example,
three times in a 3X replication storage policy -- and must meet the same configured write consistency require-
ments.
To limit the complexity of the tables above, in the Result descriptions the references to object metadata are
referring only to the per-object metadata record. In terms of the Result categories in the tables, the con-
sequences of the system's need to also write per-bucket object metadata -- potentially to different endpoint
nodes than the object data and per-object metadata -- are as follows:
l "All writes succeed" -- No effect. In these scenarios writes succeed for the per-bucket object metadata
also.
l "All writes fail" -- No effect. In these scenarios S3 writes fail regardless of the per-bucket object
metadata considerations.
l "Writes succeed for some objects and fail for others." -- In these scenarios, writes succeed for
objects for which the consistency requirement can be met for object data, per-object metadata, and per-
bucket object metadata. Writes fail for objects for which the consistency requirement cannot be met for
either the object data, or the per-object metadata, or the per-bucket metadata. In such down node scen-
arios where writes succeed for some objects and fail for others, whether the write of a given object suc-
ceeds or fails is determined not only by the hash token assigned to the object's data and per-object
metadata record but also by the hash token assigned to the per-bucket object metadata record.
103
Chapter 4. Setting Up Service Features
Note Per-bucket object metadata is not used for S3 object reads and has no impact on any of the
read availability scenarios in the tables above. Only the per-object metadata record is relevant to
object reads.
Redis DB Access
The HyperStore system's S3 Service needs information from the Redis Credentials database and the Redis
QoS database in order to process S3 write and read requests. In most cases -- particularly in larger clusters -- 1
or 2 nodes being down in your system will not impact the availability of these databases (which are imple-
mented across multiple nodes in master-slave relationships). If problems within your system were to lead to
either of these databases being completely offline -- such as if all the nodes running Redis Credentials are
down, or all the nodes running Redis QoS are down -- the S3 layer can use cached Redis data for a while. But
if the cached data expires and a Redis database is still completely offline, then S3 requests will start to fail.
Note HTTPS for the S3 Service is enabled by default only in fresh installations of HyperStore ver-
sion 7.4 or newer. For systems that were originally installed as a version older than 7.4, and
then subsequently upgraded, HTTPS for the S3 Service is disabled by default (the upgrade
does not automatically enable it). If your original HyperStore installation was older than version
7.4 and you have not yet enabled HTTPS for the S3 Service, you can enable it by generating a
self-signed certificate for the S3 Service or importing a CA-signed certified for the S3 Service.
Either of those operations automatically enables HTTPS for the S3 Service if it is not already
enabled.
HyperStore provides you tools that simplify several SSL certificate management tasks that you may wish to per-
form for the CMC, Admin, IAM, and/or S3 services:
l Generate a new self-signed certificate (see "Generating a New Self-Signed Certificate for a Service"
(page 105))
l Generate a Certificate Signing Request (CSR) to submit to a Certificate Authority (CA) (see "Gen-
erating a Certificate Signing Request (CSR)" (page 107))
104
4.2. Setting Up Security and Privacy Features
l Import a CA-signed certificate and associated intermediate and root certificates (see "Importing a CA-
Signed Certificate" (page 110))
You may also wish to block regular HTTP access to these HyperStore services, so that only HTTPS access is
allowed. For information on this task see "Disabling Regular HTTP Access to HyperStore Services" (page
113).
Because HyperStore services sometimes make outbound HTTPS connections -- when implementing cross-
region replication or auto-tiering, for example -- you may also wish to import one or more root
Certificate Authorities' certificates to the truststore used by HyperStore services, so that HyperStore services
making outbound HTTPS connections can trust certificates signed by those CAs. This is necessary only for
private root CAs, since the major public CAs are already in the truststore by default.
Adding a private root CA's certificate to HyperStore's truststore is also necessary if you import a CA-signed cer-
tificate to the S3 Service, the Admin Service, or the IAM service, and the root CA in the certificate chain is a
private CA. This is because the CMC acts as a client to those services.
For information on adding a root CA's certificate to the truststore, see "Setting Up Security and Privacy
Features" (page 104).
Note
* HyperStore HTTPS listeners support TLS v1.2 and TLS v1.3 and will not accept client connections
that use TLS versions older than 1.2.
* The Admin Service, along with supporting HTTPS, also supports HTTP(S) Basic Authentication. For
more information -- including how to find or change the default password for authentication -- see the
Introduction section in the Cloudian HyperStore Admin API Reference.
* In the current HyperStore release, the Simple Queue Service (SQS) does not support HTTPS. Only
regular HTTP access is supported for SQS.
See also:
Note This task is supported by the HyperStore installer, and the procedure below describes how to
use the installer to perform this task. If you prefer, you can use the HyperStore command line tool hsctl
to perform this task rather than the installer. Using hsctl for this task provides you additional options not
available through the installer: the option to generate self-signed certificates for all HTTPS-enabled ser-
vices in one operation; the option to assign a non-default lifespan to the certificate(s) (the default
lifespan is one year); and for the S3 Service in a multi-region deployment, the option to generate dif-
ferent certificates for each region. However, if you use hstcl to generate a self-signed certificate, after-
wards you still must use the installer to push the change out to the cluster and restart the affected
service (Steps 5 and 6 in the procedure below). For more information about using hstcl to generate
105
Chapter 4. Setting Up Service Features
self-signed certificates, log into the Configuration Master node and then on the terminal command line
enter hsctl cert self-signed --help
Follow the steps below to use the HyperStore installer to generate a new self-signed certificate for an HTTPS-
enabled HyperStore service. This operation will overwrite the certificate that the service is currently using.
1. Log into the Configuration Master node and change into /opt/cloudian-staging/7.5 (the installation sta-
ging directory). Then launch the installer:
# ./cloudianInstall.sh
$ hspkg install
Once launched, the installer's menu options (such as referenced in the steps below) are the same
regardless of whether it was launched from the HSH command line or the OS command line.
2. At the installer's main menu enter 4 for "Advanced Configuration Options". Then in the Advanced Con-
figuration Options menu enter e for "Configure SSL for Admin, CMC, S3, and IAM/STS services". This
takes you to the "SSL Certificate Management" menu:
3. From the SSL Certificate Management menu select the service for which you want to generate a new
self-signed certificate. This takes you to the "<Service Name> SSL Certificate Management" menu. The
example below is for the S3 Service, but regardless of which service you are working with you will see
the same menu options.
106
4.2. Setting Up Security and Privacy Features
4. Enter a for "Generate a new Self-Signed Certificate". Then at the prompt, confirm that you want to gen-
erate a new self-signed certificate for this service. The installer will then generate the certificate.
5. Navigate back to the installer's main menu, and then enter 2 for "Cluster Management". Then from the
Cluster Management menu enter b for "Push Configuration Settings to Cluster", and follow the prompts
to push to the cluster.
6. Navigate back to the Cluster Management menu, then enter c for "Manage Services". Then from the
"Service Management" menu, restart the service for which you generated a new self-signed certificate.
Note This task is not supported by the HyperStore installer. You must use the HyperStore command
line tool hsctl to perform this task. The hsctl tool can be run by the root user or by a HyperStore Shell
user who has the Trusted role.
To generate a certificate signing request (CSR) for one or more HyperStore services: Log into the Con-
figuration Master node, change into a working directory, and run the hsctl command described below. This will
generate one or more CSR / private key pairs in the working directory in which you run the command.
l Generates a separate CSR and corresponding private key for each of the HTTPS-enabled HyperStore
services -- the S3 service, the CMC service, the Admin service, and the IAM service. If you have multiple
service regions in your system, then for the S3 service a separate CSR (and private key) will be gen-
erated for each service region.
l Uses the service endpoints configured in the HyperStore system, including a wildcarded endpoint for
the S3 service in each region.
l Uses Cloudian and generic organization identifiers rather than identifiers for your organization.
You can modify this behavior by using any of the command options described below:
--combined
By default, separate CSRs will be generated for each of the HTTPS-enabled HyperStore services. If you prefer
to generate a single combined CSR for all of the HTTPS-enabled services, use the --combined option (and
omit the --service option described below).
Note This is a valid option only if the CA that you are using supports the use of the subject alternative
name (SAN) field. A combined CSR for HyperStore services will list multiple service endpoints in the
SAN field.
--service
Use the --service option if you want to generate a CSR for just one HyperStore service. Supported values are:
l s3
l cmc
107
Chapter 4. Setting Up Service Features
l admin
l iam
If you do not use the --service option, a CSR will be generated for each of the services listed above. (Or, if you
omit the --service option and use the --combined option, a single combined CSR will be generated for all of the
services listed above.)
--domains
By default, CSRs for HyperStore services will be created using the service domains that are configured in the
system. Use the --domains option if you want to explicitly specify the domain(s) to use in the CSR for a service.
You can check to see the default domains that will be used in CSRs by running the command hsctl cert list-
domains. For example:
iam:
iam.mycloudianhyperstore.com
admin:
s3-admin.mycloudianhyperstore.com
s3-region1:
s3-region1.mycloudianhyperstore.com
*.s3-region1.mycloudianhyperstore.com
Note that by default, the CSR for the S3 service will place all endpoints in the Subject Alternative Name (SAN)
field and will include wildcarded versions of the explicit endpoints. In the example above, the wildcarded end-
point is *.s3-region1.mycloudianhyperstore.com. The wildcarded S3 endpoint is needed if your S3 service is
going to support Virtual Host Style Access by S3 clients -- an access method for which the bucket name is used
as a sub-domain of the S3 endpoint. This is the S3 access method that AWS recommends.
The most likely use case for the --domains option is if you intend to submit the S3 service CSR to a CA that
does not support SAN and/or does not support wildcards in endpoints. In that case you could generate a CSR
for the S3 service like this, for example:
This command would generate an S3 service CSR in which there is only one endpoint -- s3-region1.-
mycloudianhyperstore.com -- and no wildcarding.
Note that the domain that you specify must match one of the domains configured for the service in the system
(one of the domains returned by hsctl cert list-domains).
IMPORTANT ! If the certificate for your S3 service does not include a wildcarded endpoint, then S3 cli-
ent applications using your service will have to use Path-Based Access (in which the bucket name is
part of the URI path) rather than Virtual Host Style Access.
108
4.2. Setting Up Security and Privacy Features
Note If you wish you can use the --domains option to specify multiple domains with comma separation
(--domains=<domain1>,<domain2>,...). You might use this approach for example if your CA supports
SAN but not wildcarding. If you list multiple domains, the first-listed domain will be placed in the Com-
mon Name (CN) field of the CSR and any additional domains will be placed in the SAN field.
109
Chapter 4. Setting Up Service Features
Note This task is supported by the HyperStore installer, and the procedure below describes how to
use the installer to perform this task. If you prefer, you can use the HyperStore command line tool hsctl
to perform this task rather than the installer. Using hsctl for this task provides you additional options not
available through the installer: the option to import CA-signed certificates for all HTTPS-enabled ser-
vices in one operation; and for the S3 Service in a multi-region deployment, the option to import dif-
ferent CA-signed certificates for each region. However, if you use hstcl to import a CA-signed
certificate, afterwards you still must use the installer to push the change out to the cluster and restart the
affected service (Steps 6 and 7 in the procedure below). For more information about using hstcl to
import CA-signed certificates, log into the Configuration Master node and then on the terminal com-
mand line enter hsctl cert import --help
Follow the steps below to use the HyperStore installer to import a certificate signed by a public or private CA,
for use by an HTTPS-enabled HyperStore service.
1. Log into the Configuration Master node, and in either /opt/cloudian-staging/7.5 (the installation staging
directory) or a working directory, place the file(s) from which you are going to import a CA-signed cer-
tificate:
l The file(s) must be in PEM format or PKCS12 format (with filename extension .p12 or .pfx). No
other formats are supported.
l The entire trust chain must be present in the file(s): the final CA-signed certificate, any inter-
mediate certificates, and the CA root certificate. These may all be present in a single file, or may
be divided into multiple files.
l The private key from which the CSR was generated must be present, either as a dedicated key
file (such as a .key.pem file) or within one of the other files.
l If you are using the HyperStore Shell (HSH), you will not be able to save files into /opt/cloudian-
staging/7.5. Instead, save the files into a working directory under your home directory.
2. Still logged into the Configuration Master node, change into /opt/cloudian-staging/7.5. Then launch the
installer:
# ./cloudianInstall.sh
$ hspkg install
Once launched, the installer's menu options (such as referenced in the steps below) are the same
regardless of whether it was launched from the HSH command line or the OS command line.
3. At the installer's main menu enter 4 for "Advanced Configuration Options". Then in the Advanced Con-
figuration Options menu enter e for "Configure SSL for Admin, CMC, S3, and IAM/STS services". This
takes you to the "SSL Certificate Management" menu:
110
4.2. Setting Up Security and Privacy Features
4. From the SSL Certificate Management menu select the service for which you want to generate a new
self-signed certificate. This takes you to the "<Service Name> SSL Certificate Management" menu. The
example below is for the S3 Service, but regardless of which service you are working with you will see
the same menu options.
5. Enter b for "Import CA-Signed Certificates (in PEM or PKSC12 format)". You will then be prompted to
enter filenames:
l If the file(s) are in any directory other than /opt/cloudian-staging/7.5, include the full path to the
file(s).
l At each successive "Enter filename" prompt, type one filename (including path if applicable)
then press Enter.
l When there are no more files to enter, at the "Enter filename" prompt simply press Enter without
typing a filename. This will trigger the import of the CA-signed certificate.
l If any of the files that you specified are password-protected you will be prompted to supply the
password.
6. After the import operation completes, navigate back to the installer's main menu, and then enter 2 for
"Cluster Management". Then from the Cluster Management menu enter b for "Push Configuration Set-
tings to Cluster", and follow the prompts to push to the cluster.
7. Navigate back to the Cluster Management menu, then enter c for "Manage Services". Then from the
"Service Management" menu, restart the service for which you imported a CA-signed certificate (for
example, restart the S3 Service if you imported a CA-signed certificate for the S3 Service to use.)
111
Chapter 4. Setting Up Service Features
Note This task is not supported by the HyperStore installer. You must use the HyperStore command
line tool hsctl to perform this task. The hsctl tool can be run by the root user or by a HyperStore Shell
user who has the Trusted role.
In some contexts HyperStore services may act as clients to other services that present SSL certificates.
Examples include, potentially:
By default HyperStore's truststore includes all the major public root CAs. If HyperStore services are connecting
to services that are using a private CA as their root CA, you can add that CA's certificate to HyperStore's trust-
store so that HyperStore services trust certificates signed by that CA.
1. Log into the Configuration Master node, and copy the private root CA's certificate file into a working dir-
ectory.
2. Change into the working directory and run this hsctl command:
# hsctl cert ca trust add <CA certificate filename>
This operation will import any PEM formatted private CA root certificates found inside the input file.
4. Change into /opt/cloudian-staging/7.5 (the installation staging directory) and launch the installer:
# ./cloudianInstall.sh
$ hspkg install
Once launched, the installer's menu options (such as referenced in the step below) are the same regard-
less of whether it was launched from the HSH command line or the OS command line.
5. In the installer main menu, enter 2 for "Cluster Management". Then from the Cluster Management menu
enter c for "Manage Services", and restart the services that you want to trust the private root CA that you
112
4.2. Setting Up Security and Privacy Features
added. Typically this would be the S3 Service (such as in the case of cross-region replication) and/or
the CMC Service (such as if the CMC is accessing an LDAP server when authenticating users).
l Configure a firewall to block S3 HTTP access (port 80), CMC HTTP access (port 8888), IAM HTTP
access (port 16080), and Admin Service HTTP access (port 18081). If you are using the HyperStore fire-
wall, see "Customizing the HyperStore Firewall" (page 584) for instructions. (If you have not yet
enabled the HyperStore firewall, and want to do so, see "HyperStore Firewall" (page 581)).
l Set the HyperStore system configuration so that the regular HTTP listeners are disabled for the
Admin Service, for the IAM Service, and for the CMC. On the Configuration Master node, in the con-
figuration file common.csv, there are settings that enable or disable the regular HTTP listeners for the
Admin Service, for the IAM Service, and for the CMC:
o For the Admin Service, set admin_secure to true if it is not already set to true (it defaults to true if
your original HyperStore install was version 6.0.2 or newer, and defaults to false if your original
HyperStore install was older than 6.0.2)
o For the IAM Service, set iam_secure to true (it defaults to false)
o For the CMC, set cmc_web_secure to true if it is not already set to true (it defaults to true)
Note The S3 Service does not have a system configuration setting for disabling the reg-
ular HTTP listener. Use a firewall to block regular HTTP access to the S3 Service, as
stated above.
After making any edits to common.csv, launch the installer and use the b "Cluster Management" menu
to first push the configuration changes out to the cluster and then go to the "Manage Services" sub-
menu to restart each of the services for which you changed the configuration. (To restart the Admin Ser-
vice, restart the S3 Service -- this has the effect of restarting the Admin Service also.) Then exit the
installer.
The HyperStore system supports server-side encryption (SSE) to protect data confidentiality. Several different
methods of server-side encryption are supported:
113
Chapter 4. Setting Up Service Features
The selection of whether to use server-side encryption, and which method to use, can be made at the object
level (as specified by headers in the object upload request), or the bucket level (so that a default encryption
method is applied to all objects uploaded to the bucket), or the storage policy level (so that a default encryp-
tion method is applied to all objects uploaded to any bucket that uses the storage policy). When the system is
processing a given object upload, the precedence ordering among these different configuration levels is as fol-
lows:
1. If a server-side encryption method is specified in the object upload request, the system uses that
method. If not, then...
2. If a default server-side encryption method is specified in the configuration of the bucket to which the
object is being uploaded, the system uses that method. If not, then...
3. If a default server-side encryption method is specified in the configuration of the storage policy used by
the bucket to which the object is being uploaded, the system uses that method.
If no encryption method is specified in the object upload request, the configuration of the bucket into which the
object is being uploaded, or the configuration of the storage policy used by the bucket, then no server-side
encryption is applied to that object.
When encryption is applied to an object, the encryption is applied at the "coordinator node" (the node that hap-
pens to receive the incoming S3 PUT request with that object) before the object data is transmitted to the
endpoint nodes where the object data will be stored.
For more information about using the supported server-side encryption methods, see:
You cannot apply server-side encryption retroactively to objects that have already been uploaded to the sys-
tem. For example, if you modify a bucket's configuration so that it includes server-side encryption, this will apply
only to objects uploaded from that time forward -- not to objects that had been uploaded to the bucket pre-
viously. The same is true of adding server-side encryption to a storage policy's configuration: from that time for-
ward, objects that get uploaded into buckets that use that storage policy will be encrypted, but objects that had
already been uploaded previously will not be encrypted.
Conversely, if a bucket configuration or storage policy configuration uses server-side encryption and then you
subsequently disable encryption for that bucket or storage policy, then from that time forward newly uploaded
objects will not be encrypted -- but objects that had already been uploaded and encrypted prior to the con-
figuration change will remain encrypted.
How the HyperStore system handles auto-tiering of server-side encrypted objects depends on the encryption
method used and the tiering destination.
114
4.2. Setting Up Security and Privacy Features
l Objects encrypted by the regular SSE method are replicated to the destination bucket.
l Objects encrypted by the SSE-C method or the AWS KMS method are not replicated to the destination
bucket.
See also:
115
Chapter 4. Setting Up Service Features
Regular server-side encryption (SSE) can be configured at the object level, the bucket level, or the storage
policy level. For information about the precedence ordering among these levels see "Setting Up Server-Side
Encryption" (page 113).
When regular SSE is used the HyperStore system generates a unique encryption key for each object that the
system encrypts, and the encryption key is stored as part of the object’s metadata.
With regular SSE, the HyperStore system generates the encryption keys using AES-128 by default. While AES-
128 will work for regular SSE, and may be acceptable in a testing or evaluation environment, for greater secur-
ity Cloudian, Inc. recommends using AES-256. You can enable AES-256 in your HyperStore system as
described in "Enabling AES-256" (page 121).
Note Amazon uses AES-256 for its regular SSE implementation, and AES-256 is called for in
Amazon's SSE specification.
Regular SSE can be set for specific objects as those objects are uploaded to the HyperStore system. This can
be done either through the CMC or through a third party S3 client application that invokes the HyperStore
implementation of the S3 API.
In the CMC’s Objects page, where a user can upload objects into the HyperStore system, the interface dis-
plays a "Store Encrypted" checkbox. If the user selects this checkbox, the HyperStore system applies regular
server-side encryption to the uploaded object(s).
In compliance with the Amazon S3 REST API, the HyperStore system’s S3 API implementation supports reg-
ular server-side encryption by the inclusion of the x-amz-server-side-encryption: AES256 request header in
any of these operations on objects:
l PutObject
l CreateMultipartUpload
l UploadPart
l POST Object
l CopyObject
116
4.2. Setting Up Security and Privacy Features
For details of HyperStore's support for these S3 operations, see the S3 section of the Cloudian HyperStore
AWS APIs Support Reference.
Note If you have not enabled AES-256 in your HyperStore system, the system will use AES-128
encryption even though the x-amz-server-side-encryption request header specifies AES256.
4.2.2.2.3. Configuring SSE as the Default for a Bucket (S3 API Only)
In compliance with the Amazon S3 REST API, the HyperStore system’s S3 API supports setting and managing
a bucket default server-side encryption method by the use of these operations:
l PutBucketEncryption
l GetBucketEncryption
l DeleteBucketEncryption
To configure a bucket for regular SSE, in the PutBucketEncryption request body set the SSEAlgorithm element
to AES256.
For details of HyperStore's support for these S3 operations, see the S3 section of the Cloudian HyperStore
AWS APIs Support Reference.
Note If you have not enabled AES-256 in your HyperStore system, the system will use AES-128
encryption even though the SSEAlgorithm element specifies AES256.
Note The CMC does not support setting a bucket default server-side encryption method.
4.2.2.2.4. Configuring SSE as the Default for a Storage Policy (CMC Only)
When you use the CMC to create storage policies, one of the configurable policy attributes is server-side
encryption. To configure a storage policy to use regular SSE, in the "Server-Side Encryption" field of the stor-
age policy configuration interface, select "SSE".
For more information on storage policy configuration, while on the CMC's Storage Policies page (Cluster ->
Storage Policies) click Help.
Server-side encryption with customer-provided encryption keys (SSE-C) can only be set at the per-object
level, and only if you use a third party S3 client that supports requesting this type of encryption. Setting SSE-C
as the default encryption method for a bucket or a storage policy is not supported.
When SSE-C is used, the HyperStore system does not store the customer-provided encryption key itself but
rather stores a hash of the key (for purposes of verifying the key if it’s subsequently submitted in a GET Object
request). The key hash is stored with the object metadata.
117
Chapter 4. Setting Up Service Features
IMPORTANT ! When SSE-C is used, the user is responsible for managing the encryption key. If an
object is uploaded to HyperStore system and encrypted with a user-provided key, the user will need to
provide that same key when later requesting to download the object. If the user loses the key, the
encrypted object will not be downloadable. This is consistent with Amazon's implementation of the
SSE-C feature. For more information on Amazon’s SSE-C feature see Protecting Data Using Server-
Side Encryption with Customer-Provided Encryption Keys (SSE-C).
l Enable AES-256 in your HyperStore system. Like Amazon S3, HyperStore’s implementation of SSE-C
requires AES-256 encryption. For instructions see "Enabling AES-256" (page 121).
l Set up HTTPS for your S3 Service. Like Amazon S3, HyperStore’s implementation of SSE-C requires
that the relevant S3 API requests be transmitted over HTTPS rather than regular HTTP. For instructions
see "Setting Up Security and Privacy Features" (page 104).
Note HyperStore supports a configuration for allowing a regular HTTP connection between a
load balancer and your S3 servers for transmission of SSE-C requests over your internal net-
work, while client applications use HTTPS in the requests that come into the load balancer. See
the configuration parameter mts.properties.erb:"cloudian.s3.ssec.usessl" (page 488).
In compliance with the Amazon S3 REST API, the HyperStore system’s S3 API supports server-side encryption
with user-provided keys by the inclusion of the x-amz-server-side-encryption-customer-* request headers in
any of these operations on objects:
l PutObject
l CreateMultipartUpload
l UploadPart
l POST Object
l CopyObject (supporting also the x-amz-copy-source-server-side-encryption-customer-* request head-
ers)
l GetObject
l HeadObject
For the full list of supported x-amz-server-side-encryption-customer-* request headers for each of these oper-
ations, see the S3 section of the Cloudian HyperStore AWS APIs Support Reference.
Note The CMC does not support requesting SSE-C for objects as they are uploaded, and it does not
support downloading objects that have been encrypted by the SSE-C method.
118
4.2. Setting Up Security and Privacy Features
Server-side encryption using encryption keys managed by the Amazon Web Services Key Management
Service (AWS KMS) can be configured at the object level or the bucket default level, only if you use a third
party S3 client that supports requesting this type of encryption. Setting the AWS KMS method as the default for
a storage policy is not supported. For information about the precedence ordering between these levels see
"Setting Up Server-Side Encryption" (page 113).
When AWS KMS based encryption is used the HyperStore system triggers in the remote AWS KMS the cre-
ation of one "customer master key" (CMK) per bucket. The CMK is stored exclusively in the AWS KMS. For
each such CMK, HyperStore stores (in Redis) a CMK ID that allows HyperStore to tell AWS KMS which CMK to
use for creating an encrypted "data key" for a given object (that is, the CMK for the bucket into which the object
is being uploaded). AWS KMS returns to HyperStore both the encrypted data key and plain text version of the
data key. After using the plain text data key to encrypt the object, HyperStore deletes the plain text data key,
while the encrypted version of the data key is stored within the encrypted object data.
Subsequently when a client downloads the object, HyperStore submits the encrypted data key to the
AWS KMS, which uses the CMK to decrypt the data key. AWS KMS returns the decrypted data key to Hyper-
Store, which uses it to decrypt the object and then deletes the decrypted data key from memory.
Note In compliance with Amazon's implementation, all S3 requests involving AWS KMS encryption
must use SSL and Signature Version 4. For example HyperStore will reject object upload requests that
specify AWS KMS encryption, or download requests for AWS KMS encrypted objects, if such requests
use Signature Version 2.
Note For more information on the AWS KMS, in the AWS online documentation see AWS Key Man-
agement Service (KMS).
To use the AWS KMS for HyperStore server-side encryption, you must have either or a combination of:
l AWS account access credentials for each HyperStore user group that will use the
AWS KMS encryption feature. These group account credentials will be used by HyperStore to access
the AWS KMS whenever a user within the group requests AWS KMS encryption for their bucket or for
specific objects, or downloads AWS KMS encrypted objects.
l Default AWS account access credentials for your HyperStore service as a whole. These default AWS
credentials will be used by HyperStore to access the AWS KMS on behalf of users who are in groups
that do not have group account credentials for AWS.
To enable AWS KMS usage in your HyperStore system, complete the following system configuration steps:
1. On the Configuration Master node, open the following file in a text editor:
/etc/cloudian-<version>-puppet/modules/cloudians3/files/awscredentials.properties
119
Chapter 4. Setting Up Service Features
2. Edit the file to specify the system default AWS access credentials, and (optionally) any group-specific
AWS access credentials. Create a separate block for each group that has its own AWS access cre-
dentials, using the formatting shown in the example below. In the example there are credentials for just
one group, "CloudianTest1". HyperStore will use the CloudianTest1 group's AWS credentials when
accessing the AWS KMS on behalf of users in that group. For users in any other group, HyperStore will
use the system default AWS credentials when accessing the AWS KMS.
[default]
aws_access_key_id = AKIAJKVELYABCCEIXXMA
aws_secret_access_key = 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
[CloudianTest1]
aws_access_key_id = ABCAJKVELY6YXCEIMAXX
aws_secret_access_key = abceikWvRR/7A8916x9vUDEhV+C+LIDmFCOE8MgC
3. Still on the Configuration Master node, open the following file in a text editor:
/etc/cloudian-<version>-puppet/modules/cloudians3/templates/mts.properties.erb
4. Find the property util.awskmsutil.region and set it to the AWS service region of the AWS KMS that you
want HyperStore to use. The default is "us-east-1". Save your change and close the file.
5. Push your changes to the cluster and restart the S3 Service. If you need instructions see "Pushing Con-
figuration File Edits to the Cluster and Restarting Services" (page 411).
4.2.2.4.2. Requesting AWS KMS Encryption for Specific Objects (S3 API Only)
In compliance with the Amazon S3 REST API, the HyperStore system’s S3 API supports AWS KMS based
server-side encryption by the inclusion of the x-amz-server-side-encryption: aws:kms request header in any of
these operations on objects:
l PutObject
l CreateMultipartUpload
l UploadPart
l POST Object
l CopyObject
For details about HyperStore support of these operations see the S3 section in the Cloudian HyperStore
AWS APIs Support Reference.
Note The CMC does not support requesting AWS KMS based encryption for objects as they are
uploaded. It does support downloading objects that have been encrypted by the AWS KMS method.
4.2.2.4.3. Configuring AWS KMS Encryption as the Default for a Bucket (S3 API Only)
In compliance with the Amazon S3 REST API, the HyperStore system’s S3 API supports setting and managing
a bucket default server-side encryption method by the use of these operations:
120
4.2. Setting Up Security and Privacy Features
l PutBucketEncryption
l GetBucketEncryption
l DeleteBucketEncryption
To configure a bucket for server-side encryption with AWS KMS, in the PutBucketEncryption request body set
the SSEAlgorithm element to aws:kms.
For details of HyperStore's support for these S3 API operations, see the S3 section of the Cloudian HyperStore
AWS APIs Support Reference.
Note The CMC does not support setting a bucket default server-side encryption method.
When you delete a HyperStore bucket that has used AWS KMS encryption -- either because AWS KMS encryp-
tion was the default encryption method for the bucket, or because certain objects within the bucket used
AWS KMS encryption -- the bucket's "customer master key" (CMK) is scheduled for deletion from the remote
AWS KMS system. The CMK deletion occurs 30 days after the deletion of the HyperStore bucket. During this 30
day period, if you do not wish the CMK to be deleted from the remote AWS KMS system you can execute a
CancelKeyDeletion operation using the AWS Console or the AWS CLI.
You must enable AES-256 in your HyperStore system if you want to do either of the following:
l Use regular SSE in a manner compliant with the Amazon SSE specification
l Use SSE-C
Note Enabling AES-256 is not necessary for -- and not relevant to -- server-side encryption using
AWS KMS.
2. Push your changes out to the cluster and restart the S3 Service. For instructions see "Pushing Con-
figuration File Edits to the Cluster and Restarting Services" (page 411).
121
Chapter 4. Setting Up Service Features
All types of users require a password to access the CMC. For the pre-configured system admin user named
admin, the default password is public and you are required to change the password the first time you log into
the CMC as the admin user (see "Accessing the Cloudian Management Console" (page 63)). For all other
users, when you create new users in the CMC -- additional system admin users, or group admins, or regular
users -- you specify an initial CMC password for each user. Users can subsequently change their own pass-
word through the CMC's Security Credentials page (accessible by hovering the cursor over the user name in
the upper right corner of the CMC). And as a system admin you can change other users' passwords through
the CMC's Manage Users page.
Note The Admin API also supports provisioning users and creating user passwords. For more inform-
ation see the "user" section of the Cloudian HyperStore Admin API Reference.
As a system administrator you can configure several types of requirements for CMC passwords, as shown in
the table below. These settings are all in the configuration file common.csv. Note that some of the requirements
are disabled by default. For more detailed descriptions of these settings and how to change them (including
enabling requirements that are disabled by default), see "HyperStore Configuration Files" (page 418).
122
4.2. Setting Up Security and Privacy Features
For more information about how a user can enable MFA on their CMC account, while on the CMC's Security
Credentials page click Help.
As a system administrator, you cannot enable MFA on other users' accounts. However, you can disable
MFA on the account of a user who currently has MFA enabled. You might need to do this if, for example, an
MFA-enabled user is unable to log into the CMC because of a problem with their virtual MFA device (MFA
application on their computer or mobile device).
For more information about disabling MFA on a CMC user's account, while on the CMC's Manage Users page
click Help.
Note The Admin API also supports provisioning users and creating users' S3 security credentials. For
more information see the "user" section of the Cloudian HyperStore Admin API Reference.
If a user employs the CMC as their S3 client application, the CMC automatically includes the user's S3 security
credentials when the CMC submits calls to the S3 Service. If a user employs a third party S3 client application
to access the HyperStore S3 Service, the user will need to supply their S3 security credentials to the third party
S3 client application (so the client application can include the credentials when submitting calls to the Hyper-
Store S3 Service). The user can retrieve their S3 security credentials from the CMC's Security Credentials
page, so that they can then supply the credentials to the third party S3 client application.
Users are allowed to have multiple S3 credentials (multiple access key / secret key pairs). They can create and
delete credentials -- and make credentials active or inactive -- in the CMC's Security Credentials page. By
default the system allows users to have a maximum of five total S3 security credentials (including inactive cre-
dentials as well as active credentials). This restriction is configurable by the credentials.user.max property in
the configuration file mts.properties.erb.
123
Chapter 4. Setting Up Service Features
IMPORTANT ! Using secure delete impacts system performance for delete operations. Consult
with your Cloudian representative if you are considering using secure delete.
HyperStore supports a "secure delete" methodology for implementing object delete requests. By default this
feature is disabled.
For background, note that object data is stored in an ext4 file system on each HyperStore node, and the pro-
cess of writing, reading, and deleting object data from the file system is managed by the HyperStore Service
(for more information see "HyperStore Service and the HSFS" (page 38)). Also note that, depending on your
storage policies, each object is either replicated or erasure coded, and the replicas or the erasure coded frag-
ments are distributed across multiple nodes. Larger objects are broken into chunks first, before those chunks
are then replicated or erasure coded.
When secure delete is enabled HyperStore implements the deletion of an object by first overwriting each byte
of the object data three times, and then deleting the object. The three passes at overwriting each of the object's
bytes are executed as:
This overwriting occurs for every byte of every replica or fragment of the object, on every node on which
the object's data resides. After the three overwriting passes complete, the object data is then deleted.
If you enable secure delete, then all deletes -- for any bucket, by any user -- are implemented as secure
deletes. You cannot, for instance, apply this feature only to some buckets and not to others.
Note
* In the case of buckets that use versioning, to delete all versions of an object the S3 client application
must explicitly delete each object version.
* Secure delete does not apply to object metadata stored in the Metadata DB (in Cassandra). When
objects are deleted, the system deletes the corresponding object metadata in the normal way -- with no
overwriting passes -- regardless of whether or not you have the secure delete feature turned on. There-
fore you should limit any user-defined object metadata created by your S3 client applications to inform-
ation that does not require secure delete.
124
4.2. Setting Up Security and Privacy Features
2. Use the installer to push your configuration change to the cluster and to restart the HyperStore Service.
If you need instructions see "Pushing Configuration File Edits to the Cluster and Restarting Ser-
vices" (page 411).
For more information about cloudian-hyperstore-request-info.log see "HyperStore Service Logs" (page 546).
Federal Information Processing Standard (FIPS) Publication 140-2 defines security requirements for cryp-
tographic modules. HyperStore in some respects meets the FIPS 140-2 standard by default. In particular,
HyperStore by default uses FIPS 140-2 prescribed cryptographic techniques for:
However:
l The cryptographic module that HyperStore uses by default -- the OpenSSL FIPS Object Module 2.0 --
entered "sunset" status in January 2022, and therefore is no longer strictly FIPS compliant.
l By default, the SSH service on HyperStore nodes -- OpenSSH -- is not FIPS 140-2 compliant because it
supports ciphers that are not FIPS 140-2 approved as well as ciphers that are FIPS 140-2 approved.
Therefore, if you need strict FIPS compliance you must enable HyperStore FIP compliance as described
below. Once you enable HyperStore FIPS compliance then:
l HyperStore will use the Bouncy Castle FIPS module rather than the sunsetted OpenSSL FIPS Object
Module 2.0.
l Only FIPS 140-2 approved ciphers will be used for SSH connections to HyperStore nodes.
1. On the Configuration Master node, in the common.csv file, set fips_enabled to true. (By default it is set
to false.) Save your change and close the file.
Note Setting fips_enabled to true will work only if you leave the sshdconfig_disable_override
125
Chapter 4. Setting Up Service Features
setting (which is also in common.csv, directly below fips_enabled) at its default value of false.
3. Still on the Configuration Master node, use the installer to push your configuration change out to the
cluster and to restart the S3 Service. If you need instructions for this step see "Pushing Configuration
File Edits to the Cluster and Restarting Services" (page 411).
However, HyperStore also supports methods of server-side encryption for which the encryption keys come
from outside of HyperStore. With SSE-C, the keys are provided by the user. With server-side encryption using
AWS KMS, the keys are generated by an external key management system.
For these types of server-side encryption, although HyperStore executes the encryption and decryption of
object data using FIPS-compliant AES, the generation of the encryption keys is outside HyperStore control.
These types of server-side encryption are fully FIPS-compliant only if the keys are generated in a FIPS-com-
pliant manner.
126
4.3. Setting Up Auto-Tiering
The HyperStore system supports an "auto-tiering" feature whereby objects can be automatically moved from
local HyperStore storage to a remote storage system on a defined schedule. HyperStore supports auto-tiering
from a local HyperStore bucket to any of several types of destinations systems:
l S3-compliant systems: Amazon S3, Amazon Glacier, Google Storage Cloud, a HyperStore region or
system, or a different S3-compliant system of your choosing
l Microsoft Azure
l Spectra Logic BlackPearl
Auto-tiering is configurable on a per-bucket basis. A bucket owner activating auto-tiering on a bucket can spe-
cify:
Note The HyperStore S3 API supports auto-tiering filtering by prefix or by object tags or by a
combination of prefix and tags, but the CMC currently only supports filtering by prefix.
Although there can only be one auto-tiering destination for a given HyperStore bucket, bucket owners have the
option of configuring different auto-tiering schedules for different sets of objects within the bucket, based on the
object name prefix (unless Bridge Mode is being used, which does not support prefix filtering).
For more information about configuring auto-tiering in the system and on individual buckets, see Setting Up
Auto-Tiering.
l With cross-region replication each replicated object is stored in both the source bucket and the des-
tination bucket. With auto-tiering, after an object is auto-tiered the object data is by default stored only in
127
Chapter 4. Setting Up Service Features
the destination bucket (although there is an option to retain a local copy for a configurable period of
time).
l Cross-region replication replicates objects to the destination bucket immediately after they've been
uploaded to the source bucket. With auto-tiering, typically the tiering of the object occurs on a user-
defined schedule, after the objects have been in the source bucket for a period of time (although there
is an option to tier immediately).
l Cross-region replication replicates nearly all object metadata and user-defined tags along with the
object data, so that an object's full set of metadata resides both at the source and at the destination.
With auto-tiering only basic metadata accompanies the tiered object, while the object's full metadata
continues to be stored locally at the source.
l Cross-region replication requires that versioning be enabled on both the source bucket and the des-
tination bucket. Auto-tiering does not have this versioning requirement.
l With cross-region replication the destination bucket is typically within the same HyperStore system as
the source bucket (although replicating to an external system is supported). With auto-tiering the des-
tination bucket is typically in an external system (although tiering to a bucket within the same Hyper-
Store system is supported).
You cannot use cross-region replication and auto-tiering on the same source bucket.
l Bucket owners can supply their own destination account credentials, on a per-bucket basis. In this way
each bucket owner tiers to his or her own account at the destination system. This is the default method
for auto-tiering.
l You can supply system default tiering credentials. This is the appropriate approach if all users will be
tiering to the same account at the same tiering destination system. For more information see Setting Up
Auto-Tiering.
HyperStore encrypts supplied tiering account credentials and stores them in the Credentials DB, where they
can be accessed by the system in order to implement auto-tiering operations.
Auto-tiering moves objects from a local HyperStore source bucket into a tiering bucket at the destination sys-
tem. The source bucket owner when configuring auto-tiering can specify as the tiering bucket a bucket that
already exists in the destination system, or the source bucket owner can have HyperStore create a tiering
bucket in the destination system. If having HyperStore create a tiering bucket, the source bucket owner can
choose a tiering bucket name or have HyperStore automatically name the tiering bucket. When HyperStore
automatically names the tiering bucket it uses this format:
<origin-bucket-name-truncated-to-34-characters>-<random-string>
The HyperStore system appends a 28-character random string to the origin bucket name to ensure that the res-
ulting destination bucket name is unique within the destination system. If the origin bucket name exceeds 34
characters, in the destination bucket name the origin bucket name segment will be truncated to 34 characters.
After objects have been auto-tiered to the destination system they can be accessed directly through that sys-
tem's interfaces (such as the Amazon S3 Console), by persons having the applicable credentials. Auto-tiered
objects can also be accessed indirectly through the local HyperStore system interfaces. Tiered object access is
described in more detail in Accessing Auto-Tiered Objects.
128
4.3. Setting Up Auto-Tiering
Note In the case of auto-tiering to Amazon Glacier, the HyperStore system creates a bucket in Amazon
S3 and configures that remote bucket for immediate transitioning to Glacier. Objects are then auto-
tiered from HyperStore to Amazon S3, where they are immediately subject to Amazon’s automated
mechanism for transitioning objects to Glacier.
With Bridge Mode, filtering by prefix or tags is not supported -- if Bridge Mode is used, this applies to all objects
incoming to the bucket.
When the system implements Bridge Mode tiering, whichever S3 node processes the upload of a given object
into the source bucket also initiates the immediate transmission of the object to the destination system. In this
way, just as the workload for processing numerous incoming object uploads from S3 client applications is dis-
tributed across all the nodes in the cluster, so too the workload associated with Bridge Mode tiering is dis-
tributed across the cluster. If the initial attempt to transmit an object to the destination system fails with a
temporary error, the same node that performed the initial attempt will retry once every hour until either the
object is successfully transmitted or a permanent error is encountered (an example permanent error would be
if someone had deleted the tiering destination bucket). The local copy of the object will not be deleted until the
object is successfully transmitted to the destination system, as indicated by the destination system returning a
success status. All attempts are logged as described in "Auto-Tiering Logging" (page 130).
Users have the option of choosing Bridge Mode when they configure auto-tiering rules for a bucket.
Note Bridge mode is not supported for tiering to Amazon Glacier or Spectra BlackPearl.
Users have the option of specifying a local retention period when they configure auto-tiering rules for a bucket.
This option to retain a local copy is supported for Bridge Mode (Proxy) tiering as well as for regular, schedule-
based tiering.
129
Chapter 4. Setting Up Service Features
In the special case of "Bridge Mode" auto-tiering, whichever S3 node processes the upload of a given object
into the source bucket also initiates the immediate auto-tiering of the object to the destination system, and the
tiering request log entry for that is written locally on that node.
For example, if a 1MB object has been auto-tiered to Amazon then that object would count as:
l 8KB toward the bucket owner’s HyperStore count for Storage Bytes.
l 1 toward the bucket owner’s HyperStore count for Storage Objects.
130
4.3. Setting Up Auto-Tiering
If an auto-tiered object is temporarily restored to HyperStore storage, then while the object is restored the
object’s size is added back to the Storage Bytes count and the 8KB for the reference is subtracted from the
count. After the restore interval ends and the restored object instance is automatically deleted, the object size is
once again subtracted from Storage Bytes and the 8KB for the reference is added back. For more on temporary
restoration of tiered objects see Accessing Auto-Tiered Objects.
Auto-tiering does not impact HyperStore usage counts for Bytes-IN or Bytes-OUT.
Note If users select the Retain Local Copy option when configuring their buckets for auto-tiering,
objects that are temporarily retained in HyperStore after they've been auto-tiered will continue to count
toward the local Storage Bytes count until the local copy is deleted.
HyperStore's Quality of Service (QoS) and billing features make use of Storage Bytes and Storage Object
counts. So the impact of auto-tiering on QoS and billing is as described above. For example in the case of the
QoS restrictions applied to users, if a bucket owner's 1MB object has been auto-tiered to Amazon, then that
object counts as 8KB toward that user's QoS limit for Storage Bytes; and as 1 toward that user's QoS limit for
Storage Objects.
For more information on the QoS and billing features, see 4.12.1 Quality of Service (QoS) Feature Overview
and 4.8.1 Usage Reporting and Billing Feature Overview.
See also:
l "Enable and Configure the Auto-Tiering Feature in the CMC" (page 132)
l "Configure Tiering Destinations for Display in the CMC" (page 132)
l "Change the Multipart Upload Size Threshold, If Tiering to Google" (page 133)
l "Specify a Different HyperStore System as Tiering Destination, If Applicable" (page 134)
131
Chapter 4. Setting Up Service Features
Note The configuration task described below is applicable only to using the auto-tiering feature
through the CMC. It is not applicable to using a third party S3 client application to invoke the Hyper-
Store S3 Service's auto-tiering feature.
By default auto-tiering functionality is disabled in the CMC, such that CMC users when configuring bucket life-
cycle properties will not see an option for auto-tiering. If you want to enable the auto-tiering feature in the CMC
-- so that CMC users can apply auto-tiering to their buckets -- do the following:
l If you want all CMC users to auto-tier to a single system-default tiering destination account for
which you are providing the account security credentials, set "Enable Per Bucket Cre-
dentials" to Disabled and then enter the "Default Tiering URL" and the account security cre-
dentials.
l If you want CMC users to be able to choose from a pre-configured list of tiering destinations
(AWS S3, AWS Glacier, Google, and Azure by default) and no other destinations, leave "Enable
Per Bucket Credentials" at Enabled (the default) and leave "Enable Custom Endpoint" at Dis-
abled (the default). With this approach users provide their own account security credentials for
the tiering destination. You can edit the list of tiering destinations that will display for users, and
the endpoints for those destinations, as described in "Configure Tiering Destinations for Dis-
play in the CMC" (page 132) below.
l If you want CMC users to be able to choose from a pre-configured list of tiering destinations
and also give users the option to specify a custom S3 tiering endpoint, leave "Enable Per
Bucket Credentials" at Enabled (the default) and set "Enable Custom Endpoint" to Enabled. With
this approach users provide their own account security credentials for the tiering destination. If
users specify a custom tiering endpoint, it must be an S3-compliant system and it cannot be a
Glacier, Azure, or Spectra BlackPearl endpoint.
4. After finishing your edits in the Configuration Settings page, click Save at the bottom of the page to
save your changes. These changes are applied dynamically and no service restart is required.
Note The configuration task described below is applicable only to using the auto-tiering feature
through the CMC. It is not applicable to using a third party S3 client application to invoke the Hyper-
Store S3 Service's auto-tiering feature.
Unless you configure the system so that all users tier to the same one default tiering endpoint (as described
above in Step 3's first bullet point), users when they configure auto-tiering for their buckets in the CMC will be
able to choose from a list of several common tiering destinations. By default the destinations are AWS S3, AWS
Glacier, Google Cloud Storage, and Azure; and by default the endpoints for those destinations are as follows:
132
4.3. Setting Up Auto-Tiering
Azure https://blob.core.windows.net
Note If your original HyperStore version was older than 7.1.4, then after upgrade to 7.1.4 or later the
default list of tiering destinations will also include Spectra BlackPearl with endpoint https://b-
plab.spectralogic.com.
If you want to change this list -- by changing the endpoint for any of the destinations above, or by adding or
removing destinations -- you can do so by editing the cmc_bucket_tiering_default_destination_list setting in
common.csv. The setting is formatted as a quote-enclosed list, with comma-separation between destination
attributes and vertical bar separation between destinations, like this:
"<name>,<endpoint>,<protocol>|<name>,<endpoint>,<protocol>|..."
This can be multiple destinations (as it is by default), or you can edit the setting to have just one destination in
the "list" if you want your users to only use that one destination.
The <name> will display in the CMC interface that bucket owners use to configure auto-tiering, as the auto-tier-
ing destination name. The <protocol> must be one of the following:
l s3
l glacier
l azure
l spectra
If you wish you can include multiple destinations of the same type, if those destinations have different end-
points. For example, "Spectra 1,<endpoint1>,spectra|Spectra 2,<endpoint2>,spectra". Each such destination
will then appear in the CMC interface for users configuring their buckets for auto-tiering.
If you make any changes to this setting, push your changes to the cluster and restart the CMC. If you need
instructions for this see Pushing Configuration File Edits to the Cluster and Restarting Services.
By default HyperStore uses the S3 multipart upload function when auto-tiering objects larger than 16MiB.
However, Google Cloud Storage does not support the multipart upload function. Therefore if your users will be
auto-tiering to Google Storage Cloud you should increase the size threshold that triggers HyperStore to use
multipart upload when auto-tiering objects. By setting the size threshold to a value larger than the largest
object that you expect your users to be tiering to Google, you can prevent HyperStore from trying to use mul-
tipart upload when tiering to Google. HyperStore will upload all objects that are at or below the size threshold
or less as a single "part".
To increase the size threshold that triggers HyperStore to use multipart upload when tiering objects:
For example, to configure HyperStore's auto-tiering feature so that multipart upload is only used for
objects larger than 200MiB:
133
Chapter 4. Setting Up Service Features
cloudian.s3.tiering.part.threshold=209715200
Note The multipart upload size threshold that you set will be used for all auto-tiering regardless
of the destination. This setting is not specific to tiering to Google.
3. Push your change to the cluster and then restart the S3 Service. If you need instructions for this see
Pushing Configuration File Edits to the Cluster and Restarting Services.
If your users will be tiering to a region in a different HyperStore system (i.e. users will be tiering from Hyper-
Store system A to a region in HyperStore system B), then for tiering to that region to work "out of the box" the
endpoint for that region must be in this format:
s3-<region>.<domain>
For example:
s3-boston.company.com
where "boston" is the actual region name in the destination HyperStore system's own system con-
figuration.
If the endpoint for the external HyperStore system is in any format other than the above -- if, for example, the
endpoint is simply boston.company.com rather than s3-boston.company.com -- then for tiering to work you
must first make the following configuration change in your local HyperStore system (the source system from
which the tiering will originate):
1. On the Configuration Master node, open the following file in a text editor:
/etc/cloudian-<version>-puppet/modules/cloudians3/templates/
tiering-regions.xml.erb
2. Copy the sample "Region" block at the top of the tiering-regions.xml.erb file and paste it toward the end
of the file after the existing "Region" blocks (but before the closing "</XML>" tag that’s at the very end of
the file).
3. Edit the block as follows:
l Use the "Name" element to specify the region name of destination system region that you will tier
to. Enter the region name exactly as it is defined in the destination HyperStore system.
l Leave the "ServiceName", "Http", and "Https" elements at their default values.
l Use the "Hostname" element to specify the service endpoint that you will tier to.
For example, if the region name is "boston" and the service endpoint is "boston.company.com", then
your edited Region block would look like this:
<Region>
<Name>boston</Name>
<Endpoint>
<ServiceName>s3</ServiceName>
<Http>true</Http>
<Https>true</Https>
<Hostname>boston.company.com</Hostname>
134
4.3. Setting Up Auto-Tiering
</Endpoint>
</Region>
Note Make sure that the service endpoint that you will tier to is resolvable in your DNS system.
Note Do not have two instances of "s3" in the endpoint, like s3-tokyo.s3.enterprise.com. This
may cause auto-tiering errors (and cross-region replication errors, if you use the cross-region
replication feature).
Alternatively, as a system administrator you can set auto-tiering for a user’s bucket by retrieving the user in the
Manage Users page and then clicking "View User Data" to open a Bucket Properties page for the user's
bucket. (This is supported only if you've configured the system to allow system administrators to view and man-
age users' stored data -- see "Showing/Hiding CMC UI Functions" (page 177)).
With either approach, the bucket first must be created in the usual way, and then the bucket can be configured
for auto-tiering.
For details, while on the CMC's Bucket Properties page click Help.
Note Auto-tiering cannot be enabled for a bucket that has an underscore in its name. For this reason
it's best not to use underscores when naming buckets in HyperStore.
4.3.2.3. Configure Auto-Tiering Rules for Individual Buckets (S3 API and Admin API)
To configure auto-tiering rules on a bucket by using the S3 API, your S3 client application will need to use
HyperStore extensions to the S3 API method PutBucketLifecycleConfiguration. The extensions take the form of
request headers to specify the bucket's auto-tiering destination and mode (x-gmt-tieringinfo), whether to base
auto-tiering timing on object creation time or last access time (x-gmt-compare), and whether to retain a local
copy of the object after auto-tiering occurs (x-gmt-post-tier-copy). For details about these API extensions see
"PutBucketLifecycleConfiguration" in the S3 API section of the Cloudian HyperStore AWS APIs Support Refer-
ence.
If you plan to configure auto-tiering on a bucket by calling the PutBucketLifecycleConfigurationS3 API method,
you will first need to use the HyperStore Admin API to store into the system the tiering destination
account security credentials that HyperStore should use when auto-tiering objects from that bucket. For
example, if you want to use the S3 API method PutBucketLifecycleConfiguration to configure HyperStore
source bucket "my-bucket" to auto-tier to AWS S3, you (or your application) must first use the HyperStore
Admin API to post the security credentials that HyperStore should use when accessing AWS S3 on behalf of
135
Chapter 4. Setting Up Service Features
source bucket "my-bucket". The same requirement applies to other destination types. For details about the rel-
evant Admin API methods, see the "tiering" section of the Cloudian HyperStore Admin API Reference.
# curl http://localhost:80/.system/stats/tiering/spectra
For example:
# curl http://localhost:80/.system/stats/tiering/spectra
{"totalInQueue":0,"totalTiered":51,"tieringFail":0,"restored":1,"restoreFail":0,
"bytesTiered":629145600,"bytesRestored":1073741824}
Note If you're not sure which node is your cron job node, you can check this in the CMC's Cluster
Information page (Cluster -> Cluster Config -> Cluster Information).
First, for any auto-tiered object (regardless of bucket configuration or tiering destination), the object can be
retrieved by temporarily restoring a copy of the object into the local bucket. The CMC Buckets & Objects inter-
face supports temporarily restoring auto-tiered objects, for a length of time that you can specify.
Restoration of auto-tiered objects does not happen instantly. For example, for an object in Amazon S3, it
can take up to six hours before a copy of the object is restored to HyperStore storage. For an object in Glacier it
can be up nine hours, factoring in the time it takes for an object to be restored from Glacier to Amazon S3,
136
4.3. Setting Up Auto-Tiering
before being restored to HyperStore. In the interim, the object is marked with an icon that indicates that the
object is in the process of being restored. During this stage you cannot download the object.
After a copy of an object has been restored, the icon next to the object name changes again and you can then
download the object through the Buckets & Objects interface in the usual way.
As a second option for retrieving auto-tiered objects, some objects may be directly downloadable through the
Buckets & Objects interface without any need for first restoring the objects. This is supported only if both of the
following are true:
l The tiered objects are in Amazon S3, Google Storage Cloud, Azure, or a Custom S3 destination (not
Amazon Glacier or Spectra BlackPearl)
l The bucket’s auto-tiering is configured to support Streaming or Caching (Stream & Restore) of auto-
tiered objects. These options are available (for most destination types) when the bucket's auto-tiering
policy is configured by the bucket owner.
If you’re uncertain whether an auto-tiered object meets these requirements, you can try directly downloading
the auto-tiered object by clicking on its name. If direct download is not supported for the object, a response mes-
sage will indicate that you need to temporarily restore a local copy of the object rather than directly down-
loading it.
If you want to delete an object that has been auto-tiered, you can do so by deleting the object through the
Buckets & Objects interface. You do not need to restore the object first. When the HyperStore system is delet-
ing an auto-tiered object, it first triggers the deletion of the object from the destination system, and then after
that succeeds it deletes the local reference to the object.
If you delete an empty local bucket that has been configured for auto-tiering, the system will also auto-
matically delete the tiering destination bucket.
Note As an alternative to accessing auto-tiered objects through the CMC, you can use a third party S3
client application to submit RestoreObject requests (for any auto-tiered objects) or GetObject requests
(for auto-tiered objects that support streaming or caching) to the HyperStore system’s S3 Service. You
can delete auto-tiered objects by submitting DeleteObject requests to the HyperStore system’s S3 Ser-
vice. For more information about these S3 API calls see the Cloudian HyperStore AWS APIs Support
Reference
For example, if a bucket owner supplied her own AWS credentials when configuring her HyperStore bucket for
auto-tiering to AWS S3, she can log into her AWS account and see the HyperStore auto-tiering destination
bucket (either named as she had specified or automatically named by HyperStore -- see "Tiering Destination
Accounts, Credentials, and Buckets" (page 128) for detail). After objects have been auto-tiered from Hyper-
Store to her AWS destination bucket, she can view the bucket content list and retrieve individual tiered objects
directly through AWS.
In the case of auto-tiering from one HyperStore region to another region in the same HyperStore system, the
tiered objects are accessible through the CMC’s Buckets & Objects interface, by selecting the destination
region.
137
Chapter 4. Setting Up Service Features
IMPORTANT ! Do not overwrite or delete tiered objects directly through the destination system's
interfaces. Doing so will cause a discrepancy between the local metadata in HyperStore and the
actual data in the destination bucket. If users want to overwrite or delete tiered objects they should do
so through HyperStore interfaces (such as the CMC or an S3 application accessing the HyperStore S3
Service). In the case of auto-tiering from one HyperStore region or system to another HyperStore
region or system, any overwriting or deleting of objects should be done through the source bucket not
the destination bucket.
Like Amazon S3, HyperStore supports cross-region replication (CRR). This feature may be valuable if your
HyperStore system consists of multiple service regions. With cross-region replication, a bucket in one service
region can be configured so that all objects uploaded into the bucket are replicated to a chosen destination
bucket in a different service region within the same HyperStore system. This feature enables a bucket owner to
enhance the protection of data by having it stored in two geographically dispersed service regions. The feature
is also useful in cases where a bucket owner wants to have the same set of data stored in two different regions
in order to minimize read latency for users in those regions.
If you wish, you can also use the CRR feature within a single HyperStore service region, so that objects
uploaded into one bucket are replicated to a different bucket in the same service region.
As is the case with Amazon S3's implementation of this feature, with HyperStore both the source bucket and
the destination bucket must have "versioning" enabled in order to activate cross-region replication.
Object metadata — including any access permissions assigned to an object, and any user-defined object
metadata or object tags — is replicated to the destination bucket along with the object data itself. (The excep-
tion is that if an object in the source bucket has an x-amz-expiration header, HyperStore does not replicate this
header.)
As with Amazon S3, HyperStore’s implementation of the cross-region replication feature does not replicate:
l Objects that were already in the source bucket before the bucket was configured for cross-region rep-
lication (except as noted in "Handling of Pre-Existing Objects" (page 142))
138
4.4. Setting Up Cross-Region Replication
l Objects that are encrypted with user-managed encryption keys (SSE-C) or AWS KMS managed encryp-
tion keys
l Objects that are themselves replicas from other source buckets. If you configure "bucket1" to replicate to
"bucket2", and you also configure "bucket2" to replicate to "bucket3", then an object that you upload to
"bucket1" will get replicated to "bucket2" but will not get replicated from there on to "bucket3". Only
objects that you directly upload into "bucket2" will get replicated to "bucket3".
l Deletions of specific object versions.
o In the case of an object deletion request that specifies the object version, the object version is
deleted from the source bucket but is not deleted from the destination bucket.
o In the case of an object deletion request that does not specify the object version, the deletion
marker that gets added to the source bucket is replicated to the destination bucket.
Note HyperStore currently supports only Version 1 of the Amazon S3 specification for
replication configuration XML, not Version 2. Those two versions differ in regard to
whether or not deletion markers are replicated. The behavior described above is the Ver-
sion 1 behavior, which is implemented by HyperStore.
l With cross-region replication each replicated object is stored in both the source bucket and the des-
tination bucket. With auto-tiering, after an object is auto-tiered the object data is by default stored only
in the destination bucket (although there is an option to retain a local copy for a configurable period of
time).
l Cross-region replication replicates objects to the destination bucket immediately after they've been
uploaded to the source bucket. With auto-tiering, typically the tiering of the object occurs on a user-
defined schedule, after the objects have been in the source bucket for a period of time (although there
is an option to tier immediately).
l Cross-region replication replicates nearly all object metadata and user-defined tags along with the
object data, so that an object's full set of metadata resides both at the source and at the destination.
With auto-tiering only basic metadata accompanies the tiered object, while the object's full metadata
continues to be stored locally at the source.
l Cross-region replication requires that versioning be enabled on both the source bucket and the des-
tination bucket. Auto-tiering does not have this versioning requirement.
l With cross-region replication the destination bucket is typically within the same HyperStore system as
the source bucket (although replicating to an external system is supported, as described in "Cross-Sys-
tem Replication" (page 140)). With auto-tiering the destination bucket is typically in an external system
(although tiering to a bucket within the same HyperStore system is supported).
You cannot use cross-region replication and auto-tiering on the same source bucket.
139
Chapter 4. Setting Up Service Features
As with the HyperStore cross-region replication generally, objects are not replicated if they are themselves rep-
licas from another source bucket. Only objects directly uploaded into a bucket by a client application will be rep-
licated. In the context of bi-directional replication this means that objects that are uploaded directly into one
bucket will be replicated to the other bucket, but they will not then be replicated back into the original bucket
and so on in an endless loop.
Note also that to enable cross-region replication the source bucket and destination bucket must both have "ver-
sioning" enabled. Once versioning is enabled, when objects are modified by users the system continues to
store the older version(s) of the object as well as storing the new version. In a cross-region replication context,
this means that over time multiple versions of an object may come to be stored in both the source bucket and
the destination bucket, with each version counting toward Stored Bytes counts in both buckets.
4.4.1.3.1. Impact on System-Wide Stored Byte Count and Licensed Max Storage Limit
When users use cross-region replication to replicate objects from one HyperStore bucket to another Hyper-
Store bucket, the objects in the source bucket and the object replicas in the destination bucket both count
toward your system's stored byte count -- the count that is used to determine whether you are in compliance
with your licensed maximum storage limit. In this respect bucket-to-bucket cross-region replication is different
than storage policy based replication or erasure coding of objects within a region, which is treated as overhead
and not counted toward your stored byte count.
l Objects encrypted by the regular SSE method are replicated to the destination bucket.
l Objects encrypted by the SSE-C method or the AWS KMS method are not replicated to the destination
bucket.
140
4.4. Setting Up Cross-Region Replication
an external system. This is known as cross-system replication (CSR). The external system can be either of
the following:
l A different, independent HyperStore system (with its own user base and service regions and man-
agement controls)
l An external third party system with native S3 support, such as AWS S3
In most respects cross-system replication works just like cross-region replication, with the same behaviors and
limitations described in all the preceding sections of the Cross-Region Replication Feature Overview.
However, there are additional limitations and caveats specific to cross-system replication:
l Since cross-system replication -- like cross-region replication -- requires that versioning be enabled on
both the source bucket and the destination bucket, some partially S3 compatible third party systems
are not supported as cross-system replication destinations because they do not fully support S3's
versioning functionality. For example, Azure and Spectra BlackPearl are not supported as cross-system
replication destinations for this reason.
l Cross-system replication does not replicate object ACLs. This is because with cross-system rep-
lication the source system and destination system have different user bases. (By contrast, for cross-
region replication within a single HyperStore system, there is just one user base and ACLs applied to
objects in the source bucket can be meaningfully applied also to the replica objects in the destination
bucket.)
l With cross-system replication, bi-directional replication is not supported. Do not replicate data from
(for example) bucket1 in your local HyperStore system to bucket2 in an external system, while data in
bucket2 is also replicated to bucket1.
l With cross-system replication, replication from bucket-to-bucket-to-bucket is not supported. Do not
replicate data from (for example) bucket1 in your local HyperStore system to bucket2 in an external sys-
tem, while data in bucket2 is also replicated to bucket3 in any system.
Note Combining this CSR prohibition with the aforementioned prohibition against bi-directional
CSR, it can be said that if you set up CSR from a source bucket to a destination bucket in an
external system, the destination bucket should not use CSR or CRR to replicate its data to
any bucket.
l In the CMC, cross-system replication is disabled by default, such that CMC users who are configuring
bucket replication will not be presented with the additional fields necessary to configure cross-system
replication (such as the destination endpoint). For information about enabling cross-system replication
support in the CMC, see "Setting Up Cross-Region Replication" (page 143).
l If the destination system returns an HTTP 403 or 404 error when HyperStore tries to replicate an object
to the destination, this is treated as a permanent error. In the "Cross-Region Replication request log
(cloudian-crr-request-info.log)" (page 561) on the node that initiates the replication request, an entry
for the object replication attempt is written with status FAILED. The system does not retry replicating the
object. Examples of scenarios that could result in permanent errors like this are if the destination bucket
has been deleted, or if versioning has been disabled on the destination bucket, or if the source bucket
141
Chapter 4. Setting Up Service Features
A permanent failure for replication of an object applies only to that object and does not impact the processing
of other objects subsequently uploaded into the same source bucket. The system will continue to replicate -- or
attempt to replicate -- other objects that subsequently get uploaded into the bucket. Those objects may also
encountered permanent errors, but the system will continue to try to replicate newly uploaded objects unless
you disable cross-region replication on the source bucket.
There is no limit on the number of replication retries for a given object or on the number of objects that are
queued for retry.
Note If an attempt to replicate an object to the destination -- either the original attempt or a retry attempt
-- results in a FAILED status in the Cross-Region Replication request log, so that there will be no further
retries, this triggers an Alert in the CMC's Alerts page. This type of alert falls within the alert rules for S3
service errors (there is not a separate alert rule category for CRR replication failures).
l If you wish you can trigger replication of the pre-existing objects in the source bucket by logging into
any HyperStore node and running this command:
curl -d "" http://localhost:80/.system/syncbucket?bucketName=string
For example:
This will replicate to the destination bucket any objects in the source bucket that have not already been
successfully replicated, including objects that were in the source bucket before you enabled cross-
142
4.4. Setting Up Cross-Region Replication
region replication. (The exception is any objects that were in the bucket before you enabled versioning
on the bucket. Such objects will not be replicated. Only versioned objects will be replicated.)
l Once replication is enabled on a source bucket, all objects subsequently uploaded to the bucket will be
replicated. If at any time there are temporary replication errors for such objects, the cron job that pro-
cesses replication retries automatically triggers the syncbucket operation for that source bucket.
This results in the replication of any unreplicated versioned objects in the source bucket, including pre-
existing versioned objects. Thus for a given source bucket for which cross-region replication has been
enabled, if there are ever temporary errors in replicating objects uploaded to that source bucket, the sys-
tem's response to those errors will result in all unreplicated versioned objects in the bucket getting rep-
licated -- including any versioned objects that were in the bucket before cross-region replication was
enabled.
See also:
If you want CMC users to be able to use cross-system replication you can enable that feature. By default
cross-system replication is not enabled in the CMC.
1. Log into the Configuration Master node and in a text editor open the configuration file common.csv.
2. Set cmc_crr_external_enabled to true, then save your change and close the file.
3. Still on the Configuration Master node, use the installer to push your change out to the cluster and
restart the CMC service. If you need instructions for this step see "Pushing Configuration File Edits to
the Cluster and Restarting Services" (page 411).
Once you've made this change, CMC users configuring cross-region replication for a source bucket (as
described briefly below) will be presented with additional fields that allow for replicating to a destination bucket
in an external system.
Note Before configuring cross-region replication on a source bucket, be sure that both the source
bucket and the destination bucket have versioning enabled. This is required in order to use cross-
143
Chapter 4. Setting Up Service Features
region replication. For information about enabling versioning on a bucket in HyperStore, while on the
CMC's Buckets Properties page (Buckets & Objects -> Buckets -> Bucket Properties) click Help.
In the CMC, bucket owners can enable and configure cross-region replication through the Buckets & Objects
page. For a given source bucket, cross-region replication is among the options that can be configured as
bucket properties. For detail, while on the CMC's Buckets Properties page click Help. Along with specifying a
destination bucket to which objects will be replicated from the source bucket, bucket owners can indicate
whether they want all newly uploaded objects to be replicated or only objects for which the full object name
starts with a particular prefix (such as /profile/images). If you've enabled CMC support for cross-system rep-
lication, bucket owners can also specify the destination system endpoint and their security credentials for the
destination system.
Also in the CMC, bucket owners can disable CRR on a source bucket for which it is currently enabled.
HyperStore's implementation of the PutBucketReplication call includes support for extension headers that can
be used for cross-system replication. Before using this API extension you should review "Cross-System Rep-
lication" (page 140) including the limitations and caveats noted in that section.
As with Amazon S3, the HyperStore implementation of the PutBucketReplication call requires that versioning
be enabled on both the source and the destination bucket before cross-region replication can be applied.
HyperStore supports the standard S3 PutBucketVersioning operation.
HyperStore also supports the GetBucketReplication method for retrieving a bucket's current CRR configuration,
and the DeleteBucketReplication method for disabling CRR on a source bucket for which it is currently
enabled. For detail see the S3 API section of the Cloudian HyperStore AWS APIs Support Reference.
HyperStore can implement WORM (Write Once Read Many) protection for stored objects by supporting the
standard AWS S3 "Object Lock" functionality. To use the Object Lock feature you must have a HyperStore
license that activates this feature. Two different types of HyperStore licensing are available for Object Lock func-
tionality:
144
4.5. Setting Up Object Lock
Certified Object Lock is appropriate for HyperStore customers who are subject to the data protection
mandates of U.S. SEC-17a or a comparable regulatory regime.
Compatible Object Lock is appropriate for HyperStore customers who want to utilize the
WORM functionality provided by the AWS S3 Object Lock APIs, but who are not subject to the data pro-
tection mandates of U.S. SEC-17a or a comparable regulatory regime.
Note If in an earlier version of HyperStore you were licensed for Object Lock, upon upgrade to Hyper-
Store version 7.4 you are now licensed for Certified Object Lock (which behaves the same as Object
Lock from pre-7.4 versions). If you wish to change your licensed Object Lock type to Compatible Object
Lock, contact Cloudian Support.
For more information on applying Object Lock to buckets and objects in HyperStore, see "Setting Up Object
Lock" (page 147).
For a summary of protections that the S3 API provides to locked objects, see "Protections for Locked
Objects" (page 152).
For more information on deleting all objects from a locked bucket -- if your licensed Object Lock type is Com-
patible Object Lock -- see the "bucketops" section in the Cloudian HyperStore Admin API Reference.
145
Chapter 4. Setting Up Service Features
If your licensed Object Lock type is Compatible Object Lock and you use the Admin API to purge the content of
a locked bucket, log messages regarding that purging operation are written to the Admin Service application
log (cloudian-admin.log) and the Admin Service request log (cloudian-admin-request-info.log). For more
information about these logs see "Admin Service Logs" (page 541).
Also, the system does support applying bucket lifecycle policies for auto-expiration on a source bucket that has
Object Lock enabled:
l For a lifecycle policy that specifies an expiration schedule for current version of objects, Object Lock is
irrelevant because for any bucket with versioning this type of expiration action only results in creation of
a delete marker (and does not actually delete any object data from storage).
l For a lifecycle policy that specifies an expiration schedule non-current object versions, when a non-cur-
rent object version reaches its expiration date the system checks for Object Lock and does not delete
the non-current object version if it is still within its lock retention period. When the non-current object ver-
sion's lock retention period ends, then the system deletes it at the next running of the auto-expiration
job.
l Before such a user can be deleted, the user's Object Lock enabled bucket(s) must be deleted.
l Before the user's Object Lock enabled bucket(s) can be deleted, all objects in those bucket(s) must be
deleted. Depending on the type of Object Lock configuration used on those buckets and objects, it may
be that the objects cannot be deleted until after the end of their retention period. For a summary of
restrictions on deletion of "locked" objects through the S3 API, see "Protections for Locked Objects"
(page 152).
146
4.5. Setting Up Object Lock
Note If your licensed Object Lock type is Compatible Object Lock, you can delete a locked
bucket and its contents through the Admin API after obtaining a temporary security token from
Cloudian Support. See the "bucketops" section of the Cloudian HyperStore Admin
API Reference.
See also:
4.5.2.1. Prerequisites
Before Object Lock can be set up on buckets and objects your HyperStore system must meet these pre-
requisites:
l You must have a HyperStore license that supports the Object Lock feature. To check whether your
license supports this feature, in the CMC's Cluster Informationpage see the "Object Lock License"
field: it will indicate either "Certified" (if your licensed Object Lock type is Certified Object Lock) or "Com-
patible" (if your licensed Object Lock type is Compatible Object Lock) or "Disabled" (if your license does
not support Object Lock). If you want to use the Object Lock feature but it is disabled by your current
license, contact your Cloudian representative to inquire about obtaining a different license. For a
description of how Certified Object Lock differs from Compatible Object Lock, see "Object Lock
Feature Overview" (page 144).
l If your licensed Object Lock type is Certified Object Lock, you must enable the HyperStore Shell
and disable the root account password on your HyperStore hosts. For instructions see:
o "Enabling the HSH" (page 70).
o "Disabling the root Password" (page 73)
These actions must be completed before Object Lock enabled buckets can be created. If your licensed
Object Lock type is Certified Object Lock and you have not yet enabled the HyperStore Shell and dis-
abled the root account password:
o A warning will display in the "Object Lock License" field in the CMC's Cluster Informationpage
o WARNING level log messages will be written to the Admin Service log cloudian-admin.log
o Any attempts to create an Object Lock enabled bucket through the CMC or directly through the
S3 API will fail and return an error response.
147
Chapter 4. Setting Up Service Features
Note If your licensed Object Lock type is Compatible Object Lock, the requirement to enable
the HyperStore Shell and disable the root account password does not apply.
The sections that follow provide more information about each of these steps.
Note Object Lock can only be enabled on a new bucket, as the bucket is created. Object Lock cannot
be enabled on an already existing bucket.
l With a third party S3 client application, the client application submits a CreateBucket request that
includes the request header x-amz-object-lock-enabled: true. For HyperStore support of the S3
CreateBucket operation, see the S3 section of the Cloudian HyperStore AWS APIs Support Refer-
ence.
l With the CMC, as a user creates a new bucket in the CMC's Add Bucket interface she can select a
checkbox to enable Object Lock for the bucket. For more information, while on the CMC's Buckets
page (Buckets & Objects -> Buckets) click Help.
Note that:
l Enabling Object Lock on a new bucket (either through a third party S3 client or through the CMC) auto-
matically enables Versioning on that bucket as well. Object Lock can only be used in combination with
Versioning (which protects objects from being overwritten).
l If you create a bucket with Object Lock enabled, you cannot disable Object Lock or suspend ver-
sioning for the bucket (even if you never apply a default Object Lock configuration to the bucket).
l Enabling Object Lock on a new bucket does not by itself have the effect of locking objects that are
subsequently uploaded into that bucket. It only makes it possible to lock such objects, using the meth-
ods described below.
4.5.2.2.2. Setting a Default Object Lock Configuration for the Bucket (Optional)
Once a new bucket has been created with Object Lock enabled, a default Object Lock configuration can option-
ally be set on the bucket. This Object Lock configuration will then by default be applied to all objects that are
subsequently added to the bucket (it does not apply to any objects that were already in the bucket at the time
that the default Object Lock configuration was set for the bucket). The default configuration can be overridden
on a per-object basis, as described in "Setting Object Lock Attributes on Individual Objects (S3 API or
CMC)" (page 150). If a default Object Lock configuration is not set on the bucket, then objects uploaded to the
bucket will not be locked unless they have Object Lock attributes set on them on a per-object basis.
148
4.5. Setting Up Object Lock
l With a third party S3 client application, the bucket owner (or a user with s3:PutBuck-
etObjectLockConfiguration permission on the bucket) submits a PutObjectLockConfiguration request
to the HyperStore S3 Service. For HyperStore support of this S3 API operation see the S3 section of the
Cloudian HyperStore AWS APIs Support Reference. Along with specifying a retention period in days
or years, the request specifies whether the bucket's default Object Lock implementation is to be in
Governance mode (which allows users who have the applicable permissions to change the object
retention period or delete objects through the S3 API before their retention period completes) or Com-
pliance mode (which does not allow any user to change the object retention period or delete objects
through the S3 API before their retention period completes).
l With the CMC, the bucket owner can use the Object Lock tab of the Bucket Properties interface to set
a default Object Lock configuration on the bucket. For more information, while on the CMC's Bucket
Properties page (Buckets & Objects -> Buckets -> Properties) click Help.
With a default Object Lock configuration set on the bucket, from that point forward whenever an object is
uploaded to the bucket -- and lacks object-specific lock attributes -- the default retention period is applied
to that object. For example with a 90 day default retention period, every object is locked for 90 days starting
from the time of object upload. For objects with multiple versions, in this example each object version is locked
for 90 days from time of object version upload.
When an object version is locked, HyperStore rejects an S3 DeleteObject request for that object version with
an HTTP 403 Forbidden response. The exception is if Governance mode is being used and the requesting
user is a user who has the applicable permissions; for more information see "Protections for Locked
Objects" (page 152).
A third party S3 client application can change or remove the default Object Lock configuration on a bucket
by submitting another PutObjectLockConfiguration request; or in the CMC a bucket's default Object Lock con-
figuration can be changed or removed through the Bucket Properties interface. However, the change applies
only from that time forward, to objects that are subsequently added to the bucket. Locks that are already in
place on existing objects -- in accordance with the prior default configuration on the bucket -- are not impacted.
Note
• Object Lock protects each version of an object individually and does not prevent the creation of new
versions of the object.
• For a locked object, an S3 DeleteObject request that does not specify an object version is allowed
and only results in the creation of a delete marker for that object. No object data is actually deleted from
storage.
• Once a bucket is assigned a default lock configuration, any S3 requests for uploading objects to that
bucket must use Signature Version 4 request authentication. This is consistent with Amazon's Object
Lock requirements.
l With a third party S3 client application, the bucket owner (or a user with s3:GetBuck-
etObjectLockConfiguration permission on the bucket) can submit a GetObjectLockConfiguration
request to the HyperStore S3 Service. For HyperStore support of this S3 operation see the S3 section of
the Cloudian HyperStore AWS APIs Support Reference. The request response will indicate whether
Object Lock is enabled on the bucket, and what the default Object Lock configuration is for the bucket (if
any).
149
Chapter 4. Setting Up Service Features
l With the CMC, in the bucket owner can view the list of buckets she owns and any buckets for which
Object Lock is enabled will have a padlock icon beside the bucket name. The bucket owner can then
view the Bucket Properties interface for the bucket and check the Object Lock tab to see what the
default Object Lock configuration is for the bucket (if any).
4.5.2.3. Setting Object Lock Attributes on Individual Objects (S3 API or CMC)
In a bucket that has Object Lock enabled, there is the option to set Object Lock attributes on individual objects.
If the bucket has a default Object Lock configuration, then setting Object Lock attributes on individual objects
will -- for those objects only -- override the bucket's default Object Lock configuration. If the bucket does not
have a default Object Lock configuration, then setting Object Lock attributes on individual objects is the only
way that objects will become locked.
With a third party S3 application, HyperStore users with appropriate permissions can:
l Set Object Lock attributes on objects as they are uploaded to the bucket.
l Set Object Lock attributes on existing objects that are already in the bucket.
l Check an object's lock status.
l Set Object Lock attributes on existing objects that are already in the bucket.
l Check an object's lock status.
Note The CMC does not currently support setting Object Lock attributes on objects as they are
uploaded to the bucket. Instead the user can upload the object to the bucket, and then set
Object Lock attributes on the object.
The sections that follow provide more information about these options.
With a third party S3 application, objects being uploaded to on Object Lock enabled bucket can be assigned
Object Lock attributes by the inclusion of the request headers x-amz-object-lock-mode, x-amz-object-lock-
retain-until-date, and/or x-amz-object-lock-legal-hold with any of the standard S3 API requests for uploading
objects:
l PutObject
l CopyObject
l POST Object
l CreateMultipartUpload
For HyperStore support of these S3 operations see the S3 section of the Cloudian HyperStore AWS APIs Sup-
port Reference.
Similarly to the bucket default configuration options, the individual Object Lock configuration options include
the choice between Governance retention mode and Compliance retention mode (as set by the x-amz-object-
lock-mode request header).
Unlike the bucket default configuration options, the individual object configuration attributes also include an
option for Legal Hold (as optionally set by the x-amz-object-lock-legal-hold request header). Legal Hold pre-
vents the object from being deleted by any user through the S3 API, for an indefinite period of time until the
150
4.5. Setting Up Object Lock
Legal Hold is explicitly removed from the object (by a user who has the applicable permissions). Legal Hold
can be used instead of having a defined retention date for the object, or in combination with having a defined
retention date for the object. For example, if both a Governance retention date and a Legal Hold are set on an
object, and then the Legal Hold is removed before the retention date, the object will continue to be protected by
Governance mode retention until its retention date is reached. For a second example, if both a Compliance
retention date and a Legal Hold are placed on an object, and the Compliance retention date is reached while
the Legal Hold is still in place, the object continues to be protected by the Legal Hold until the Legal Hold is
explicitly removed.
Note Setting retention attributes on an object as the object is uploaded can be done by the bucket
owner, or by a user who has s3:PutObject and s3:PutObjectRetention permissions on the bucket. Set-
ting a legal hold on an object as the object is uploaded can be done by the bucket owner or by a user
who has s3:PutObject and s3:PutObjectLegalHold permissions on the bucket.
Note Object upload requests that include Object Lock headers must use Signature Version 4 request
authentication. This is consistent with Amazon's Object Lock requirements.
l With a third party S3 application, existing objects in an Object Lock enabled bucket can be assigned
Object Lock attributes by using the standard S3 requests PutObjectRetention and/or PutOb-
jectLegalHold.For HyperStore support of these S3 operations see the S3 section of the Cloudian
HyperStore AWS APIs Support Reference.
Note Setting retention attributes on an existing object can be done by the bucket owner or by a
user who has s3:PutObjectRetention permission on the object. Setting or removing a legal hold
on an existing object can be done by the bucket owner or by a user who has s3:PutOb-
jectLegalHold permission on the object.
For an existing object that already has retention attributes, the PutObjectRetention request can be used
to increase the existing retention period on the object but not to reduce the existing retention period
(unless Governance mode retention is being used and the requesting user is a user who has the applic-
able permissions, as described in "Protections for Locked Objects" (page 152)).
l With the CMC, the bucket owner can assign Object Lock attributes to an existing object in the bucket by
accessing the object's Object Properties interface and using the Object Lock tab. For more information,
while logged into the CMC's Bucket Properties page (Buckets & Objects -> Buckets -> Properties)
click Help.
l With a third party S3 application, checking an object's lock status can be done by using the standard
S3 requests GetObject, HeadObject, GetObjectRetention, and/or GetObjectLegalHold. For Hyper-
Store support of these S3 operations see the S3 section of the Cloudian HyperStore AWS APIs Sup-
port Reference.
An object's lock status will reflect the Object Lock attributes that were set directly on that individual
object (if any), or otherwise will reflect the object's inheriting of the bucket's default Object Lock con-
figuration (if any).
151
Chapter 4. Setting Up Service Features
Note Checking an object's lock status can be done by the bucket owner or by a user who has
s3:GetObject, s3:GetObjectVersion, s3:GetObjectRetention (for retention status), and s3:GetOb-
jectLegalHold (for legal hold status) permissions on the object.
l With the CMC, the bucket owner can view the list of objects in her bucket, and locked objects will have
a padlock icon beside the version ID of each version of the object. The bucket owner can then view an
object version's Object Properties interface and check the Object Lock tab to see what the lock attrib-
utes are for that object version.
Compliance Mode
Governance Mode Retention Legal Hold
Retention
The bucket owner and users who have
been granted both s3:De-
leteObjectVersion and s3:By-
Delete a locked passGovernanceRetention permission No user can delete a No user can delete a
object version? can use the DeleteObject request with an locked object version locked object version
x-amz-bypass-governance-retention:
true request header to delete a locked
object version.
The bucket owner and
The bucket owner and users who have
users who have been
been granted both s3:PutOb-
granted s3:PutOb-
jectRetention and s3:By-
Remove the No user can remove jectLegalHold per-
passGovernanceRetention permission
lock on an the retention lock on an mission can use the
can use the PutObjectRetention request
object version? object version PutObjectLegalHold
with an x-amz-bypass-governance-reten-
request to remove the
tion: true request header to remove the
legal hold on an object
retention lock on a object version.
version.
The bucket owner and users who have The bucket owner and
been granted both s3:PutOb- users who have been
jectRetention and s3:By- granted s3:PutOb-
Reduce the
passGovernanceRetention permission No user can reduce the jectLegalHold per-
retention period
can use the PutObjectRetention request retention period for an mission can use the
for an object ver-
with an x-amz-bypass-governance-reten- object version PutObjectLegalHold
sion?
tion: true request header to reduce the request to remove the
retention period for a locked object ver- legal hold on an object
sion. version.
Note In the CMC, no users other than the bucket owner can access a bucket or the objects that it con-
tains. So in the table above, the statements about users who have been granted various permissions
152
4.6. Setting Up Object Metadata Search
relating to locked objects apply only to users who are using third party S3 client applications -- not the
CMC.
4.5.3.1. Using the Admin API to Purge All Objects from a Locked Bucket
If your licensed Object Lock type is Compatible Object Lock, you can use a HyperStore Admin API call to
purge all of the objects in a locked bucket. Note that:
l This operation deletes all objects from the bucket, regardless of whether the objects are protected by
Governance mode, Compliance mode, or Legal Hold.
l There is no option to selectively delete objects from the bucket -- the operation only supports deleting
all versions of all objects.
l The operation allows you (optionally) to delete the bucket itself, as well as all of the objects in it.
l The operation requires that you obtain a temporary security token from Cloudian Support.
l The operation is not allowed if your licensed Object Lock type is Certified Object Lock.
For more information see the "bucketops" section in the Cloudian HyperStore Admin API Reference.
Like Amazon S3, HyperStore allows for rich metadata to be associated with each stored object. There are three
categories of object metadata:
l The system itself assigns certain metadata items to objects, having to do with object attributes and
status
l Optionally, users can assign metadata to objects by using the S3 x-amz-meta-* request headers
l Optionally, users can assign key-value "tags" to objects by using the S3 API methods and headers that
implement object tagging
If you wish, you can make object metadata searchable by setting up an Elasticsearch cluster and then having
HyperStore submit object metadata to Elasticsearch on an ongoing basis. (See "Enabling Elasticsearch Integ-
ration for Metadata Search" (page 157).)
153
Chapter 4. Setting Up Service Features
l Creation time
l Last modified time
l Last accessed time
l Size
l ACL information
l Version, if applicable
l Public URL, if applicable
l Compression type, if applicable
l Encryption key, if applicable
l Auto-tiering state, if applicable
By default the maximum size limit for x-amz-meta-* based metadata is 2KB per object. It is possible to increase
this size limit by using hidden configuration settings, but to do so could have significant implications for the stor-
age space requirements on the SSDs on which you store the Metadata DB (in Cassandra). If you are interested
in increasing the size limit for x-amz-meta-* based object metadata, consult with Cloudian Support.
Object tags share some common use cases with x-amz-meta-* based object metadata -- for example, you
could use either a x-amz-meta-* header or an object tag to identify an object as belonging to a particular pro-
ject. But object tags support a wider range of capabilities, including being able to assign tags to an object that's
already in storage, and being able to use object tags as criteria in access control policies.
An object may have a maximum of 10 tags associated with it, and each tag must have a unique key (for
example you can't assign status:in-progress and status:complete to the same object). Each key can be a max-
imum of 128 unicode characters in length, and each value can be up to 256 unicode characters.
154
4.6. Setting Up Object Metadata Search
When the auto-tiering feature transitions objects from a HyperStore bucket to an external destination system,
user-defined metadata in the form of x-amz-meta-* headers and object tags is sent along with the object data if
the tiering destination is an S3-compliant system such as Amazon Web Services or Google Cloud Storage.
Also, the user-defined object metadata is retained in the local HyperStore system as well.
If the tiering destination is not S3-compliant, such as Microsoft Azure or Spectra Pearl Logic, then the user-
defined object metadata is not sent along with the object data. The user-defined object metadata will exist only
in the HyperStore system, even after the corresponding object data is transitioned to the destination system.
Note Even if the destination system is S3-compliant, the auto-tiering feature does not send certain
types of object metadata such as ACL data (data regarding access permissions on objects).
See also:
Note In the current version of HyperStore, the CMC’s built-in S3 client does not support creating user-
defined object metadata. To create user-defined object metadata you will need to use an S3 client
application other than the CMC.
When uploading a new object, S3 clients can create user-defined object metadata by including one or more x-
amz-meta-* request headers along with the PutObject request. For example, x-amz-meta-topic: merger or x-
amz-meta-status: draft.
155
Chapter 4. Setting Up Service Features
The S3 API method POST Object — for uploading objects via HTML forms — also allows for the specification of
user-defined metadata (through x-amz-meta-* form fields), and HyperStore supports this method as well.
For HyperStore support of these S3 API operations, see the S3 section of the Cloudian HyperStore AWS APIs
Support Reference.
When uploading a new object, S3 clients can create one or more user-defined object tags by including a single
x-amz-tagging request header along with the PutObject request. For example, x-amz-tagging: team-
m=Marketing&project=SEO-2018&status=Needs-Review.
S3 clients can also create object tags when calling the POST Object method, by using the tagging form field.
The CopyObject method supports copying or replacing an object's tags as it is copied, by making use of the x-
amz-tagging-directive and x-amz-tagging request headers.
S3 clients can also create object tags for an object that is already stored in the HyperStore system, by using the
S3 API method PutObjectTagging.
For a high-level view of object tagging usage and methods, in the Amazon S3 online documentation see
Object Tagging.
For HyperStore support of these S3 API operations, see the S3 section of the Cloudian HyperStore AWS APIs
Support Reference.
Note In the current version of HyperStore, the CMC’s built-in S3 client does not support retrieving user-
defined object metadata. To retrieve user-defined object metadata you will need to use an S3 client
application other than the CMC.
S3 client applications can retrieve user-defined x-amz-meta-* based object metadata -- as well as system-
defined object metadata -- by using either of two standard S3 API methods, both of which the HyperStore sys-
tem supports:
l GetObject returns this type of object metadata as response headers, as well as returning the object
itself.
l HeadObject returns this type of object metadata as response headers, without returning the object itself.
For HyperStore support of these S3 API operations, see the S3 section of the Cloudian HyperStore AWS APIs
Support Reference.
The S3 methods GetObject and HeadObject will not return object tags. They will however return a count of the
object tags associated with the object (if any), in an x-amz-tagging-count response header.
To retrieve the object tags associated with an object, use the S3 API method GetObjectTagging.
156
4.6. Setting Up Object Metadata Search
For a high-level view of object tagging usage and methods, in the Amazon S3 online documentation see
Object Tagging.
For HyperStore support of these S3 API operations, see the S3 section of the Cloudian HyperStore AWS APIs
Support Reference.
Note To work with HyperStore your Elasticsearch cluster must be version 7.7.x (for example, 7.7.1).
For availability and performance Cloudian recommends that you have at least three nodes in your
Elasticsearch cluster.
After you configure the HyperStore system to integrate with an Elasticsearch cluster, you can then enable
object metadata search on a per storage policy basis (both of these tasks are described in "Enabling Elastic-
search Integration for Metadata Search" (page 157)). After you've enabled object metadata search for a stor-
age policy, the following will happen:
l For each object that subsequently gets uploaded into a HyperStore bucket that uses that storage policy,
HyperStore will retain the object metadata locally (as it always does) and will also transmit a copy of the
object metadata to your Elasticsearch cluster (using HyperStore's built-in Elasticsearch REST client).
o This includes HyperStore system-defined object metadata and user-defined metadata (for more
information on these types of metadata see "Object Metadata Feature Overview" (page 153) ).
It does not include object tags.
o HyperStore will create in your Elasticsearch cluster an index for each bucket that uses a
metadata search enabled storage policy. The indexes created in Elasticsearch are named as fol-
lows (using lower case only):
cloudian-<cloudianclustername>-<regionname>-<datacentername>-<bucketname>-<buck-
etcreationdatetime>
For example:
cloudian-cloudianregion1-region1-dc1-bucket2-2020-10-19t18:25:56.955z
Each index will be created in Elasticsearch with three shards and two replicas.
o For each object uploaded to HyperStore a "document" containing the object metadata will be
157
Chapter 4. Setting Up Service Features
created in Elasticsearch. Each such document will have a name (ID) in this format:
<bucketname>/<objectname>
In the case of versioned objects there will be a separate Elasticsearch document for each object
version, named as:
<bucketname>/<objectname>\u0001u-<versionId>
The document names will be URL-encoded before transmission to the ES cluster and then URL-
decoded upon retrieval from the ES cluster.
Note Enabling object metadata search on a storage policy results in HyperStore copying into
Elasticsearch any object metadata associated with objects uploaded from that time forward. If
you also want to load into Elasticsearch the object metadata associated with objects that are
already in your HyperStore system, you can use the Elasticsearch synchronization tool that
comes with HyperStore -- as described in "Enabling Elasticsearch Integration for Metadata
Search" (page 157).
l When an object is updated (overwritten by a new S3 upload operation) in HyperStore, its metadata will
be updated in Elasticsearch; and when a object or object version is deleted in HyperStore (by an S3
delete operation or by execution of a bucket lifecycle policy), its metadata will be deleted in Elastic-
search.
l If objects that have metadata in Elasticsearch get auto-tiered from the local HyperStore system to a
remote destination, their metadata will remain in Elasticsearch.
l If a bucket is deleted from HyperStore, its corresponding index will be deleted in Elasticsearch.
l All insertions, updates, and deletes of HyperStore object metadata in your Elasticsearch cluster are
implemented by an hourly cron job in HyperStore. Note that this means that object metadata asso-
ciated with a new S3 upload will not immediately appear in Elasticsearch.
l If you enable metadata search for a storage policy, and then at a later time disable metadata search on
that storage policy, any object metadata that had been copied to Elasticsearch during the period when
metadata search was enabled will remain in Elasticsearch (it will not be deleted by HyperStore).
Note that when you enable Elasticsearch integration, HyperStore only writes to your Elasticsearch cluster. It
does not read from the cluster, and you cannot use HyperStore to search through or retrieve object
metadata in Elasticsearch. For that you must use Elastic Stack applications, such as Kibana.
158
4.6. Setting Up Object Metadata Search
SSL can be enabled only when Xpack is enabled. To use SSL, in your Elastic-
search cluster you will first need to create a new keystore by running the following
command:
Then copy the truststore.jks file to each HyperStore node, in the directory path spe-
cified by mts.properties.erb: cloudian.elasticsearch.ssl.truststore.path (the default
setting for the path is /opt/cloudian/conf/certs/truststore.jks).
2. On the Configuration Master node, use the installer to push your configuration changes out to the
cluster, restart the S3 Service, and restart the CMC. For more detail see "Pushing Configuration File
Edits to the Cluster and Restarting Services" (page 411).
3. Log into the CMC and go to the Storage Policies page (Cluster -> Storage Policies). Now when you
either create a new storage policy or edit an existing storage policy, the interface will display an
"Enable metadata search" option. For each storage policy for which you select the "Enable metadata
search" option -- and Save your change -- object metadata from buckets that use that storage policy will
be sent to the Elasticsearch cluster starting from that time forward (not retroactively).
l All buckets for which Elasticsearch integration is enabled (that is, all buckets that use Elasticsearch-
enabled storage policies).
l A single bucket for which Elasticsearch integration is enabled (a bucket that uses an Elasticsearch-
enabled storage policy).
l A single object in a bucket for which Elasticsearch integration is enabled.
The primary use case for the elasticsearchSync tool is to populate Elasticsearch with object metadata that
was already in your HyperStore system before you enabled Elasticsearch integration. For example, sup-
pose you enable Elasticsearch integration for one or more of your existing storage policies that are already
being used by existing buckets. HyperStore will automatically copy into Elasticsearch the metadata associated
with objects that get uploaded into those buckets from that point forward. But if you want to copy into Elastic-
search all the metadata associated with objects that are currently in those buckets -- including objects that
were already in the buckets before Elasticsearch integration was enabled -- you need to use the elast-
icsearchSync tool,.
Another use case for the elasticsearchSync tool is if your Elasticsearch cluster has been down or unreachable
for more than a few hours. HyperStore uses an hourly cron job to apply needed updates to object metadata in
159
Chapter 4. Setting Up Service Features
Elasticsearch. Elasticsearch update requests that fail during one run of the cron job are retried during the next
run of the cron job. But if your Elasticsearch cluster is unreachable for more than a few hours, the Metadata DB
based queue for unprocessed Elasticsearch update requests can get filled up. Consequently if the Elastic-
search cluster has been unreachable for more than a few hours you should use the elasticsearchSync tool
when Elasticsearch comes back online. Using the elasticsearchSync tool for all Elasticsearch-enabled buckets
will bring the Elasticsearch cluster up to date with HyperStore object uploads and deletions that occurred while
Elasticsearch was unavailable.
The elasticsearchSync tool is located on each of your HyperStore nodes, under the /opt/cloudian/bin directory.
You can run the tool from any HyperStore node (and the operation will apply to the cluster as a whole, not just
the node on which you're running the tool). Once you change into the /opt/cloudian/bin directory, the tool syn-
tax is as follows:
# ./elasticsearchSync all
$ elasticsearchSync all
The same points apply to the other command options described below.
Note Along with performing your specified action, the running of the elasticsearchSync tool always trig-
gers the processing of any Elasticsearch update jobs that are currently in the retry queue.
Note The elasticsearchSync tool -- for transmitting metadata associated with any objects already in
the system -- only works with Elasticsearch as the destination. It does not work with an HTTP Server as
the destination.
160
4.7. Setting Up Quality of Service Controls
2. On the Configuration Master node, use the installer to push your configuration changes out to the
cluster, restart the S3 Service, and restart the CMC. For more detail see "Pushing Configuration File
Edits to the Cluster and Restarting Services" (page 411).
3. Log into the CMC and go to the Storage Policies page (Cluster -> Storage Policies). Now when you
either create a new storage policy or edit an existing storage policy, the interface will display an
"Enable metadata search" option. For each storage policy for which you select the "Enable metadata
search" option -- and Save your change -- object metadata from buckets that use that storage policy will
be sent to the HTTP server from that time forward (not retroactively).
Here is an example of a POST request by which HyperStore submits object metadata to an HTTP server. In this
example the request body contains the metadata associated with an S3 PUT Object operation that was pro-
cessed in HyperStore.
POST / HTTP/1.1.
Content-type: application/json.
Content-Length: 209.
Host: 10.20.2.64:9000.
Connection: Keep-Alive.
User-Agent: Apache-HttpClient/4.5.2 (Java/1.8.0_172).
Accept-Encoding: gzip,deflate.
.
{"id":"buser1%2Fabc","bucketName":"buser1","objectName":"abc",
"lmt":"2019-06-26T08:23:16.224Z","objectSize":1,
"contentType":"application/octet-stream","esTime":"Wed Jun 26 16:23:16 CST 2019",
"userMetadata":{}
}
The Cloudian HyperStore system supports user-level and group-level Quality of Service (QoS) settings:
l User QoS settings place upper limits on service usage by individual users.
l Group QoS settings place upper limits on aggregate service usage by entire user groups.
The HyperStore system enforces QoS settings by rejecting S3 requests that would result in a user (or a user’s
group) exceeding the allowed service usage level.
161
Chapter 4. Setting Up Service Features
Note You can set system-wide defaults for group QoS limits and individual user QoS limits before pro-
visioning any groups or users, if you wish. After provisioning groups and users you can set QoS limits
specific to those groups and users, which would override the system-wide defaults.
When configuring QoS controls, you have the option of limiting some of the usage types above while leaving
others unrestricted. For example, you could limit per-user and/or per-group storage volume (by KBs), while pla-
cing no restrictions on number of stored objects. Similarly, you could cap data upload rate while placing no cap
on data download rate.
When the system rejects a user request because of a storage quota, it returns an HTTP 403 response to the cli-
ent application. When the system rejects a user request due to rate controls, it returns an HTTP 503 response
to the client application.
For HTTP request rate and for upload and download rates, the system also supports a configurable warning
level -- which you can set to a lower threshold of usage than the threshold at which requests will be rejected. If
a user's request results in the warning threshold being exceeded, the request will succeed but the system will
log an INFO level message to the S3 Service application log. (Note that the system does not inform the user
that the warning threshold has been exceeded -- it only writes the aforementioned log message.)
Note The storage overhead associated with replication or erasure coding does not count toward a
user’s storage quota. For example, a 1MiB object that is protected by 3X replication or by 4+2 erasure
coding counts as only 1MiB toward the storage quota.
Note For information on how auto-tiering impacts the implementation QoS controls, see "How Auto-
Tiering Impacts Usage Tracking, QoS, and Billing" (page 130).
l For the system as a whole, you can configure a system default for user QoS settings, applicable to all
users of your S3 service.
l For particular groups, you can configure a group-specific default for user QoS settings. If you do, then
for users in that group, the group-specific user QoS defaults will override the system-wide user QoS
defaults.
162
4.7. Setting Up Quality of Service Controls
l For particular users, you can configure user-specific QoS settings. If you do, these settings will over-
ride any group-wide or system-wide defaults.
l For the system as a whole, you can configure a system default for group QoS settings, applicable to
all groups in your S3 service.
l For particular groups, you can configure group-specific group QoS settings. If you do, these settings
will override the system-wide defaults.
See also:
Note that in enabling QoS functionality as described below, you are merely "turning on" the S3 Service mech-
anisms that enforce whatever QoS restrictions you establish for users and groups. The creation of specific QoS
restrictions is a separate administrative task as described in "Setting QoS Limits for Users" (page 164) and
"Setting QoS Limits for Groups" (page 164).
To enable HyperStore QoS enforcement, in the CMC go to the Configuration Settings page and open the
Quality of Service panel. Then:
l To enable enforcement of storage quotas only (number of bytes and/or number of objects), set just the
"QoS Limits" setting to "enabled".
l To enable enforcement of storage quotas and also traffic rates (number of HTTP requests per minute,
or bytes uploaded or downloaded per minute), set both the "QoS Limits" setting and the "QoS Rate Lim-
its" setting to "enabled".
After you Save, your configuration changes are applied to the system dynamically — no service restart is
required.
Note Enforcing QoS for traffic rates but not for stored bytes and objects is not supported at the system
configuration level. If you want to use QoS in this way, set both "QoS Limits" and "QoS Rate Limits" to
enabled, then when you’re configuring QoS limits for groups and users set the stored bytes and objects
controls to unlimited and the rate controls to your desired levels.
163
Chapter 4. Setting Up Service Features
You can set group QoS settings through either the CMC or the Admin API. Group QoS settings limit the aggreg-
ate activity of all users in a group.
Next, you can set group QoS limits for a specific group. If you do so, these limits will override any system
defaults for group QoS. To do this, first retrieve the group in the Manage Groups page. Then click Group QoS
for the group. This opens the Group QoS Limits: Overrides panel, where you can configure group QoS limits
for that specific group.
Note For details about working with the CMC’s QoS configuration panels, while in the "Users &
Groups" section of the CMC click Help.
The GET /qos/limits and DELETE /qos/limits methods can be used to retrieve or delete group QoS settings.
For information about these Admin API calls see the "qos" section of the Cloudian HyperStore Admin
API Reference.
You can set QoS limits for users through either the CMC or the Admin API.
As with group QoS settings, you can set user QoS settings through either the CMC or the Admin API.
164
4.8. Setting Up Usage Reporting
User QoS Default to open the User QoS Limits: Defaults panel, where you can configure default user QoS
limits for your system.
Next, decide whether you want to set default QoS limits for all users who belong to a particular user group. If
you do so, this will override the system-wide default, for users within that group. To do this, first retrieve the
group in the Manage Groups page. Then click User QoS Group Default for the group. This opens User QoS
Limits: Group Defaults panel, where you can configure default user QoS limits for the group.
Finally, you also have the option of setting QoS limits for a specific individual user. If you do so, these limits will
override any system or group default limits, for that user. To do this, first retrieve the user in the Manage Users
page, then click Set QoS for the user. This opens the User QoS Limits: Overrides panel, where you can con-
figure QoS limits for that specific user.
Note For details about working with the CMC’s QoS configuration panels, while in the "Users &
Groups" section of the CMC click Help.
l A GET /qos/limits method, for retrieving system-default, group-default, or user-specific QoS limits
l A DELETE /qos/limits method, for deleting system-default, group-default, or user-specific QoS limits
For information about these Admin API calls see the "qos" section of the Cloudian HyperStore Admin
API Reference.
By default the HyperStore system keeps track of the following service usage metrics for each user group and
each individual user:
Optionally, you can configure the system to also track the following metrics for each group and user (these met-
rics are disabled by default):
165
Chapter 4. Setting Up Service Features
Like Amazon S3, the HyperStore system attributes all service usage to bucket owners. If a bucket owner
grants permission (via ACL policies) for other users to use the bucket, the system attributes the service activity
of the grantees to the bucket owner. For example, if grantees upload objects into the bucket, the associated
Bytes IN activity and Storage Bytes impact is attributed to the bucket owner — not to the grantees.
If a HyperStore account root user creates IAM users, then storage activity associated with those IAM users'
buckets counts toward the usage of the account root user.
The HyperStore system’s tracking of service usage by groups and users serves two main purposes:
l Usage reporting. Based on service usage tracking data, you can generate service usage reports for
individual users, for user groups, for a whole service region, or for your entire HyperStore system.
l Billing. Usage tracking provides the foundation for billing users or groups on the basis of their service
usage level.
Note Cloudian HyperIQ is a product that provides advanced analytics and visualization of HyperStore
S3 Service usage by users and groups, as well as HyperStore system monitoring and alerting. For
more information about HyperIQ contact your Cloudian representative.
The bucket statistics features are disabled by default. For information on enabling these features see
"Enabling Additional Usage Reporting Features" (page 168).
These per-user and per-group Storage Bytes and Storage Objects coun-
166
4.8. Setting Up Usage Reporting
Every five minutes, freshly updated Redis QoS counts for Storage Bytes
and Storage Objects are written to the Raw column family in Cassandra’s
Reports keyspace, where the data is subjected to additional processing
in support of reporting and billing functionality. Each hour the Raw data
is automatically processed to derive hourly roll-up data which is written to
the RollupHour column family. The hourly roll-up data includes, for each
user and each group, the hour’s maximum value and weighted average
value for Storage Bytes and for Storage Objects.
For example, if during a given hour User1 has 10MB of Storage Bytes for
the first 20 minutes of the hour and then 15MB for the next 40 minutes of
the hour, her weighted average Storage Bytes for the hour is:
This hourly data is in turn rolled up once each day to derive daily roll-up
values (including, for each user and group, the day’s maximum and
day’s average for Storage Bytes and Storage Objects); and the daily roll-
up values are rolled up once each month to derive monthly roll-up values
(including monthly maximums and averages for each user and group).
This data is stored in the RollupDay column family and RollupMonth
column family, respectively.
l HTTP Requests By default system configuration, HTTP Requests, Bytes IN, and Bytes
l Bytes IN OUT are not tracked.
l Bytes OUT If you enable usage tracking for HTTP Requests and Bytes IN/OUT,
then for each S3 request the S3 Service writes transactional metadata dir-
ectly to the Raw column family in Cassandra’s Reports keyspace. It does
so asynchronously so that S3 request processing latency is not
impacted.
Together with the Storage Bytes and Storage Objects data, the HTTP
Request and Bytes IN/OUT data in the Raw column family is rolled up
167
Chapter 4. Setting Up Service Features
The hourly roll-up data is rolled up daily; and the daily roll-up data is
rolled up monthly.
See also:
l Per-user and per-group traffic rates (HTTP request rates and data transfer rates)
l Per-bucket usage tracking
l Per-bucket object and byte counts
168
4.8. Setting Up Usage Reporting
Once you enable this feature, this type of usage data will be tracked and available for reporting from that point
in time forward. There will not be any per-user and per-group traffic rate data from prior to the time that you
enabled this feature.
Note Enabling this feature results in additional data being stored in the Metadata DB, and additional
work for the cron jobs that roll up usage data into hourly, daily, and monthly aggregates.
Once you enable this feature, bucket usage data for storage consumption and traffic rates will be tracked and
available for reporting from that point in time forward, through the usage Admin API calls (for details see the
"usage" section of the Cloudian HyperStore Admin API Reference). There will not be any per-bucket usage
data from prior to the time that you enabled this feature.
Note Enabling this feature results in additional metadata being stored in the Metadata DB, and addi-
tional work for the cron jobs that roll up usage data into hourly, daily, and monthly aggregates.
Note Once enabled, the per-bucket usage tracking feature allows a view into bucket activity across a
specified period of time. By contrast, the per-bucket object and byte count feature -- described below --
allows just snapshots of current counts.
After restarting the S3 Service, run the Admin API call POST /usage/repair?groupId=ALL to bring the counters
up to date for buckets that already have objects in them. For details see the "usage" section of the Cloudian
HyperStore Admin API Reference.
Subsequently, the counters will automatically be updated for each bucket each time there is an S3 transaction
that impacts the byte count or object count for the bucket.
With this feature enabled, you can query the current stored-bytes and stored-objects counts for individual buck-
ets by using the Admin API calls GET /system/bytecount and GET /system/objectcount. For details see the
"system" section of the Cloudian HyperStore Admin API Reference.
Note In some atypical circumstances the keeping of these per-bucket stored-bytes and stored-object
counts, which will be updated after every S3 transaction, may cause a minor-to-moderate decline in S3
write performance. Example circumstances would be if there are an exceptionally large number of
169
Chapter 4. Setting Up Service Features
buckets in your HyperStore system, or if you have a multi-data center HyperStore deployment with
more than usual latency between the data centers. If you want to enable per-bucket stored-bytes and
stored-objects counters in your system but are concerned about potential performance impacts, consult
with Cloudian Support.
IMPORTANT ! The HyperStore system calculates monthly bills for service users by aggregating
hourly roll-up data. Once hourly data is deleted, you will not be able to generate bills for the ser-
vice period covered by that data. So be sure to have reports.rolluphour.ttl set to a value large
enough to accommodate your billing routine.
If you edit any of these settings, push your changes out to the cluster and restart the S3 Service. For instruc-
tions see "Pushing Configuration File Edits to the Cluster and Restarting Services" (page 411).
Because the Redis QoS counters for Storage Bytes and Storage Objects can impact your billing of service
users (if you charge users based on volume of storage used), it’s important that the counts be accurate.
Various types of errors and conditions can on occasion result in discrepancies between the Redis QoS counts
and the actual stored bytes and objects owned by particular users. The HyperStore system provides mech-
anisms for detecting and correcting such discrepancies.
170
4.8. Setting Up Usage Reporting
In normal circumstances, this automated mechanism should suffice for maintaining the accuracy of usage data
for Storage Bytes and Storage Objects.
The Admin API also supports a method for validating Storage Bytes and Storage Object counts for a whole
user group, a whole service region, or the whole system: POST /usage/repair. Depending on how many users
are in your system, this is potentially a resource-intensive operation. This operation should only be considered
in unusual circumstances, such as if the Redis QoS database has been brought back online after being
unavailable for some period of time.
For details on these API calls, see the "usage" section of the Cloudian HyperStore Admin API Reference.
You can display reports in the CMC in tabular format or dynamic graph format, or download report data in
comma-separated value (CSV) format.
The CMC does not support per-bucket usage reporting. For that you must use the Admin API.
Note To retrieve usage data for a whole region or the whole system, you must execute GET /usage
separately for each group.
l GET system/bytecount and GET system/objectcount calls that return the current total bytes count and
total objects count the system, a group, a user, or a bucket (if bucket counts are enabled)
l A POST /usage/bucket call that can retrieve raw usage data for multiple specified buckets at once (if
bucket usage statistics are enabled)
For more detail on usage related Admin API calls see the "system" and "usage" sections of the Cloudian Hyper-
Store Admin API Reference.
171
Chapter 4. Setting Up Service Features
The HyperStore system maintains comprehensive service usage data for each group and each user in the sys-
tem. This usage data serves as the foundation for HyperStore service billing functionality.
The system provides you the ability to create rating plans that specify charges for the various types of service
usage activity, and to assign each group and each user a rating plan. You can then generate bills for a user or
for a whole user group, for a selected service period. The CMC has a function for displaying a single user's bill
report in a browser, but in the more typical use case you will use the HyperStore Admin API to generate user or
group billing data that can be ingested a third party billing application.
Cloudian HyperStore also allows for special treatment of designated source IP addresses, so that the billing
mechanism does not apply any data transfer charges for data coming from or going to these "allowlisted"
domains.
Note For information on how auto-tiering impacts billing calculations, see "How Auto-Tiering Impacts
Usage Tracking, QoS, and Billing" (page 130).
For more information about this configurable setting see "Setting Usage Data Retention Periods" (page 170)
See also:
172
4.9. Setting Up Billing
l Per GB of data in storage (based on a calculated average storage volume for the billing period)
l Per GB of data uploaded
l Per GB of data downloaded
l Per 10,000 HTTP GET or HEAD requests
l Per 10,000 HTTP PUT or POST requests
l Per 10,000 HTTP DELETE requests
A user’s bill can then be calculated by applying the user’s assigned rating plan to the user’s activity levels for
each of these activity types, and adding together the charges for each activity type to get a total charge for the
billing period.
You can create multiple, named rating plans, each of which applies different charges to the various service
activity types. Once you’ve created rating plans, those plans are then available for you to assign to users.
For example, you can create higher-priced and lower-priced rating plans and then assign different plans to dif-
ferent users based on the users' quality of service terms.
IMPORTANT ! If you want to bill for data upload or download volume, or for HTTP request volume, you
must enable the "Track/Report Usage for Request Rates and Data Transfer Rates" setting in the CMC’s
Configuration Settings page, Usage Tracking section. By default this setting is disabled and the sys-
tem does not maintain per-user HTTP request counts and data transfer byte counts.
You can create rating plans either through the CMC or through the HyperStore Admin API. The system also
comes equipped with an editable default rating plan.
The Rating Plan page also supports viewing and editing existing plans, including the default rating plan that
comes with the HyperStore system. For more information about the default rating plan, while on the CMC's Rat-
ing Plan page click Help.
173
Chapter 4. Setting Up Service Features
For details of these API calls see the "ratingPlan" section of the Cloudian HyperStore Admin API Reference.
l Types: One type only. The average number of GBs of data stored for the month.
l Example:
o Unit: Dollars per GB-months.
o Pricing: From 0-1 TB at $0.14 per GB-month, 1-10 TB at $0.12 per GB-month, 10+ TB at $0.10
per GB-month.
o Usage: Store 0.1 TB for first 10 days of month, then 20 TB for remaining 21 days of month.
o Sum up usage over month: 0.1TB X 240 hours + 20TB X 504 hours = 10,104 TB-hours.
o Convert to GB-months: 10,104 TB-hours X (1024 GB/TB) X (1 month/744 hours) = 13,906.58
GB-months
o Apply tiered pricing: ($0.14 X 1 X 1024GB) + ($0.12 X 9 X 1024GB) + ($0.10 X 3666.58GB) =
$1615.94.
l Types: 3 types of requests are HTTP PUT/POST, HTTP GET/HEAD, and HTTP DELETE.
o Each request type has own cost.
l Example:
o Unit: Dollars per 10,000 requests.
o Pricing: PUT/POST: $0.20 per 10,000 requests, GET/HEAD: $0.01 per 10,000 requests,
DELETE: $0.00 per 10,000 requests.
o Usage: For the month, 25,000 PUTs/POSTs, 300,000 GETs/HEADs, 1000 DELETEs.
o ($0.20 X 25,000 / 10,000) + ($0.01 X 300,000 / 10,000) + ($0.00 X 1000 / 10,000) = $0.53.
174
4.9. Setting Up Billing
In a multi-region HyperStore system, you have the option of assigning groups or individual users different rat-
ing plans for activities in different regions. For example, you might charge users more for stored data in buckets
that they’ve created in your North region than for stored data in buckets created in your South region.
If you do not explicitly assign a rating plan to a user, the user is automatically assigned the rating plan that’s
assigned to the user’s group. If you do not explicitly assign a rating plan to a group, then the system default rat-
ing plan is automatically used for that group.
You can assign rating plans to users through either the CMC or the Admin API.
To use the CMC to assign a rating plan to an individual user, use the Manage Users page (Users & Groups -
> Manage Users). From here you can create a new user, and while doing so assign the user a rating plan from
a drop-down list. By default the new user is assigned whichever plan is assigned to the user’s group. From the
Manage Users page you can also retrieve an existing user and change her rating plan assignment.
The approach is similar if you want to assign a rating plan to an individual user. The user must have already
been created (with PUT /user), and then you can use POST /user/ratingPlanId to assign the user a rating plan
(and to subsequently update that assignment). The method GET /user/ratingPlanId lets you retrieve the iden-
tifier of the rating plan currently assigned to a specified user, and there’s also a GET /user/ratingPlan method
for retrieving the user’s rating plan in full.
For details of these API calls, see the "group" and "user" sections of the Cloudian HyperStore Admin
API Reference.
175
Chapter 4. Setting Up Service Features
Note that the allowlist does not have any impact on what users are charged for data storage. It allows only for
free traffic from the specified origin domains. For data storage billing, a user’s regular assigned rating plan pri-
cing is used, even if all of the user’s S3 requests originate from an allowlisted IP address.
You can creating a billing allowlist through either the CMC or the Admin API.
IMPORTANT ! If your S3 Servers are behind a load balancer, the load balancer must be configured to
pass through request source IP addresses in order for the allowlist feature to work. Also, note that when
S3 requests are submitted via the CMC, the S3 Servers consider the CMC itself to be the source of the
request. The CMC does not pass to the S3 Servers the IP addresses of CMC clients.
1. On your Configuration Master node, open the following configuration file in a text editor:
/etc/cloudian-<version>-puppet/manifests/extdata/common.csv
2. Set admin_whitelist_enabled to true, then save your change. (If the setting is already set to true, close
the file without changing the setting.)
3. If you made a change to the setting, use the installer to push your change to the cluster and to restart
the S3 Service and the CMC. If you need instructions see "Pushing Configuration File Edits to the
Cluster and Restarting Services" (page 411).
Once created, the allowlist takes effect. From that time forward, the HyperStore billing system will no longer
apply traffic charges to users' traffic originating from the allowlisted IP addresses and subnets.
Note If you want to see a particular user’s allowlisted traffic volume during a given billing period, you
can do so as part of the HyperStore system’s functionality for "Generating Billing Data for a User or
Group" (page 177).
l Use the POST /allowlist method to post your allowlist as a JSON-encoded request payload.
l Use the POST /allowlist/list method to post your allowlist as a URI parameter.
176
4.10. Customizing the CMC
The Admin API also supports a GET /allowlist method for retrieving the contents of your current allowlist.
For details on these API calls, see the "allowlist" section of the Cloudian HyperStore Admin API Reference.
With the CMC you can generate a billing report for an individual user (but not for a whole group). With the
Admin API you can generate billing data for a specified user or for a whole user group.
IMPORTANT ! Billing calculation is derived from hourly rollup usage data. The retention period for
hourly rollup usage data is configured by mts.properties.erb: "reports.rolluphour.ttl" (page 478). The
default retention period is 65 days. Once this rollup data is deleted it can no longer be used to generate
users' bills.
The billing data that you can generate from this page displays in the form of a printable billing document that
includes the user’s name and user-ID, their user group, the billing period, and the bill generation date. The doc-
ument shows a summary of the user’s rating plan, the user’s service activity for the billing period, and the asso-
ciated charges.
The Account Activity page also provides you an option to view a user’s service traffic originating from allowl-
ist domains, if any.
The Admin API method also supports a GET /billing method, which simply retrieves billing data that you’ve pre-
viously generated with the POST /billing method. Like the POST /billing method, the GET /billing method
returns billing data as a JSON-encoded response payload.
For details on these API calls, see the "billing" section of the Cloudian HyperStore Admin API Reference.
177
Chapter 4. Setting Up Service Features
indicates which user types have access to the functionality by default. The fourth column indicates the con-
figuration setting that controls access to the functionality — all settings are in mts-ui.properties.erb on your Con-
figuration Master node unless otherwise noted. You can edit these settings if you want the availability to be
different than the default -- for example, any of the functionalities that by default display for system admins and
group admins can be reconfigured to display only for system admins.
Note After making changes to a configuration file, use the installer to push the changes out to your
cluster and restart the CMC service. For instructions see "Pushing Configuration File Edits to the
Cluster and Restarting Services" (page 411).
Function
Functionality Default Availability Controlling Setting
Area
Whole user man- System admins and
"admin.manage_users.enabled" (page 496)
agement interface group admins
"admin.manage_groups.create.enabled"
Create groups System admins
(page 499)
"admin.manage_groups.delete.enabled"
Delete groups System admins
(page 499)
Set source IP
Billing whitel- Hidden from all user "admin_whitelist_enabled" (page 447) in
addresses allowed free
ist types common.csv
traffic
System admins,
User "account.profile.writeable.enabled" (page
Edit own profile group admins, and
Account Self- 500)
regular users
Management
Whole security cre- System admins, "account.credentials.enabled" (page 500)
178
4.10. Customizing the CMC
Function
Functionality Default Availability Controlling Setting
Area
group admins, and
dentials interface
regular users
System admins,
Change own CMC pass- "account.credentials.signin.enabled" (page
group admins, and
word 501)
regular users
System admins,
Whole usage reporting
group admins, and "usage.enabled" (page 502)
interface
Usage regular users
Reporting Reporting on HTTP Track/Report Usage for Request Rates and
Hidden from all user
request rates and byte Data Transfer Rates in the CMC Con-
types
transfer rates figuration Settings page
Allow buckets to be con- Hidden from all user Enable Auto Tiering in the CMC Con-
Auto-Tiering
figured for auto-tiering types figuration Settings page
See also:
l You can configure a text banner that displays at the top of the login page every time any user accesses
the CMC.
l You can configure an acknowledgment gate that displays as an overlay in front of the login page every
time any user accesses the CMC, and requires that the user acknowledge having read the gate text
before they can log into the CMC.
By default the CMC login page has no banner and no acknowledgment gate. You can enable either one of
these customizations, or enable both of them. This section describes how to configure a login page banner; for
instructions for configuring an acknowledgment gate see "Configuring a Login Page Acknowledgment Gate"
(page 181).
Here is an example in which a custom text banner has been added to the login page.
179
Chapter 4. Setting Up Service Features
This creates under your working directory a backup sub-directory named as web_backup_<timestamp>,
with some relevant files in it.
4. Copy the custom_banner.jsp file from the backup sub-directory into the working directory.
5. In the working directory, use a text editor such as vi to make the following edits to the custom_ban-
ner.jsp file:
a. Uncomment the starting style tag.
<!-- style>
After uncommenting:
<style>
b. Replace Title with your desired banner title text. Do not alter the h2 tags.
<h2>Title</h2>
c. Replace Message with your desired banner body text. Do not alter the p tags.
<p>Message</p>
180
4.10. Customizing the CMC
6. After saving your changes and exiting the file, while still in the working directory run the following com-
mand:
/opt/cloudian/tools/rebrand_cmc.sh --custom_banner
This copies your edited custom_banner.jsp file to the appropriate Puppet configuration directory.
7. Use the installer to push your changes out to the cluster and then restart the CMC. If you need instruc-
tions see "Pushing Configuration File Edits to the Cluster and Restarting Services" (page 411).
To verify that the banner is displaying as desired, go to the CMC in your browser.
l You can configure a text banner that displays at the top of the login page every time any user accesses
the CMC.
l You can configure an acknowledgment gate that displays as an overlay in front of the login page every
time any user accesses the CMC, and requires that the user acknowledge having read the gate text
before they can log into the CMC.
By default the CMC login page has no banner and no acknowledgment gate. You can enable either one of
these customizations, or enable both of them. This section describes how to configure an acknowledgment
gate; for instructions for configuring a plain login page banner that does not require acknowledgment "Con-
figuring a Login Page Banner" (page 179).
Here is an example in which an acknowledgment gate has been added to the login page. The CMC imple-
ments the acknowledgment gate as a modal dialog.
181
Chapter 4. Setting Up Service Features
Note You do not need to enclose any of the setting text in quotes.
l cmc_login_banner_size -- This setting controls the width of the acknowledgment dialog. The
valid values are 0, 1, 2, or 3, with 0 being the narrowest and 3 the widest. The default is 1. In the
screen shot above, the width is set to 1.
l cmc_login_banner_title -- Title text of the acknowledgment dialog.
l cmc_login_banner_message -- Body text of the acknowledgment dialog. If you want to apply
any formatting to this text, use HTML format tags within the text. For example, <br><br> to start a
second paragraph with a line break between paragraphs; or <b>text</b> to create bold text.
l cmc_login_banner_button_confirm -- Text of the "confirm" button in the acknowledgment dialog.
For example, you could use Confirm or Acknowledge or Agree or OK as the button text.
182
4.10. Customizing the CMC
4. After saving your changes and exiting the file, push your changes to the cluster and then restart the
CMC. If you need instructions see "Pushing Configuration File Edits to the Cluster and Restarting
Services" (page 411).
To verify that the acknowledgment gate is working as desired, go to the CMC in your browser.
To rebrand any or all of these interface elements you will use a HyperStore tool that simplifies the process of
putting the relevant files into the proper location.
Before starting, decide whether you want a change of logos to be part of your rebranding of the CMC UI. If so,
you will need three images of your organization's logo, using the following file names and pixel sizes:
Before starting the rebranding procedure below, you should have the three image files on your local machine
(for instance a laptop computer from which you will connect to the Configuration Master node).
2. On the Configuration Master node create or choose a working directory from which you will manage the
process of rebranding the CMC UI. Then change into the working directory.
3. Run the following script command to back up the current CMC files that are relevant to the UI's brand-
ing.
/opt/cloudian/tools/rebrand_cmc.sh --backup
This action creates a web_backup_<timestamp> sub-directory under your working directory on the Con-
figuration Master node. In the backup directory are files copied from the CMC's Tomcat web server con-
figuration, that affect the CMC's look and feel.
Replace logos
183
Chapter 4. Setting Up Service Features
a. Copy your organization's logo image files -- as specified in the introduction to this procedure --
from your local machine to the working directory on the Configuration Master node (by using
scp, for example).
b. Change into working directory on the Configuration Master node, if you are not already there.
Then run this script command:
/opt/cloudian/tools/rebrand_cmc.sh --images
This copies the image files from the working directory to the proper location within the Puppet
configuration module for the CMC, so that you can subsequently push the files out to the whole
cluster (as described in Step 6).
a. On the Configuration Master node, copy the resources.properties file (for English language) and
any of the resources_<language-code>.properties files (for other languages that the CMC sup-
ports, if applicable to your user population) from the backup sub-directory that you created in
Step 3 into the working directory.
b. Use a text editor to edit the copy of resources.properties in the working directory, as follows:
c. Make the same change in the resources_<language-code>.properties files that you've copied
into the working directory (if any).
d. After editing the file(s) and saving your changes, while still in the working directory run this script
command:
/opt/cloudian/tools/rebrand_cmc.sh --resources
This copies the resource file(s) from the working directory to the proper location within the Pup-
pet configuration module for the CMC, so that you can subsequently push the file(s) out to the
whole cluster (as described in Step 6).
a. On the Configuration Master node, copy the master.css file from the backup sub-directory that
you created in Step 3 into the working directory.
b. Use a text editor to edit the copy of master.css file in the working directory, to replace all occur-
rences of the existing color codes with the desired new values.
c. After editing the file and saving your changes, while still in the working directory run this script
command:
/opt/cloudian/tools/rebrand_cmc.sh --css
184
4.10. Customizing the CMC
This copies the CSS file from the working directory to the proper location within the Puppet con-
figuration module for the CMC, so that you can subsequently push the file out to the whole
cluster (as described in Step 6).
a. On the Configuration Master node, use a text editor to open the configuration file /etc/cloudian-
7.5-puppet/manifests/extdata/common.csv and edit this setting:
cmc_application_name,Cloudian
Replace Cloudian with a different application name string of your choosing. Use only alpha-
numeric characters, with no spaces, dashes, or underscores.
5. To confirm that the rebranding script has successfully copied your customized image, resource, and/or
CSS files to the Puppet configuration module, run this script command:
/opt/cloudian/tools/rebrand_cmc.sh --list
This should list all the image, resource, and/or CSS files that you worked with in Step 4. (It will not list
the common.csv file.)
6. Use the installer to push your customized file(s) to the cluster and to restart the CMC to apply the
changes. If you need instructions see "Pushing Configuration File Edits to the Cluster and Restarting
Services" (page 411).
Note If you customize the branding of the CMC, and then subsequently upgrade your HyperStore sys-
tem to a newer version, only your customized logos and your customized application name will be
retained after the upgrade. After the upgrade you will need to re-implement any changes that you had
made to the browser tab title and/or the color scheme, by again following the instructions above.
Note The rebrand_cmc.sh script also supports adding a custom banner to the top of the CMC login
page. For instructions see "Configuring a Login Page Banner" (page 179).
Each of the three Help systems has a Cloudian logo. For each Help system the logo image file is named Cloud-
ianLogoFull_2.png and its size is 235 X 44 pixels.
To replace the Cloudian logo in the CMC Help with your organization's logo, replace these three instances of
the logo file on each of your HyperStore nodes:
l /opt/cloudian-packages/apache-tomcat-7.0.103/webapps/Cloud-
ian/help/HyperStoreHelp/Skins/Default/Stylesheets/Images/CloudianLogoFull_2.png
185
Chapter 4. Setting Up Service Features
l /opt/cloudian-packages/apache-tomcat-7.0.103/webapps/Cloud-
ian/help/HyperStoreHelpGroupAdmin/Skins/Default/Stylesheets/Images/CloudianLogoFull_2.png
l /opt/cloudian-packages/apache-tomcat-7.0.103/webapps/Cloud-
ian/help/HyperStoreHelpEndUser/Skins/Default/Stylesheets/Images/CloudianLogoFull_2.png
Note that:
l Changing this image file is not supported by the rebrand_cmc.sh script, and this image file is not under
Puppet control. To replace this image file you must do it manually on each of your HyperStore nodes.
l Changing this image file requires root access and cannot be performed through the HyperStore Shell.
l In the Help for group administrators and the Help for regular users, the text makes no reference to
"Cloudian" or "HyperStore" or "the CMC". Instead the text uses generic, non-branded terminology such
as "the storage system" and "the console".
If you wish you can change this setting's value in common.csv (so that your own organization's name is used
rather than "Cloudian"). To apply your change do a Puppet push and then restart the IAM Service.
User provisioning is beyond the scope of the provided SSO solution. The HyperStore provides an Admin API
for user provisioning (see the "user" section of the Cloudian HyperStore Admin API Reference) but the imple-
mentation of user mapping is left to the portal application integrating with CMC.
186
4.10. Customizing the CMC
The idea is that a portal application calculates an one-way hash (also known as a signature) based on Cloud-
ian HyperStore user identification, timestamp and the shared key. Then the user’s browser accesses sso-
securelogin.htm with the signature. The CMC checks for this signature to determine whether a user is
authenticated or not. If the signature is found valid, access to the CMC from the client will skip the login page
and take the user directly to a CMC interior page such as the Buckets & Objects page.
IMPORTANT ! To use the Cloudian HyperStore SSO feature, the following system configuration set-
tings must be set to "true":
-- "cmc_web_secure" (page 448) in common.csv -- This is set to true by default. Leave it true if you
want to use SSO.
-- "cmc_sso_enabled" (page 452) in common.csv -- This is set to false by default. Change it to true if
you want to use SSO.
Also in common.csv, if you enable SSO functionality (by setting cmc_sso_enabled to true), then for
security reasons you should set cmc_sso_shared_key and cmc_sso_cookie_cipher_key to custom val-
ues. Do not use the default keys.
The following HTTP API, using a signature, prompts the CMC to create an authenticated session for the client
that submitted the request:
Note Submit this as a GET, not a POST. POST is not supported for CMC SSO login.
https://<cmc_FQDN>:<cmc_port>/Cloudian/ssosecurelogin.htm?user=USERID&group=GROUPID
×tamp=TIMESTAMP&signature=SIG&redirect=RELATIVE_OR_ABSOLUTE_URL
Each value must be URL-encoded by the client. Order of the parameters does not matter.
187
Chapter 4. Setting Up Service Features
If the signature is found valid, the CMC creates an authenticated session for the HyperStore user, allowing the
client to skip the login page and access to a CMC interior page.
The portal server can create the signature by the following steps.
l querystring = "user=USERID&group=GROUPID×tamp=TIMESTAMP"
Note When using the querystring to create the signature, do not URL-encode the querys-
tring. Also do not reorder the items. (By contrast, when the client subsequently submits
the SSO secure login request to the CMC, it’s desirable to URL encode the request
querystring.)
2. Calculate one-way hash for the querystring using the standard HmacSHA1 and the CMC SSO shared
key. The shared key is configured by common.cscv: cmc_sso_shared_key.
l base64string = Base64Encode(hashresult)
l signature = encodeURIComponent(base64string)
For a sample of a Python script that uses the one-way hash login API, see "Cloudian HyperStore SSO Sample
Script" (page 189).
After creating the signature, the portal server can return an HTML page with a hyperlink to the CMC SSO
secure login API. The following example will display CMC’s Buckets & Objects page (bucket.htm) embedded
in the inline frame on the portal’s page.
<iframe src="https://<cmc_FQDN>:<cmc_port>/Cloudian/ssosecurelogin.htm
?user=USERID&group=GROUPID×tamp=TIMESTAMP&signature=SIG
&redirect=bucket.htm"></iframe>
If redirect=RELATIVE_OR_ABSOLUTE_URL is given, the CMC’s SSO secure login API returns an HTTP redir-
ect response.
l If the request was successful, the redirect response will take the client to the URL specified by redir-
ect.
l If the request failed, the redirect response will take the client to the CMC’s Login panel.
If redirect=RELATIVE_OR_ABSOLUTE_URL is not given, the CMC’s SSO secure login API returns an HTTP
response with content-type "text/plain".
l If the request was successful, the HTTP response status is 200 OK.
l If the request failed, a 400 BAD REQUEST status is returned, along with a plain text status description.
Possible reasons for failure include:
188
4.10. Customizing the CMC
This API method allows for immediately invalidating the CMC session.
https://<cmc_FQDN>:<cmc_port>/Cloudian/logout.htm&redirect=RELATIVE_OR_ABSOLUTE_URL
l redirect: This optional parameter can be used to redirect the client to the URL after logging out from the
CMC. It is typically set to a portal page. The URL must be URL-encoded by the client.
If redirect=RELATIVE_OR_ABSOLUTE_URL is not given, the CMC’s logout API returns an HTTP redirect
response to take the client to the CMC’s Login panel.
You may want the logout link on the portal page to also trigger logout from the CMC. You can achieve this by
using the redirect parameter.
For example, if you have the portal’s logout link like this:
<a href="/auth/logout">Logout</a>
<a href="https://<cmc_FQDN>:<cmc_port>/Cloudian/logout.htm
?redirect=https:%2F%2F<portal_FQDN>:<portal_port>%2Fauth%2Flogout">Logout</a>
l The redirect URL must be an absolute URL including the protocol (e.g. https://) and portal’s FQDN.
l The redirect URL must be URL-encoded.
#!/usr/bin/python
import time
import hmac
import hashlib
import base64
import urllib
189
Chapter 4. Setting Up Service Features
SSO_KEY = 'aa2gh3t7rx6d'
# Do Not Change
SSO_PROTO = 'https://'
SSO_PATH = 'Cloudian/ssosecurelogin.htm'
SSO_LOGOUT_PATH = 'Cloudian/ssologout.htm'
def sso_logout_url():
url = '%s%s:%d/%s' % (SSO_PROTO, SSO_DOMAIN, SSO_PORT, SSO_LOGOUT_PATH)
return url
Note The sample script hard-codes the SSO secret key, which is not advisable for actual practice. In
practice, you should keep the secret key safely on the server side.
190
4.11. Provisioning Groups and Users
l Each group can be configured for integration with an external LDAP system as a means of authen-
ticating users, if applicable to your environment. For more information see "LDAP Integration" (page
194).
l Each group can be assigned quality of service (QoS) limits that will enforce upper bounds on the ser-
vice usage levels of the group as a whole. Each group can also be assigned default user-level QoS con-
trols that will limit the service usage of individual users within the group. (Optionally, you can also
assign per-user QoS limits that will supersede this default.)
Note You can set system-wide defaults for group QoS limits and individual user QoS limits
before provisioning any groups or users, if you wish. After provisioning groups and users you
can set QoS limits specific to those groups and users, which would override the system-wide
defaults. For more information see "Quality of Service (QoS) Feature Overview" (page 161).
l You can generate service usage reports for groups (and also for individual users).
l Each group can be assigned a default rating plan which will determine how users in that group will be
charged for HyperStore service usage. (Optionally, you can also assign per-user rating plans that will
supersede this default.)
l You can create one or more users who have group administrator privileges. By default system con-
figuration, group administrators are able to perform the following operations through the CMC:
o Create a user within the group
o Edit a user’s profile
o Retrieve a list of users in the group
o Assign user-specific QoS limits
o Provide user support by accessing a user’s data in the S3 object store
o Delete a user
o Generate a usage report for the group
o Generate a usage report for an individual user in the group
Note The set of privileges that you make available to group administrators is configurable in a
granular way (in the mts-ui.properties.erb file, see the admin.manage_users.enabled property
and those that follow it). Individual CMC UI functions and sub-functions can be displayed to or
hidden from group administrators depending on your configuration settings.
See also:
191
Chapter 4. Setting Up Service Features
Note Optionally, when creating a group you can enable LDAP-based authentication of group mem-
bers. For more information see "LDAP Integration" (page 194).
l Add a group
l Set quality of service (QoS) for a group or its users
l Retrieve a group or a list of group
l Edit a group
l Delete a group
For details, while on the CMC's Manage Groups page click Help.
For details on these API calls see the "group" section of the Cloudian HyperStore Admin API Reference.
Note The HyperStore system does not currently support bulk provisioning of users. Users must be
added one at a time.
192
4.11. Provisioning Groups and Users
For more details, while on the Manage Users page click Help.
For details on these API calls see the "user" section and the "qos" section in the Cloudian HyperStore Admin
API Reference.
l IAM users have only the permissions granted to them by IAM policies created by the account root user.
l IAM users cannot log into the CMC. To perform S3 actions, IAM users must use a third party S3 client
application to access the HyperStore S3 Service.
193
Chapter 4. Setting Up Service Features
l HyperStore does not track service usage data specifically for IAM users. Instead, an IAM user's S3 stor-
age activity counts toward the HyperStore usage data of the account root user who created the IAM
user.
HyperStore account root users can create IAM groups, users, and policies in the CMC's IAM section.For more
information, while in any of the pages in the CMC's IAM section click Help.
HyperStore account root users can also create IAM groups, users, and policies by using a third party IAM client
application that calls the HyperStore IAM Service. The HyperStore IAM Service supports all the IAM API calls
associated with creating and managing IAM groups, users, and policies. For more information see the IAM sec-
tion of the Cloudian HyperStore AWS APIs Support Reference.
l Have only the permissions granted to them by the IAM policies associated with the role.
l Cannot log into the CMC. To perform S3 actions, SAML users temporarily assuming an IAM role must
use a third party S3 client application to access the HyperStore S3 Service.
l Do not have their own service usage counts. Instead the S3 storage activity of such users counts toward
the HyperStore usage data of the account root user who created the IAM role.
For more information see "SAML Support" in the IAM section of the Cloudian HyperStore AWS APIs Support
Reference.
HyperStore supports integrating with Active Directory or other types of LDAP systems so that users can log into
the CMC with their LDAP-based login credentials. This feature is implemented on a per-group basis, so you
have the option of creating some groups that are LDAP-enabled and others that are not. The system also sup-
ports having different groups use different Active Directory or LDAP servers for authentication, or having all
LDAP-enabled groups use the same Active Directory or LDAP server.
Within an LDAP-enabled group, along with users who the CMC will authenticate against an Active Directory or
other LDAP system you can optionally also have local users who the CMC will authenticate by use of a CMC-
based password rather than LDAP.
194
4.11. Provisioning Groups and Users
Note
* Under no circumstances does the CMC try to write to your Active Directory or LDAP server — it only
reads from it, for the purpose of authenticating users.
* Only system administrators can enable Active Directory or LDAP authentication for a group. Group
administrators cannot enable Active Directory or LDAP authentication for their groups.
* For LDAP authentication of HyperStore Shell users to work (see "LDAP Authentication of System
Administrators and HSH Users" (page 196)), your LDAP server must use either TLS or START_TLS.
To use the Admin API to enable LDAP authentication for a group, use the PUT /group method (to create a new
group with LDAP authentication enabled) or the POST /group method (to enable LDAP authentication for an
existing group). When creating or editing the group, in the GroupInfo object in the request body set the ldapEn-
abled attribute to true and also set the other LDAP-related attributes. For details on these API calls see the
"group" section of the Cloudian HyperStore Admin API Reference.
Note If you enable LDAP Authentication for an existing group to which you have already added users
via the CMC's Add User function, those existing users will continue to be authenticated by reference to
their CMC-based passwords -- not by reference to an LDAP server. LDAP Authentication will apply
only for new users.
Note If you wish you can enable LDAP Authentication for the System Admin group, by editing the
group either through the CMC or the Admin API. For further information specific to this group, see
"LDAP Authentication of System Administrators and HSH Users" (page 196).
l For users who you want to be authenticated by Active Directory or LDAP, do not manually create
those users through the CMC (or the Admin API). Instead, simply have those users log into the CMC
using their LDAP credentials. If a user tries to log into the CMC as a member of an LDAP-authenticated
group and the user is not already registered in HyperStore as a member of the group, the CMC will
attempt to authenticate the user against the LDAP system. If the authentication succeeds, the CMC will
automatically provision the user into HyperStore. This includes automatic creation of security keys for
accessing the HyperStore S3 data store. Going forward whenever the user logs in the CMC will recog-
nize the user as a registered HyperStore user, but will continue to authenticate the user against the
LDAP system each time rather than by reference to a CMC-based password.
Note that for such users to be successfully provisioned, the user names that they use when logging into
195
Chapter 4. Setting Up Service Features
Note If you want the group administrator to be authenticated by LDAP, have the user log into
the CMC using their LDAP credentials. Once this occurs and the CMC automatically provisions
the user, you can subsequently edit the user’s profile (using the CMC’s Edit User function or the
Admin API's POST /user method) to promote them to the group admin role.
l For users who you want to be authenticated by a CMC-based password rather than by the LDAP sys-
tem, create those users through the CMC's Add New User interface (or the Admin API's PUT /user
method). The CMC will not use LDAP-based authentication for users created through the Add New
User interface or the PUT /user method.
l If you delete a user from the CMC but that user still exists in LDAP, the user will be able to log in to the
CMC as if they were a first-time user and the CMC will auto-provision the user once again. If you want
to prevent a user from accessing the CMC and HyperStore, but the user still exists in LDAP, the thing to
do is to deactivate the user in the CMC (through the CMC’s Edit User function), rather than deleting
them. This will prevent the user from logging into the CMC or accessing HyperStore storage, even
though they still exist in LDAP.
l If you delete a user from LDAP but do not delete them from the CMC, the user will not be able to log into
the CMC. However, they still have valid S3 credentials and can access the HyperStore storage layer
through a different S3 client. If you want a user who you’ve deleted from LDAP to not have access to the
HyperStore S3 system, you should delete them from CMC also (which prevents access and also
deletes the user’s stored data) or else deactivate them in the CMC (which prevents access but leaves
their stored data in place).
196
4.11. Provisioning Groups and Users
authenticated by reference to their CMC-based password. Just as with any other group, after enabling LDAP
authentication for the System Admin group, if you want a new system admin user to be authenticated by LDAP
do not manually create that user through the CMC or the Admin API -- instead, have that user log into the
CMC with his or her LDAP credentials, and the user will then be automatically be provisioned into HyperStore
(see "Provisioning of Users within LDAP-Enabled Groups" (page 195)).
l The default System Admin user -- with user ID admin -- is considered a pre-existing user and can only
be authenticated by reference to the CMC-based password for that user. LDAP authentication is not
supported for the admin user.
l To edit the System Admin group via the CMC or the Admin API you will need to know the group ID,
which is "0" (the number zero).
l When a new system admin user is auto-provisioned into HyperStore as an LDAP-authenticated user
(upon their first login to the CMC with their LDAP credentials), and the system automatically creates a
corresponding HyperStore Shell user -- so that the new system admin user can log into and use the
HyperStore Shell -- that user's logins to the HyperStore Shell will also be LDAP-authenticated. That
is, when logging into the HyperStore Shell the user will supply their LDAP username and password,
and the system will verify those credentials against your LDAP service.
o For "local" system admin users -- who have been manually added through the CMC or the
Admin API and who are not configured for LDAP authentication -- their HyperStore Shell user
name is their CMC login user name prefixed by "sa_" (such as "sa_admin2"). By contrast, for
LDAP-authenticated system admin users their HyperStore Shell user name is simply the same
user name that they use to log into the CMC (without any prefix).
o For LDAP authentication of HyperStore Shell users to work, along with enabling LDAP for the
System Admin group in the CMC's Edit Group interface (or through the Admin API's POST
/group method) you must perform this additional configuration step:
1. Log in to the Configuration Master node (as root or as a locally authenticated HyperStore
Shell user).
2. Set the Distinguished Name for binding to your LDAP service, and the password:
hsctl config set hsh.ldap.bindDN=<bind Distinguished Name>
hsctl config set hsh.ldap.bindPassword=<bind password>
hsctl config apply hsh
Note For LDAP authentication of HyperStore Shell users to work, your LDAP server must use
either TLS or START_TLS.
Note LDAP authentication for HyperStore Shell users works only for system admin users cre-
ated in HyperStore version 7.2.3 and later.
197
This page left intentionally blank
Chapter 5. Cluster and Node Operations
You can start, restart, or stop a service across all the nodes in your cluster by using the HyperStore installer. If
instead you want to start, restart, or stop a service on just one particular node, you can do so through the CMC
or by using a HyperStore service initialization script. Note also that HyperStore services are configured to start
automatically when you reboot a node.
Note The installation tool does not support managing a service on just one particular node.
1. On your Configuration Master node, change into the installation staging directory and then launch
the HyperStore installer.
# ./cloudianInstall.sh
199
Chapter 5. Cluster and Node Operations
$ hspkg install
Once launched, the installer's menu options (such as referenced in the steps below) are the same
regardless of whether it was launched from the HSH command line or the OS command line.
2. Choose "Cluster Management", then from the sub-menu that displays choose "Manage Services". This
displays the "Service Management" sub-menu:
a. At the prompt, enter a service number from the menu. To manage all services enter option (0).
Note The Admin Service is bundled with the S3 Service. Any operation that you apply to
the S3 Service (such as stopping or restarting) applies also to the Admin Service. Like-
wise, the STS Service is bundled together with the IAM Service, so any operation that
you apply to the IAM Service also applies to the STS Service.
b. At the prompt that appears after you make your service selection, enter a service command:
start, stop, status, restart, or version. (The "version" option is supported only for the S3 Service.)
200
5.1. Starting and Stopping Services
The service command you enter will be applied to all nodes on which the service resides. For example,
if you choose Cassandra (the Metadata DB) and then enter "start", this will start Cassandra on all nodes
on which it is installed. Likewise if you choose S3 and then "status", this will return the status of the S3
Service on each node on which it is installed. And if you choose "All services" and then "stop", this will
stop all services on all nodes.
Note From the "Service Management" menu all you can do for the Puppet service (the Con-
figuration Management service) is check its status. To stop or start the Puppet daemons (Con-
figuration Agents), from the installer’s main menu choose "Advanced Configuration Options".
From the advanced sub-menu that displays you can stop or start the Puppet daemons.
As an alternative to using the CMC for this task, you can use the following systemctl commands. These com-
mands can be run from any directory on the target node.
The table below shows all HyperStore services and their corresponding <servicename>.
Service <servicename>
S3 Service and Admin Service. These two services start and stop cloudian-s3
together.
IAM Service and STS Service. These two services start and stop cloudian-iam
together.
201
Chapter 5. Cluster and Node Operations
Service <servicename>
Redis Monitor cloudian-redismon
Dnsmasq cloudian-dnsmasq
1. CMC
2. Cloudian Monitoring Agent
3. Redis Monitor
4. S3 Service
5. IAM Service
6. SQS Service (if being used)
7. HyperStore Service
8. Redis QoS
9. Redis Credentials
10. Cassandra
11. Dnsmasq
To start all of the services on a single node, start each individual service in this order:
1. Cassandra
2. Redis Credentials
3. Redis QoS
4. HyperStore Service
5. SQS Service (if being used)
6. IAM Service
7. S3 Service
8. Redis Monitor
9. Cloudian Monitoring Agent
10. CMC
11. Dnsmasq
To power off:
# systemctl poweroff
202
5.2. Upgrading Your HyperStore Software Version
To reboot:
# systemctl reboot
IMPORTANT ! If you are rebooting multiple nodes, make sure that each node is back up for at least
one second before moving on to reboot the next node.
Note ntpd is configured to automatically start on host boot-up. By design, the sequencing is such that
ntpd starts up before major HyperStore services do.
Cloudian Inc. recommends that after booting a HyperStore host, you verify that ntpd is running. You can
do this with the ntpq -p command. If ntpd is running this command will return a list of connected time
servers.
The instructions that follow are for upgrading to HyperStore version 7.5 from HyperStore version 7.4 or
newer. This upgrade procedure does not require S3 service interruption.
If you are already running HyperStore version 7.5 and are installing a patch release (with a 4-digit release num-
ber), you can jump directly to "Installing a Patch" (page 209).
IMPORTANT ! Contact Cloudian Support before upgrading if any of these conditions apply to your
existing HyperStore system:
* Your current HyperStore version is earlier than version 7.4.
* One or more of your existing HyperStore hosts has less than 128GB RAM.
* Your HyperStore hosts are using Xen or are in Amazon EC2.
* Your HyperStore data disks are using Logical Volume Manager (LVM).
203
Chapter 5. Cluster and Node Operations
Note Multi-factor authentication (MFA) for CMC logins is a new feature in HyperStore version 7.5. You
should advise service users not to use this new feature until the upgrade to HyperStore 7.5 is com-
pleted on all nodes. If users try to use this new feature during the upgrade -- while some nodes are
upgraded and others are not yet -- they may encounter errors.
Make sure that all nodes and all services are up and that no node
is in a restricted status. If a service is down or a node is in one of the
restricted statuses noted below, the upgrade script will fail in the pre-
check stage and will not make any changes to your system.
Data Centers l If any services are stopped, start them. For instructions see
"Starting and Stopping Services" (page 199).
l If any node is in Maintenance Mode, take it out of Main-
tenance Mode. For instructions see Stopping Maintenance
Mode.
204
5.2. Upgrading Your HyperStore Software Version
Also, if the Data Centers page indicates that any node has alerts, go
to the Node Status page and then select that node and review its
alerts. Resolve any serious issues before proceeding with the
upgrade.
Note The upgrade script will automatically disable the auto-repair and proactive repair features -- so
that no new repairs kick off during the upgrade -- and then after the upgrade completes the script will
automatically re-enable the auto-repair and proactive repair features.
5.2.1.1. Additional Upgrade Preparation If Your System Currently Has Failed Disks
By default the HyperStore upgrade script will abort if it detects that any disks on your HyperStore nodes are
failed or disabled. If you want to perform a HyperStore version upgrade while there are failed or disabled disks
in your current system, take the following preparation steps before doing the upgrade:
1. On the Configuration Master node, in the staging directory for your current HyperStore system, open
this file in a text editor:
CloudianInstallConfiguration.txt
1. Download the HyperStore product package (CloudianHyperStore-7.5.bin file) from the Cloudian Sup-
port portal into a working directory on the Configuration Master node (such as /tmp or your home dir-
ectory -- do not use the installation staging directory from your existing HyperStore system). You
can also download from the Support portal the signature file (.sig file) corresponding to the product
package file -- you will need the signature file if you are using the HyperStore Shell to perform the
upgrade.
205
Chapter 5. Cluster and Node Operations
2. Copy your current Cloudian license file into the same working directory as the product package and
signature file. Your current license file is located in the /opt/cloudian/conf directory and the file name
ends with suffix .lic. If there are multiple .lic files in this directory, use the most recent one. Copy this file
to the working directory in which you've placed the new HyperStore product package.
Note The license file must be your cluster-wide license that you obtained from Cloudian, not a
license for an individual HyperStore Appliance machine (not a cloudian_appliance.lic file).
3. In the working directory run the commands below to unpack the HyperStore package:
# chmod +x CloudianHyperStore-7.5.bin
# ./CloudianHyperStore-7.5.bin <license-file-name>
This creates a new installation staging directory named /opt/cloudian-staging/7.5, and extracts the
HyperStore package contents into the staging directory.
$ chmod +x CloudianHyperStore-7.5.bin
$ hsrun --root CloudianHyperStore-7.5.bin <license-file-name>
Note To perform the upgrade using the HSH you must be an HSH Trusted user.
4. Change into the new installation staging directory and then launch the installer:
# cd /opt/cloudian-staging/7.5
# ./cloudianInstall.sh
206
5.2. Upgrading Your HyperStore Software Version
$ hspkg install
Once launched, the installer's menu options (such as referenced in the steps below) are the same
regardless of whether it was launched from the HSH command line or the OS command line.
5. From the installer main menu enter "3" for "Upgrade from <your version number> to 7.5". Then at the
prompt, confirm that you wish to continue with the upgrade.
The upgrade script will first check the configuration template files (*.erb files) from your existing HyperStore sys-
tem to determine whether you made any customizations to those settings (changes from the default values):
l If you have not made any such changes, the upgrade proceeds.
l If you have made such changes, then in the new installation staging directory the installer creates a text
file that lists those changes -- a "diff" file -- and prompts you to review the file, before the upgrade pro-
ceeds:
o Open a second terminal instance and in that terminal go to the new installation staging directory
(/opt/cloudian-staging/7.5) and view the "diff" file that the installer created -- while the upgrade
process remains paused in the first terminal instance. The "diff" file will be named as cloudian-
merge-conflicts-<timestamp>.txt.
o Then to carry forward your existing *.erb file customizations that are identified in the diff file, in
the second terminal instance manually make those same customizations to the new HyperStore
version's *.erb files (under /etc/cloudian-7.5-puppet/modules). For example, if in your existing
HyperStore system you had set a custom value for a hyperstore-server.properties.erb setting,
edit that same setting in /etc/cloudian-7.5-puppet/modules/cloudians3/templates/hyperstore-
server.properties.erb.
Note You do not need to manually carry forward customizations to the sso.enabled,
sso.shared.key, and sso.cookie.cipher.key properties in mts-ui.properties.erb. If you've
customized these properties, the upgrade will carry forward these customized property
values automatically. Starting in HyperStore 7.5 these properties are controlled by cor-
responding settings in common.csv.
o After saving your changes return to the original terminal instance in which you are running the
upgrade, and at the installer prompt continue with the upgrade. Note that you do not need to do
a configuration push, since the upgrade will apply your configuration edits.
In the background this invokes the Linux text editor vi to display and modify the configuration file.
Note Customizations that you have previously made to the configuration file common.csv are
handled differently. The installer detects such customizations and automatically applies the
same customizations to the new version's common.csv file, without you having to do anything.
207
Chapter 5. Cluster and Node Operations
When the upgrade process proceeds it upgrades one node at a time -- by shutting down the node, updating
the packages on the node, and then restarting the node and the services on the node -- until all nodes are
upgraded. Messages in the terminal will indicate the upgrade progress.
After the upgrade successfully completes, proceed to "Verifying Your System Upgrade" (page 208).
Note
* Once you've started the upgrade, you cannot <ctrl>-c out of it.
* If you have initiated the upgrade through a remote terminal, and the connection between the terminal
and the Configuration Master node is subsequently lost, the upgrade will continue.
IMPORTANT ! After the upgrade, do not delete the staging directory that was created when you
unpacked the product package file (/opt/cloudian-staging/7.5). HyperStore will continue to require cer-
tain files in this directory throughout the time that you are using this HyperStore version.
The upgrade process also generates an upgrade-logNconfig*.tgz "S.O.S" tar file (which packages together mul-
tiple upgrade-related files) in the staging directory that you can provide to Cloudian Support in the event that
you need assistance in resolving any upgrade problems.
1. After the automated upgraded completes, you should be taken back to the main menu of the Hyper-
Store installer. The first post-upgrade step is to confirm that all your HyperStore services are up and run-
ning:
All services on all nodes should then indicate that they are running.
a. Still on the Service Management menu, at the “Select a service to manage:” prompt select the S3
Service.
b. At the “Enter command” prompt, type version.
208
5.2. Upgrading Your HyperStore Software Version
l On the Node Advanced page, select command type "Info" then execute the "repairqueue" com-
mand to verify that auto-repair is enabled for replica, EC, and Cassandra data. (The installer
disables auto-repair prior to executing the upgrade and then re-enables auto-repair at the end of
the upgrade process.)
l On the Manage Users page, confirm that you can retrieve users.
l Log out of the CMC as system admin and log back in as a regular user, and then confirm that
you can successfully download and upload objects.
4. If prior to the upgrade you had made any customizations to the branding of the CMC interface, only your
customized logos and customized application name will be retained after the upgrade. You will need to
re-implement any changes that you had made to the browser tab title and/or the color scheme, by again
following the instructions for "Rebranding the CMC UI" (page 183).
5. If you have been using ElasticSearch for search of HyperStore object metadata, you should have
upgraded your ES cluster to version 7.7.x before upgrading HyperStore to version 7.5 (as noted in "Pre-
paring to Upgrade Your System" (page 204)). Now, after having upgraded HyperStore, run this com-
mand from any HyperStore node to verify that a sync-up of the object metadata in your ES cluster
against the object metadata in HyperStore can still be performed without error:
# /opt/cloudian/bin/elasticsearchSync all
$ elasticsearchSync all
Note If you disabled the failed disk check before performing your upgrade (as described in "Additional
Upgrade Preparation If Your System Currently Has Failed Disks" (page 205)), note that after you've
completed the upgrade, in the new instance of CloudianInstallConfiguration.txt in the staging directory
for your new HyperStore version the INSTALL_SKIP_DRIVES_CHECK setting is set back to its default
of false. So the next time you upgrade your HyperStore version the check for failed drives will be
executed, unless you once again disable the check by changing that setting to true.
It may happen that there are multiple patches in between full releases -- for example, if you are running 7.3.1 it
may be that there is a 7.3.1.1 patch release and then later there is a 7.3.1.2 patch release. In this case you can
209
Chapter 5. Cluster and Node Operations
install each patch when it comes out, or alternatively if you miss the first patch for some reason, you can install
just the second patch and the second patch will include the fixes from the first patch.
Cloudian Support will announce patch releases when they come out, and you can download the patch from
the Support portal. A patch is released as a self-extracting binary file, named as S3Patch-<version>.bin (for
example S3Patch-7.3.3.1.bin). You can also download from the Support portal the signature file (.sig file) cor-
responding to the patch -- you will need the signature file if you are using the HyperStore Shell to apply the
patch.
For more information on performing these checks, see "Preparing to Upgrade Your System" (page 204).
To install a patch:
1. Place the patch binary file in a working directory on your Configuration Master node (such as /tmp or
your home directory).
2. Change into that directory, and then run the following commands to run the patch file:
# chmod +x S3Patch-<version>.bin
# ./S3Patch-<version>.bin
$ chmod +x S3Patch-<version>.bin
$ hsrun --root S3Patch-<version>.bin
When you run the patch file, you will be prompted to confirm that you want to install the patch -- enter y
to do so. Then it automatically takes all the actions necessary to apply the patch to each of your Hyper-
Store nodes. Specifically, the following actions are automatically executed:
l The S3Patch-<version>.bin file content -- including a patch installation script -- is extracted into a
/s3patch/<patch-version>/ sub-directory under your HyperStore system's current installation sta-
ging directory
l The patch installation script is automatically launched. The script performs a non-disruptive,
rolling install of the patch to each of your HyperStore nodes one at a time -- including auto-
matically restarting the affected services on one node at a time.
l The status of the patch installation process is written to the console, and log messages per-
taining to the patch installation are written to <current-staging-directory>/installs3patch.log. Also,
a backup copy of the original, unpatched version of the main .jar file (Java archive file) from your
existing HyperStore version is written to <current-staging-directory>/s3patch/backup/.
After a successful patch upgrade of all nodes, you can launch the main HyperStore installer (./cloud-
ianInstall.sh in your current staging directory), go to the Manage Services menu, and for the S3 Service check
the version. The results should show that on all nodes, the S3 Service now has the version number of the patch
that you installed. You should also log into the CMC and check some of the main status reporting pages -- such
as the Data Centers page and the Alerts page -- to confirm that your patched HyperStore system is healthy.
You may also want to exercise the system by, for instance, uploading some objects into a bucket.
210
5.2. Upgrading Your HyperStore Software Version
Note Do not delete the S3Patch-<version>.bin file from the working directory in which you placed it.
You may need to use the file again, as described in the sections below.
# ./S3Patch-<version>.bin
You will again be prompted to confirm that you want to install the patch -- enter y to do so. The patch script will
then check each HyperStore node and install the patch only on nodes on which it has not already been suc-
cessfully installed. The status will be written to the console, and also logged to <current-staging-dir-
ectory>/installs3patch.log.
l You are unable to successfully install the patch to all nodes -- that is, you are in a condition where some
nodes were successfully patched while errors prevented patching of the other nodes, and you are
unable to correct the errors.
l You successfully patch all nodes, but subsequently you encounter negative behavior in your system
that you had not encountered prior to the installation of the patch.
If you are reverting a patch, contact Cloudian Support (either before or after reverting the patch).
To revert a patch:
1. On the Configuration Master node, change into the working directory in which the S3Patch-<ver-
sion>.bin file is located.
2. Run the patch bin file using the -r option:
# ./S3Patch-<version>.bin -r
You will be prompted to confirm that you want to revert the patch -- enter y to do so. The patch script will then
revert your HyperStore system to the preceding 3-digit release version. For example, reverting a 7.5.1 patch
will revert your system to 7.5; and reverting a 7.5.2 patch will also revert your system to 7.5 (not to 7.5.1).
211
Chapter 5. Cluster and Node Operations
IMPORTANT !
* Within a data center, if your HyperStore system is configured with multiple rack names and you are
adding nodes, make sure to position the new nodes so that when you are all done adding nodes,
each rack has the same number of nodes. Having an imbalance in the number of nodes per rack will
result in data load imbalance, such that on racks with fewer nodes there will be more stored data per
node than on racks with more nodes.
* Each node that you add must have the same networking configuration as the existing nodes in
your HyperStore system -- for example, if the existing nodes each have two NICs, one for front-end net-
work and one for back-end network, then the new nodes must have the same configuration.
* Once you have added a node to the cluster you cannot simply "undo" the process. If after adding a
new node to the cluster you were to change your mind about keeping the node in the cluster, you
would still be required to rebalance data within the cluster and then afterwards you would need to
decommission the node.
212
5.3. Adding Nodes
4. From your Configuration Master node use the system_setup.sh tool's Prep New Node to Add to
Cluster function to complete network interface configuration, time zone set-up, prerequisites install-
ation, and data disk formatting for each new node.
More detail:
a. On your Configuration Master node change into the installation staging directory and then
launch the system_setup.sh tool.
# ./system_setup.sh
$ hspkg setup
Once launched, the setup tool's menu options (such as referenced in the steps below) are the
same regardless of whether it was launched from the HSH command line or the OS command
line.
b. In the tool's main menu select "8" for Prep New Node to Add to Cluster. When prompted
provide the IP address of a new node, and then the password for logging into the node. A menu
of node preparation tasks will then display.
c. Use the node preparation task menu to prepare the node:
o Complete the configuration of network interfaces for the node, if you haven't already.
o Set the timezone for the node.
o Install and configure HyperStore prerequisites on the node.
o Set up data disks on the node with ext4 file systems, if you haven't already. Make sure to
format and mount all available data disks on the node.
o After completing the setup tasks for the node choose the "Return to Master Node" option,
which returns you to the tool's main menu.
d. Repeat steps "b" and "c" above for each new node that you're adding. When you're done, exit
the system_setup.sh tool.
IMPORTANT ! If the new node(s) are not HyperStore Appliances and if you do not use sys-
tem_setup.sh to format the data disks on the new node(s), then in the installation staging dir-
ectory on your Configuration Master node you must for each new node create a text file named
<hostname>_fslist.txt that specifies the new node’s data mount points, in this format:
<devicename> <mountpoint>
<devicename> <mountpoint>
etc...
a. If there are any operations in-progress in the CMC's Operation Status page, wait for them to fin-
ish (or otherwise, the Add Node operation will automatically stop them).
More detail:
When you initiate the Add Node operation in the CMC (as described in "Adding Nodes" below),
213
Chapter 5. Cluster and Node Operations
the system will automatically stop any in-progress repair, repairec, repaircassandra, cleanup, or
cleanupec operations in the service region. If there is an in-progress operation that you do not
want the system to stop, wait until the operation completes before you add nodes to your cluster.
Note When you in initiate the Add Node operation in the CMC, the system will also auto-
matically disable the auto-repair feature and the proactive repair feature -- so that no new
repairs kick off while you're expanding your cluster -- and then after you've completed the
rebalance operation the system automatically re-enables auto-repair and proactive
repair.
b. In the Data Centers page, make sure that all the existing nodes and services are up and run-
ning in the service region.
More detail:
214
5.3. Adding Nodes
The Add Node function will not let you add nodes if any existing nodes or services in the region
are down.
More detail:
215
Chapter 5. Cluster and Node Operations
Clicking the light green cube opens the Add Node interface.
Note If the data center has multiple racks for HyperStore -- which is possible only if your original
HyperStore version installed was earlier than version 7.2 -- and if you want the new node(s) to
be assigned to a new rack rather than one of the existing racks, just click on the light green cube
icon for any rack in the display for the correct data center. If (and only if) the data center has mul-
tiple racks for your existing HyperStore nodes, you will have an opportunity to specify a new
rack name in the next step below.
2. In the Add Node dialog that displays, complete the node information fields for the new node(s). After
completing all the fields for one new node, click Add More Nodes and complete the fields for another
new node; and repeat this process until you've completed the fields for all the new nodes that you want
to add to the data center.
More detail:
216
5.3. Adding Nodes
Hostname (required)
Hostname of the new node. This must be just a hostname, not an FQDN. Do not use the same
hostname for more than one node in your entire HyperStore system. Each node must have a
unique hostname, even if the nodes are in different domains.
IP Address (required)
Service network IP (v4) address that the hostname resolves to. Do not use IPv6.
l If your original HyperStore system is version 7.2 or later, or if your original HyperStore sys-
tem was earlier than 7.2 and you have only been using one rack name for your existing
nodes in the data center, this field value is fixed to the rack name that you have been
using for your existing nodes. You cannot edit this field.
l If your original HyperStore system was earlier than 7.2 and you have been using multiple
rack names for your existing nodes in the data center, you can select any of those rack
names from the drop-down list or create a new rack name.
217
Chapter 5. Cluster and Node Operations
IMPORTANT ! If your system is configured with multiple rack names and you are
adding nodes, make sure to position the new nodes so that when you are all done
adding nodes, each rack has the same number of nodes (for example, three
nodes on each rack). Having an imbalance in the number of nodes per rack will
result in data load imbalance, such that on racks with fewer nodes there will be
more stored data per node than on racks with more nodes.
l If the new node is a "Secure Appliance" (a HyperStore Appliance for which the Hyper-
Store Shell [HSH] was enabled and the root password disabled at the factory), select the
"New node is a Secure Appliance" checkbox. Then enter the sa_admin user's password
for the new node.
Note Before the new node is added to your HyperStore cluster, the sa_admin
user's password on the new node may be different than the sa_admin user's pass-
word in your existing cluster. If so, after the new node is added to the cluster, the
sa_admin user's password on the new node will be automatically changed to
match the sa_admin user's password in the cluster.
l If the new node is not a "Secure Appliance" -- that is, if the new node is a standard
Appliance or a software-only node on commodity hardware -- then you can use either
one of these authentication methods (use one method or the other -- not both):
o Enter the root user's password for the new node.
OR
o Select the "Private Key Authentication" checkbox. In this case the installer will use
the same private key as was used to install the existing cluster. Distribution of the
corresponding public key to the new node depends on how you handled SSH key
set-up during installation of the existing HyperStore cluster:
n If during installation of the cluster you let the installer generate an SSH key
pair for you, or you used your own existing SSH key pair and you copied
both the private and the public key into the installation staging directory on
the Configuration Master, then distribution of the public key to the new
node will be taken care of automatically by the installer.
n If during installation of the cluster you used your own existing SSH key pair
and you copied only the private key into the installation directory -- and you
218
5.3. Adding Nodes
copied the public key to the target installation nodes manually -- then you
must also copy the public key to the new node manually, before execut-
ing the Add Node operation.
3. Click Execute. The system will first run a pre-check to confirm that each of your specified new nodes
can be reached through the network. The system then performs a simulation that shows how token
ranges and data will be distributed across your cluster if you proceed with adding the new nodes.
The simulation results are displayed in an overlay over the Add Node dialog, including summary
assessments of whether the data associated with each of your existing storage policies will be well-bal-
anced across the cluster if you add the new nodes. Scroll through the simulation results and carefully
review the summary assessments. Typically the assessments will indicate a good data distribution bal-
ance, and you can then proceed with adding the new nodes by clicking Install.
Note If the assessments project an imbalance for a storage policy that's being used extensively
in your production service, close out of the simulation overlay and the Add Node dialog and con-
tact Cloudian Support for assistance.
4. When you click Install in Step 3 the system then adds the new node(s) to your cluster, one node at a
time. As part of adding a new node to the cluster, the system automatically streams to the new node its
proper share of system and object metadata. Depending on how many nodes you are adding and how
much data is in your system, this process can take anywhere from several hours up to several days or
more. When this streaming process completes successfully for all the new nodes, for each new node an
orange node icon with a gear inside of it should display in the Data Centers page.
More detail:
The Add Node operation entails verifying that the new host meets HyperStore requirements, installing
software, updating system configuration, starting services, joining the new node into the cluster, and
streaming system and object metadata to the new node (in the Cassandra database).
There are two CMC locations where you can monitor the Add Node operation progress:
l The Data Centers page. As each node is added to the cluster a new node icon appears, with
color-coding to indicate the node's status. You can hold your cursor over the node icon for a text
description of the status.
Grey node with gear -- The node has been added to the cluster and Cassandra
repair is in progress (which streams system and object metadata to the new node).
Orange node with gear -- Cassandra repair has completed successfully, and the
node is ready for you to run a "rebalance" operation to stream object data to it (as
described later in this procedure). If all the new nodes show this status you can pro-
ceed to Steps 5 and 6.
Red node with gear -- Cassandra repair has failed for the node.
l The Operation Status page. In this page you will see a one-line status summary for the
addNode operation (even if multiple nodes are being added there is just one summary status
219
Chapter 5. Cluster and Node Operations
line for the operation as a whole). If you click View to the right of the summary status line you can
display detailed progress information.
Handling Failures:
l Failure in adding the node to the cluster: If in the Data Centers page the grey node icon (indic-
ating that the node has been added and Cassandra repair is underway) never appears for one
or more nodes, go the Operation Status page and view the status detail. Try to resolve the prob-
lem reported in the status detail -- in some cases a reboot of the new node may suffice. Then go
back to the CMC's Data Centers page and again initiate the Add Node operation. This time
enter the node information only for the node(s) that failed to be added to the cluster, then
Execute the operation. Be sure to include only the node(s) that failed to be added (do not
include the successfully added nodes, and do not add entirely new nodes that you haven't tried
to add yet -- this will cause the operation to fail.) Note that you will not be shown a data dis-
tribution simulation again on this second attempt at adding the nodes.
Note Do not change the disk configuration of the new nodes before trying again to add
the new nodes (since this could result in sub-optimal token distribution, and con-
sequently a sub-optimal data distribution). If for some reason you must change the disk
configuration before trying again to add the new nodes, contact Cloudian Support.
l Failure in Cassandra repair (as indicated by red node icon with gear): If this status occurs, first
check the Data Center page's Service Status section to make sure that Cassandra is up and run-
ning on all nodes (if it's down on any node, go to the Node Status page for that node and start
Cassandra). After making sure Cassandra is up on all nodes go to the Node Advanced page,
and from the Maintenance command type menu run repaircassandra on the new node. Peri-
odically check the progress. In the Data Centers page the new node's icon should turn orange
when the repair completes successfully.
5. After the Add Node operation has completed successfully for all of the new nodes -- as indicated by
there being an orange node icon for each new node in the Data Centers page -- update your DNS
and/or load balancer configurations to include the new nodes, so that the new nodes can participate in
servicing S3 user request traffic. For more information see "DNS Set-Up" and "Load Balancing" in the
Cloudian HyperStore Installation Guide.
6. In the Node Advanced page, from the Maintenance command type group, execute hsstool rebalance
on each new node. When launching rebalance on each new node, use the cleanupfile option. Rebal-
ance is a long-running background operation that you can run concurrently on multiple new nodes that
220
5.3. Adding Nodes
you've added. When all new nodes have completed rebalancing, for each node a green check-marked
cube icon will display in the Data Centers page.
More detail:
The rebalance operation populates the new node(s) with their appropriate share of S3 object data. The
rebalance is a background operation that may take up to several days or more for each new node to
complete, depending on factors such as data volume and network bandwidth. When rebalance is per-
formed with the cleanupfile option, as soon as a replica or fragment is successfully copied to the new
node, it is deleted from the older node on which it no longer belongs -- thereby freeing up storage space
on the older nodes as the rebalance operation progresses. For more information see "hsstool rebal-
ance" (page 344).
Note During in-progress rebalance operations the affected data remains readable by S3 client
applications. Meanwhile the new nodes are immediately available to support writes of new data,
even before any rebalancing occurs.
Note When you have rebalance running on a new node or multiple new nodes, you cannot add
any additional nodes to the service region (using the CMC's Add Node feature) until rebalance
has competed successfully on all of the current new nodes.
Use the Data Centers page to periodically check the progress of the rebalance operation on each of
the new nodes. The icon for each of the new nodes will be color-coded, and you can hold your cursor
over the icon for a text description of the status.
Grey node with gear -- The rebalance operation is in progress. If you have added multiple
nodes and have started rebalance operations on the nodes, each node's status icon will
remain in this status until rebalance completes on all of the nodes in the batch.
Red node with gear -- The rebalance operation has failed for one or more token ranges. If
this status displays, go to the Nodes Advanced page and run rebalance on the node again,
using the retry option this time (select the retry checkbox when you run rebalance on the
node), as well as the cleanupfile option. This will try the rebalance again, just for the failed
token range(s).
221
Chapter 5. Cluster and Node Operations
Green node with check mark -- The rebalance operation has completed successfully.
Another source of status information for the rebalance operation is the Operation Status page.
For status detail click View to the right of the summary status line.
Note: If you removed a dead node prior to performing the Adding Nodes procedure and deferred the node
repair operations that are necessary after you remove a dead node, perform those repairs now.
More detail:
a. From the Node Advanced page, run hsstool repair on each node in the service region except for the
new node(s), just one node at a time. When repairing each node, use the allkeyspaces option and
also the -pr option. Leave the -l and -m options selected, as they are by default. Use the Operation
Status page to track the progress of each repair. After repair of a node is complete, repair another node
-- until all nodes except for the new node(s) have been successfully repaired.
b. If you have erasure coded object data in your system, from the Node Advanced page run hsstool
repairec on one node in each HyperStore data center in the region. It doesn't matter which node you
run it on, as long as you do it for one node in each DC in the region. Use the Operation Status page to
track repair progress.
222
5.4. Adding a Data Center
Note Each node that you add must have the same networking configuration as the existing nodes in
your HyperStore system -- for example, if the existing nodes each have two NICs, one for front-end net-
work and one for back-end network, then the new nodes must have the same configuration.
1. The HyperStore nodes in each data center will need to be able to communicate with the HyperStore
nodes in the other data center(s). This includes HyperStore services that listen on the internal interface.
Therefore, if you haven't already done so you must configure your inter-DC networking so that the
DCs' internal networks are connected to each other (for example, by using a VPN).
More detail:
a. In your existing cluster, on your Configuration Master node change into the installation staging
directory and then launch the system_setup.sh tool.
# ./system_setup.sh
$ hspkg setup
223
Chapter 5. Cluster and Node Operations
Once launched, the setup tool's menu options (such as referenced in the steps below) are the
same regardless of whether it was launched from the HSH command line or the OS command
line.
b. In the tool's main menu select "8" for Prep New Node to Add to Cluster. When prompted
provide the IP address of a new node, and then the password for logging into the node. A menu
of node preparation tasks will then display.
c. Use the node preparation task menu to prepare the node:
o Complete the configuration of network interfaces for the node, if you haven't already.
o Set the timezone for the node.
o Install and configure HyperStore prerequisites on the node.
o Set up data disks on the node with ext4 file systems, if you haven't already. Make sure to
format and mount all available data disks on the node.
o After completing the setup tasks for the node choose the "Return to Master Node" option,
which returns you to the tool's main menu.
d. Repeat steps "b" and "c" above for each new node that you're adding. When you're done, exit
the system_setup.sh tool.
IMPORTANT ! If the new node(s) are not HyperStore Appliances and if you do not use sys-
tem_setup.sh to format the data disks on the new node(s), then in the installation staging dir-
ectory on your Configuration Master node you must for each new node create a text file named
<hostname>_fslist.txt that specifies the new node’s data mount points, in this format:
<devicename> <mountpoint>
<devicename> <mountpoint>
etc...
6. In the CMC's Data Centers page, make sure that all the existing nodes and services are up and run-
ning in the region in which you are adding a data center. The Add DC function will not let you add
nodes if any existing nodes or services in the region are down.
224
5.4. Adding a Data Center
Note When you initiate the Add DC function (as described below) the system automatically disables
Cassandra auto-repair throughout the service region, and then when the Add DC operation has com-
pleted the system automatically re-enables Cassandra auto-repair throughout the service region.
225
Chapter 5. Cluster and Node Operations
2. In the Add DC interface that displays, complete the top two fields for the new data center as a whole,
then complete the remaining fields for each node in the data center, clicking Add More Nodes to dis-
play fields for additional nodes as needed.
More detail:
226
5.4. Adding a Data Center
Note
* This must be just a hostname, not an FQDN.
* Do not use the same hostname for more than one node in your entire HyperStore system.
Each node must have a unique hostname within your entire HyperStore system, even in the
case of nodes that are in different domains.
IP Address (required)
Service network IP (v4) address that the hostname resolves to. Do not use IPv6.
227
Chapter 5. Cluster and Node Operations
Note This is an internal value used by HyperStore. It does not need to correspond to any actual
rack name in your data center.
l If the new node is a "Secure Appliance" (a HyperStore Appliance for which the HyperStore
Shell [HSH] was enabled and the root password disabled at the factory), select the "New node is
a Secure Appliance" checkbox. Then enter the sa_admin user's password for the new node.
Note Before the new node is added to your HyperStore cluster, the sa_admin user's
password on the new node may be different than the sa_admin user's password in your
existing cluster. If so, after the new node is added to the cluster, the sa_admin user's
password on the new node will be automatically changed to match the sa_admin user's
password in the cluster.
l If the new node is not a "Secure Appliance" -- that is, if the new node is a standard Appliance
or a software-only node on commodity hardware -- then you can use either one of these authen-
tication methods (use one method or the other -- not both):
o Enter the root user's password for the new node.
OR
o Select the "Private Key Authentication" checkbox. In this case the installer will use the
same private key as was used to install the existing cluster. Distribution of the cor-
responding public key to the new node depends on how you handled SSH key set-up
during installation of the existing HyperStore cluster:
n If during installation of the cluster you let the installer generate an SSH key pair for
228
5.4. Adding a Data Center
you, or you used your own existing SSH key pair and you copied both the private
and the public key into the installation staging directory on the Configuration
Master, then distribution of the public key to the new node will be taken care of
automatically by the installer.
n If during installation of the cluster you used your own existing SSH key pair and
you copied only the private key into the installation directory -- and you copied the
public key to the target installation nodes manually -- then you must also copy
the public key to the new node manually, before executing the Add DC oper-
ation.
3. Click Execute. This initiates a background operation that will take anywhere from several minutes to
several hours to complete depending on your environment. When it completes successfully the Data
Centers page will display an additional block representing the newly added DC, with a green, check-
marked cube icon for each of the new DC's nodes.
More detail:
The Add DC operation entails verifying that the new hosts meet HyperStore requirements, installing soft-
ware, updating system configuration, starting services, joining the new nodes into the cluster, and
streaming system metadata to the new nodes (in Cassandra).
You can use the Operation Status page to monitor progress of the operation.
For status detail click View to the right of the summary status line.
When the operation is complete, to see the new nodes in the Data Centers page you may need to
refresh the page in your browser.
If you hold your cursor over each cube in the new data center the node host names will display.
Note If the Operation Status page indicates that the Add DC operation has failed, click "View"
for detail. Then for more information to support troubleshooting efforts, grep for "ERROR" level
messages in the cloudian-installation.log file under the installation staging directory on your
Configuration Master node.
4. Update your DNS and load balancing configurations to include the new data center and its nodes, if
you have not already done so. For more information see "DNS Set-Up" and "Load Balancing" in the
Cloudian HyperStore Installation Guide.
5. Go to the CMC's Storage Policies page (Cluster -> Storage Policies) and create one or more stor-
age policies that utilize the new DC. Until you create storage policies that use the new DC and users
229
Chapter 5. Cluster and Node Operations
subsequently create buckets that use those storage policies, no S3 object data will be stored in the new
DC.
Note Because your previously existing storage policies do not include the new DC, none of the
data already stored in your system in association with those storage policies will be migrated
into the new DC. Accordingly, there is no need for you to run a rebalance operation after adding
a new DC.
This completes the procedure for adding a data center to your cluster.
NOTE: If you removed a dead node prior to performing the Adding a Data Center procedure and deferred
the node repair operations that are necessary after you remove a dead node, perform those repairs now.
More detail:
a. From the Node Advanced page, run hsstool repair on each node in the service region except for the
new nodes, just one node at a time. When repairing each node, use the allkeyspaces option and also
the -pr option. Leave the -l and -m options selected, as they are by default. Use the Operation Status
page to track the progress of each repair. After repair of a node is complete, repair another node -- until
all nodes except for the nodes in the new data center have been successfully repaired.
b. If you have erasure coded object data in your system, from the Node Advanced page run hsstool
repairec on one node in each HyperStore data center in the region except for the new data center. It
doesn't matter which node you run it on, as long as you do it for one node in each DC in the region,
except for the new DC. Use the Operation Status page to track repair progress.
Note Each node that you add must have the same networking configuration as the existing nodes in
your HyperStore system -- for example, if the existing nodes each have two NICs, one for front-end net-
work and one for back-end network, then the new nodes must have the same configuration.
1. The HyperStore nodes in each region will need to be able to communicate with the HyperStore nodes
in the other region(s). This includes HyperStore services that listen on the internal interface. Therefore,
if you haven't already done so you must configure your networking so that the internal networks of
all of your data centers in all of your regions are connected to each other (for example, by using a
VPN).
230
5.5. Adding a Region
More detail:
a. In your existing cluster, on your Configuration Master node change into the installation staging
directory and then launch the system_setup.sh tool.
# ./system_setup.sh
$ hspkg setup
Once launched, the setup tool's menu options (such as referenced in the steps below) are the
same regardless of whether it was launched from the HSH command line or the OS command
line.
b. In the tool's main menu select "8" for Prep New Node to Add to Cluster. When prompted
provide the IP address of a new node, and then the password for logging into the node. A menu
of node preparation tasks will then display.
c. Use the node preparation task menu to prepare the node:
o Complete the configuration of network interfaces for the node, if you haven't already.
o Set the timezone for the node.
o Install and configure HyperStore prerequisites on the node.
o Set up data disks on the node with ext4 file systems, if you haven't already. Make sure to
format and mount all available data disks on the node.
o After completing the setup tasks for the node choose the "Return to Master Node" option,
which returns you to the tool's main menu.
d. Repeat steps "b" and "c" above for each new node that you're adding. When you're done, exit
the system_setup.sh tool.
IMPORTANT ! If the new node(s) are not HyperStore Appliances and if you do not use sys-
tem_setup.sh to format the data disks on the new node(s), then in the installation staging dir-
ectory on your Configuration Master node you must for each new node create a text file named
231
Chapter 5. Cluster and Node Operations
<hostname>_fslist.txt that specifies the new node’s data mount points, in this format:
<devicename> <mountpoint>
<devicename> <mountpoint>
etc...
2. In the Add Region interface that displays, enter the name of the new region then complete the fields for
each node in the data center, clicking Add More Nodes to display fields for additional nodes as
needed.
More detail:
232
5.5. Adding a Region
Note
* This must be just a hostname, not an FQDN.
* Do not use the same hostname for more than one node in your entire HyperStore system.
Each node must have a unique hostname within your entire HyperStore system, even in the
case of nodes that are in different domains.
IP Address (required)
Service network IP (v4) address that the hostname resolves to. Do not use IPv6.
233
Chapter 5. Cluster and Node Operations
cluster are using "eth1" for internal traffic — enter the interface name in this field. If the new node will
use the same internal network interface as your existing nodes you can leave this field empty.
Note This is an internal value used by HyperStore. It does not need to correspond to any actual
rack name in your data center(s).
l If the new node is a "Secure Appliance" (a HyperStore Appliance for which the HyperStore
Shell [HSH] was enabled and the root password disabled at the factory), select the "New node is
a Secure Appliance" checkbox. Then enter the sa_admin user's password for the new node.
Note Before the new node is added to your HyperStore cluster, the sa_admin user's
password on the new node may be different than the sa_admin user's password in your
existing cluster. If so, after the new node is added to the cluster, the sa_admin user's
password on the new node will be automatically changed to match the sa_admin user's
password in the cluster.
l If the new node is not a "Secure Appliance" -- that is, if the new node is a standard Appliance
or a software-only node on commodity hardware -- then you can use either one of these authen-
tication methods (use one method or the other -- not both):
o Enter the root user's password for the new node.
OR
o Select the "Private Key Authentication" checkbox. In this case the installer will use the
same private key as was used to install the existing cluster. Distribution of the cor-
responding public key to the new node depends on how you handled SSH key set-up
during installation of the existing HyperStore cluster:
n If during installation of the cluster you let the installer generate an SSH key pair for
you, or you used your own existing SSH key pair and you copied both the private
and the public key into the installation staging directory on the Configuration
Master, then distribution of the public key to the new node will be taken care of
234
5.5. Adding a Region
3. Click Execute. This initiates a background operation that will take anywhere from several minutes up to
an hour to complete depending on your environment. When it completes successfully the Data Centers
page will display an additional tab representing the newly added region, with a green, check-marked
cube icon for each of the new region's nodes.
More detail:
The Add Region operation entails verifying that the new hosts meet HyperStore requirements, installing
software, updating system configuration, starting services, and joining the new nodes into a new Cas-
sandra ring.
For status detail click View to the right of the summary status line. (The page may at one point indicate
that it cannot retrieve status information -- this is due to an S3 Server / Admin Server restart which is an
automatic part of the Add Region operation. Wait a few minutes then refresh the page.)
When the operation is complete, to see the new nodes in the Data Centers page you may need to
refresh the page in your browser.
If you hold your cursor over each cube in the new region the node host names will display.
Note If the Operation Status page indicates that the Add Region operation has failed, click
"View" for detail. Then for more information to support troubleshooting efforts, grep for "ERROR"
level messages in the cloudian-installation.log file under the installation staging directory on
your Configuration Master node.
4. Update your DNS and load balancing configurations to include the new service region and its nodes, if
you have not already done so. Note that name server configurations in each of your existing region(s)
and the new region must have entries for all your regions' S3 service endpoints, as well as for the
global Admin service and CMC service endpoints. For more information see "DNS Set-Up" and "Load
Balancing" in the Cloudian HyperStore Installation Guide.
5. Go to the Storage Policies page (Cluster -> Storage Policies) and select the new region. Then create
a storage policy for the new region. This first storage policy will become the default storage policy for
235
Chapter 5. Cluster and Node Operations
the region. Later if you've created more than one storage policy in the region you can change which
policy is the default policy if you wish. Until you create one or more storage policies in the new region
and users subsequently create buckets that use those storage policies, no S3 object data will be stored
in the new region.
6. Optionally, set Quality of Service (QoS) limits for users' and groups' activity in the new service region.
Each service region has its own QoS configuration. By default, in each region no QoS limits are
enforced. For more information, while logged into the CMC's Manage Groups or Manage Users page
click Help. When using the QoS configuration interfaces be sure to select the new service region from
the interfaces' drop-down list of regions.
This completes the procedure for adding a service region to your system.
l Verify that your existing system is in a proper condition to successfully support the removal of a node
l Remove the node
l If the node you removed was "dead" -- the Cassandra Service on the node is down or unreachable
when you remove the node -- repair the remaining nodes in your cluster (this is not necessary if the
node you removed was live)
Note If you are removing a node after having added a new node to your cluster, you must complete
the rebalance operation for the new node before removing a node. For more information on rebalance
see "Adding Nodes" (page 212).
IMPORTANT ! Removing a node should be something that you do only if absolutely necessary.
When you remove a live node from your cluster, during the decommissioning process the data replicas
and erasure coded fragments on that node are unavailable to help meet your configured read con-
sistency requirements, when servicing read requests from client applications. Once data has been
streamed from the decommissioning node to the other nodes that data is again available to contribute
to meeting read consistency requirements. Depending on the data volume and network bandwidth, it
may take several days or more until the decommissioning process has completed for all of the node's
data.
Note For instructions on uninstalling an entire HyperStore system, from all nodes, including delet-
ing all data, see "cloudianInstall.sh Command Line Options" in the Cloudian HyperStore Installation
Guide.
1. Make sure that removing the node won't leave your cluster with fewer nodes than your configured
storage policies require.
236
5.6. Removing a Node
More detail:
For example if 4+2 erasure coding is being used in your system you cannot reduce your cluster size to
fewer than 6 nodes, even temporarily. Or for another example if you have a storage policy that for each
object places 3 replicas in DC1 and 3 replicas in DC2, do not reduce the number of nodes in either data
center to fewer than 3.
If you're not certain what storage policies currently exist in your system, check the CMC's Storage
Policies page (Cluster -> Storage Policies).
The CMC's Uninstall Node function checks and enforces this requirement and will not let you remove a
node if doing so would leave fewer than the required number of nodes in your cluster. If your cluster is
currently at the minimum size required by your storage policies and you want to remove a node:
l If the node you want to remove is live you can first add a new node to your cluster by following
the complete procedure for "Adding Nodes" (page 212) (including rebalancing); and then after
that you will be able to remove the node that you want to remove.
l If the node you want to remove is dead contact Cloudian Support for guidance.
2. Make sure that you have sufficient available storage capacity on the other nodes in your cluster.
The data from the removed node will be redistributed to the remaining nodes that are encompassed by
the same storage policies as the removed node.
More detail:
Each remaining node must have available storage capacity to take on some of the data from the
removed node. You can review your cluster and per-node storage space availability in the CMC's Capa-
city Explorer page (Analytics -> Capacity Explorer).
Note If any of the other nodes are in the "stop-write" condition -- or if any disks on a node are
in stop-write condition -- at a time when you decommission a different node from your system,
the decommissioning process overrides the stop-write restriction on the node(s) or disk(s)
where it exists in order to stream to all nodes and disks in the cluster their share of data from the
decommissioned node. Consequently, disks that were nearly full before a decommissioning
operation may become completely full during the decommission operation, resulting in stream
job failures once the disks can accept no more data.
To avoid this, before removing a node from your cluster make sure there is plenty of space on all
the remaining nodes and disks to absorb the data from the node you intend to remove. If you
want to remove a node from your cluster at a time when some disks on the other nodes are
nearly full, consult with Cloudian Support.
3. In the Node Advanced page, from the Maintenance command type group, execute the autorepair com-
mand with the Disable option to temporarily disable the automated repair feature in the service
region in which you are removing a node.
More detail:
The target node for the command can be any node in the region. Leave the "Type" option unselected so
that all automated repair types are disabled. This will prevent any new schedule-based auto-repairs or
proactive repairs from launching during the node removal process.
237
Chapter 5. Cluster and Node Operations
4. In the Operation Status page, make sure there are no rebalance operations or other operations cur-
rently in progress in the service region.
More detail:
If any operations are in progress, wait until the operations complete before removing a node from your
cluster. The CMC's Uninstall Node function will not let you remove a node if a major operation such as
rebalance, repair, or cleanup is running in the service region.
Note If you don't want to wait for an in-progress repair or cleanup of a node to complete you
have the option of terminating the operation. To do so, go to the Node Advanced page and for
that node execute hsstool repair or hsstool repairec or hsstool cleanup or hsstool cleanupec
with the "stop" option selected. Note that an hsstool rebalance operation does not support a stop
option and cannot be terminated while in progress.
5. In the Repair Status page, make sure there are no proactive repairs currently in progress in the ser-
vice region.
More detail:
If any proactive repairs are in progress, wait until they complete before removing a node from your
cluster. The CMC's Uninstall Node function will not let you remove a node if a proactive repair is
238
5.6. Removing a Node
Note If you don't want to wait for an in-progress proactive repair of a node to complete you have
the option of terminating the repair. To do so, go to the Node Advanced page and for that node
execute hsstool proactiverepairq with the "stop" option selected.
6. If the node you want to remove is the active Configuration Master node, manually fail over the Con-
figuration Master role from the primary instance to the backup instance.
More detail:
If you're not sure whether the node you want to remove is the Configuration Master (Puppet master)
host, check the CMC's Cluster Information page. That page also shows which host is the Configuration
Master backup host.
239
Chapter 5. Cluster and Node Operations
For instructions on failing over the Configuration Master from primary to backup, see "Manually Fail
Over the Configuration Master Role from the Primary to the Backup" (page 268).
Note Any other specialized services on the node -- such as Redis or Redis Monitor -- will be
automatically moved to other nodes in the cluster by the CMC's Uninstall Node function.
7. In the Node Advanced page, from the Info command type group, submit the status command to any
healthy node to retrieve status information for all nodes in the region. The HyperStore Service must
be up on all nodes including the node that you are going to remove. The Cassandra Service must be
up on all nodes other than the one you are going to remove. If Cassandra is down or unreachable on
the node you are removing -- if the node is "dead" in terms of its relation to the Cassandra cluster -- you
can still proceed with removing the node, but an extra step will be required toward the end of the node
removal procedure (as described in "Removing a Node" (page 241)).
More detail:
The target node for the command can be any healthy node in the same service region as the node you
want to remove -- the command returns status information for the cluster as a whole.
240
5.6. Removing a Node
a. In the command response, for the node that you want to remove check the status of the Cas-
sandra Service (in the "Cassandra" column) and the HyperStore Service (in the "Hss" column).
l If the Cassandra status and the HyperStore Service status are both Up, the data redis-
tribution that will be required in the cluster when you remove the node will be executed
by a decommissioning process that will be automatically invoked by the CMC's Uninstall
Node function.
l If the Cassandra status is Down or unreachable ("?"), data redistribution by decom-
missioning is not supported. Instead, at the end of the remove node procedure you will
need to run repair on all the remaining nodes in order to redistribute data in the cluster.
Details are in the procedure below.
Note If possible, start Cassandra on the node that you want to remove, so that the
Uninstall Node function can automatically implement a decommissioning process
and you won't have to perform repairs.
b. For all the other nodes confirm that Cassandra and the HyperStore Service are Up. The CMC's
Uninstall Node function will abort if Cassandra or the HyperStore Service are Down or unreach-
able on any other node in the cluster. If those services are down on any of the other nodes, start
the services (or resolve the network access problem if there is one) before trying to remove a
node.
https://<IP_address_of_node_other_than_removal_node>:8443/Cloudian
2. In the CMC's Node Advanced page, from Command Type drop-down list select Start Maintenance
Mode. For the target node select the node that you want to remove from the cluster.
241
Chapter 5. Cluster and Node Operations
Then click Execute. This directs the rest of the cluster to stop sending S3 requests to the specified node.
3. In the Node Advanced page, from Command Type drop-down list select Uninstall Node. For the "Node
to Uninstall" list select the node that you want to remove from the cluster. This operation will recreate
the selected node's data on to other nodes remaining in the cluster (if the node you are removing is a
"live" node), remove the node from the cluster, and remove HyperStore software from the node. If you
want also for the HyperStore data and logs to be removed from the node, select the "o" option.
Then click Execute. After you confirm that you want to proceed, the operation is initiated.
4. Use the Operation Status page to periodically check on the progress of the Uninstall Node operation.
To pop up a detailed status report click View next to the summary status line.
242
5.6. Removing a Node
If the node you are removing is live, the Uninstall Node operation will include decommissioning the
node -- streaming copies of its data to the remaining nodes in the cluster -- and this may take up to sev-
eral days or more to complete. If the node you are removing is dead, the Uninstall Node operation will
be much briefer.
Note If you want to remove multiple nodes, wait until the Uninstall Node operation completes
for one node before you start to uninstall the next node. The system does not support unin-
stalling multiple nodes concurrently.
5. After the Operation Status page shows that the status of the Uninstall Node operation is Completed,
go to the Node Status page (Cluster -> Nodes -> Node Status) and confirm that the removed node no
longer appears in the "Host" drop-down list.
6. If the node you removed was "dead" -- meaning that the node's Cassandra Service was down or
unreachable when you checked it (as described in "Preparing to Remove a Node" (page 236)) -- you
must repair each of the remaining nodes in the region.
More detail:
If the node you removed was "dead", take the following actions to recreate the removed node's data on
the remaining nodes in the cluster. (If the node you removed was "live" -- in which case the Uninstall
Node operation executed a decommissioning process -- these actions are not needed and you can
jump down to Step 7.)
Note If you have removed a dead node from your cluster as a precursor to adding a new node
to your cluster -- if you wish you can defer the time-consuming repair operations called for below
until after you have added the new node(s). If that's the case, you can now perform the "Adding
Nodes" (page 212) procedure, and that procedure indicates the point at which you should ini-
tiate the deferred repairs. If you have removed a dead node and are not adding a new node, per-
form the repair now as described below.
243
Chapter 5. Cluster and Node Operations
From the Node Advanced page, run hsstool repair on each of the remaining nodes in the service
region. When repairing each node, use the allkeyspaces option (so as to repair Cassandra metadata
as well as S3 object data) and also the -pr option (this makes for more efficient repairs when repairing
multiple nodes). Leave the -l and -m options selected, as they are by default. Since you are using the -
pr option you can run repair on multiple nodes concurrently (in the GUI you will have to execute the
repair command for each node individually, but you do not need to wait for the repair operation on one
node to complete before you execute the command on the next node).
Also, if you have any erasure coded object data in your system, from the Node Advanced page run
hsstool repairec on one node in each HyperStore data center in the region. It doesn't matter which
node you run it on, as long as you do it for one node in each DC in the region. (Note that you do not
need to wait for the in-progress hsstool repair operations to finish before launching hsstool repairec --
it's OK to run hsstool repairec and hsstool repair concurrently.)
Use the Operation Status page to track the progress of all repairs. After all repairs have completed pro-
ceed to Step 7.
244
5.7. Replacing a Node
Note Because the repairec operation repairs erasure coded data on all hosts in a data center,
it's potentially a very long running operation. In a large cluster with high data volume it may take
multiple weeks to complete.
7. In the Node Advanced page, from the Maintenance command group execute the autorepair command
with the Enable option to re-enable the HyperStore automated repair features in the service region.
The target node can be any node in the region. Leave the "Type" option unselected so that all repair
types are enabled. This also re-enables proactive repair.
This completes the procedure for removing a node from your cluster.
Note that the CMC's Uninstall Node function deletes the node from the HyperStore cluster configuration and
removes all HyperStore software from the node. It does not delete the Cassandra metadata or HyperStore
object data from the node.
IMPORTANT ! If the node was "live" when you removed it -- so that the Uninstall Node operation
included a decommissioning process -- make sure that in the Operation Status page the Uninstall
Node operation shows as having completed successfully with no errors, before you consider manually
deleting the data that remains on the removed node. If the node was "dead" when you removed it -- so
that you had to subsequently run repair operations on the remaining nodes in the cluster -- make sure
that in the Operation Status page the Uninstall Node operation and also all those repair operations
show as having completed successfully with no errors, before you consider manually deleting the data
that remains on the removed node.
l First perform the procedure for "Removing a Node" (page 236) (for the dead node).
245
Chapter 5. Cluster and Node Operations
Note If your current number of nodes is at the minimum required by your storage policies, the
system will not allow you to remove the node in the standard way. If this is your circumstance --
you have the minimum number of nodes required by your storage policies and one of those
nodes is dead -- please contact Cloudian Support for guidance.
l Then perform the procedure for "Adding Nodes" (page 212) (for the new node).
If you want to replace a live node (a node on which the Cassandra Service and HyperStore Service are run-
ning and reachable) with a new node:
l First perform the procedure for "Adding Nodes" (page 212) (for the new node).
l Then perform the procedure for "Removing a Node" (page 236) (for the live node that you want to
remove).
l HyperStore will use proactive repair to automatically populate the node with any data that the node is
responsible for storing but that is missing due to the node having been offline.
IMPORTANT ! The longest node outage that a proactive repair can cover for is four hours (by
default configuration). If a node is down for more than four hours you need to take manual
steps to fully repair it. For detail see "Repairing a Node That's Been Down for Longer than
the Proactive Repair Limit" (page 247).
l If the node that you are restoring is a Redis slave node (for either the Redis Credentials DB or the
Redis QoS DB), when you bring the node back online it will automatically sync with the Redis master
node to get the most current data.
If you made any configuration changes to your cluster while the node was down, from the Configuration
Master node push the changes out to the cluster after the node is back up.
Optionally, you can run a cleanup on the node in order to remove from the node any data that should no
longer be there. This would apply, for example, if service users deleted some of their stored objects from the
system while the node was down. In this case after being brought back into service the node will still be storing
replicas or erasure coded fragments of those objects, resulting in wasted use of disk space. Cleaning the node
removes this "garbage" data. For cleanup command details see "hsstool cleanup" (page 310) (if you have
erasure coded data in your system use the command's -a option so that it cleans erasure coded data as well as
replica data).
246
5.9. Restoring a Node That Has Been Offline
Note Before cleaning a node you should wait until any proactive repair that’s automatically run on the
node has completed. You can check this on the CMC's Repair Status page (Cluster -> Repair
Status). Wait until the node’s status displays as "All Clear", and then you can clean the node.
Also optionally, you can return to the node any specialized service role that the node was playing before it
went down. If the node had been acting as a "master" or "primary" within one of the HyperStore system’s spe-
cialized services, then when the node went offline that role would have failed over to a different node. If you
want you can return the master or primary role to the restored node after it’s back online — though it is not
necessary to do so.
How going down and then being brought back up affects a node's specialized service roles...
The table below shows how having been down impacts a node’s specialized service role(s).
If before going down the node Then while the node was down that And when brought back
was… role… online the node is now…
Automatically failed over to a Redis Cre-
Credentials DB master Redis Credentials slave
dentials slave
To see what specialized service role(s) a restored node is currently playing, go to the CMC's Cluster Inform-
ation page (Cluster -> Cluster Config -> Cluster Information).
If you want to change the node’s current role assignment(s), see the instructions for "Change Node Role
Assignments" (page 259).
5.9.1. Repairing a Node That's Been Down for Longer than the Proactive
Repair Limit
In mts.properties.erb the setting "hyperstore.proactiverepair.queue.max.time" (page 490) sets the max-
imum time for which proactive repair jobs can be queued for a node that is unavailable. The default is 4 hours.
This time limit prevents Cassandra from being over-loaded with metadata relating to proactive repair, and
ensures that proactive repair is used only for its designed purpose, which is to repair object data from a rel-
atively brief time period.
1. Monitor the automatic proactive repair that initiates on the node when the node starts up, until the pro-
active repair completes. You can check the CMC's Repair Status page periodically to see whether pro-
active repair is still running on the node that you've brought back online. This proactive repair will repair
247
Chapter 5. Cluster and Node Operations
the new and changed objects from during the period when proactive repair metadata was still being
written to the Cassandra for the node.
2. After proactive repair on the node completes, manually initiate a full repair of the node. For information
on manually initiating a repair see hsstool repair and hsstool repairec. This will repair the new and
changed objects from the period after the proactive repair queueing time maximum was reached and
before the node came back online.
1. For all "Replication Across Data Centers" and "Replicated EC" storage policies that you have con-
figured in your system and that include the DC that you want to put into maintenance, use the CMC's
Storage Policies page (Cluster -> Storage Policies) to change the read and write consistency require-
ment to "Local Quorum" (if that is not already the consistency requirement that you are using).
2. Use the CMC's Cluster Information page (Cluster -> Cluster Config -> Cluster Information) to see
whether any of the following specialized service roles are being performed by nodes in the DC that you
want to put into maintenance; and if so, move each such role to a node in a different DC. For instruc-
tions see "Change Node Role Assignments" (page 259).
3. Use the CMC's Alert Rules page (Alerts -> Alert Rules) to disable alerts related to the DC that you
want to put into maintenance. Toward the bottom of the page, in the DC Alert Configuration section, set
the DC to Disabled.
To confirm that alerts for the DC are disabled, go to the Data Centers page (Cluster -> Data Centers)
and next to the DC name you should see a gray triangle with an exclamation mark.
248
5.11. Backing Up and Restoring a Cluster
IMPORTANT ! Do not change the IP addresses of the nodes in the DC for which you are per-
forming maintenance.
5. After completing maintenance on the DC, verify network connectivity from your operational DC(s) to the
HyperStore nodes in the DC for which you performed maintenance.
6. Use the CMC's Alert Rules page to disable alerts related to the DC that you want to put into main-
tenance. Toward the bottom of the page, in the DC Alert Configuration section, set the DC to Enabled.
To confirm that alerts for the DC are enabled, go to the Data Centers page and next to the DC name
you should no longer see a gray triangle with an exclamation mark.
7. For all "Replication Across Data Centers" and "Replicated EC" storage policies that you have con-
figured in your system and that include the DC for which you performed maintenance, use the CMC's
Storage Policies page to change the read and write consistency requirement back to your desired set-
tings.
8. Use the CMC's Node Advanced page to execute hsstool repair on each node in the DC for which you
performed maintenance, one node at a time, allowing the repair to complete on one node before start-
ing a repair on the next node. These are long-running operations. (This step is applicable only if you
are using one or more replication [not erasure coding] storage policies in the DC.)
9. Use the CMC's Node Advanced page to execute hsstool repairec on any one node in the DC for which
you performed maintenance. This repairs all erasure coded data in the DC. This is a long-running oper-
ation. (This step is applicable only if you are using one or more erasure coding storage policies in the
DC.)
249
Chapter 5. Cluster and Node Operations
This procedure is for backing up and restoring an entire HyperStore cluster (i.e. an entire HyperStore service
region). This procedure should not be used for partial backups and restores.
Throughout this procedure, it’s assumed that you have used the default installation directories for HyperStore
binaries. If not, adjust the command paths stated in the procedure.
IMPORTANT ! Make sure you have enough space for the backup. If you back up on to nodes in your
HyperStore cluster, this will double disk usage within the cluster.
Note This procedure for backing up and restoring a cluster is not supported if you have the Hyper-
Store Shell enabled and the root password disabled.
So you always have access to Redis backup files that are generated at 11PM daily (more specifically, the
backup starts at 11PM -- it may take some time to complete). However, if you want a more current Redis
backup you can follow steps 1a and 1b below to re-generate the dump-credentials.rdb and dump-qos.rdb files.
Logging output from the redisBackup script runs is written to /var/log/redis/redisBackup.log. (A separate log file
cloudian-redisBackup.log merely records information regarding the launching of the backup script by cron.d.)
a. On the Redis Credentials master node, use the Redis CLI to perform a BGSAVE operation:
# /opt/redis/redis-cli -h <hostname> -p 6379 BGSAVE
The above command will update the dump-credentials.rdb file in the Redis data directory
(/var/lib/redis by default).
250
5.11. Backing Up and Restoring a Cluster
Note The BGSAVE operation saves the database in the background. To check when the
save is done you can use the Redis CLI LASTSAVE command, which returns the Unix
time of the most recently completed save.
b. On the Redis QoS master node, use the Redis CLI to perform a BGSAVE operation:
# /opt/redis/redis-cli -h <hostname> -p 6380 BGSAVE
The above command will update the dump-qos.rdb file in the Redis data directory.
2. Back up Cassandra and Hyperstore data on each of your HyperStore nodes, using the third party
backup tool of your choice:
o Then with your third party tool, back up the Cassandra data directory. (To check which dir-
ectory is the Cassandra data directory, see the configuration setting common.csv: "cas-
sandra_data_directory" (page 445)).
l For HyperStore data, on each node: With your third party tool, back up the HyperStore data dir-
ectories. (To check which directories are the HyperStore data directories, see the configuration
setting common.csv: "hyperstore_data_directory" (page 423)).
251
Chapter 5. Cluster and Node Operations
2. Restore Cassandra and HyperStore data directories on each of your HyperStore nodes.
3. Restore the Redis Credentials and Redis QoS databases.
Through its storage policies feature, HyperStore provides you the option of using eventual consistency for
writes of S3 object data and metadata. For example, in the context of a 3X replication storage policy you can
configure a policy such that the system returns a success response to an S3 client’s PUT Object request so
long as two of the three replicas can be written at the time of the request. As a second example, in the context
of a 4+2 erasure coding storage policy you can configure a policy to return a success response to a PUT
Object request so long as five of the six erasure coded fragments can be written at the time of the request.
Eventual consistency can reduce S3 write request latency and increase S3 write availability while still provid-
ing a high degree of data durability assurance. Eventual consistency also means that for a given object, there
may be times when not all of the object’s intended replicas or EC fragments exist in the system. For example, in
a 3X replication context there may be times when only two replicas of an object exist in the system, rather than
the intended three replicas.
HyperStore automatically implements several mechanisms to detect and replace missing replicas or EC frag-
ments: repair-on-read, proactive repair, and scheduled auto-repair.
5.12.1.1. Repair-On-Read
Whenever a read request is processed for a particular replicated object, all replicas of the object are checked
and any missing or out-of-date or corrupted replicas are automatically replaced or updated asynchronously.
This repair process occurs whether the read succeeds in meeting consistency requirements or not, so long
as at least one valid replica exists in the system.
252
5.12. Data Repair and Data Cleanup
Repair-on-read is also performed for erasure coded object reads, in the event that there are enough fragments
to decode the object but one or more of the object's fragments are missing. For example, if 4+2 erasure coding
is being used and the system when reading an object finds only 4 fragments for the object, the read will suc-
ceed and also the system will asynchronously replace the object's 2 missing fragments.
Proactive repair covers several circumstances wherein writes may succeed in the system as a whole but fail to
be written to a particular endpoint node:
The maximum time for which proactive repair jobs can be queued for a node that is unavailable is 4 hours by
default, and is configurable. Also configurable is the regular interval at which each node in the cluster checks
whether there are any queued proactive repair jobs for itself, and executes those jobs if there are any (default =
1 hour). For more information see "Configuring Automatic Data Repair" (page 254) .
l To repair replicated object data, each node is scheduled to have hsstool repair automatically run on it
once every 30 days.
l To repair erasure coded object data, each node is scheduled to have hsstool repairec automatically
run on it once every 29 days.
l To repair metadata in Cassandra, each node is scheduled to have hsstool repaircassandra auto-
matically run on it once every 7 days.
The repairs are scheduled and launched in such a way that within a service region only one repair of each
type is running at a time. For each repair type the system maintains a queue of nodes scheduled for auto-
repair -- a replicated data repair queue, an erasure coded data repair queue, and a Cassandra metadata
repair queue. For each repair type, repair of the node that is next in queue will not start until the repair oper-
ation completes on the node on which a repair is currently running. Consequently if you have a lot of nodes
and/or a high volume of data in your system, the actual time between repairs of a given node may be larger
than the scheduled interval. This effect is most pronounced with erasure coded data repair (which can be very
long-running) and least pronounced with Cassandra repair (which is relatively fast).
253
Chapter 5. Cluster and Node Operations
Note that when a repair operation is running on a target node, the scope of repair activity will extend to other
nodes as well. In the case of repair of replicated object data, repair of a target node will also make sure that for
objects that fall within the target node's primary token range, the objects' replicas also are present on the other
nodes where they are supposed to be. That same repair dynamic is true also for repair of replicated Cassandra
metadata.
In the case of erasure coded object data, for single data center storage policies and for multi- data center rep-
licated EC storage policies, repair of a target node has the effect of assessing and repairing all erasure coded
data on all nodes within the data center where the target node resides. And for multi- data center distributed
EC storage policies, repair of a target node has the effect of assessing and repairing all erasure coded data in
all of the participating data centers. In a multi- data center HyperStore service region, the erasure coded data
auto-repair queue is ordered in such a way that the target nodes alternate among the data centers -- for
example after a repair completes on a target node in DC1, then the next target node will be from DC2, and then
after that completes the next target node will be from DC3, and then after that completes the next target node
will be from DC1 again, and so on.
Note The intervals for scheduled auto-repair are configurable as described in "Configuring Auto-
matic Data Repair" (page 254).
Note The system allows repair operations of different types -- such as an hsstool repair operation and
an hsstool repairec operation -- to run concurrently within the same service region.
l After removing a dead node from your cluster. See "Removing a Node" (page 236).
l When a node is brought back online after being unavailable for longer than the configurable maximum
time that proactive repair can handle. See "Restoring a Node That Has Been Offline" (page 246).
For repair command syntax see hsstool repair and hsstool repairec and hsstool repaircassandra.
See also:
The Scheduled Auto-Repair feature is configurable by these settings in the CMC’s Configuration Settings
page (Cluster -> Cluster Config -> Configuration Settings):
254
5.12. Data Repair and Data Cleanup
For important details about how these auto-repair schedule settings are applied, while on the CMC's
Configuration Settings page click Help.
Note If you wish, you can have some or all of the auto-repairs of replica data and erasure coded
data use the "computedigest" option to combat bit rot. This feature is controlled by the "auto_
repair_computedigest_run_number" (page 424) setting in common.csv. By default "com-
putedigest" is not used in auto-repair runs.
Note If for some reason you want to trigger proactive repair on a particular node immediately,
you can do so by running the hsstool proactiverepairq command with the "-start" option.
If you edit properties file settings, be sure to push your changes to your cluster and to restart the HyperStore
Service (for a hyperstore-properties.erb edit) and/or the S3 Service (for an mts.properties.erb edit). For instruc-
tions see "Pushing Configuration File Edits to the Cluster and Restarting Services" (page 411).
Repairs that you have initiated via the CMC can also be tracked in the Operation Status page (Cluster ->
Operation Status). If you initiate a repair via the command line, you can track its progress by executing hsstool
255
Chapter 5. Cluster and Node Operations
opstatus (either on the command line or via the CMC's Node Advanced page [Cluster -> Nodes ->
Advanced]).
To view the cluster-wide schedule for auto-repairs, execute the hsstool repairqueue command on any node
(either on the command line or via the CMC's Node Advanced page).
You can customize these alerts — including an option for having SNMP traps sent — in the CMC's Alert Rules
page (Alerts -> Alert Rules).
For detailed repair status information for a recently finished repair run — for example, to get more information
about a FAILED repair run that you’ve been alerted to — in the CMC go to the Node Advanced page and from
the "Info" command group execute the hsstool opstatus command for the node on which the repair ran.
HyperStore allows you to temporarily disable its automatic data repair features, and also to stop data repairs
that are in progress.
IMPORTANT ! The scheduled auto-repair feature and the proactive repair feature are both important
for maintaining data integrity in your system. Do not permanently disable either of these features.
Note The system automatically disables the auto-repair and proactive repair features when you per-
form a HyperStore version upgrade and when you add nodes to your cluster; and the system auto-
matically re-enables the auto-repair and proactive repair features at the conclusion of those operations.
So the steps below are only needed if you want to temporarily disable these features in circumstances
other than system upgrade or system expansion.
To disable all automatic data repairs in a service region, go to the CMC's Node Advanced page and
execute the maintenance command autorepair with the Disable option. This will prevent any new schedule-
based auto-repairs or proactive repairs from launching. The target node can be any node in the service region;
automatic data repairs will be disabled throughout the service region regardless of which node receives the
command. Leave the "Type" option unselected.
256
5.12. Data Repair and Data Cleanup
Note Disabling automatic data repairs prevents new scheduled auto-repairs and new proactive
repairs from launching, but it does not stop repairs that are currently in-progress. For information
about doing the latter see "Stopping In-Progress Data Repairs" (page 258).
After completing the system operation that you are undertaking, be sure to re-enable automatic data repairs.
If you wish you can disable just a certain type of automatic data repair, rather than disabling all automatic data
repairs.
To disable just scheduled auto-repair for replicated object data, or just scheduled auto-repair for erasure
coded object data, or just scheduled auto-repair for Cassandra metadata, go to the CMC's Node Advanced
page and execute the maintenance command autorepair with the Disable option and with a "Type" specified
(Replicas or EC or Cassandra).
257
Chapter 5. Cluster and Node Operations
To disable just proactive repair go to the CMC's Node Advanced page and execute the maintenance com-
mand proactiverepair with the Disable option.
Be sure later to re-enable whichever type of automatic data repair you disabled, by using the same inter-
face.
l To stop an in-progress repair of replicated object data, use hsstool repair with the "-stop" option.
l To stop an in-progress repair of erasure coded object data, use hsstool repairec with the "-stop"
option.
l To stop an in-progress repair of Cassandra metadata, use hsstool repaircassandra with the "-stop"
option.
l To stop an in-progress proactive repair, use hsstool proactiverepairq with the "-stop" option.
It does not matter whether the repair in progress was initiated automatically by the system or initiated by an
operator -- in either case you can stop it with the "-stop" option.
Note A typical Cassandra metadata repair would take minutes, and a typical proactive repair would
take minutes or hours, but a replicated object data repair or erasure coded object data repair for a node
may take days (depending on the amount of data involved and on network bandwidth). If a replica
repair or EC data repair is in progress you can check either the CMC's Operation Status page (Cluster
-> Operation Status) or the Repair Status page (Cluster -> Repair Status) to see how far along the
repair is and approximately how much longer it will take, before deciding whether to stop it. The Repair
Status page also has progress information for proactive repairs.
258
5.13. Changing Node Role Assignments
For a summary of where specialized service roles are currently assigned in your cluster, go to the CMC's
Cluster Information page (Cluster -> Cluster Config -> Cluster Information).
If you wish you can change node role assignments, by using the HyperStore installer on the Configuration
Master node. The installer supports the role assignment operations listed below.
Note Each of the role assignment change operations listed below entails doing a configuration push
and a restart of the affected service (as described in the instructions for each operation). If you need to
perform multiple of these role assignment change operations, do them one operation at a time and with
a configuration push and affected service restart at the end of each operation. Do not perform multiple
different role assignment change operations while deferring the configuration push and service restart.
l "Move the Credentials DB Master Role or QoS DB Master Role" (page 270)
l "Move or Add a Credentials DB Slave or QoS DB Slave" (page 259)
l "Move the Cassandra Seed Role" (page 261)
l "Reduce or Change the List of CMC Hosts" (page 262)
l "Move the Redis Monitor Primary or Backup Role" (page 273)
l "Move the Cron Job Primary or Backup Role" (page 263)
l "Move the Configuration Master Primary or Backup Role" (page 266)
l "Change Internal NTP Servers or External NTP Servers" (page 265)
1. On the Configuration Master node, change into the installation staging directory and then launch the
HyperStore installer.
# ./cloudianInstall.sh
259
Chapter 5. Cluster and Node Operations
Master node you can launch the installer with this command:
$ hspkg install
Once launched, the installer's menu options (such as referenced in the steps below) are the same
regardless of whether it was launched from the HSH command line or the OS command line.
2. From the installer’s main menu, select "Advanced Configuration Options". Then from the Advanced Con-
figuration Options menu select "Change server role assignments". This displays the Change Server
Role Assignments menu.
3. From the Change Server Role Assignments menu select the option for Credentials slave nodes or the
option for QoS slave nodes.
4. The installer prompts you to specify a comma-separated list of all the hosts on which you want slaves to
run. The installer’s prompt text indicates [in brackets] which hosts are the current slaves. You can use
your entry at the prompt to either move a slave or add a slave.
Note If your HyperStore system has multiple data centers, the installer will prompt you sep-
arately for each data center’s slave host list. If for some data centers you want to continue using
the same slave(s) you can just press enter at the prompt rather than entering a host list.
260
5.13. Changing Node Role Assignments
5. After completing the interaction for specifying the slave locations, return to the Change Server Role
Assignments menu and select "Review cluster configuration". Then at the prompt confirm that you want
to apply the updated configuration to the Configuration Master.
6. Go to the installer’s main menu, then choose "Cluster Management" → "Push Configuration Settings to
Cluster" and follow the prompts.
7. Return to the "Cluster Management" menu, then choose "Manage Services" and restart the following
services:
l Redis Credentials or Redis QoS (whichever service you moved or added a slave for)
l Redis Monitor
l S3 Service (restarting this service also results in a restart of the Admin Service)
After these services have successfully restarted you can exit the installer.
To verify your change, log into the CMC and go to the Node Status page (Cluster -> Nodes -> Node Status).
Review the service status information for the node(s) on which you’ve located the slave(s). Among the listed
services on the node(s) should be "Redis Cred" (for a Redis Credentials slave) or "Redis QoS" (for a Redis
QoS slave). The absence of "(Master)" appended to the service name indicates it’s a slave instance, not a mas-
ter.
If you wish you can change the list of Cassandra seed nodes for your system, by following the steps below.
Note Any of your nodes can play the "seed" role. But bear in mind the recommendation that you have
three seed nodes per data center. For performance reasons it's not advisable to have more seed nodes
than this.
1. On the Configuration Master node, change into the installation staging directory and then launch the
HyperStore installer.
# ./cloudianInstall.sh
$ hspkg install
Once launched, the installer's menu options (such as referenced in the steps below) are the same
regardless of whether it was launched from the HSH command line or the OS command line.
2. From the installer’s main menu, select "Advanced Configuration Options". Then from the Advanced Con-
figuration Options menu select "Change server role assignments". This displays the Change Server
261
Chapter 5. Cluster and Node Operations
3. From the Change Server Role Assignments menu select "Cassandra seed nodes".
4. At the prompt enter a comma-separated list of hosts that you want to serve as Cassandra seed nodes. If
you want to keep the same host just press Enter rather than specifying a different host. (If you have a
multi-region system, you will be prompted for a list of Cassandra seed nodes for each region.)
5. After completing the interaction for specifying NTP Server Configuration, return to the Change Server
Role Assignments menu and select "Review cluster configuration". Then at the prompt confirm that you
want to apply the updated configuration to the Configuration Master.
6. Go to the installer’s main menu and choose "Cluster Management" → "Push Configuration Settings to
Cluster" and follow the prompts.
7. Return to the "Cluster Management" menu, then choose "Manage Services" and restart the Cassandra
service. After the Cassandra service has successfully restarted you can exit the installer.
To verify your change, you can look at the /opt/cassandra/conf/cassandra.yaml configuration file on any one of
your nodes and confirm that the "seeds:" parameter is set to the list of hosts that you specified.
below.
1. On the Configuration Master node, change into the installation staging directory and then launch the
HyperStore installer.
# ./cloudianInstall.sh
262
5.13. Changing Node Role Assignments
If you are using the HyperStore Shell (HSH) as a Trusted user, from any directory on the Configuration
Master node you can launch the installer with this command:
$ hspkg install
Once launched, the installer's menu options (such as referenced in the steps below) are the same
regardless of whether it was launched from the HSH command line or the OS command line.
2. From the installer’s main menu, select "Advanced Configuration Options". Then from the Advanced Con-
figuration Options menu select "Change server role assignments". This displays the Change Server
Role Assignments menu.
3. From the Change Server Role Assignments menu select "CMC nodes".
4. At the prompt enter a comma-separated list of hosts on which you want the CMC to run.
5. After completing the interaction for specifying your CMC host list, return to the Change Server Role
Assignments menu and select "Review cluster configuration". Then at the prompt confirm that you want
to apply the updated configuration to the Configuration Master.
6. Go to the installer’s main menu and choose "Cluster Management" → "Push Configuration Settings to
Cluster" and follow the prompts.
7. Return to the "Cluster Management" menu, then choose "Manage Services" and restart the CMC ser-
vice. After the CMC service has successfully restarted you can exit the installer.
263
Chapter 5. Cluster and Node Operations
The HyperStore Monitoring Data Collector resides on the same primary host and backup host as the cron
jobs. If the cron jobs role automatically fails over from the primary host to the backup host, the Monitoring Data
Collector role also fails over to the backup.
The system supports moving the primary cron job role (and with it, the primary Monitoring Data Collector role)
to a different host as described below. The same procedure also supports moving the backup cron job role
(and with it, the backup Monitoring Data Collector role) to a different host.
Note Do not assign the primary cron job role to the same host as your Configuration Master role. If the
cron job primary and the Configuration Master are on the same host and that host goes down, auto-
mated fail-over from the cron job primary to the cron job backup will not work.
1. On the Configuration Master node, change into the installation staging directory and then launch the
HyperStore installer.
# ./cloudianInstall.sh
$ hspkg install
Once launched, the installer's menu options (such as referenced in the steps below) are the same
regardless of whether it was launched from the HSH command line or the OS command line.
2. From the installer’s main menu, select "Advanced Configuration Options". Then from the Advanced Con-
figuration Options menu select "Change server role assignments". This displays the Change Server
Role Assignments menu.
264
5.13. Changing Node Role Assignments
3. From the Change Server Role Assignments menu select "Cronjob/Cluster Monitor node".
4. At the prompts specify your desired primary host and backup host for the cron jobs / cluster monitor.
5. After completing the interaction for specifying cron job hosts, return to the Change Server Role Assign-
ments menu and select "Review cluster configuration". Then at the prompt confirm that you want to
apply the updated configuration to the Configuration Master.
6. Go to the installer’s main menu and choose "Cluster Management" → "Push Configuration Settings to
Cluster" and follow the prompts. When Puppet pushes the current configuration settings to the cluster it
will also automatically restart cron.d on the affected nodes. You do not need to manually restart any ser-
vices. When the Puppet push completes you can exit the installer.
To verify your change, log into the CMC and go to the Cluster Information page (Cluster -> Cluster Config ->
Cluster Information). Review the service information section to confirm that your System Monitoring/Cronjob
primary and backup hosts are what you want them to be.
l 0.centos.pool.ntp.org
l 1.centos.pool.ntp.org
l 2.centos.pool.ntp.org
l 3.centos.pool.ntp.org
Note For more on how HyperStore automatically configures an NTP set-up for the cluster, see "NTP
Automatic Set-Up" (page 511)
As described below, the system supports changing the list of internal NTP servers (for data centers in which
you have more than four HyperStore nodes) and/or changing the list of external NTP servers.
1. On the Configuration Master node, change into the installation staging directory and then launch the
HyperStore installer.
# ./cloudianInstall.sh
$ hspkg install
Once launched, the installer's menu options (such as referenced in the steps below) are the same
regardless of whether it was launched from the HSH command line or the OS command line.
2. From the installer’s main menu, select "Advanced Configuration Options". Then from the Advanced Con-
figuration Options menu select "Change server role assignments". This displays the Change Server
Role Assignments menu.
265
Chapter 5. Cluster and Node Operations
3. From the Change Server Role Assignments menu select "NTP Server Configuration".
4. At the first prompt specify the HyperStore hosts that you want to act as the internal NTP servers, as a
comma-separated list. You must use host names, not IP addresses. If you want to keep the same hosts
that the system is currently using, just press Enter rather than specifying different hosts.
Note If you have multiple HyperStore data centers you will be prompted separately for the
internal NTP host list for each data center.
5. At the next prompt specify the external NTP servers with which you want the internal NTP servers to syn-
chronize, as a comma-separated list. For these external servers you can use either FQDNs or IP
addresses. If you want to keep the same external servers that the system is currently using, just press
Enter rather than specifying a different server list.
6. After completing the interaction for specifying NTP Server Configuration, return to the Change Server
Role Assignments menu and select "Review cluster configuration". Then at the prompt confirm that you
want to apply the updated configuration to the Configuration Master.
7. Go to the installer’s main menu and choose "Cluster Management" → "Push Configuration Settings to
Cluster" and follow the prompts. When Puppet pushes the current configuration settings to the cluster it
will also automatically restart ntpd on the affected nodes. You do not need to manually restart ntpd.
When the Puppet push completes you can exit the installer.
To verify your change, log into the CMC and go to the Cluster Information page (Cluster -> Cluster Config ->
Cluster Information). Review the service information section to confirm that your internal NTP hosts and
external NTP hosts are what you want them to be.
266
5.13. Changing Node Role Assignments
that node to be the primary Configuration Master, and also configures a second node to be the Configuration
Master backup. Any edits that you make to configuration templates on the Configuration Master primary are
automatically sync’d to the Configuration Master backup. If the primary goes down, you can manually fail over
the active Configuration Master role to the backup host.
There are two different operations that you can perform in regard to the Configuration Master role:
Note You can find out which host is currently the Configuration Master backup host in the CMC's
Cluster Information page (Cluster -> Cluster Config -> Cluster Information).
1. On the Configuration Master primary host, change into the installation staging directory and then
launch the HyperStore installer.
# ./cloudianInstall.sh
$ hspkg install
Once launched, the installer's menu options (such as referenced in the steps below) are the same
regardless of whether it was launched from the HSH command line or the OS command line.
2. From the installer’s main menu, select "Advanced Configuration Options". Then from the Advanced Con-
figuration Options menu select "Change server role assignments". This displays the Change Server
Role Assignments menu.
267
Chapter 5. Cluster and Node Operations
3. From the Change Server Role Assignments menu select "Installer/Config Manager backup node".
("Installer/Config Manager" is what this menu calls the Configuration Master.)
4. At the prompt specify the host to which you want to move the Configuration Master backup role.
5. After completing the interaction for specifying the Configuration Master backup host, return to the
Change Server Role Assignments menu and select "Review cluster configuration". Then at the prompt
confirm that you want to apply the updated configuration to the Configuration Master.
6. Go to the installer’s main menu and choose "Cluster Management" → "Push Configuration Settings to
Cluster" and follow the prompts.
7. Exit the installer, wait at least one minute, then log into the CMC and go to the Cluster Information
page (Cluster -> Cluster Config -> Cluster Information). Review the service information section to con-
firm that your Configuration Master primary and backup hosts are what you want them to be.
Manually Fail Over the Configuration Master Role from the Primary to the Backup
In this scenario there is a problem with your primary Configuration Master, and you want the backup Con-
figuration Master to become active.
IMPORTANT ! For this procedure you log into the current Configuration Master backup host and
implement the whole procedure from that host. You can find out which host is currently the Con-
figuration Master backup host in the CMC's Cluster Information page (Cluster -> Cluster Config ->
Cluster Information).
1. On the Configuration Master backup host, change into the installation directory and then launch the
HyperStore installer.
# ./cloudianInstall.sh
2. From the installer’s main menu, select "Advanced Configuration Options". Then from the Advanced Con-
figuration Options menu select "Start or stop Puppet daemon".
3. At the prompt specify that you want to stop Puppet. This stops any Puppet daemons that are currently
running in your cluster. (Puppet is the underlying technology that HyperStore uses for cluster Con-
figuration Management.)
4. After the installer indicates that Puppet has been stopped, return to the Advanced Configuration
Options menu and select "Remove existing Puppet SSL certificates". This will remove existing Puppet
SSL certificates, with no further prompts.
5. Return to the Advanced Configuration Options menu and select "Change server role assignments".
This displays the Change Server Role Assignments menu.
268
5.13. Changing Node Role Assignments
6. From the Change Server Role Assignments menu select "Installer/Config Manager backup node".
7. At the prompt specify the host to which you want to move the Configuration Master (config manager)
backup role. You need a new backup because you are converting the original backup into the primary
(by running through this procedure using the installer on the original backup -- this has the effect of turn-
ing it into the new primary).
8. After completing the interaction for specifying the Configuration Master backup host, return to the
Change Server Role Assignments menu and select "Review cluster configuration". Then at the prompt
confirm that you want to apply the updated configuration to the Configuration Master.
9. Go to the installer’s main menu and choose "Cluster Management" → "Push Configuration Settings to
Cluster" and follow the prompts.
10. Go to "Cluster Management" → "Manage Services" and restart the CMC.
11. Optionally, if you want to leave the Puppet daemons running, from the installer’s main menu select
"Advanced Configuration Options". Then from the Advanced Configuration Options menu select "Start
or stop Puppet daemon", and choose to start the daemons. After the daemons have successfully started
you can exit the installer.
To verify your change, log into the CMC and go to the Cluster Information page (Cluster -> Cluster Config ->
Cluster Information). Review the service information section to confirm that your Configuration Master primary
and backup hosts are what you want them to be. The former backup (from which you implemented the above
procedure) should now be the primary and the new backup should be as you specified during the procedure.
Note If the Configuration Master primary is now on the same host as the cron job primary role you
should move the cron job primary role to a different host. If the cron job primary and the running Con-
figuration Master are on the same host and that host goes down, automated fail-over from the cron job
primary to the cron job backup will not work.
269
Chapter 5. Cluster and Node Operations
If you wish you can move the Credentials DB Master role to a current Credentials DB slave node, or move a
QoS Master DB role to a current QoS DB slave node, by following the procedure in this section. If you do so,
the node that had been Master will automatically become a slave.
Note The system does not support moving the Master role to a node that is not currently a slave.
The above command will update the dump-credentials.rdb file in the Redis data directory (/var/lib/redis by
default).
The above command will update the dump-qos.rdb file in the Redis data directory (/var/lib/redis by default).
Note The BGSAVE operation saves the database in the background. To check to see whether the
save is completed yet you can use the Redis CLI LASTSAVE command, which returns the Unix time of
the most recently completed save. Do not move the Master role until the saving of the database backup
is completed.
l role is slave
l master_host is the current Master
l master_link_status is up
l master_sync_in_progress is 0
When submitting the Redis CLI command use port 6379 if the target node is a Credentials DB
slave, or use port 6380 if the target node is a QoS DB slave.
270
5.13. Changing Node Role Assignments
For example, if the Credentials DB Master role is currently on host "cloudian-node2" and you
want to move it to host "cloudian-node7" where there is currently a Credentials DB slave:
role:slave
master_host:cloudian-node2
master_link_status:up
master_sync_in_progress:0
Log into the CMC and go to Cluster → Nodes → Advanced. Select Command Type "Redis Monitor
Operations", then select a Cluster type (Credentials or QoS) and select Command "setClusterMaster".
For "Hostname" select the host to which you want to move the Master role. The drop-down list will show
only nodes that are currently slaves within the cluster type (Credentials or QoS) that you selected.
After making your selections, click Execute. The chosen slave will become the Master, while the Master
becomes a slave. The change happens immediately upon command execution.
3. Make the switch of the Master and slave permanent by updating your system configuration:
a. On the Configuration Master node (not the Credentials DB Master or QoS DB Master but rather
the Configuration Master from which cluster configuration is managed) change into the install-
ation staging directory and then launch the HyperStore installer.
# ./cloudianInstall.sh
$ hspkg install
Once launched, the installer's menu options (such as referenced in the steps below) are the
same regardless of whether it was launched from the HSH command line or the OS command
line.
271
Chapter 5. Cluster and Node Operations
b. From the installer’s main menu, select "Advanced Configuration Options". Then from the
Advanced Configuration Options menu select "Change server role assignments". This displays
the Change Server Role Assignments menu.
c. From the Change Server Role Assignments menu select the option for the master that you want
to move — Credentials DB Master or QoS DB Master.
d. At the prompt indicate the host to which you want to move the Master role. This should be the
same host that you checked in Step 1.
e. After completing the interaction for specifying the new Master host, return to the Change Server
Role Assignments menu and select "Review cluster configuration". Then at the prompt confirm
that you want to apply the updated configuration to the Configuration Master.
f. Return to the installer’s main menu, then choose "Cluster Management" → "Push Configuration
Settings to Cluster" and follow the prompts.
g. Return to the "Cluster Management" menu, then choose "Manage Services" and restart the fol-
lowing services, one service at a time:
l The DB service for which you’ve changed the master role (either Redis Credentials or
Redis QoS)
l HyperStore Service
l S3 Service
l Redis Monitor
l IAM Service
After these services have successfully restarted you can exit the installer.
To verify your change, log into the CMC and go to the Cluster Information page (Cluster -> Cluster Config ->
Cluster Information). Review the service information section to confirm that the Master that you moved is now
where you want it to be.
272
5.13. Changing Node Role Assignments
Now, the former Credentials DB or QoS DB slave has been promoted to Master and the former Master has
been demoted to slave.
IMPORTANT ! If you want to remove the demoted host (the former Master that’s now a slave) from
your cluster, you must first move its slave role to a different host in your cluster. For instructions see
"Move or Add a Credentials DB Slave or QoS DB Slave" (page 259).
IMPORTANT ! In a multi- data center HyperStore system, the Redis Monitor backup must remain in the
same data center as the Redis Monitor primary, and this must be the same data center as where the
Credentials DB Master is located.
The system supports moving the primary or backup Redis Monitor to a different host as described below.
1. On the Configuration Master node, change into the installation staging directory and then launch the
HyperStore installer.
# ./cloudianInstall.sh
$ hspkg install
Once launched, the installer's menu options (such as referenced in the steps below) are the same
regardless of whether it was launched from the HSH command line or the OS command line.
2. From the installer’s main menu, select "Advanced Configuration Options". Then from the Advanced Con-
figuration Options menu select "Change server role assignments". This displays the Change Server
Role Assignments menu.
273
Chapter 5. Cluster and Node Operations
3. From the Change Server Role Assignments menu select the option for the Redis Monitor instance that
you want to move (Primary Redis Monitor or Backup Redis Monitor).
4. At the prompt specify the host to which you want to move the Redis Monitor instance.
5. After completing the interaction for specifying the new Redis Monitor location, return to the Change
Server Role Assignments menu and select "Review cluster configuration". Then at the prompt confirm
that you want to apply the updated configuration to the Configuration Master.
6. Go to the installer’s main menu and choose "Cluster Management" → "Push Configuration Settings to
Cluster" and follow the prompts.
7. From the "Cluster Management" menu choose "Manage Services" and restart the Redis Monitor ser-
vice. After the Redis Monitor successfully restarts you can exit the installer.
To verify your change, log into the CMC and go to the Cluster Information page (Cluster -> Cluster Config ->
Cluster Information). Review the service information section to confirm that your Redis Monitor primary and
backup hosts are what you want them to be.
274
5.14. Cron Jobs and Automated System Maintenance
jobs, go to the CMC's Cluster Information page (Cluster -> Cluster Config -> Cluster Information). In the Ser-
vice Information section you will see the identity of the System Monitoring / Cronjob Primary Host, from which
the cron jobs are run.
Note You will also see the identity of the System Monitoring / Cronjob Backup Host. If the primary cron
job instance goes down (due to the host going down or crond going down on the host) and remains
down for 10 minutes, the backup cron job instance will automatically take over the primary role and
start running the system cron jobs.
The cron jobs themselves are configured in the /etc/cron.d/cloudian-crontab file on the host node. If you want to
adjust the scheduling of these cron jobs you should do so via Puppet configuration management, by editing
the configuration template file /etc/cloudian-<version>-puppet/modules/cloudians3/templates/cloudian-
crontab.erb on the Configuration Master node. For general information on how to configure cron job schedul-
ing, refer to any reputable resource on the topic.
Note that most of the cron jobs configured in cloudian-crontab.erb have "> /dev/null 2>&1" appended to them
and therefore they direct all output to /dev/null.
For information on the Admin API calls referenced in this section, see the Cloudian HyperStore Admin
API Reference.
System cron jobs are implemented for the following system tasks:
This cron job invokes a process that logs "snapshot" system statistics each minute in support of the Cloudian
Management Console’s system monitoring functionality.
This cron job uploads a daily diagnostics file to a configurable S3 URI, if the HyperStore "Phone Home" feature
(also known as "Smart Support") is enabled. Typically this would be the S3 URI of Cloudian Support.
Note The upload will occur within an hour of the time specified in the crontab. A random wait time is
built into the upload process so that not all Cloudian customer environments are uploading to Cloudian
Support at the same time.
275
Chapter 5. Cluster and Node Operations
Note For an overview of how the HyperStore system tracks service usage by groups and users, see
"Usage Reporting Feature Overview" (page 165).
This cron job writes snapshots of per-user and per-group counts for stored bytes and number of stored objects
from the Redis QoS database over to the "Raw" column family in the Metadata DB's "Reports" keyspace. The
operation is applied only to users and groups that have uploaded or deleted objects in the time since the oper-
ation was last executed.
This cron job writes snapshots of per-user and per-group counts for stored bytes and number of stored objects
from the Redis QoS database over to the "Raw" column family in the Metadata DB's "Reports" keyspace. The
operation is applied to all users and groups.
This cron job repairs Redis QoS stored bytes and stored object counts for up to 1000 active or "dirty" users.
See the Admin API method description for more detail.
Scope: One job per region for each roll-up granularity (hour, day, month)
These three cron jobs create summary (or "roll-up") usage reports data from more granular reports data. The
hourly roll-up data is derived from the raw data (the transactional data and stored bytes/stored object count
snapshot data). The daily roll-up data and monthly roll-up data are both derived from the hourly roll-up data.
276
5.14. Cron Jobs and Automated System Maintenance
Note Hourly rollup jobs that fail -- such as if a relevant service is down or unreachable -- will be retried.
The time span for which failed rollup jobs are eligible for retry is configuration by the "usage.rol-
lup.hour.maxretry" (page 480) property in mts.properties.erb. The default is 6 hours.
This cron job moves bucket logging data from the Metadata DB's BucketLog column family into the S3 storage
system, in support of the S3 Bucket Logging feature.
This cron job performs two tasks in support of S3 bucket lifecycle policies:
l Transitions (auto-tiers) objects to a tiering destination system, if the objects have reached their sched-
uled auto-tiering interval or date.
l Deletes objects from HyperStore storage (or from the remote tiered storage system if they’ve been auto-
tiered), if the objects have reached their scheduled expiration interval or date.
The cron job distributes auto-tiering and auto-expiry processing workload across the cluster. Specifically, the
auto-tiering and auto-expiry processing workload associated with a given bucket is spread out across all the
nodes that participate in the storage policy used by that bucket. For example, in a HyperStore system with two
data centers, if a bucket uses a storage policy that stores data only in one of the two data centers, the pro-
cessing workload of that bucket's auto-tiering and auto-expiry will be spread among the nodes in that one data
center. By contrast, for a bucket that uses a storage policy that stores data in both data centers, the processing
workload associated with that bucket's auto-tiering will be spread among all nodes in both data centers.
Note For more information on the HyperStore auto-tiering feature, see "Auto-Tiering Feature Over-
view" (page 126).
277
Chapter 5. Cluster and Node Operations
This cron job executes queued object-restore jobs. Restore jobs are placed in queue when S3 clients invoke
the S3 API method RestoreObject, to restore a local copy of objects that have been auto-tiered to a remote stor-
age system. The cron job executes queued restored jobs every six hours.
This cron job generates bucket content inventories, for any buckets that have bucket inventories configured.
For the relevant S3 API see "PutBucketInventoryConfiguration" in the Cloudian HyperStore AWS APIs Sup-
port Reference. For CMC support of this feature, while on the CMC's Bucket Properties page click Help.
This hourly operation cleans (removes) Cassandra "tombstones", which are markers that indicate that a cell or
row has been deleted.
Note The issue described below applies only to buckets created prior to the release of HyperStore ver-
sion 7.4. Buckets created in HyperStore 7.4 and later use a different metadata structure than older buck-
ets. Starting in HyperStore 7.5, all buckets created in HyperStore 7.4 and later use an improved
tombstone monitoring and cleanup mechanism that leverages the new metadata structure. Such buck-
ets are able to support mass deletions without negative service impact (they are not subject to the
100,000 deletes per hour limit that older buckets are). If you have buckets that were created prior to
HyperStore 7.4 and that need to be able to support mass deletes, contact Cloudian Support for assist-
ance in upgrading those buckets to the newer metadata structure (known as "rules based partitioning").
Under normal circumstances the hourly running of the /.system/cleantombstones cron job should ensure that
there is no excessive build-up of tombstones in Cassandra (the Metadata DB). However, it is possible to
encounter TombstoneOverwhelmingException errors in Cassandra logs and an inability to successfully
execute an S3 ListObjects operation against a specific bucket, in either of these unusual circumstances:
l An S3 client application has attempted to delete more than 100,000 objects from the bucket in less than
an hour.
l Over the course of multiple hours an S3 client application has attempted to delete more than 100,000
objects from the bucket and during that time the hourly /.system/cleantombstones cron job has failed to
purge tombstones for one reason or another.
In such circumstances you can trigger tombstone removal by connecting to any S3 Service’s JMX port (19080)
and submitting a purgeTombstone command with the bucket name as input. If you are using JConsole, after
connecting to port 19080 on an S3 node select the MBeans tab, then select com.cloudian.ss.cassandra.cl →
BatchJobs → Operations → purgeTombstone. On the right side of the console enter the bucket name as the p1
value and then execute the operation (by clicking purgeTombstone on the right side of the console).
278
5.14. Cron Jobs and Automated System Maintenance
Note This purgeTombstone operation will clean up all buckets' tombstones (not just the specified tar-
get bucket's tombstones). Specifically, the operation implements a Cassandra repair followed by a Cas-
sandra compaction, for each UserData_<policyId> keyspace. These keyspaces contain bucket
metadata and object metadata, and there is one such keyspace for each of your storage policies.
Here is the purgeTombstone operation triggered by using the jmxclient tool (which exists on each HyperStore
node). Replace <version> with the jmxclient version (you can check this under /opt/cloudian/tools), <IP
address> with the node's IP address, and <bucketname> with the target bucket name.
Operation invoked: Admin API method POST /system/deletedUserCleanup (for system use only; this Admin API
method is not covered in the Admin API documentation)
This cron job attempts to complete the user deletion process for users for whom the deletion process failed to
complete on the original attempt. These are users who are stuck in "Deleting" status for one reason or another.
This cron job processes any pending storage policy delete jobs. System operators can initiate the deletion of
an unused storage policy (a storage policy that is not assigned to any buckets) through the CMC's Storage
Policies page. This operator action marks the policy with a "DELETED" flag and makes it immediately unavail-
able to service users. However, the full deletion of the storage policy from the system (specifically, the deletion
of the Metadata DB keyspace associated with the policy) is not processed until this cron job runs.
This cron job also processes any pending storage policy creation jobs, in the event that multiple storage policy
creation requests have been initiated in a short amount of time -- which can result in queueing of storage policy
creation jobs. More typically, storage policy creation completes shortly after the creation is initiated through the
CMC.
This cron job processes any pending cross-region replication jobs. These pending jobs result when an
attempt to replicate an object from the source bucket to the destination bucket results in a temporary error such
as a connection failure or an HTTP 5xx error. This cron job retries the replication of such objects. For such
279
Chapter 5. Cluster and Node Operations
objects, the retries will recur once every four hours until either the objects are successfully replicated to the des-
tination system or a permanent error is encountered.
All insertions, updates, and deletes of HyperStore object metadata in your Elasticsearch cluster (if you have
one) are implemented by this cron job. Until the cron job runs, the Elasticsearch update requests are queued in
the Metadata DB. Any requests that fail when this cron job runs are retried at the next running of the cron job.
Note In the event that your Elasticsearch cluster is unavailable for more than a few hours, the request
queue will become full and when the cluster is available again you should use the elasticsearchSync
tool to update the metadata in Elasticsearch. For more information about this tool see "Enabling
Elasticsearch Integration for Metadata Search" (page 157). For general information about the Elastic-
search integration feature see "Enabling Elasticsearch Integration for Metadata Search" (page 157).
This cron job retries S3 Bucket Notification messages that failed to be successfully sent on previous attempts.
For more information about the S3 Notification feature see the SQS section in the Cloudian HyperStore
AWS APIs Support Reference.
Cassandra regularly implements a "minor" compaction process that occurs automatically. This automatic com-
paction process is sufficient in the context of the HyperStore system. For the HyperStore system, you do not
need perform "major" compactions using the nodetool utility.
You can monitor the compaction process by using JConsole or another JMX client to connect to a Cassandra
node’s JMX listening port (7199 by default). By accessing the CompactionManagerMBean through the JMX
console, you can check the progress of any compactions that are currently executing, and also check on the
number of completed and pending compactions.
280
5.15. Automated Disk Management
HyperStore has mechanisms for automatically correcting imbalances of data disk utilization on each node, and
automatically discontinuing new writes to disks or nodes that are near capacity.
The same imbalance-correcting logic applies if the system detects that a particular disk's usage is lower than
the node's average disk utilization by more than the configured delta. The system automatically migrates one
or more storage tokens from the node's more heavily utilized disks to the under-utilized disk.
You do not need to run a repair operation in connection with this disk usage balancing feature. The disk usage
rebalancing mechanism is not intended to move existing data between disks. It is designed to impact only
objects uploaded after the time that the automated token migration was executed. If you were to perform a
repair, existing objects on the over-used disk will not follow a migrated token to the token’s new home on a
less-used disk. For more about how the token migration works see "Dynamic Object Routing" (page 284).
Note that the disk usage balancing feature applies only to HyperStore data disks (where S3 object data is
stored). It does not apply to disks that store only Cassandra, Redis, or the OS.
281
Chapter 5. Disk Operations
The object data on the 90% full disk remains readable and the disk will still support S3 get requests and delete
requests, but any S3 writes associated with the token ranges that used to be on the disk are now directed to the
new locations of those token ranges, on other disks on the same node. This disk-level "stop-write" condition
triggers a warning message in the HyperStore application log, which in turn results in an alert being generated
in the CMC's Alerts page (Alerts -> Alerts).
For a disk that reaches stop-write condition, if its capacity usage level later falls back below 90% it becomes eli-
gible to receive tokens from any other disks on the node that subsequently reach 90% utilization and go into a
stop-write condition. Recall though that when the system moves tokens away from a 90% full disk it prioritizes
allotting tokens to disks that have relatively low usage. So a disk that is only a little below 90% usage is less
likely to receive tokens during this process than disks that are at lower usage levels on the node.
Note When a disk at 90% usage level enters stop-write condition it stops receiving new S3 writes, but
object data writes associated with hsstool repair, hsstool repairec, hsstool rebalance, and hsstool
decommission operations continue to be supported until the disk reaches 95% usage. If the disk
reaches 95% usage, then subsequent writes that hsstool operations target to that disk will fail (and in
the operation's status metrics these failures will increment the "Failed count").
Note If you want to customize the disk usage check interval (default 30 minutes) or the "stop-write"
threshold (default 90%) or the hssstool operation write threshold (default 95%) or the "start-write"
threshold (default 85%), consult with Cloudian Support.
l The S3 Service is not able to write object data to the token ranges assigned to that node. This may
result in the failure of some S3 PUT requests from client applications. Whether failures occur or not
depends on the consistency requirements that you have configured for your storage policies, and on
the availability of the other nodes in your system. For example if you are using 3X replication with a
write consistency level of ALL, then an S3 PUT of an object will fail if one of the three endpoint token
ranges for the object (as determined automatically by a hash of the bucket name and object name) is a
token range on the node that's in stop-write condition. With 3X replication and a write consistency level
of QUORUM, then an S3 PUT of an object will fail if one of the object's three endpoint token ranges is
on the stop-write node and either of the nodes hosting the object's other two endpoint token ranges is
also unavailable (such as if one of those other nodes is down, or is in stop-write condition). Note that
HyperStore does not reallocate tokens from a node that's in stop-write condition to other nodes in the
system. Dynamic token reallocation is supported only between the disks on a node -- not between
nodes.
282
5.15. Automated Disk Management
Note After detecting that a node is in stop-write condition, the S3 Service will mark that node as
unavailable for object data writes and will stop sending object data write requests to that node
(rather than continuing to send object data write requests to that node and having all those
requests fail).
l The S3 Service is still able -- while implementing requests from S3 client applications -- to read object
data from that node and to delete object data from that node.
l Object metadata can still be written to the Cassandra database on that node, since this is on the OS
and metadata disk(s) not the data disks. The stop-write feature applies only to the data disks.
l The hsstool repair, hsstool repairec, and hsstool decommission operations cannot write to a node
that's in stop-write condition. If a node is in stop-write condition, then writes that hsstool operations dir-
ect to that node will fail (and in the operation's status metrics these failures will increment the "Failed
count").
l When a node goes into stop-write condition a critical message appears in the CMC's Dashboard page
( ); an alert is generated in the Alerts page (Alerts -> Alerts); and the node is marked by a red
disk-stack icon in the Data Centers page (Cluster -> Data Centers).
IMPORTANT ! To avoid having disks and nodes go into a stop-write condition, closely monitor your
system's current and projected disk space usage and expand your cluster well in advance of disks
and nodes becoming nearly full. See "Capacity Monitoring and Expansion" (page 298).
A node will exit the "stop-write" condition -- and the S3 layer will resume sending object data write requests to
the node -- if enough data is deleted from the node's data disks to reduce the node's average disk utilization
to 85% usage or lower. At this point all of the node's tokens -- and consequently all of the S3 object data writes
on the node -- will be allocated to the data disks that are currently at 85% usage or lower. Disks that are still
above 85% utilization will not be assigned any tokens and will not support writes.
When you have a node in stop-write condition there are two ways in which disk space utilization on the node
can be reduced such that the node starts accepting writes again:
l You can delete objects from your HyperStore system. In HyperStore each object's replicas or eras-
ure coded fragments are stored on multiple nodes and there is not a way to target only the replicas and
fragments on a specific node for deletion. Rather, you can delete objects from the system as a whole so
that the overall disk space utilization in the system is reduced. This will have the effect of also reducing
disk space utilization on the node that's in stop-write condition. You can delete objects either through
the S3 interface or through the special Admin API call that lets you efficiently delete all the objects in a
specified bucket (see bucketops). You can use the CMC's Node Status page (Cluster -> Nodes ->
Node Status) to periodically check the node's disk space usage.
Note When you delete objects using the S3 interface or Admin API, the objects are immediately
flagged for deletion but the data is not actually removed from disk until the running of the hourly
batch delete cron job.
l You can add nodes to your cluster, and execute the associated rebalancing and cleanup operations
(for instructions see "Adding Nodes" (page 212)). This will have the effect of reducing data utilization
283
Chapter 5. Disk Operations
on your existing nodes, including the node that's in the stop-write condition. Note that rebalance and
cleanup are long-running operations and so this approach to getting a node out of stop-write condition
may take several days or more depending on how much data is in your system.
Dynamic Object Routing allows for low-impact automated token migrations in circumstances such as disk
usage imbalance remediation, disk stop-write implementation, and automated disk failure handling.
Note This automated token migration occurs only between disks on a node -- not between nodes.
l Continuously monitors the HyperStore Service application log for error messages indicating a failure to
read from or write to a disk (messages containing the string "HSDISKERROR").
l At a configurable interval (default is once each hour), tries to write one byte of data to each HyperStore
data disk. If any of these writes fail, /var/log/messages is scanned for messages indicating that the file
system associated with the disk drive in question is in a read-only condition (message containing the
string "Remounting filesystem read-only"). This recurring audit of disk drive health is designed to pro-
actively detect disk problems even during periods when there is no HyperStore Service read/write activ-
ity on a disk.
If HyperStore Service application errors regarding a drive occur in excess of a configurable error rate
threshold, or if the proactive audit detects that a drive is in read-only condition, then HyperStore by default auto-
matically disables the drive.
When a drive is automatically disabled, the system will no longer direct writes or reads to that drive. The stor-
age tokens from the disabled disk are automatically moved to the other disks on the host, so that new writes
associated with those token ranges can be directed to the other disks. The disabled disk's data is not recreated
on the other disks, and so that data is unreadable on the host. For more detail on the configuration options and
the disk disabling behaviors see HyperStore Disk Failure Action.
Note You can tell that a disk is disabled by viewing its status in the "Disk Detail Info" section of the
CMC's Node Status page (Cluster -> Nodes -> Node Status).
284
5.15. Automated Disk Management
Note As part of the Smart Support feature, if a data disk on one of your HyperStore nodes fails
(becomes disabled), information about the failed disk is automatically sent to Cloudian Support within
minutes. This triggers the automatic opening of a Support case for the failed disk. For HyperStore Appli-
ances, automatic case creation is also performed for failed OS disks.
The automatic disk disabling feature works only if you have multiple HyperStore data disks on the host. If there
is only one HyperStore data disk on the host, the system will not automatically disable the disk even if errors
are detected.
Also, the automatic disk disabling feature works only if you are using ext4 file systems mounted on raw disks
(which is the only officially supported configuration). If you've installed HyperStore on nodes with unsupported
technologies such as Logical Volume Manager (LVM) or XFS file systems, the automatic disk disabling feature
will be deactivated by default and the HyperStore system will not take any automatic action in regard to disk
failure. Also, the automatic disk disabling feature does not work in Xen or Amazon EC2 environments. Contact
Cloudian Support if you are using any of these technologies.
By default, each node is checked for disk imbalance every 72 hours, and token migration is triggered if a disk’s
utilization percentage differs from the average disk utilization percentage on the node by more than 10%. For
example, if the average disk space utilization on a node is 35%, and the disk space utilization for Disk4 is 55%,
then one or more tokens will be migrated away from Disk4 to other disks on the node (since the actual delta of
20% exceeds the maximum allowed delta of 10%). For another example, if the average disk utilization on a
node is 40%, and the disk utilization for Disk7 is 25%, then one or more tokens will be migrated to Disk7 from
the other disks on the node.
The settings for adjusting the frequency of the disk balance check or the delta that triggers disk migration are:
After editing either of these settings, be sure to push your changes to the cluster and restart the HyperStore Ser-
vice. For instructions see "Pushing Configuration File Edits to the Cluster and Restarting Services" (page
411).
Note In connection with the HyperStore "stop-write" feature, if you want to customize the disk usage
check interval (default 30 minutes) or the "stop-write" threshold (default 90%) or the "start-write"
threshold (default 85%), consult with Cloudian Support. These settings are not in HyperStore con-
figuration files by default.
285
Chapter 5. Disk Operations
1. Use JConsole to access the HyperStore Service’s JMX port (19082 by default) on the host for which
you want the balance check to be performed.
2. Access the com.gemini.cloudian.hybrid.server.disk → VirtualNodePartitioner MBean, and under "Oper-
ations" execute the "shuffletoken" operation.
The operation will run in a background thread and may take some time to complete. If the space utilization for
any disk is found to differ from the node's average disk utilization by more than the configured maximum delta
(10% by default), then tokens will automatically be migrated between disks.
IMPORTANT ! The automatic disk failure handling feature does not work correctly in Xen, Logical
Volume Manager (LVM), or Amazon EC2 environments. Contact Cloudian Support if you are using any
of these technologies.
Several aspects of the HyperStore automated disk failure handling feature are configurable.
In the CMC's Configuration Settings page (Cluster -> Cluster Config -> Configuration Settings), in the "Sys-
tem Settings" section, you can configure a "HyperStore Disk Failure Action". This is the automated action for the
system to take in the event of a detected disk failure, and the options are "Disable Disk + Move Its Tokens" (the
default) or "None". (For more detail, while on the CMC's Configuration Settings page click Help.) If you change
this setting in the Configuration Settings page, your change is dynamically applied to the cluster — no service
restart is necessary.
By default the threshold is 100 errors in the space of 30 minutes, for a particular disk.
Also in hyperstore-server.properties.erb is this setting for the interval at which to conduct the proactive disk
drive audit:
After editing any of these hyperstore-server.properties.erb settings, push the changes to the cluster and restart
the HyperStore Service. For instructions see "Pushing Configuration File Edits to the Cluster and Restarting
Services" (page 411).
286
5.15. Automated Disk Management
Note As part of the Smart Support feature, if a data disk on one of your HyperStore nodes fails
(becomes disabled), information about the failed disk is automatically sent to Cloudian Support within
minutes. This triggers the automatic opening of a Support case for the failed disk. For HyperStore Appli-
ances, automatic case creation is also performed for failed OS disks.
l The current space utilization information for each disk on the selected node.
l The current health status of each disk on the selected node. Each data disk’s status is communicated
by a color-coded icon, with the status being one of OK, Error, or Disabled.
Note that such an alert does not necessarily indicate that the disk has been automatically disabled. This is
because the alert is triggered by the appearance of a single "HSDISKERROR" message in the HyperStore Ser-
vice application log, whereas the automatic disabling action is triggered only if such messages appear at a rate
exceeding the configurable threshold.
l Re-enable the same disk. You might choose this option if, for example, you know that some cause
other than a faulty disk resulted in the drive errors and the automatic disabling of the disk.
HyperStore provides a highly automated method for bringing the same disk back online. For instructions
see "Enabling a HyperStore Data Disk" (page 289).
l Replace the disk. You would choose this option if you have reason to believe that the disk is bad.
HyperStore provides a highly automated method for replacing a disk and restoring data to the new disk.
For instructions see "Replacing a HyperStore Data Disk" (page 291).
With either of these methods, any tokens that were migrated away from the disabled disk will be auto-
matically migrated back to it (or its replacement). Object data that was written in association with the
affected tokens while the disk was disabled — while the tokens were temporarily re-assigned to other disks on
the host — will remain on those other disks and will be readable from those disks (utilizing HyperStore
"Dynamic Object Routing" (page 284)).
Note As part of the Smart Support feature, if a data disk on one of your HyperStore nodes fails
(becomes disabled), information about the failed disk is automatically sent to Cloudian Support within
287
Chapter 5. Disk Operations
minutes. This triggers the automatic opening of a Support case for the failed disk. For HyperStore Appli-
ances, automatic case creation is also performed for failed OS disks.
HyperStore supports a method for temporarily disabling a HyperStore data disk drive so that you can per-
form planned maintenance such as a firmware upgrade.
Note If you are replacing a disk, follow the instructions for "Replacing a HyperStore Data Disk"
(page 291) rather than the instructions below.
l Unmounts the disk's file system, comments out its entry in /etc/fstab, and marks the disk as unavailable
for HyperStore reads and writes.
l Moves the disk’s assigned storage tokens to the remaining HyperStore data disks on the same host, in
a way that’s approximately balanced across those disks. This is a temporary migration of tokens which
will be reversed when you later re-enable or replace the disk. While the tokens are on the other disks,
writes of new or updated S3 object data that would have gone to the disabled disk will go to the other
disks on the host instead.
The existing object data on the disabled disk is not recreated on the other disks and therefore that data will be
unreadable on the host. Whether the system as a whole can still provide S3 clients with read access to the
affected S3 objects depends on the availability of other replicas or erasure coded fragments for those objects,
elsewhere within the cluster.
288
5.17. Enabling a HyperStore Data Disk
2. Choose the Target Node (the node on which the disk resides), and enter the Mount Point of the disk
that you want to disable (for example /cloudian6).
3. Click Execute.
After the disableDisk operation completes, go to the CMC's Node Status page (Cluster -> Nodes -> Node
Status). In the "Disk Detail Info" section, the device that you disabled should now have this red status icon
(indicating that it’s disabled and its tokens have been migrated to other disks on the same host):
Later, after completing your maintenance work on the disk, follow the instructions for "Enabling a HyperStore
Data Disk" (page 289). When you re-enable the disk, the tokens that had been moved away from the disk will
be moved back to it.
HyperStore supports a method for enabling an existing HyperStore data disk that is currently disabled. You
can tell that a disk is disabled by viewing its status in the "Disk Detail Info" section of the CMC's Node Status
page (Cluster -> Nodes -> Node Status). A disk can go into a disabled state either because you disabled it by
using the HyperStore disableDisk function (as described in "Disabling a HyperStore Data Disk" (page 288))
or because the HyperStore system automatically disabled it in response to disk errors (as described in "Auto-
matic Disk Failure Handling" (page 284)).
You can enable a disk if you know that the disk problem was only temporary and that the disk is still healthy
enough to use.
289
Chapter 5. Disk Operations
Note If you are replacing a disk, follow the instructions for "Replacing a HyperStore Data Disk"
(page 291) rather than the instructions below.
l Remounts the disk (using the same mount point that the disk previously had), uncomments its entry in
/etc/fstab, and marks the disk as available for HyperStore reads and writes.
l Moves back to the disk the same set of storage tokens that were automatically moved away from the
disk when it was disabled.
After a disk is re-enabled in this way, writes of new or updated S3 object data associated with the returned
tokens will go to the re-enabled disk. And the existing object data that was already on the disk will once again
be readable. Meanwhile object data that was written to the affected token ranges while the disk was disabled
— while the tokens were temporarily re-assigned to other disks on the host — will remain on those other disks
and will be readable from those disks. That data will not be moved to the re-enabled disk.
Note For information on how HyperStore tracks token location over time so that objects can be written
to and read from the correct disks, see "Dynamic Object Routing" (page 284).
2. Choose the Target Node (the node on which the disk resides), and enter the Mount Point of the disk
that you want to enable.
3. Click Execute.
After the enableDisk operation completes, go to the CMC's Node Status page (Cluster -> Nodes -> Node
Status). In the "Disk Detail Info" section, the device that you enabled should now have this green status icon
(indicating that its status is OK):
290
5.18. Replacing a HyperStore Data Disk
If instead the disk icon is displaying in red (indicating an "Error" status), click the Clear Error History button.
Doing so should return the disk to OK status.
Note If the CMC continues to show status information for the disk's old device address (as well as a
new device address), and if clicking the Clear Error History button fails to clear the old information,
ssh into the node on which you enabled the disk and run the command systemctl restart cloudian-
agent. Then wait at least one minute, and check the CMC again. If the old information is still displaying,
click the Clear Error History button again.
HyperStore supports a method for activating a replacement HyperStore data disk and restoring data to it.
This procedure applies to either of these circumstances:
l You are replacing a disk that is currently disabled. You can tell that a disk is disabled by viewing its
status in the "Disk Detail Info" section of the CMC's Node Status page (Cluster -> Nodes -> Node
Status). A disk can go into a disabled state either because you disabled it by using the HyperStore dis-
ableDisk function (as described in "Disabling a HyperStore Data Disk" (page 288)) or because the
HyperStore system automatically disabled it in response to disk errors (as described in "Automatic
Disk Failure Handling" (page 284)).
l You are replacing a disk that is not currently disabled. In this case, it is not necessary for you to use the
disableDisk function before replacing the disk. When you pull the disk from the host machine Hyper-
Store will automatically disable the associated mount point, and you can proceed to replace the disk.
After you’ve pulled the bad disk and physically installed the replacement disk, HyperStore will take care of
all the remaining set-up and restoration tasks when you follow the steps in "Replacing a HyperStore Data
Disk" (page 292).
l Creates a primary partition and an ext4 file system on the new disk.
l Establishes appropriate permissions on the mount.
l Remounts the new disk (using the same mount point that the prior disk had), uncomments its entry in
/etc/fstab, and marks the disk as available for HyperStore reads and writes.
l Moves back to the new disk the same set of storage tokens that were automatically moved away from
the prior disk when it was disabled.
291
Chapter 5. Disk Operations
l Performs a data repair for the new disk (populating the new disk with its correct inventory of object rep-
licas and erasure coded object fragments).
Going forward, writes of new or updated S3 object data associated with the returned tokens will go to the new
disk. Meanwhile object data that was written to the affected token ranges while the mount point was disabled
— while the tokens were temporarily re-assigned to other disks on the host — will remain on those other disks
and will be readable from those disks. That data will not be moved to the new disk.
For information on how HyperStore tracks token location over time so that objects can be written to and read
from the correct disks, see "Dynamic Object Routing" (page 284).
IMPORTANT ! Do not replace multiple disks concurrently within a HyperStore service region -- even
if the disks are in different data centers. Replace just one disk at a time, within a service region. When
replacing a disk as described below, wait until the automatically triggered data repair for the disk is
completed -- you can monitor the repair status in the Operation Status page (Cluster -> Operation
Status) -- before replacing any other disk. Replacing multiple disks concurrently can potentially result
in data loss.
Note HyperStore supports replacing a bad disk on an existing node while you are in the midst of
adding new nodes to your system. Wait until the "Add Node" operation completes successfully, and
then you can follow the steps below either before you trigger a rebalance operation on the new node(s)
or while the rebalance operation(s) are in progress. You do not need to wait until the rebalance oper-
ation(s) complete. However, see "Extra Step After Disk Replacements Performed During Cluster
Expansion" (page 293).
After you’ve physically installed the replacement disk, follow these steps to activate the replacement disk and
restore data to it:
1. In the CMC's Node Advanced page, select command type "Disk Management" and then select the
"replaceDisk" command.
292
5.19. Replacing a Cassandra Disk
2. Choose the Target Node (the node on which the disk resides), and enter the Mount Point of the replace-
ment disk. This must be the same as the mount point of the disk that you replaced.
3. Click Execute.
After the replaceDisk operation completes, go to the CMC's Node Status page (Cluster -> Nodes -> Node
Status). In the "Disk Detail Info" section, the replacement disk should now have this green status icon (indic-
ating that its status is OK):
The system then automatically runs repair and repairec on the disk mount point. You can monitor the repair pro-
gress in the Operation Status page (Cluster -> Operation Status).
Note
* If in the Node Status page the disk icon is displaying in red (indicating an "Error" status), click the
Clear Error History button. Doing so should return the disk to OK status.
* If the CMC continues to show status information for the old disk (as well as the new disk), and if click-
ing the Clear Error History button fails to clear the old information, ssh into the node on which you
replaced the disk and run the command systemctl restart cloudian-agent. Then wait at least one
minute, and check the CMC again. If the old information is still displaying, click the Clear Error History
button again.
5.18.2.1. Extra Step After Disk Replacements Performed During Cluster Expansion
If you executed the replaceDisk during the rebalance operations associated with a cluster expansion (or after
the Add Node[s] operation completed and before you kicked off the rebalance operations), then after the
replaceDisk operation and all rebalance operations have completed, run hsstool repairec on any one of the
new nodes. This will replace any EC data that is supposed to be on the new nodes but that may be missing
because of the disk having gone back on an existing node during the cluster expansion.
If you added new nodes to multiple data centers, run hsstool repairec on one of the new nodes in each data
center.
293
Chapter 5. Disk Operations
Note For guidance about HyperStore capacity management and cluster resizing, see "Capacity Mon-
itoring and Expansion" (page 298).
If an individual disk or a node as a whole is running low on available capacity, HyperStore alerts admin-
istrators:
l Alerts are triggered if an individual disk drops below 15% available capacity or if a node as a whole
drops below 10% available capacity. When such alerts are triggered, they appear in the CMC’s Node
Status page (Cluster -> Nodes -> Node Status) and Alerts page (Alerts -> Alerts) as well as being
sent to the system administrator email address(es). Optionally, alerts can also be transmitted as SNMP
traps. Alert thresholds and options are configurable in the Alerts Rules page (Alerts -> Alert Rules).
l If a node as a whole reaches 80% utilization of its data disk capacity, a Warning is display on the CMC's
Dashboard page.
If a HyperStore data disk (a disk storing S3 object data) is nearing capacity, the first two things to try are:
l Use hsstool cleanup (and hsstool cleanupec if you use erasure coding in your system) on the node to
clear it of any data that’s no longer supposed to be there. Such "garbage data" may be present if, for
example, S3 objects have been deleted from the system as a whole but the deletion operations on the
node in question failed.
l Delete S3 objects. Note that the associated files will not be deleted from the disk immediately since
HyperStore uses batch processing for deletion of S3 object data. The batch processing is triggered
hourly by a cron job (see "System cron Jobs" (page 274)).
Note For additional guidance on removing data to free up disk space, consult with Cloudian
Support.
If these measures do not free up sufficient disk space, the solution is to increase system capacity by adding
one or more new nodes to your cluster. For the procedure see "Adding Nodes" (page 212). When you add a
node, a portion of the data on your existing nodes is copied to the new node and then (when you subsequently
run a cleanup operation) deleted from the existing nodes — thereby freeing up space on the existing nodes.
The degree to which space will be freed up on existing nodes depends on the number of new nodes that you
add in proportion to the size of your existing cluster — for example, adding two nodes to a four node cluster
would free up a larger percentage of the existing nodes' disk space than would adding two nodes to a twenty
node cluster.
For information about managing a Cassandra data disk (a disk storing system and object metadata stored in
the Metadata DB) that is nearing capacity see "Responding to Cassandra Disks Nearing Capacity" (page
294).
l Use hsstool cleanup on the host node, using the allkeyspaces option. This will clear any Cassandra
data that is no longer supposed to be on the node.
l Selectively delete Cassandra data. To do this, consult with Cloudian Support.
If these measures do not free up sufficient space, the solution is to increase system capacity by adding one or
more new nodes. For the procedure see "Adding Nodes" (page 212). When you add a node, a portion of the
294
5.22. Adding Disks is Not Supported
Cassandra data on your existing nodes is copied to the new node and then (when you subsequently run a
cleanup operation) deleted from the existing nodes — thereby freeing up space on the existing nodes.
Note For guidance about HyperStore capacity management and cluster resizing, see "Capacity Mon-
itoring and Expansion" (page 298).
295
This page left intentionally blank
Chapter 6. Monitoring
The table below shows the system monitoring and alert management tasks that are supported by the CMC.
297
Chapter 6. Monitoring
Check how your storage capacity usage has changed over Analytics -> Cluster
the past 30 days, per service region Usage
For each data center, see which nodes have any alerts
Cluster -> Data
and whether any HyperStore services are down on any
Centers
node
HyperStore is horizontally scalable, allowing you to gain additional storage and request processing capacity
by adding more nodes to your cluster. When you add new nodes to your cluster, the storage capacity asso-
ciated with the new nodes becomes immediately available to the system. However, the automated processes
that re-distribute data from existing nodes to newly added ones -- thereby reducing storage capacity usage on
the existing nodes -- may take up to several days or more to complete, depending on factors such as data
volume and network bandwidth. Therefore it's important to closely monitor your current and projected system
capacity usage, plan ahead for needed cluster expansions, and implement such expansions well before
you've filled your current capacity.
298
6.3. Capacity Monitoring and Expansion
Use the CMC Dashboard to monitor your system's current and projected storage utilization level (for more
information see "Monitoring Cluster Storage Capacity Utilization" (page 299), further below). As best prac-
tices for cluster expansion timing, Cloudian recommends the following:
l Start expansion planning and preparation when either of the following occur (whichever occurs
first):
o The Dashboard shows your utilization of system storage capacity has reached 70%.
o The Dashboard shows your utilization of system storage capacity is projected to reach 90%
within 120 days. If your system has a high rate of ingest relative to capacity, this projection may
occur even if your current usage has not yet reached 70%.
o The minimum unit of expansion is a node. HyperStore does not support adding disks to existing
nodes.
o You need to allow time to acquire host machines and prepare them for being added to your
cluster.
o Preferably, cluster expansions should be substantial enough that the expanded cluster will
allow you to meet your projected storage needs for at least an additional six months after the
expansion. In this way you can avoid having to frequently add nodes to your system.
o Cloudian Support is available to review and provide feedback on your expansion plan.
l Execute your expansion when either of the following occur (whichever occurs first):
o The Dashboard shows your utilization of system storage capacity has reached 80%.
o The Dashboard displays a Warning that your utilization of system storage capacity is pro-
jected to reach 90% within 90 days. If your system has a high rate of ingest relative to capacity,
this projection may occur even if your current usage has not yet reached 80%.
IMPORTANT ! Each HyperStore node is designed to reject new writes if it reaches 90% storage
capacity utilization. Allowing your system to surpass 80% capacity utilization poses the risk of having
to rush into an urgent cluster expansion operation.
299
Chapter 6. Monitoring
In both the current capacity utilization graphic and the projected utilization graphic, color-coding is used to high-
light utilization levels of 70% or higher and 90% or higher.
l A Warning message if the cluster is projected to reach 90% storage utilization in 90 days or less
l A Critical message if the cluster has reached 90% storage utilization
In the CMC's Capacity Explorer page (Analytics -> Capacity Explorer) you can view your system's remaining
free storage capacity broken down by service region (cluster), data center, and node. If less than 30% storage
space remains at any one of these levels -- that is, if more than 70% of capacity is utilized in a given node, data
center, or region -- this is highlighted in the interface.
In the CMC's Node Status page (Cluster -> Nodes -> Node Status) you can view each node's storage capa-
city utilization as well as the utilization level for each disk on each node.
300
6.4. Using the Admin API to Monitor HyperStore
To add a new data center (with new nodes) to an existing HyperStore service region, follow the documented
procedure "Adding a Data Center" (page 222).
To add a new service region (with new nodes) to an existing HyperStore system, follow the documented pro-
cedure "Adding a Region" (page 230).
IMPORTANT ! With the addition of a new DC or service region you will need to create new storage
policies that utilize the new DC or region. Adding a new DC or region does not create additional stor-
age capacity for existing buckets that use existing storage policies. Only new buckets that utilize
the new storage policies will make use of the additional storage capacity created by adding a new DC
or region. In the current HyperStore version, you cannot revise an existing storage policy or reassign a
new storage policy to an existing bucket.
To create additional storage capacity for your existing storage policies you must add nodes to your
existing data center(s).
For more information see "monitor" section of the Cloudian HyperStore Admin API Reference.
Other useful Linux utilities for monitoring system resource usage include:
301
Chapter 6. Monitoring
l vmstat
l iostat
l dstat
The S3 Service, Admin Service, HyperStore Service, and Cassandra Service support monitoring via Java Man-
agement Extensions (JMX). You can access JMX statistics using the graphical JMX client JConsole, which
comes with your Java platform. By default the full path to the JConsole executable is /us-
r/java/latest/bin/jconsole.
After launching JConsole, in the JConsole GUI specify the <host>:<jmx-port> that you want to connect to. Each
of the HyperStore system’s Java-based services has a different JMX listening port as indicated in the sections
that follow. The statistics that you view will be only for the particular node to which you are connected via
JConsole.
Note By default a JConsole connection does not require a user name and password, so in the JCon-
sole GUI these fields can be left empty. For general information about using JConsole, including pass-
word protection and security options, see the JConsole Help.
Note This section on JMX statistics presumes that you are using JConsole, but there are other JMX cli-
ents available. Your HyperStore system comes with two command-line JMX clients — cmdline-jmx-
client and jmxterm — which are in the /opt/cloudian/tools directory.
302
6.6. Using JMX to Monitor HyperStore Services
For each operation type there are also rate stats including:
The statistics are initialized at each restart of the S3 Service. Statistics will only be available for operations that
have been performed since the last S3 Service restart — for example, if no deleteObject operations have been
performed since the last restart, then no deleteObject statistics will be available.
Note These timing and rates stats are implemented with the Metrics Core library.
For example, the statistic "DeleteBucket" would have a value such as 1.0 or 0.92, indicating a 100% or 92%
success rate for S3 "DELETE Bucket" operations since the last start-up of the S3 Service.
l ExhaustedPoolNames
l KnownHosts
l NumActive
l NumBlockedThreads
l NumConnectionErrors
l NumExhaustedPools
l NumIdleConnections
l NumPoolExhaustedEventCount
l NumPools
l NumRenewedIdleConnections
l NumRenewedTooLongConnections
l ReadFail
l RecoverableErrorCount
l RecoverableLoadBalancedConnectErrors
l RecoverableTimedOutCount
l RecoverableTransportExceptionCount
l RecoverableUnavailableCount
303
Chapter 6. Monitoring
l SkipHostSuccess
l StatisticsPerPool
l SuspendedCassandraHosts
l WriteFail
l threads
l idleThreads
l queueSize
In JConsole’s MBeans tab, timing performance statistics for Admin Service operations are available under the
metrics MBean. Under metrics there is com.cloudian.admin.stats.<operation>, where <operation> is an S3 API
operation such as getUser, createUser, and so on. For each operation type, under Attributes there is a set of
timing statistics including:
For each operation type there are also rate stats including:
The statistics are initialized at each restart of the S3 Service (the Admin Service stops and starts together with
the S3 Service). Statistics will only be available for operations that have been performed since the last S3 Ser-
vice restart — for example, if no createUser operations have been performed since the last restart, then no cre-
ateUser statistics will be available.
Note These timing and rates stats are implemented with the Metrics Core library.
For the HyperStore Service, these categories of JMX statistics are supported:
304
6.6. Using JMX to Monitor HyperStore Services
HyperStore Service operation such as put, getBlob, getDigest, or delete. For each operation type, under Attrib-
utes there is a set of timing statistics including:
For each operation type there are also rate stats including:
The statistics are initialized at each restart of the HyperStore Service. Statistics will only be available for oper-
ations that have been performed since the last HyperStore Service restart — for example, if no delete oper-
ations have been performed since the last restart, then no delete statistics will be available.
Note These timing and rates stats are implemented with the Metrics Core library.
l NumberOfSuccessfulDeleteOperations
l NumberOfSuccessfulReadOperations
l NumberOfSuccessfulWriteOperations
l TotalNumberOfDeleteOperations
l TotalNumberOfReadOperations
l TotalNumberOfWriteOperations
The list of supported statistics is the same as indicated for the S3 Service’s Hector statistics. Descriptions of
these statistics are available in the Health Check Attributes Available for Hector section of the online Hector
user guide.
l threads
l idleThreads
l queueSize
305
Chapter 6. Monitoring
In JConsole’s MBeans tab, under org.apache.cassandra.metrics, Cassandra supports a wide range of stat-
istics.
Note If the number of pending compaction tasks grows over time, this is an indicator of a need to
increase cluster capacity.
The ColumnFamily MBean exposes statistics for active, pending, and completed tasks for each of Cassandra’s
thread pools.
Note A significant and sustained increase in the pending task counts for the Cassandra thread pools is
an indicator of a need to increase cluster capacity.
Note A sustained increase in read and write latencies may indicate a need to increase cluster capa-
city.
<hostname or IP address>:<port>/.healthCheck
If the service is up and running and listening on its assigned port, you will receive back an HTTP 200 OK
status. If not, your request will time out.
This feature is supported for the following HTTP(S) based services and ports:
306
6.7. Checking HTTP(S) Responsiveness
Note
* To do health checks against an HTTPS port, the client executing the check must support TLS/SSL.
Note also that health checks against HTTPS ports are a more expensive operation (in terms of
resource utilization) than health checks against HTTP ports.
* HTTP is disabled by default for the Admin Service, so that only HTTPS is supported. See the Intro-
duction section of the Cloudian HyperStore Admin API Reference.
* The HTTP(S) health check of the Admin API service does not require Basic Authentication cre-
dentials. Basic Authentication is required for all other HTTP(S) requests to the Admin API, but is not
required for health check requests.
* The SQS Service is disabled by default. See the introduction to the SQS section of the Cloudian
HyperStore AWS APIs Support Reference.
The example below shows a health check of an S3 Service instance that is responsive.
HEAD http://192.168.2.16:80/.healthCheck
Status Code: 200 OK
Content-Length: 0
Date: Wed, 25 Aug 2021 12:51:50 GMT
Server: CloudianS3
In the case of health checks of the S3 Service, each health check request results in a special entry in the S3
Request Log, such as in this example entry:
2021-08-16 15:01:33,757|127.0.0.1||healthCheck|||81|0|0|0|81|11820||200|
544cdd90-822f-1c98-b780-525400e89933|0|0|
https://<host>:8443/Cloudian/login.htm)
Note Sending a GET or HEAD request to the CMC login URL will result in the CMC sending a GET
/group/list call to the Admin Service which in turn sends a request to Cassandra. To avoid this, the more
lightweight way to check CMC responsiveness is to send an OPTIONS request to the CMC login URL.
307
This page left intentionally blank
Chapter 7. Reference
7.1.1. hsstool
The HyperStore system includes its own cluster management utility called hsstool. This tool has functionality
that in many respects parallels the Cassandra utility nodetool, with the important distinction that hsstool applies
its operations to the HyperStore File System (HSFS) as well as to the Cassandra storage layer.
The hsstool utility is in the /opt/cloudian/bin directory of each HyperStore node. Because the HyperStore install-
ation adds /opt/cloudian/bin to each host's $PATH environment variable, you can run the hsstool utility from
any directory location on any HyperStore host.
l The <host> is the hostname or IP address of the HyperStore node on which to perform the operation.
Specify the actual hostname or IP address -- do not use "localhost". For most commands that only
retrieve information, the -h <host> argument is optional and (if not supplied) defaults to the hostname of
the host on which you are executing hsstool. For commands that impact system data or processes --
such as a repair or cleanup operation -- the -h <host> argument is mandatory.
l The <port> is the HyperStore Service’s JMX listening port. If you do not supply the port number when
using hsstool, it defaults to 19082. There is no need to supply the port when using hsstool unless
you've configured your system to use a non-default port for the HyperStore Service's JMX listener. The
syntax summaries and examples in the documentation of individual commands omit the -p <port>
option.
l The hsstool options -h <host> and (if you use it) -p <port> must precede the <command> and the <com-
mand-options> on the command line. For example, do not have -h <host> come after the <command>.
l For best results when running hsstool on the command line, run the commands as root. While some
commands may work when run as a non-root user, others will return error responses.
All hsstool operations activity is logged in the cloudian-hyperstore.log file on the node to which you sent the
hsstool command.
For usage information type hsstool help or hsstool help <command>. The usage information that this returns
is not nearly as detailed as what's provided in this documentation, but it does provide basic information about
syntax and command options.
The table below lists the commands that hsstool supports. For command options and usage information,
click on a command name.
309
Chapter 7. Reference
Note All hsstool commands can be executed either on the command line or through the CMC’s Node
Advanced page (Cluster -> Nodes -> Advanced). The documentation for these commands shows the
CMC interface for each command as well as the command line syntax.
madd This is for internal use by the install script, or for use as dir-
ected by Cloudian Support.
Maintenance Com- cleanup Clean a node of replicated data that doesn’t belong to it
mands
cleanupec Clean a node of erasure coded data that doesn’t belong to it
rebalance Shift data load from your existing nodes to a newly added
node
opctl List or stop all repair and cleanup operations currently run-
ning in the region
310
7.1. Command Line Tools
Options
l Use allkeyspaces to clean up replicated S3 objects in the HSFS and also clean up all the Cassandra
keyspaces. Cassandra cleanup will be completed first, then HSFS replica cleanup. The Cassandra key-
spaces that will be cleaned are: UserData_<storage-policy-ID> keyspaces; AccountInfo; Reports; Mon-
itoring; and ECKeyspace. (For more information see the overview of Cassandra keyspaces for
HyperStore)
l Use nokeyspaces to clean up only replicated objects in the HSFS, and not any Cassandra keyspaces
l If you specify neither allkeyspaces nor nokeyspaces then the default behavior is to clean up replicated
objects in the HSFS and also to clean the Cassandra UserData_<storage-policy-ID> keyspaces (which
store object metadata). Cassandra cleanup will be completed first, then HSFS replica cleanup.
l Use the nokeyspaces option. The ability to do a dry run without actually deleting data is supported only
for HSFS replica data. Therefore you must select the nokeyspaces option or else the cleanup run will
actually clean Cassandra keyspace data.
l Have cleanup operation logging turned on (as it is by default). See the description of the -l option
below.
Note If you use the -n option when you run the cleanup command, one of the response items will be
"Number of files to be deleted count". The specific objects identified for deletion will be listed in the
cleanup log file as in the -l option description below.
If you use logging without using the -n option, then the list in the log file is a list of all objects that were deleted
by the cleanup operation.
If you use logging in combination with the -n option, then the list in the log file is a list of HSFS replica objects
that will be deleted if you run cleanup again without the -n option.
311
Chapter 7. Reference
The log is named cloudian-hyperstore-cleanup.log and is written into the Cloudian HyperStore log directory of
the target host. Activity associated with a particular instance of a cleanup command run is marked with a
unique command number.
Note that the -no option is the opposite of the -x option, and therefore you cannot use those two options in com-
bination.
Note The CMC interface does not support the -no option. This option is supported only on the com-
mand line.
Do not use this option if you are using either the -d <mount-point> option or the -vnode <token> option.
This safety feature guards against the possibility of a cleanup operation deleting an incorrectly placed replica
when no other replicas of the object exist in any of the correct locations within the cluster. The trade-off is that
the cleanup operation will take longer if this approach is used.
312
7.1. Command Line Tools
This safety feature defaults to true, so there's no need to use the -c option unless you want to specify -c false in
order to skip this safety check.
Note Before deleting a storage policy you are required to delete any buckets and objects that are
stored under that policy. However, the way that object deletion works in HyperStore is that the system
deletes the object metadata immediately but does not delete the actual object data until the next hourly
run of the object deletion cron job. Meanwhile, for storage policy deletion, the full deletion of the
metadata associated with the policy is implemented by a daily cron job. Depending on when you
delete buckets and objects associated with a storage policy and when you delete the policy itself, the
timing may be such that the daily storage policy deletion cron job executes before some of the object
data associated with the policy gets deleted by the hourly object deletion cron job. It's this residual
"garbage data" that will be detected and removed if you use the -policy option when running cleanup
on a node.
Note The CMC interface does not support the -policy option. This option is supported only on the com-
mand line.
You can subsequently use the "hsstool opstatus" (page 332) command to confirm that the cleanup has been
stopped (status = TERMINATED) and to see how much cleanup progress had been made before the stop.
If you terminate an in-progress cleanup operation you will not subsequently be able to resume that operation
from the point at which it stopped. Instead, when you want to clean the node you can run a regular full cleanup
operation.
Use this hsstool command on a node when you want to identify and delete replica data that does not belong
on the node. Broadly, hsstool cleanup removes two classes of "garbage" data from a target node:
313
Chapter 7. Reference
l Data that belongs to a token range that the target node is no longer responsible for, as a result of a mod-
ified token range allocation within the cluster (as occurs when you add a new node).
l Data that should not be on the node even though the data falls within the token ranges that the node is
responsible for. This can occur, for example, if data from objects that have been deleted through the S3
interface (or the Admin API's POST bucketops/purge operation) has not yet been removed from disk by
the hourly batch delete job; or if an object delete request through the S3 interface succeeds for some
but not all of the object’s replicas.
By default hsstool cleanup performs both types of cleanups, but the command supports an -x option to perform
only the first type and a -no option to perform only the second type.
By default you can only run hsstool cleanup on one node at a time per data center. This limit is configurable by
the "max.cleanup.operations.perdc" (page 461) setting in hyperstore-server.properties.erb. If you want to
raise this limit, consult with Cloudian Support first.
The system will not allow you to run hsstool cleanup on a node on which hsstool repair is currently running.
Note The hsstool cleanup operation will only clean objects whose Last Modified timestamp is older
than the interval set by the system configuration property hyperstore-server.properties: cleanup.ses-
sion.delete.graceperiod. By default this interval is one day. So by default no objects with Last Modified
timestamps within the past 24 hours will be deleted by hsstool cleanup.
The operational procedures during which you would use hsstool cleanup are:
Please refer to those procedures for step-by-step instructions, including the proper use of hsstool cleanup
within the context of the procedure.
You might also use hsstool cleanup at the end of the procedure for "Adding Nodes" (page 212). However,
using hsstool cleanup at the end of the procedure for Adding Nodes is necessary only if you do not use one of
the options that integrates cleanup tasks into the rebalance operation (the rebalance -cleanupfile option or
the rebalance -cleanup option). For more information on these rebalance options, see "hsstool rebalance"
(page 344).
If you do run hsstool cleanup at the end of the Adding Nodes procedure, use the cleanup options allkeyspaces,
-l, -x, -a, and -c. Note that the -a option applies the cleanup operation to erasure coded data as well as rep-
licated data. The system by default only allows you to run cleanup on one node at a time per data center. After
cleanup completes on one node, initiate cleanup on a next node, and continue in this way until all of your pre-
viously nodes have been cleaned.
As an alternative to running hsstool cleanup on the command line, you can run command through the CMC UI:
314
7.1. Command Line Tools
Note If you launch the operation through the CMC UI, you can track the operation progress through the
CMC's Operation Status page (Cluster -> Operation Status). This way of tracking operation progress
is not supported if you launch the operation on the command line. However, regardless of how you
launch the operation you can periodically check on its progress by using the hsstool opstatus com-
mand.
315
Chapter 7. Reference
Note This is typically a long-running operation and the command response will not return until the
operation completes. In the meanwhile you can track operation progress as described in "CMC Sup-
port For This Command" (page 314).
Response Items
optype
The type of hsstool operation.
cmdno#
Command number of the run. Each run of a command is assigned a number.
status
Status of the command run: INPROGRESS, COMPLETED, FAILED, or TERMINATED
A COMPLETED status means only that the operation did not error out and prematurely end. It does not mean
that the operation succeeded in respect to every object checked by the operation. For high-level information
about object cleanup successes and failures (if any), see the other fields in the cleanup response.
A FAILED status means that the operation ended prematurely due to errors. For additional status detail see the
other fields in the cleanup response. For details on any FAILED operation you can also scan cloudian-hyper-
store.log for error messages from the period during which the operation was running.
A TERMINATED status means that the cleanup run was terminated by an operator, using cleanup -stop.
arguments
Value of the command arguments used for the run, if any. The status results use internal system names for the
arguments which may not exactly match the command-line arguments that are defined in a command’s syntax,
but the relationships should be clear.
operation ID
Globally unique identifier of the cleanup run. This may be useful if Cloudian Support is helping you
troubleshoot a repair failure.
Note The "cmd#" (described further above) cannot serve as a globally unique identifier because that
counter resets to zero -- and subsequently starts to increment again -- when the HyperStore Service is
restarted.
start
Start time of the operation.
end, duration
End time and duration of a completed operation.
progress percentage
316
7.1. Command Line Tools
Of the total work that the operation has identified as needing to be done, the percentage of work that has been
completed so far.
task count
The number of object replicas on the node, which the cleanup operation must evaluate to determine whether
they correctly belong on the node.
completed count
The number of object replicas that the cleanup operation evaluated to determine whether they correctly belong
on the node.
Note If you use the -n option when you run the cleanup command, this response item will be "Number
of files to be deleted count" rather than "Number of files deleted count".
failed count
The number of object replicas that the cleanup operation tried to delete (because they don't belong on the
node), but failed.
skipped count
The number of object replicas that the cleanup operation left on the node because they belong on the node.
Options
317
Chapter 7. Reference
Note If you use the -n option when you run the cleanupec command, one of the response items will be
"Number of files to be deleted count". The specific objects identified for deletion will be listed in the
cleanup log file as in the -l option description below.
If you use logging without using the -n option, then the list in the log file is a list of all objects that were deleted
by the cleanupec operation.
If you use logging in combination with the -n option, then the list in the log file is a list of objects that will be
deleted if you run cleanupec again without the -n option.
The log is named cloudian-hyperstore-cleanup.log and is written into the Cloudian HyperStore log directory of
the target host. Activity associated with a particular instance of a cleanupec command run is marked with a
unique command number.
Note that the -no option is the opposite of the -x option, and therefore you cannot use those two options in com-
bination.
Note The CMC interface does not support the -no option. This option is supported only on the com-
mand line.
318
7.1. Command Line Tools
fragments of the object exist on the correct endpoint nodes within in the cluster. If fewer than k+m fragments
exist, then the out-of-range fragment will be left in place on the node that's being cleaned and an ERROR level
message will be written to the HyperStore Service application log (which will also result in the triggering of an
Alert in the CMC, if you are using the default alert rules).
This safety feature guards against the possibility of a cleanup operation deleting an incorrectly placed fragment
when fewer than k+m fragments of the object exist within the cluster. The trade-off is that the cleanup operation
will take longer if this approach is used.
This safety feature defaults to true, so there's no need to use the -c option unless you want to specify -c false in
order to skip this safety check.
Note Before deleting a storage policy you are required to delete any buckets and objects that are
stored under that policy. However, the way that object deletion works in HyperStore is that the system
deletes the object metadata immediately but does not delete the actual object data until the next hourly
run of the object deletion cron job. Meanwhile, for storage policy deletion, the full deletion of the
metadata associated with the policy is implemented by a daily cron job. Depending on when you
delete buckets and objects associated with a storage policy and when you delete the policy itself, the
timing may be such that the daily storage policy deletion cron job executes before some of the object
data associated with the policy gets deleted by the hourly object deletion cron job. It's this residual
"garbage data" that will be detected and removed if you use the -policy option when running cleanup
on a node.
Note The CMC interface does not support the -policy option. This option is supported only on the com-
mand line.
You can subsequently use the "hsstool opstatus" (page 332) command to confirm that the cleanupec oper-
ation has been stopped (status = TERMINATED) and to see how much progress had been made before the
stop.
319
Chapter 7. Reference
If you terminate an in-progress cleanupec operation you will not subsequently be able to resume that operation
from the point at which it stopped. Instead, when you want to clean the node you can run a regular full cleanu-
pec operation.
Note If you want to clean replica data and also erasure coded data on a node, use hsstool cleanup
with the -a option. If you want to clean only erasure coded data on a node, use hsstool cleanupec as
described below.
Use this hsstool command on a node when you want to identify and delete erasure coded data that does not
belong on the node. Broadly, hsstool cleanupec removes two classes of "garbage" data from a target node:
l Data that belongs to a token range that the target node is no longer responsible for, as a result of a mod-
ified token range allocation within the cluster (as occurs when you add a new node).
l Data that should not be on the node even though the data falls within the token ranges that the node is
responsible for. This can occur, for example, if data from objects that have been deleted through the S3
interface (or the Admin API's POST bucketops/purge operation) has not yet been removed from disk by
the hourly batch delete job; or if an object delete request through the S3 interface succeeds for some
but not all of the object’s replicas.
By default hsstool cleanupec performs both types of cleanups, but the command supports an -x option to per-
form only the first type and a -no option to perform only the second type.
By default you can only run hsstool cleanupec on one node at a time per data center. This limit is configurable
by the "max.cleanup.operations.perdc" (page 461) setting in hyperstore-server.properties.erb. If you want
to raise this limit, consult with Cloudian Support first.
The system will not allow you to run hsstool cleanupec on a node on which hsstool repairec is currently run-
ning.
Note The hsstool cleanupec operation will only clean objects whose Last Modified timestamp is older
than the interval set by the system configuration property hyperstore-server.properties: cleanup.ses-
sion.delete.graceperiod. By default this interval is one day. So by default no objects with Last Modified
timestamps within the past 24 hours will be deleted by hsstool cleanupec.
If you have erasure coded data in your HyperStore system, the operational procedure during which you would
use hsstool cleanup are:
Please refer to those procedures for step-by-step instructions, including the proper use of hsstool cleanupec
within the context of the procedure.
Note The hsstool cleanupec operation will only clean objects whose Last Modified timestamp is older
than the interval set by the system configuration property hyperstore-server.properties: cleanup.ses-
320
7.1. Command Line Tools
sion.delete.graceperiod. By default this interval is one day. So by default no objects with Last Modified
timestamps within the past 24 hours will be deleted by hsstool cleanupec.
As an alternative to running hsstool cleanupec on the command line, you can run command through the CMC
UI:
Note If you launch the operation through the CMC UI, you can track the operation progress through the
CMC's Operation Status page (Cluster -> Operation Status). This way of tracking operation progress
is not supported if you launch the operation on the command line. However, regardless of how you
launch the operation you can periodically check on its progress by using the hsstool opstatus com-
mand.
321
Chapter 7. Reference
failed count: 0
skipped count: 190000
rcvd connection count: 0
open connection count: 0
job threadpool size: 10
message threadpool active: 0
timer: batch.copies.find.timer: count=0 mean=NaN; batch.fragments.find.timer: count=68
mean=85069.69420885573; batch.cleanup.file.timer: count=68 mean=5.230084050540501E9;
Note This is typically a long-running operation and the command response will not return until the
operation completes. In the meanwhile you can track operation progress as described in "CMC Sup-
port For This Command" (page 321).
Response Items
optype
The type of hsstool operation.
cmdno#
Command number of the run. Each run of a command is assigned a number.
status
Status of the command run: INPROGRESS, COMPLETED, FAILED, or TERMINATED
A COMPLETED status means only that the operation did not error out and prematurely end. It does not mean
that the operation succeeded in respect to every object checked by the operation. For high-level information
about object cleanup successes and failures (if any), see the other fields in the cleanup response.
A FAILED status means that the operation ended prematurely due to errors. For additional status detail see the
other fields in the cleanup response. For details on any FAILED operation you can also scan cloudian-hyper-
store.log for error messages from the period during which the operation was running.
A TERMINATED status means that the clean up run was terminated by an operator, using cleanupec -stop.
arguments
Value of the command arguments used for the run, if any. The status results use internal system names for the
arguments which may not exactly match the command-line arguments that are defined in a command’s syntax,
but the relationships should be clear.
operation ID
Globally unique identifier of the cleanupec run. This may be useful if Cloudian Support is helping you
troubleshoot a repair failure.
Note The "cmd#" (described further above) cannot serve as a globally unique identifier because that
counter resets to zero -- and subsequently starts to increment again -- when the HyperStore Service is
restarted.
start
Start time of the operation.
322
7.1. Command Line Tools
end, duration
End time and duration of a completed operation.
progress percentage
Of the total work that the operation has identified as needing to be done, the percentage of work that has been
completed so far.
task count
The number of erasure coded fragments on the node, which the cleanup operation must evaluate to determine
whether they correctly belong on the node.
completed count
The number of erasure coded fragments that the cleanup operation evaluated to determine whether they cor-
rectly belong on the node.
Note If you use the -n option when you run the cleanupec command, this response item will be "Num-
ber of files to be deleted count" rather than "Number of files deleted count".
failed count
The number of erasure coded fragments that the cleanup operation tried to delete (because they don't belong
on the node), but failed.
skipped count
The number of erasure coded fragments that the cleanup operation left on the node because they belong on
the node.
323
Chapter 7. Reference
This hsstool command returns virtual node (token range) information and data load information for a specified
physical node within a storage cluster. The return includes a list of virtual nodes (tokens) assigned to the phys-
ical node.
As an alternative to running hsstool info on the command line, you can run it through the CMC UI:
The example below shows an excerpt from a response to the info command. The command returns information
about a specific node, "cloudian-node1". The node’s token (vNode) list is sorted in ascending order.
324
7.1. Command Line Tools
Note To see which tokens are on which disks on a node, use hsstool ls.
Response Items
Cloudian
Cloudian HyperStore software version installed on the node.
Cloudian Load
The total volume of S3 object data (replicas and/or erasure coded fragments) stored in the HyperStore File Sys-
tem on the node, across all HyperStore data disks combined.
This field also shows the total volume of disk space allocated for S3 object storage on the node (the total capa-
city of HyperStore data disks combined); and the percentage of used volume over total capacity.
Uptime
The number of seconds that the HyperStore Service has been running since its last start.
Token
A storage token assigned to the node. Tokens are randomly generated from an integer token space ranging 0
to 2127 -1, and distributed around the cluster. Each token is the top of a token range that constitutes a virtual
node (vNode). Each vNode's token range spans from the next-lower token (exclusive) in the cluster up to its
own token (inclusive). A physical node's set of tokens/vNodes determines which S3 object data will be stored
on the physical node.
For more background information see "How vNodes Work" (page 55).
Cassandra
Cassandra software version installed on the node.
Cassandra Load
Cassandra storage load (quantity of data stored in Cassandra) on the node. There will be some Cassandra
load even if all S3 object data is stored in the HyperStore File System. For example, Cassandra is used for stor-
age of object metadata and service usage data, among other things.
Data Center
Data center in which the node resides.
Rack
Rack in which the node resides.
7.1.1.4. hsstool ls
Subjects covered in this section:
325
Chapter 7. Reference
This hsstool command returns a node’s list of HyperStore data mount points, the list of storage tokens currently
assigned to each mount point on the node, the current disk usage per mount point, and the recent errors per
mount point if any.
As an alternative to running hsstool ls on the command line, you can run it through the CMC UI:
The example below shows a response to an hsstool ls command. The command response snippet below is
truncated; the actual response would list all the tokens assigned to each HyperStore data directory mount point
on the node.
/cloudian61/hsfs:
12449505273998224519582214417366908928 HfkGaHQIEduYOZDCRLzW4
37348191303441015132019860096080150528 r1D1xZzjR37DQCOlSVBC4
...
...
326
7.1. Command Line Tools
Response Items
Filesystem
Device name of the disk drive
Size
Total capacity of the disk
Used
Used capacity of the disk
All
Remaining available capacity of the disk
Use%
Percentage of disk capacity used
Mounted on
Mount point of the device
Status
Disk status: either OK or ERROR or DISABLED. For more information on the disk's status see the CMC's Node
Status page, specifically the Disk Detail Info section.
Disk errors
Count of recent disk errors, if any. These counts are categorized into filesystem errors, I/O errors, and file not
found errors. The counts are reset whenever any of these events occur:
Token list
For each HyperStore data mount point, the lower section of the ls command response lists all the storage
tokens currently assigned to that mount point. Displayed alongside the decimal version of each storage token
is the base62 encoding of the token. In the HyperStore File System, base62 encoded tokens will be part of the
directory structure for stored S3 object data. For example, under directory <mount-point>/hsfs/<base62-
encoded-tokenX>/... would be the S3 object replica data associated with the token range for which tokenX is
the upper bound. For more information on S3 storage directory structure see "HyperStore Service and the
HSFS" (page 38).
327
Chapter 7. Reference
Options
If the object name has spaces in it, enclose the bucket/object name pair in quotes. For example, "mybucket/big
document.doc".
Note In the CMC UI implementation of this command, you enter the bucket name and the full object
name (including folder path) in separate fields. For example, bucket name mybucket and full object
name Videos/Vacation/Italy_2021-06-27.mpg.
This hsstool command returns metadata for a specified S3 object, such as the object size and the date-time
that the object was last accessed by an S3 client application.
As an alternative to running hsstool metadata on the command line, you can run it through the CMC UI:
328
7.1. Command Line Tools
The metadata command example below returns the metadata for the specified object.
329
Chapter 7. Reference
Response Items
Key
Key that uniquely identifies the S3 object, in format <bucketname>/<objectname>.
PolicyID
System-generated identifier of the storage policy that applies to the bucket in which this object is stored.
Version
Object version, if versioning has been used for the object. Versions are identified by timeuuid values in hexa-
decimal format. If versioning has not been used for the object, the Version field displays "Null".
Compression
Compression type applied to the object, if any.
Create Time
Timestamp for the original creation of the object. Format is ISO 8601 and the time is in Coordinated Universal
Time (UTC).
Last Modified
Timestamp for last modification of the object. Format is ISO 8601 and time is in UTC.
Digest
MD5 digest of the object. This will be a 32 digit hexadecimal number. This digest is used in a variety of oper-
ations including data repair.
Size
The object’s size in bytes.
Region
The HyperStore service region in which the object is stored.
330
7.1. Command Line Tools
which object metadata is organized per bucket) and CLOUDIAN_OBJMETADATA column family (in which
object metadata is organized per object). This raw object metadata may be useful if you are working Cloudian
Support to troubleshoot an issue in regard to the object.
Note There is overlap in the content of these two sets of raw object metadata.
Options
This hsstool command returns a list of all hsstool repair, hsstool repairec, hsstool cleanup, and hsstool
cleanupec operations currently running in a service region. You can also use the command to stop all those
in-progress operations.
The target <host> can be any node in the service region. The command will apply to all nodes in the service
region.
With this command you must use either the -l option or the -stop option ("hsstool opctl" by itself doesn't do any-
thing).
Note The hsstool opctl command is not supported in the CMC UI.
The example below shows the responses to hsstool opctl commands. The first command lists the in-progress
repair and cleanup operations in the cluster (in this example only a replica repair operation is in progress). The
second command stops the in-progress operation(s).
331
Chapter 7. Reference
min-modified-ts=0,logging=true,full-repair=true,check-metadata=true,cmdno=1,merkletree=true,
computedigest=false
Options
Valid operation types are listed below. If you do not specify a type, status is returned for all supported operation
types.
l cleanup
l cleanupec
l decommissionreplicas
l decommissionec
l proactiverebalance
l proactiverepair
l proactiverepairec
l rebalance
l rebalanceec
l repair
l repaircassandra
l repairec
332
7.1. Command Line Tools
Note The decommission operation is automatically invoked by the CMC's "Uninstall" feature if you
uninstall a node that's "live" in the Cassandra ring. Status reporting on a decommission operation is
broken out to decommissionreplicas (for replicated data) and decommissionec (for erasure coded
data).
-a Verbose output
(Optional) Verbose status output for a repair or repairec or rebalance or rebalanceoperation. The additional
status detail that this option provides can be helpful if you are working with Cloudian Support to troubleshoot a
repair or rebalance problem.
For example:
If you want just a history of a particular repair or cleanup operation type, use:
where <operation> is rebalance, rebalanceec, repair, repairec, cleanup, or cleanupec. For example:
Note The CMC does not support the "-q history" option.
This hsstool command returns the status of the most recent runs of repair, cleanup, rebalance, or decom-
mission operations that have been performed on a specified node. For each operation type:
l If a run of the operation is in progress on the node, then that’s the run for which status is returned.
l If the operation is not currently in progress on the node, then status is returned for the most recent run
of that operation on the node, in the time since the last restart of the node.
For checking the status of repair runs other than the most recent run, opstatus also supports a command line
option to return a 90-day history of repairs performed on the target node.
Note For operations that you've launched through the CMC UI, a convenient way to check operation
status is through the CMC's Operation Status page (Cluster -> Operation Status). This page does not
report on operations that you've launched on the command line -- for such operations hsstool opstatus
is your only option for checking status.
As an alternative to running hsstool opstatus on the command line, you can run it through the CMC UI:
333
Chapter 7. Reference
334
7.1. Command Line Tools
Note For in-progress operations, rather than an end time and duration the opstatus response will show
an estimated completion time and estimated time remaining.
335
Chapter 7. Reference
Note For in-progress operations, rather than an end time and duration the opstatus response will show
an estimated completion time and estimated time remaining.
Note "REPAIRCASSANDRA" status metrics will appear in opstatus results for the Cassandra key-
space repair part of the auto-repair feature; or for when you manually run hsstool repair either with its
default behavior (which includes a repair of user data keyspaces in Cassandra) or with the "allkey-
spaces" option (which includes a repair of user data keyspaces and service metadata keyspaces in
Cassandra).
336
7.1. Command Line Tools
failed count: 0
skipped count: 0
stream jobs total: 5
stream jobs completed: 5
streamed bytes: 500
Note For in-progress operations, rather than an end time and duration the opstatus response will show
an estimated completion time and estimated time remaining.
Note The decommission operation is automatically invoked by the CMC's "Uninstall" feature, if you use
the Uninstall feature to remove a node that's "live" in the Cassandra ring. Status reporting on a decom-
mission operation is broken out to decommissionreplicas (for replicated data) and decommissionec (for
erasure coded data).
337
Chapter 7. Reference
optype
The type of hsstool operation.
cmdno#
Command number of the run. Each run of a command is assigned a number.
status
Status of the command run: INPROGRESS, COMPLETED, or FAILED
A COMPLETED status means only that the operation did not error out and prematurely end. It does not mean
that the operation succeeded in respect to every object checked by the operation. For high-level information
about decommission operation successes and failures (if any), see the other fields in the response.
A FAILED status means that the operation ended prematurely due to errors. For additional status detail see the
other fields in the response. For details on any FAILED operation you can also scan cloudian-hyperstore.log
for error messages from the period during which the operation was running.
Note Decommission processes replica data first and then erasure coded data. After decommission fin-
ishes for erasure coded data the HyperStore Service on the node immediately shuts down and can no
longer response to hsstool commands including hsstool opstatus. Therefore the decommissionec oper-
ation will never show a status of COMPLETED (since the HyperStore Service shuts down upon decom-
missionec completion).
operation ID
Globally unique identifier of the decommission run. This may be useful if Cloudian Support is helping you
troubleshoot a decommission failure. Note that when decommission is run, the
DECOMMISSIONREPLICAS part of the response (for replica data) and DECOMMISSIONEC part of the
response (for erasure coded data) will both have the same operation ID.
Note The "cmd#" (described further above) cannot serve as a globally unique identifier because that
counter resets to zero -- and subsequently starts to increment again -- when the HyperStore Service is
restarted.
start
Start time of the operation.
progress percentage
338
7.1. Command Line Tools
Of the total work that the operation has identified as needing to be done, the percentage of work that has been
completed so far.
task count
The total number of files that are evaluated for possible streaming from the node to be decommissioned to
other nodes in the cluster. Each such file constitutes a "task".
completed count
From the total task count, the number of tasks that have been completed so far (that is, the number of files for
which processing has been completed). Each completed task results in the incrementing of either the
"streamed count", the "failed count", or the "skipped count".
streamed count
The number of object replica files successfully streamed (copied) from the decommissioned node to other
nodes in the cluster.
streamed bytes
The total number of object replica file bytes successfully streamed (copied) from the decommissioned node to
other nodes in the cluster.
failed count
The number of files for which the attempt to stream the file to a different node failed as a result of an error. For
information about such failures, on the decommissioned node you can scan /var/log/cloudian/cloudian-hyper-
store.log for error messages from the time period during which the decommission operation was running.
skipped count
The number of files for which the streaming operation is skipped because a file that was going to be streamed
from the decommissioned node to a different target node in the cluster is found to already exist on the target
node.
The "streamed count" plus the "failed count" plus the "skipped count" will equal the "completed count".
Note For "decommission" operations, opstatus can only report in-progress status, not final, completed
status. This is because the HyperStore service on the decommissioned node is immediately stopped at
the completion of the decommission operation. For final decommission operation status you can review
/var/log/cloudian/cloudian-hyperstore.log on the decommissioned node.
339
Chapter 7. Reference
Options
If you use hsstool -h <host> proactiverepairq -a, the command will return the number of nodes that are in need
of proactive repair and the IP addresses of those nodes, and precise information about the count and total size
of objects that are in need of proactive repair on each node. When you use the -a option, a scan of Cassandra
metadata is done. This is a much more resource intensive operation than if you omit the -a option.
Note To see how much proactive repair work has already been completed on a given node, use the
"hsstool opstatus" (page 332) command on that node.
Under normal circumstances you should not need to use the -delete <host> option. You might however use
this option if you are working with Cloudian Support to troubleshoot problems on a node.
Note The CMC interface does not support the -delete <host> option. This option is supported only on
the command line.
Optionally you can restrict the immediate proactive repair to a particular category of proactive repair: proactive
repair of replica data for an existing node; proactive repair of erasure coded data for an existing node; or pro-
active repair of object data streaming failures from a recently completed rebalance operation for a newly added
340
7.1. Command Line Tools
node. For example, use hsstool -h <host> proactiverepairq -start -type rebalance to immediately initiate pro-
active repair on a new host after a rebalance operation that reported failures for some objects.
If you do not include the -type option, then using -start will immediately initiate proactive repair for all types of
proactive repairs that are currently needed the target node.
Note Proactive repair is triggered automatically every hour (by default configuration; see hyperstore-
server.properties.erb:"hyperstore.proactiverepair.poll_time" (page 461)), on all nodes that are in
need of proactive repair, for all proactive repair types. No operator action is required. So there is no
need to use the -start option unless for some reason you want proactive repair to begin immediately on
a particular node rather than waiting for the next automatic hourly run of proactive repair.
If proactive repair is in progress on multiple hosts and you want to stop all the proactive repairs, you must sub-
mit a hsstool -h <host> proactiverepairq -stop command separately for each of those hosts.
Note Unlike the -start option, the -stop option does not support specification of a proactive repair type.
Instead the -stop option stops all types of in-progress proactive repair on the target host.
-enable true|false Enable or disable proactive repair throughout the service region
(Optional) Enable or disable the proactive repair feature. By default the feature is enabled.
If you use hsstool -h <host> proactiverepairq -enable false to disable the proactive repair feature, this applies
to all nodes in the service region of the specified host. So, it doesn't matter which host you specify in the com-
mand as long as it's in the right service region. Likewise hsstool -h <host> proactiverepairq -enable true re-
enables the proactive repair feature for all nodes in the service region.
Disabling the proactive repair feature does not abort in-progress proactive repairs. Rather, it prevents any
additional proactive repairs from launching. To stop in-progress proactive repairs on a particular node use the -
stop option.
IMPORTANT ! The proactive repair feature is important for maintaining data integrity in your system.
Do not leave it disabled permanently.
Note The proactive repair feature can also be disabled and re-enabled by using hsstool repairqueue -
h <host> -enable true|false. The difference is that the "hsstool repairqueue" (page 374) approach dis-
ables and re-enables scheduled auto-repairs and also proactive repairs, whereas the hsstool pro-
activerepairq approach disables and re-enables only proactive repair.
If you use both types of commands, then whichever command you used most recently will be operative
in regard to the proactive repair feature. For example if you run hsstool repairqueue -h <host> -enable
false and then subsequently you run hsstool -h <host> proactiverepairq -enable true, then proactive
repair will be enabled in the cluster (while the scheduled auto-repair feature will remain disabled).
341
Chapter 7. Reference
This hsstool command returns information about nodes that are in need of automated proactive repair. This
includes nodes for which automated proactive repair is in progress as well as nodes for which automated pro-
active repair will begin shortly. You can also use the command to immediately start proactive repair (rather
than waiting for the automatic hourly run); or to stop in-progress proactive repairs; or to temporarily disable the
proactive repair feature (and to re-enable it after having disabled it).
For retrieving proactive repair queue information (hsstool -h <host> proactiverepairq) or for disabling or re-
enabling the proactive repair feature (hsstool -h <host> proactiverepairq -enable true|false), the <host> can be
any host in your cluster. The command applies to all nodes in the service region of the host that you specify.
For starting or stopping a proactive repair (hsstool -h <host> proactiverepairq -start or hsstool -h <host> pro-
activerepairq -stop), the <host> is the host on which you want to start or stop a proactive repair.
For more information about the HyperStore proactive repair feature see "Automated Data Repair Feature
Overview" (page 252).
As an alternative to using the command line, in the CMC UI you can run the hsstool proactiverepairq com-
mand's proactive repair queue status reporting function through this interface:
The functions for disabling or re-enabling proactive repair, immediately starting a proactive repair, or stopping
an in-progress proactive repair have their own separate CMC interface and the command is there renamed as
"proactiverepair" (although hsstool proactiverepairq is being invoked behind the scenes):
342
7.1. Command Line Tools
The proactiverepairq command example below shows that one node in the cluster is in need of proactive
repair, and that an estimated 100 objects are queued for proactive repair on that node. This proactive repair
occurs automatically; no operator action is required. The repair is either already underway or will be triggered
at the next interval (default is hourly).
The proactiverepairq -a command example below shows that one node in the cluster is in need of proactive
repair, and that the needed repair involves eight object replicas totaling about 1.6 MBs. This proactive repair
occurs automatically; no operator action is required. The repair is either already underway or will be triggered
at the next interval (default is hourly).
Note The proactiverepairq -a option provides more precisely accurate information about the number of
queued objects than does the proactiverepairq command without the -a option -- but using the -a
option is resource intensive.
Response Items
Proactive repair
This indicates whether the proactive repair feature is enabled, true or false. By default proactive repair is
enabled.
343
Chapter 7. Reference
An exact count is available if you use the proactiverepairq -a option, but using that option is resource intensive.
<hostname>(<IPAddress>)
Hostname and IP address of a node that is in need of proactive repair. The results will show one line for each
node that needs proactive repair, with each such line starting with the node’s IP address.
If the proactive repair is in progress, these numbers indicate how much is left to do.
If the proactive repair is in progress, this number indicates how much is left to do.
Options
-cleanupfile | -cleanup Remove from old nodes the data copied to new node
When you add a new node to your cluster, the new node takes over some portions of the token space from the
existing nodes in the cluster. Based on the new token space allocation, the rebalance operation copies certain
object replicas and/or erasure coded fragments to the new node, from the existing nodes. Then, having been
copied to the new node, replicas and/or fragments can be deleted from existing nodes on which they no longer
344
7.1. Command Line Tools
belong (as a result of some portions of the token space having been taken over by the new node). This
"cleanup" action frees up storage space on those existing nodes.
l To have the system delete each replica or fragment from the appropriate existing node as soon as it is
successfully copied to the new node, use the -cleanupfile option when you run rebalance. Using the -
cleanupfile option is the recommended method in typical circumstances. This option also cleans up
the corresponding metadata in Cassandra.
l Alternatively, if you use the -cleanup option -- instead of the -cleanupfile option -- the system will clean
up entire token ranges on the appropriate existing nodes, after rebalance tasks (the copying over of rep-
licas or fragments) have successfully completed for those token ranges. This option is appropriately
only if you have reason to believe that your system had a lot of "garbage files" -- extra replicas and/or
fragments, on nodes on which they do not belong -- before you added the new node.
Note If you use neither the -cleanupfile option nor the -cleanup option when you run rebalance, then
after the rebalance operation completes for the new node -- or after rebalance operations complete for
all new nodes if you are adding multiple new nodes -- you will need to run hsstool cleanup on each of
the older nodes, one node at a time, in order to free up storage space on those nodes. For more details
about this approach to cleaning up data after a rebalance operation, see "hsstool cleanup" (page
310).
When running hsstool -h <host> rebalance -retry, do not include any other options in the command (such as
the -cleanupfile option). Instead, the retry operation will automatically implement the same option(s) that you
used in the original rebalance run, if any.
Note For detailed status information on a rebalance operation that you have already run -- including a
break-down of status by token range, so you can see whether any token ranges failed -- you can run
hsstool -h <host> opstatus rebalance -a (or opstatus rebalanceec -a) on the node.
l A node status of "REQUIRED" means that the node has been added to the cluster (through the CMC's
Add Node operation) but that you have not yet run hsstool rebalance on the new node.
l A node status of "DONE" means that rebalance has been successfully completed for the node.
l A node status of "FAILED" means that the rebalance operation failed for one or more token ranges.
Note This option is supported only on the command line -- not in the CMC UI.
Sample command and response, where one new node has been added to the cluster and rebalance has been
successfully completed on the node:
345
Chapter 7. Reference
This option defaults to true, so you only need to specify the -l option if you do not want rebalanced object log-
ging (in which case you’d specify -l false).
Note This option is supported only on the command line -- not in the CMC UI.
You can subsequently use the "hsstool opstatus" (page 332) command to confirm that the rebalance has
been paused (status = PAUSED) and to see how much rebalance progress had been made before the pause.
It may take a while for the operation to become PAUSED, since the operation will first complete the pro-
cessing of whichever token range it was working on when you executed the -pause option.
When you are ready to resume a rebalance that you paused, use the rebalance -resume option.
The system allows pausing multiple rebalance operations concurrently, on different nodes. So if you have
added multiple nodes to your cluster and are running rebalance operations on the new nodes concurrently, if
you wish you can concurrently pause the rebalance operations on each node -- so that no rebalance oper-
ations are running in your cluster --and then later, concurrently resume the rebalance operations on each
node.
IMPORTANT ! On each new node you can only resume a rebalance operation once. So when you
resume the operation on the target node, do not use the -pause option a second time on the same
node and be careful not to interrupt the operation. If you do pause the operation a second time, or inad-
vertently interrupt the operation, then you will subsequently need to start over by running a full rebal-
ance operation on the node. Do not use -resume a second time on the same node.
Note During a rebalance operation the system automatically disables the auto-repair and proactive
repair features, and the system will not automatically re-enable these features until all rebalance oper-
ations in the service region have been fully completed. So during a rebalance pause, auto-repair and
proactive repair will remain disabled.
Note The -pause option is supported only on the command line -- not in the CMC UI.
346
7.1. Command Line Tools
paused.. The rebalance operation will continue with the token ranges that have not yet been processed.
IMPORTANT ! You can only resume a rebalance operation on a node once. So when you resume
the operation on the target node, do not use the -pause option a second time on the same node and be
careful not to interrupt the operation. If you do pause the operation a second time, or inadvertently inter-
rupt the operation, then you will subsequently need to start over by running a full rebalance operation
on the node. Do not use -resume a second time on the same node.
Note The -resume option is supported only on the command line -- not in the CMC UI.
This hsstool command copies S3 object data from your existing nodes to a specified new node that you have
added to your HyperStore cluster. The rebalance operation populates the new node with its share of S3 object
replica data and erasure coded data, based on the token ranges that the system automatically assigned the
new node when you added it to the cluster.
The target <host> must be a newly added node, unless the -list option is used in which case the target <host>
can be any node.
All other nodes in the cluster must be up and running when you run rebalance on a new node.
The only time to use this command is when you have added a new node or nodes to an existing data cen-
ter. You will then run rebalance on the new node(s).
For complete instructions on adding new nodes including the proper use of the rebalance command within the
context of the procedure, see:
The rebalance is a background operation that may take many hours or days, depending on your HyperStore
cluster size and stored data volume. If you are adding multiple nodes to your cluster, it's OK to have rebalance
operations running on multiple new nodes concurrently. See the "Adding Nodes" (page 212) procedure for
detail.
In the event that the rebalance operation fails for some objects that are supposed to be copied to the new
node, these failures will subsequently be corrected automatically by the "Proactive Repair" (page 253) fea-
ture.
As an alternative to running hsstool rebalance on the command line, you can run it through the CMC UI:
347
Chapter 7. Reference
Note If you launch the operation through the CMC UI, you can track the operation progress through the
CMC's Operation Status page (Cluster -> Operation Status). This way of tracking operation progress
is not supported if you launch the operation on the command line. However, regardless of how you
launch the operation you can periodically check on its progress by using the hsstool opstatus com-
mand.
The example below shows a rebalance operation being executed on a newly added node, and the command
response. Note that rebalance is implemented separately for replica data ("REBALANCE cmdno #1" in the
response) and erasure coded data ("REBALANCEEC cmdno #1").
Note For more detailed status information on a rebalance operation -- including a break-down of
status by token range -- you can run hsstool -h <host> opstatus rebalance -a (or opstatus rebalanceec -
a).
348
7.1. Command Line Tools
Response Items
optype
The type of hsstool operation.
cmdno#
Command number of the run. Each run of a command is assigned a number.
status
Status of the command run: INPROGRESS, COMPLETED, or FAILED
A COMPLETED status means only that the operation did not error out and prematurely end. It does not mean
that the operation succeeded in respect to every object checked by the operation.
A FAILED status means that the operation ended prematurely due to errors. For additional status detail see the
other fields in the operation response. For details on any FAILED operation you can also scan cloudian-hyper-
store.log for error messages from the period during which the operation was running.
349
Chapter 7. Reference
arguments
Value of the command arguments used for the run, if any. The status results use internal system names for the
arguments which may not exactly match the command-line arguments that are defined in a command’s syntax,
but the relationships should be clear.
operation ID
Globally unique identifier of the rebalance run. This may be useful if Cloudian Support is helping you
troubleshoot a rebalance failure. Note that when rebalance is run, the REBALANCE part of the response (for
replica data) and REBALANCEEC part of the response (for erasure coded data) will both have the same oper-
ation ID.
Note The "cmd#" (described further above) cannot serve as a globally unique identifier because that
counter resets to zero -- and subsequently starts to increment again -- when the HyperStore Service is
restarted.
start
Start time of the operation.
end, duration
End time and duration of a completed operation.
time remaining
Estimated time remaining to complete the operation.
progress percentage
Of the total work that the operation has identified as needing to be done, the percentage of work that has been
completed so far.
task count
From the existing cluster, the total number of files that are evaluated for possible streaming (copying) to the
newly added node, based on the newly added node's assigned tokens. Each such file constitutes a "task".
completed count
From the total task count, the number of tasks that have been completed so far -- that is, the number of files that
the system has evaluated and if appropriate has attempted to stream to the new node. Each "completed" task
results in the incrementing of either the "streamed count", the "failed count", the "skipped count", or the
"prqueued count".
At the end of the operation the "completed count" should equal the "task count".
streamed count
The number of files successfully streamed (copied) from the existing nodes to the new node.
failed count
The number of files for which the attempt to stream the file to the new node fails, and the resulting attempt to
insert the file stream job into the proactive repairqueue fails also.
350
7.1. Command Line Tools
Note If the stream attempt for a file fails, but the stream job for that file is successfully added to the pro-
active repair queue, that file is counted toward the "prqueued count" -- not the "failed count".
For detail about rebalance streaming failures, on the target node for the rebalance operation (the new node)
you can scan /var/log/cloudian/cloudian-hyperstore.log for error messages from the time period during which
the rebalance operation was running.
skipped count
For rebalancing of replicated object data: Replicas of each object to be streamed (copied) to a newly added
node will typically exist on multiple existing nodes. For example, in a 3X replication environment, for a given
object that should be streamed to the new node (based on the new node's token ranges), typically a replica file
will reside on three of the existing nodes. The evaluation and processing of each such replica file counts
towards the "task count" (so, 3 toward the task count in our example). But once a replica is streamed from one
existing node to the new node, it doesn't need to be streamed from the other two existing nodes to the new
node. On those other two existing nodes the replica file is "skipped".
prqueued count
The number of files for which the attempt to stream the file to the new node fails (even after automatic retries),
but the stream job for that file is successfully added to the proactive repair queue. These stream jobs will then
be automatically executed by the next run of the hourly proactive repair process.
Note For rebalance status detail for each individual job (also known as a "session"), run hsstool -h
<host> opstatus rebalance -a for replicated object data or hsstool -h <host> opstatus rebalanceec -a for
erasure coded object data. This detailed information can be helpful if you are working with Cloudian
Support to troubleshoot rebalance problems.
351
Chapter 7. Reference
streamed bytes
The number of bytes of object data streamed to the new node.
Options
l Use allkeyspaces to repair replicated objects in the HSFS and also repair all the Cassandra keyspaces.
Cassandra repair will be completed first, then HSFS replica repair. The Cassandra keyspaces that will
be repaired are: UserData_<storage-policy-ID> keyspaces; AccountInfo; Reports; Monitoring; and
ECKeyspace. (For more information see the overview of Cassandra keyspaces for HyperStore).
l Use nokeyspaces to repair only replicated objects in the HyperStore File System, and not any Cas-
sandra keyspaces.
l If you specify neither allkeyspaces nor nokeyspaces then the default behavior is to repair replicated
objects in the HSFS and also to repair the Cassandra UserData_<storage-policy-ID> keyspaces (which
store object metadata). Cassandra repair will be completed first, then HSFS replica repair.
352
7.1. Command Line Tools
Note If you wish, you can have some or all of the scheduled auto-repairs of replica data use the "-com-
putedigest" option to combat bit rot. This aspect of auto-repair is controlled by the "auto_repair_com-
putedigest_run_number" (page 424) setting in common.csv. By default "-computedigest" is not used
in auto-repair runs.
If each node in the cluster is being repaired in succession, using this option makes the successive repair oper-
ations less redundant and more efficient.
Note The HyperStore auto-repair feature -- which automatically runs node repairs on a schedule --
uses the -pr option when it initiates the hsstool repair operation. For more on this feature see "Auto-
mated Data Repair Feature Overview" (page 252).
IMPORTANT ! Do not perform more than one mountpoint-specific repair at a time within a Hyper-
Store service region, even if the target disks are in different data centers. Repair just one disk at a
time, within a service region. Wait until repair of one disk is completed before repairing any other disk.
Repairing multiple disks concurrently can potentially result in data loss.
Note If you use the -d <mount-point> -rebuild option do not use the -pr option or the -range <start-
token,end-token> option. The system does not support using these options in combination.
353
Chapter 7. Reference
The end-token must be a token that is assigned to the target host. To see what tokens are assigned to each
node you can use the "hsstool ring" (page 378) command. The start-token must be the next-lower token in
the ring from the end-token. Put differently, the start-token is the token that forms the lower boundary of the
vNode that's identified by the end-token.
This option may be useful if a previous full node repair failed for particular ranges. You can obtain Information
about range repair failures (including the start and end tokens that bound any failed ranges) by running hsstool
-h <host> opstatus repair -a on the command line.
Note If you use the -range <start-token,end-token> option do not use the -pr option or the -d <mount-
point> option. The system does not support using the -range option together with the -pr option, or the -
range option together with the -d option.
-t <min-timestamp,max-timestamp> Repair only objects last modified within a specified time period
(Optional) If you use this option the repair is performed only for objects that have a last-modified timestamp
equal to or greater than min-timestamp and less than max-timestamp. When using this option, use Unix mil-
liseconds as the timestamp format.
Note This option is not supported in the CMC interface -- only on the command line.
The log is named cloudian-hyperstore-repair.log and is written into the Cloudian HyperStore log directory of
the target node. Activity associated with a particular instance of a command run is marked with a unique com-
mand number.
If Cassandra repair is in-progress — that is, if repair was launched with the default behavior or the allkey-
spaces option, and the Cassandra part of the repair is still in-progress — the Cassandra repair is terminated.
The HSFS replica repair — which would normally launch after the Cassandra repair — is canceled.
If HSFS replica repair is in-progress — that is, if repair was launched with the nokeyspaces option, or if it was
launched with the default behavior or the allkeyspaces option and the Cassandra part of the repair has already
completed and HSFS replica repair is underway — the HSFS replica repair is terminated.
You can subsequently use the "hsstool opstatus" (page 332) command to confirm that the repair has been
stopped (status = TERMINATED) and to see how much repair progress had been made before the stop.
If you subsequently want to resume a repair that you stopped, you can use the repair -resume option.
Note The -stop option stops a single in-progress repair on a single node. It does not disable the Hyper-
Store scheduled auto-repair feature.
354
7.1. Command Line Tools
repair token ranges that were not repaired by the previous repair.
Note If during the incomplete repair run you used the -pr option or the -d <mount-point> option, you
do not need to re-specify the option when running repair -resume. The system will automatically detect
that one of those options was used in the previous repair and will use the same option when resuming
the repair.
If during the incomplete repair run you used the -range option, then resuming the repair is not sup-
ported. Repair resumption works by starting from whichever token ranges were not repaired by the pre-
vious repair. Since a repair that uses the -range option targets just one token range, resuming such a
repair would not be any different than running that same repair over again. If you want to run the repair
over again, do repair -range <start-token,end-token> again, not repair -resume.
Note If you run repair -resume, then in the command results the Arguments section will include "resum-
ing-cmd=<n>", where <n> is the number of times that repair -resume has been run on the node since
the last restart of the HyperStore Service. This Argument string -- which serves to distinguish repair -
resume result output from regular repair result output -- also appears in "hsstool opstatus" (page 332)
results for repair -resume operations.
Use this hsstool command to check whether a physical node has all of the replicated data that it is supposed
to have (based on the node’s assigned tokens and on replication settings for the system); and to replace or
update any data that is missing or out-of-date. Replacement or update of data is implemented by retrieving cor-
rect and current replica data from other nodes in the system.
Note The system will not allow you to run hsstool repair:
* On a node on which hsstool cleanup is currently running.
* On any node if there is a disabled disk on any node in the same service region.
* On any node if hsstool repair is already running on a different node in the same service region. The
one exception to this rule is if you use the -pr option with each repair run -- you can run hsstool repair -
pr on multiple nodes concurrently.
The HyperStore system automatically uses a combination of read repair, proactive repair, and scheduled
auto-repair to keep the replica data on each node complete and current. Consequently, you should rarely
need to manually initiate a repair operation.
However, there are these uncommon circumstances when you should manually initiate repair on a specific
node:
355
Chapter 7. Reference
l If you are removing a "dead" node from your cluster. In this circumstance, after removing the dead node
you will run repair on each of the remaining nodes, one node at a time. See "Removing a Node" (page
236) for details.
1. Monitor the automatic proactive repair that initiates on the node when the node starts up, until it
completes. You can check the CMC's Repair Status page (Cluster -> Repair Status) peri-
odically to see whether proactive repair is still running on the node that you've brought back
online. This proactive repair will repair the objects from during the period when proactive repair
metadata was still being written to the Cassandra for the node.
2. After proactive repair on the node completes, manually initiate a full repair of the node (using
hsstool repair and, if appropriate for your environment, hsstool repairec). This will repair objects
that were written after the proactive repair queueing time maximum was reached.
Note The repair operation will fail if the HyperStore service is down on any of the nodes affected by the
operation (the nodes storing the affected token ranges). The operation will also fail if any disk storing
affected token ranges is disabled or more than 95% full.
The table below lists data problem cases and shows whether or not they are remedied by regular repair and by
repair that uses the computedigest option. Although regular repair can handle some cases of corruption, if cor-
ruption is suspected on a node and you’re not certain exactly which data is corrupted, it’s best to use repair
computedigest.
Case Will repair fix it? Will repair computedigest fix it?
Missing blob file yes yes
As an alternative to running hsstool repair on the command line, you can run it through the CMC UI:
356
7.1. Command Line Tools
Note If you launch the operation through the CMC UI, you can track the operation progress through the
CMC's Operation Status page (Cluster -> Operation Status). This way of tracking operation progress
is not supported if you launch the operation on the command line. However, regardless of how you
launch the operation you can periodically check on its progress by using the hsstool opstatus com-
mand.
The example below shows a default run of hsstool repair, using no options.
357
Chapter 7. Reference
Note In the example above there is very little data in the system and so the operation completes
almost instantly. In a real-world environment this is a long-running operation and the command
response will not return until the operation completes. In the meanwhile you can track operation pro-
gress as described in "CMC Support for This Command" (page 356).
Response Items
optype
The type of hsstool operation.
cmdno#
Command number of the run. Each run of a command is assigned a number.
status
Status of the command run: INPROGRESS, COMPLETED, FAILED, or TERMINATED.
A COMPLETED status means only that the operation did not error out and prematurely end. It does not mean
that the operation succeeded in respect to every object checked by the operation. For example in the case of
repair, a COMPLETED status means that all objects in the scope of the operation were checked to see if they
needed repair. It does not mean that all objects determined to need repair were successfully repaired. For
high-level information about object repair successes and failures (if any), see the other fields in the repair
response.
A FAILED status means that the operation ended prematurely due to errors. For additional status detail see the
other fields in the repair response. For details on any FAILED operation you can also scan cloudian-hyper-
store.log for error messages from the period during which the operation was running. More details can also be
had by running hsstool -h <host> opstatus repair -a on the command line.
A TERMINATED status means that the repair run was terminated by an operator, using repair -stop.
arguments
Value of the command arguments used for the run, if any. The status results use internal system names for the
arguments which may not exactly match the command-line arguments that are defined in a command’s syntax,
but the relationships should be clear. For example, hsstool repair command-line syntax supports a "-pr" option,
and within "arguments" response item the use or non-use of this option is indicated as "primary-range=true" or
"primary-range=false".
operation ID
Globally unique identifier of the repair run. This may be useful if Cloudian Support is helping you troubleshoot
a repair failure.
Note The "cmd#" (described further above) cannot serve as a globally unique identifier because that
counter resets to zero -- and subsequently starts to increment again -- when the HyperStore Service is
restarted.
start
358
7.1. Command Line Tools
end, duration
End time and duration of a completed operation.
progress percentage
Of the total work that the operation has identified as needing to be done, the percentage of work that has been
completed so far.
The exception is if the "-pr" option was used when the hsstool repair operation was executed, in which case the
repair operation addresses only the target node’s "primary ranges". In this case the "total range count" value
will equal the number of tokens assigned to the node.
Note For repair status detail for each token range, run hsstool -h <host> opstatus repair -a. This
detailed, per-range status information can be helpful if you are working with Cloudian Support to
troubleshoot repair problems.
For information about such failures, on the target node for the repair you can scan /var/log/cloudian/cloudian-
hyperstore-repair.log for the time period during which the repair operation was running. Running hsstool -h
<host> opstatus repair -a on the command line will also provide useful details about repair failures.
keyspace count
Number of Cassandra keyspaces repaired. With the default repair behavior this will equal the number of stor-
age policies that are in your HyperStore system. There is one Cassandra UserData_<policyid> keyspace for
each storage policy. This is where object metadata is stored.
359
Chapter 7. Reference
Of all the replica files evaluated by the repair operation, this is the number of files that were determined to be in
need of repair. This figure may include files on other nodes as well as files on the target repair node. For
example, if an object is correct on the target node but one of the object’s replicas on a different node is missing
and needs repair, then that counts as one toward the repair file count. For a second example, if two of an
object’s three replicas are found to be out-dated, that counts as two toward the "repair file count".
failed count
Of the files that were found to be in need of repair, the number of files for which the attempted repair failed. For
information about such failures, on the target node for the repair you can scan /var/log/cloudian/cloudian-hyper-
store-repair.log for the time period during which the repair operation was running. Running hsstool -h <host>
opstatus repair -a on the command line will also provide useful details about repair failures.
If possible, files for which the repair attempt fails are added to the proactive repair queue (see "pr queued
count" below).
The "repaired count" plus the "failed count" plus the "pr queued count" should equal the "repair file count".
Note One thing that can increment the "failed count" is if the operation entails writing data to a disk that
is in a stop-write condition (which by default occurs when a disk is 90% full). Such write attempts will
fail.
repaired count
Of the files that were found to be in need of repair, the number of files for which the repair succeeded. The
"repaired count" plus the "failed count" plus the "pr queued count" should equal the "repair file count".
pr queued count
The number of files that the hsstool repair operation adds to the proactive repair queue, to be fixed by the next
proactive repair run. If the hsstool repair operation fails to repair a file that requires repair, it adds the file to the
proactive repair queue. Proactive repair is a different type of repair operation and may succeed in cases where
regular hsstool repair failed. By default proactive repair runs every 60 minutes.
The "repaired count" plus the "failed count" plus the "pr queued count" should equal the "repair file count".
completed count
The total number of files that were assessed to see if they were in need of repair. This number reflects rep-
lication across the cluster — for example, if an object is supposed to be replicated three times (with one replica
on the target repair node and two replicas on other nodes), then repair assessment of that object counts as
three files toward the "completed count".
Note that "completed" here does not necessarily mean that all object repair attempts succeeded. For more
information on success or failure of object repair attempts, see the other status metrics.
scan time
360
7.1. Command Line Tools
Total time in seconds that it took to scan the file systems and build the Merkle Tree that is used to detect dis-
crepancies in object replicas across nodes.
stream time
Total time in seconds that was spent streaming replicas across nodes, in order to implement needed repairs.
Options
You can only specify one -keyspace option per hsstool repaircassandra run. Specifying multiple keyspaces is
not supported. Note again that the default behavior of hsstool repaircassandra -- if you omit the -keyspace
option -- is to repair all of the Cassandra keyspaces.
Example of using the -keyspace option together with a target column family name:
Note The -keyspace option is supported only on the command line, not in the CMC UI.
If each node in the cluster is being repaired in succession, using this option makes the successive repair oper-
ations less duplicative and more efficient.
If you use neither the -local option nor the -dc <dcname> option, then Cassandra data in all of your HyperStore
system's data centers will be repaired.
361
Chapter 7. Reference
Note The -local option is supported only on the command line, not in the CMC UI.
-dc <dcname> Repair only the nodes in the specified data center
(Optional) Only repair data on nodes in the specified data center. For example if you specify -dc boston then
only the Cassandra data on the nodes in the data center named boston will be repaired. The <dcname> must
be a valid data center name from your HyperStore system configuration.
If you use neither the -local option nor the -dc <dcname> option, then Cassandra data in all of your HyperStore
system's data centers will be repaired.
Note The -dc <dcname> option is supported only on the command line, not in the CMC UI.
If you need to terminate an in-progress Cassandra repair that was initiated via the native Cassandra utility
nodetool, use hsstool -h <host> repaircassandra -stop enforce. Note that using nodetool to initiate a Cas-
sandra repair is not recommended.
Note In the CMC UI the -stop enforce option is called "force stop".
The hsstool repaircassandra command does not support a 'resume' option. If you stop an in-progress Cas-
sandra repair, to do the repair again use the hsstool repaircassandra command again and the repair will start
over.
Note The -stop option stops a single in-progress Cassandra repair on a single node. It does not dis-
able the HyperStore scheduled auto-repair feature.
Use this hsstool command to repair only the Cassandra data on a node (the system metadata and object
metadata stored in Cassandra) and not the object data on the node. Under normal circumstances you should
not need to use this command, but you might use it when in a troubleshooting or recovery situation.
IMPORTANT ! If you do need to initiate a Cassandra-only repair (with no repair of the object data in
the HyperStore File System), use this command rather than using the native Cassandra utility nodetool
to initiate the repair. Using hsstool repaircassandra has multiple advantages over using nodetool
repair, including that with hsstool repaircassandra you can track the repair's progress with hsstool
opstatus and you can stop the repair if you need to for some reason.
As an alternative to running hsstool repaircassandra on the command line, you can run it through the CMC UI:
362
7.1. Command Line Tools
Note As is shown in the CMC interface for hsstool repaircassandra and in the status response when
you run the command, the command applies a "ranges=true" argument by default (the status response
includes "ranges=true" in the list of arguments). With this method of Cassandra repair, each impacted
token range is repaired one range at a time, sequentially. This approach improves the performance for
Cassandra repair. Prior to HyperStore release 7.1.5 using this method was optional, but starting with
release 7.1.5 it became the default repair behavior.
Options
363
Chapter 7. Reference
Note If you wish, you can have some or all of the scheduled auto-repairs of erasure coded data use
the "-computedigest" option to combat bit rot. This aspect of auto-repair is controlled by the "auto_
repair_computedigest_run_number" (page 424) setting in common.csv. By default "-computedigest"
is not used in auto-repair runs.
If you are performing the disk repair after an "Add Node" operation has successfully completed and before
rebalance has completed for all the new nodes , use the -reb option together with -d <mountpoint> -- -- for
example hsstool -h localhost repairec -reb -d /cloudian1. In all other circumstances, omit the -reb option.
IMPORTANT ! Do not perform more than one mountpoint-specific repair at a time within a Hyper-
Store service region, even if the target disks are in different data centers. Repair just one disk at a
time, within a service region. Wait until repair of one disk is completed before repairing any other disk.
Repairing multiple disks concurrently can potentially result in data loss.
Note
* If you use the -d <mountpoint> option do not use the -range <start-token,end-token> option or the -f
<input-file>. The system does not support combining mountpoint repair with those other repair options.
* The HyperStore replaceDisk function (see "Replacing a HyperStore Data Disk" (page 291)) auto-
matically invokes the hsstool repairec -d <mountpoint> command, after automatically performing other
tasks required for bringing a replacement disk back into service. If you use the replaceDisk function
after adding nodes to your cluster and before rebalancing has been completed, the function auto-
matically invokes the hsstool repairec -d <mountpoint> command.
The end-token must be a token that is assigned to the target host. To see what tokens are assigned to each
node you can use the "hsstool ring" (page 378) command. The start-token must be the next-lower token in
the ring from the end-token. Put differently, the start-token is the token that forms the lower boundary of the
vNode that's identified by the end-token.
This option may be useful if a previous full node repair reported failures.
-f <input-file> [-opid <operationId>] Repair only the objects listed in an input file
(Optional) See "Repairing Objects Specified in a File" (page 367).
364
7.1. Command Line Tools
Note The CMC does not support the -f <input-file> [-opid <operationId>] option. This option is only sup-
ported if you run hsstool repairec on the command line.
The log is named cloudian-hyperstore-repair.log and is written into the Cloudian HyperStore log directory of
the target node.
Failures to repair particular erasure coded object chunks are logged in a separate log file, named cloudian-
hyperstore-repair-failure.log.
For more information about these logs see "HyperStore Service Logs" (page 546).
Note If the repair that you want to stop is a rebuild of the erasure coded data on a particular disk or a
repair of a specified token range -- a repair launched as hsstool -h <host> repairec -d <mountpoint> or
hsstool -h <host> repairec -range <start-token,end-token> -- you can stop it with simply hsstool -h
<host> repairec -stop. Do not include the -d <mountpoint> option or the -range <start-token,end-token>
option when executing the stop command.
Note The -stop option stops a single in-progress repair. It does not disable the HyperStore scheduled
auto-repair feature.
Note In a large cluster with high data volume hsstool repairec is a long-running operation that may
take multiple weeks to complete.
Use this hsstool command to evaluate and repair erasure coded object data. When you run hsstool repairec
on a target node, the scope of the repair depends on the storage policy or policies that you are using in your
system:
365
Chapter 7. Reference
l For an erasure coding storage policy confined to a single data center, the hsstool repairec operation
repairs all erasure coded data on all nodes in the data center in which the target node resides. So, to
repair all the data associated with this policy you only need to run hsstool repairec on any one node in
the data center.
l For a distributed erasure coding storage policy spanning multiple data centers, the hsstool repairec
operation repairs all erasure coded data on all nodes in all of the data centers included in the storage
policy. To repair all the data associated with this type of storage policy you only need to run hsstool
repairec on one node in any one of the data centers included in the storage policy.
l For a replicated erasure coding storage policy spanning multiple data centers, the hsstool repairec
operation repairs all erasure coded data on all nodes in the data center in which the target node
resides. To repair all the data associated with this type of storage policy you must run hsstool repairec
on one node in each of the data centers included in the storage policy.
The repair process entails replacing erasure coded object fragments that are missing, outdated, or corrupted.
Replacement of a missing or bad object fragment is implemented by using the object's good fragments to
decode the object, re-encoding the object, and then re-writing the missing or bad fragment to the correct end-
point node. To repair an erasure coded object in this manner, there must be at least "k" good fragments present
for the object within the system.
l On any node if hsstool repairec is already running on a different node in the same service region.
l On any node if there is a disabled disk on any node in the same service region.
l On a node on which hsstool cleanupec is currently running.
If the HyperStore Service or Cassandra Service is down on a node in the same data center as the target node,
the system does allow you to run hsstool repairec but the repair will skip all objects that have a fragment on the
node on which the HyperStore Service or Cassandra Service is down. Those objects will not be added to the
proactive repair queue -- instead, those objects will remain un-repaired until you subsequently run hsstool
repairec again with the HyperStore Service and Cassandra Service both up on that node.
The command also supports options for repairing just a single disk (mountpoint) or just a single token range.
IMPORTANT ! Do not perform more than one mountpoint-specific repair at a time within a Hyper-
Store service region, even if the target disks are in different data centers. Instead, when using the -d
<mountpoint> option repair just one disk at a time. Wait until repair of one disk is completed before
repairing any other disk. Repairing multiple disks concurrently can potentially result in data loss.
The HyperStore system automatically uses a combination of read repair, proactive repair, and scheduled
auto-repair to keep the erasure coded data on each node complete and current. Consequently, you should
rarely need to manually initiate a repairec operation.
However, if you use erasure coding in your system, there are these uncommon circumstances when you
should manually initiate a repairecoperation:
l If you are removing a "dead" node from your cluster. In this circumstance, after removing the dead node
you will run repairec on one node in each of your data centers. See "Removing a Node" (page 236) for
details.
366
7.1. Command Line Tools
1. Monitor the automatic proactive repair that initiates on the node when the node starts up, until it
completes.You can check the CMC's Repair Status page (Cluster -> Repair Status) peri-
odically to see whether proactive repair is still running on the node that you've brought back
online. This proactive repair will repair the objects from during the period when proactive repair
metadata was still being written to the Cassandra for the node.
2. After proactive repair on the node completes, manually initiate a full repair of the node (using
hsstool repairec and hsstool repair). This will repair objects that were written after the proactive
repair queueing time maximum was reached.
You can also run the hsstool repairec command through the CMC UI:
Note If you launch the operation through the CMC UI, you can track the operation progress through the
CMC's Operation Status page (Cluster -> Operation Status). This way of tracking operation progress
is not supported if you launch the operation on the command line. However, regardless of how you
launch the operation you can periodically check on its progress by using the hsstool opstatus com-
mand.
When run on the command line, the hsstool repairec command supports an option for repairing one or more
specific objects that are listed in an input file:
367
Chapter 7. Reference
For <input-file>, specify the full absolute path to the input file (including the file name). Both the input file and
the directory in which it is located must be readable by the 'cloudian' user or else the repair command will
immediately fail and report an error.
The target <host> can be any node that is utilized by the erasure coding storage policies used by the objects
that are listed in the input file.
You have two options for obtaining or creating the input file that lists the objects to repair:
l Use the dedicated repair failure log file that the system generates automatically when hsstool repairec
is run and repair fails for some objects.
l Create the input file yourself, to target a specific object or small set of objects.
On each node, the location and name of the current repair failure log is:
/var/log/cloudian/cloudian-hyperstore-repair-failure.log
ChunkName|Path|OperationId|yyyy-mm-dd HH:mm:ss,SSS|FailReason[|TaskType]
Note For more information about this log, including sample entries and log rotation policy, see "Hyper-
Store Service Logs" (page 546).
Within the log entries, the OperationId field indicates a system-generated unique ID for the hsstool repairec run
that resulted in the chunk repair failure.
You can use the repair failure log file as input to the hsstool repairec -f <input-file> command, and if you wish
you can use the -opid <operationId> option to limit the repair to the chunk repair failures that resulted from a
particular hsstool repairec run. If you do not use the -opid <operationId> option, then the repair will be per-
formed for all chunks listed in the log file.
Here is an example of the command syntax, including use of the -opid option:
Note that:
368
7.1. Command Line Tools
<operationId>] separately for each file. Wait for the repairec run to complete for one file before moving
on to run it again for the next file (do not run multiple repairec operations concurrently).
l Rotated cloudian-hyperstore-repair-failure.log files are Gzipped. You must unzip them before using
them as input to an hsstool repairec -f <input-file> [-opid <operationId>] run.
l The -opid <operationId> option is supported only if the repair failure log is the input file. The -opid <oper-
ationId>option is not supported if you use an input file that you've created (as described below).
In the context of erasure coding storage policies, after objects are broken into chunks, each object chunk is
erasure coded.
The hsstool repairec -f <input-file> operation acts on individual chunks, and so in the input file each line must
specify a chunk name and the full path to the chunk file, in the following format:
ChunkName|Path
There is no limit on the number of lines that you can include in the file (no limit on the number of chunks that
you can specify in the file).
Here is an example in which the object is smaller than the chunk size threshold and so the object is stored as
just one chunk. In this case the chunk name is simply in the form of <bucketname>/<objectname>. (For back-
ground information about chunk file paths within the HyperStore File System see "HyperStore Service and
the HSFS" (page 38).)
bucket1/HyperFileInstallGuide_v-3.6.pdf|/cloudian1/ec/std8ZdRJDskcPvmOg4/
0bb5332b429ccb76466e05bee2915d34/074/156/90721763863541208072539249099911078458.
1554130786616795163-0A3232C9
Note Although the example above and those that follow below break to multiple lines in this doc-
umentation, in the actual input file each ChunkName|Path combination constitutes just one line in the
file.
Here is a second example, for an object that was uploaded by a regular S3 PUT operation (not a Multipart
Upload) but which is larger than the chunk size threshold and so has been broken into multiple chunks. The
example shows an input file entry for one of those chunks. Note that the chunk name here includes a chunk
number suffix (shown in bold).
bucket1/HyperFileAdminGuide_v-3.6.pdf..0001|/cloudian1/ec/std8ZdRJDskcPvmOg4/
0bb5332b429ccb76466e05bee2915d34/087/073/13080395222414127681573583484873262519.
1554124662019670529-0A3232C9
In this third example, the object has been uploaded via an S3 Multipart Upload operation. In this example that
specifies one of the object's chunks, the chunk name includes a prefix based on the upload ID, as well as a
chunk number suffix and part number suffix (all shown in bold).
369
Chapter 7. Reference
m-MDA1NTE5NjExNTU0MTIzODU3Mjg1/bucket1/cloudian-hyperfile_v-3.6.tar.gz..0001.2|
/cloudian1/ec/std8ZdRJDskcPvmOg4/0bb5332b429ccb76466e05bee2915d34/082/100/
39535768889303436494640495599026926454.1554123857285865726-0A3232C9
You have two options for obtaining the chunk name and chunk path for chunks that you want to target for
repair. One option is to use the hsstool whereis command for each object that you want to repair. The whereis
response includes the chunk name(s) and chunk path(s) for the object that you specify. You can copy the
chunk name and path from the whereis response into your input file.
Note The whereis response has information for each erasure coded fragment, on each node on which
the fragments reside. For a given object chunk, each erasure coded fragment has the same chunk
name and chunk path, so you can get this information from any of the fragments, regardless of which
node a particular fragment is stored on.
The other option is to copy one or more entries from the log file cloudian-hyperstore-repair-failure.log
(described in "Using the Repair Failure Log as the Input File" (page 368)) into a separate file, and use that
separate file as the input file. You could do this if you wanted to target a particular object's chunks for another
repair attempt, rather than targeting all failures from a preceding repair run.
Note In the example above there is very little data in the system and so the operation completes
almost instantly. In a real-world environment this is a long-running operation and the command
370
7.1. Command Line Tools
response will not return until the operation completes. In the meanwhile you can track operation pro-
gress as described in "CMC Support for This Command" (page 367).
To get more detailed metrics about repair failures -- in the event that the "failed count" in the repairec
response is non-zero -- you can run hsstool -h <host> opstatus repairec -a (the -a is the verbose output flag) on
the same node on which you ran the repairec operation. This will return the same status metrics that are
returned by repairec and opstatus repairec, followed by a categorization of repair failures (if any) into various
failure types. The response excerpt below is for an operation in which no failures occurred.
...
Reason: CONNECTION_ERROR Count: 0
Reason: UNKNOWN Count: 0
Reason: CREATE_TASK_FAILED Count: 0
Reason: EC_DECODE_FAILED Count: 0
Reason: EC_ENCODE_FAILED Count: 0
Reason: REPAIR_TASK_EXPIRED Count: 0
Reason: REPAIR_CYCLE_EXCEEDED Count: 0
Reason: REPAIR_MESSAGE_REQUEST_FAILED Count: 0
Reason: CASSANDRA_CHECK_FAILED Count: 0
Reason: NODE_DOWN Count: 0
These same metrics for failure types are logged in cloudian-hyperstore.log, as part of the standard logging of
repairec operations.
Also, the log file cloudian-hyperstore-repair.log records an entry for each object that the repairec operation was
able to repair; and cloudian-hyperstore-repair-failure.log records an entry for each object that the repairec oper-
ation was unable to repair.
For more information about these logs see "HyperStore Service Logs" (page 546).
Response Items
optype
The type of hsstool operation.
cmdno#
Command number of the run. Each run of a command is assigned a number.
status
Status of the command run: INPROGRESS, COMPLETED, FAILED, or TERMINATED.
A COMPLETED status means only that the operation did not error out and prematurely end. It does not mean
that the operation succeeded in respect to every object checked by the operation. For example in the case of
repair, a COMPLETED status means that all objects in the scope of the operation were checked to see if they
needed repair. It does not mean that all objects determined to need repair were successfully repaired. For
high-level information about object repair successes and failures (if any), see the other fields in the repairec
response.
A FAILED status means that the operation ended prematurely due to errors. For additional status detail see the
other fields in the repairec response. For details on any FAILED operation you can also scan cloudian-hyper-
store-repair-failure.log for error messages from the period during which the repairec operation was running.
A TERMINATED status means that the repair run was terminated by an operator, using repairec -stop.
371
Chapter 7. Reference
arguments
Value of the command arguments used for the run, if any. The status results use internal system names for the
arguments which may not exactly match the command-line arguments that are defined in a command’s syntax,
but the relationships should be clear. For example, hsstool repair command-line syntax supports a "-pr" option,
and within "arguments" response item the use or non-use of this option is indicated as "primary-range=true" or
"primary-range=false".
operation ID
Globally unique identifier of the repairec run. This may be useful if Cloudian Support is helping you
troubleshoot a repair failure.
Note The "cmd#" (described further above) cannot serve as a globally unique identifier because that
counter resets to zero -- and subsequently starts to increment again -- when the HyperStore Service is
restarted.
start
Start time of the operation.
end, duration
End time and duration of a completed operation.
progress percentage
Of the total work that the operation has identified as needing to be done, the approximate percentage of work
that has been completed so far.
time remaining
Estimated time remaining to complete the operation.
total ranges
The total number of token ranges for which data is being evaluated to determine if it needs repair.
task count
In the HyperStore File System, objects are stored as "chunks". Objects smaller than or equal to the chunk size
threshold (10MB) are stored as a single chunk. Objects larger than the chunk size threshold are broken into
and stored as multiple chunks, with no chunk exceeding the threshold in size. In the case of large objects that
S3 client applications upload to HyperStore by the Multipart Upload method, HyperStore breaks the individual
parts into chunks if the parts exceed the chunk size threshold. In the context of erasure coding storage policies,
after objects (or object parts) are broken into chunks, each object chunk is erasure coded.
In the repairec operation, the evaluation of a single chunk -- to determine whether all of its erasure coded frag-
ments are present on the nodes on which they should be stored -- constitutes a single "task". For example, the
evaluation of a 100MB object that has been broken into 10 chunks -- each of which has been erasure coded
using a 4+2 erasure coding scheme -- would count as 10 "tasks", with one task per chunk.
The "task count" metric, then, is the total number of chunks that are being evaluated to determine whether any
of them are in need of repair.
372
7.1. Command Line Tools
definition of a "task" see the description of "task count" above). A "completed" task means that an erasure
coded object chunk was evaluated and, if it needs repair, an attempt was made to repair it.
A completed task has one of three possible results: a successful repair, a failed repair attempt, or the determ-
ination that the chunk does not need repair (i.e., all of the chunk's fragments are in the proper locations within
the cluster). These results are tallied by other repairec response metrics:
l "repaired count" -- The number of erasure coded object chunks for which a repair was found to be
necessary and was successfully executed.
l "failed count" -- The number of erasure coded object chunks for which a repair was found to be neces-
sary, but the repair attempt failed.
l "skipped count" -- The number of erasure coded object chunks that were evaluated and determined not
to need repair.
The "repaired count", "failed count", and "skipped count" should add up to equal the "completed count".
Note Notice from the descriptions above that a failed repair attempt counts as a "completed" task. In
other words, "completed" in this context does not necessarily mean success. It means only that the
repairec operation has finished its processing of that chunk, resulting in one of the three outcomes
described above.
Note If the "completed count" is less than the "task count" this means that the repair was interrupted in
such a way that some erasure coded object chunks were identified by a scan of object metadata in Cas-
sandra (and thus counted toward the "task count") but were not yet evaluated or repaired.
timer
This shows detailed timing metrics for various parts of the repairec operation. These metrics may be useful if
Cloudian Support is working with you to troubleshoot repairec performance issues in your environment.
Note that even more detailed timing information for a completed repairec operation is available in the Hyper-
Store Service application log. The timing metrics lines in the log are preceded by a line that says "Per-
formance meters". For example here is a log except showing some of the timing detail (this is from a repairec
run in which there was no data to repair and so the timings are "0"):
Performance meters:
Timer name: RocksDB.digests.per.disk.timer, event count: 0, mean rate: 0.0,
373
Chapter 7. Reference
recent 15 min rate: 0.0, mean duration: 0.0, median duration: 0.0,
75% events average duration: 0.0, 99% events average duration: 0.0.
Rate unit: events/s, Duration unit: milliseconds.
Timer name: cassandra.iterating.timer, event count: 0, mean rate: 0.0,
recent 15 min rate: 0.0, mean duration: 0.0, median duration: 0.0,
...
...
Options
If you use hsstool -h <host> repairqueue -enable false to disable the auto-repair feature, this applies to all
nodes in the service region of the specified host. So, it doesn't matter which host you specify in the command
as long as it's in the right service region. Likewise hsstool -h <host> repairqueue -enable true re-enables the
auto-repair feature for all nodes in the service region.
If you do not use the optional -t flag (described below) to specify an auto-repair type, then the disabling or re-
enabling applies to all auto-repair types and also to the proactive repair feature. (If you want to disable or
re-enable only proactive repair without impacting the scheduled auto-repair feature see "hsstool pro-
activerepairq" (page 340).)
Note that disabling the auto-repair feature does not abort in-progress auto-repairs. Rather, it prevents any
additional scheduled auto-repairs from launching. (For information about stopping in-progress repairs, see
"hsstool repair" (page 352) and "hsstool repairec" (page 363)).
IMPORTANT ! The scheduled auto-repair feature is important for maintaining data integrity in your sys-
tem. Do not leave it disabled permanently.
Note In the CMC UI the enable/disable option is presented as part of a Maintenance -> autorepair
command rather than the Info -> repairqueue command.
374
7.1. Command Line Tools
l In combination with the -enable true|false option, you can use the -t option to disable or re-enable just a
particular type of auto-repair. For example, use hsstool -h <host> repairqueue -enable false -t ec to dis-
able auto-repairs of erasure coded object data. In this example auto-repairs would continue to be
enabled for replicated object data and for Cassandra metadata.
Note If you do not use the optional -t flag to specify an auto-repair type, then the disabling or re-
enabling applies to all auto-repair types and also to the proactive repair feature. (If you want
to disable or re-enable only proactive repair without impacting the scheduled auto-repair feature
see "hsstool proactiverepairq" (page 340).)
l Without the -enable true|false option, you can use the -t option to retrieve scheduling information for just
a particular type of scheduled auto-repair. For example, use hsstool -h <host> repairqueue -t replicas to
retrieve scheduling information for auto-repairs of replicated object data. Using hsstool -h <host>
repairqueue by itself with no -t flag will retrieve scheduling information for all auto-repair types.
If you use the -enable true|false -t cassandra option without the -inc option then your enabling or disabling
action applies to both types of Cassandra auto-repair (incremental and full).
Note The -inc option is only supported on the command line, not in the CMC.
The HyperStore "auto-repair" feature implements a periodic automatic repair of replicated object data, erasure
coded object data, and Cassandra metadata on each node in your system. With the hsstool repairqueue com-
mand you can:
l Check on the upcoming auto-repair schedule as well as the status from the most recent auto-repair
runs.
l Temporarily disable auto-repair for a particular repair type or all types.
l Re-enable auto-repair, if it has previously been disabled by the hsstool repairqueue command.
For background information on the auto-repair feature, see "Automated Data Repair Feature Overview"
(page 252).
Note You cannot enable or disable auto-repair for just one particular node — the auto-repair feature is
either enabled or disabled for the cluster as a whole. Therefore when running the command the target
<host> can be any node.
With the hsstool repairqueue command you can disable (and subsequently re-enable) the HyperStore auto-
repair feature. You should disable auto-repair before performing the following cluster operation:
375
Chapter 7. Reference
IMPORTANT ! If you disable auto-repair in order to perform an operation, be sure to re-enable it after-
ward.
Note The system automatically disables the auto-repair feature when you upgrade your HyperStore
software version or when you add nodes to your cluster; and the system automatically re-enables auto-
repair after these operations are completed. You do not need to use hsstool repairqueue when you per-
form those operations.
In the CMC UI you can run the hsstool repairqueue command's auto-repair queue status reporting function
through this interface:
The function for disabling and re-enabling auto-repair has its own separate CMC interface and the command is
there renamed as "autorepair" (although hsstool repairqueue is being invoked behind the scenes):
376
7.1. Command Line Tools
Note If you use this command to disable or re-enable auto-repair and you do not specify a repair type,
then the disabling or re-enabling applies also to the "proactive repair" feature. (If you want to disable or
re-enable only proactive repair without impacting the scheduled auto-repair feature see hsstool pro-
activrepairq.)
The first repairqueue example below shows the "replicas" auto-repair queue status for a recently installed six-
node cluster.
Note The response attributes for the "ec" and "cassandra" auto-repair queues would be the same as
for the "replicas" queue ("next repair at", "last repair status", and so on) -- except that for the "cassandra"
repair queue the response also includes a "repairScope" attribute which distinguishes between
"INCREMENTAL" (for Cassandra incremental repairs) and "DEFAULT" (for Cassandra full repairs).
The next example command disables "replicas" auto-repair. Note that this disables replicated object data auto-
repair for the whole cluster. It does not matter which node you submit the command to.
Response Items
When you use the repairqueue command to retrieve auto-repair queue information, the command results have
three sections — one for each repair type. Each section consists of the following items:
Queue
Auto-repair type — either "replicas", "ec", or "cassandra"
Auto-repair
Enabled or disabled
#endpoints
Number of nodes in the cluster. Each node is separately scheduled for repair, for each repair type.
377
Chapter 7. Reference
<Queue position>
This is an integer that indicates the position of this node within the cluster-wide queue for auto-repairs of this
type. The node at the head of the queue has queue position "1" and is listed first in the command results.
endpoint
IP address of a node
next repair at
For each repair type, each node’s next repair at value is determined by adding the configurable auto-repair
interval for that repair type to the start-time of the last repair of that type done on that node. It’s important to note
that next repair at values are used to order the cluster-wide queues for each repair type, but the next repair of
that type on that node won’t necessarily start at that exact time. This is because the queue processing logic
takes into account several other considerations along with the scheduled repair time.
Note For erasure coded (EC) object repair, the "next repair at" values are not relevant. Ignore these val-
ues. This is because auto-repair for erasure coded objects is run against just one randomly selected tar-
get host in each data center each 29-day auto-repair period (and this results in repair of all EC objects
in the whole data center).
If a node restart interrupts a repair, that repair job is considered FAILED and it goes to the head of the queue.
interval
The configurable interval at which this type of repair is automatically initiated on each node, in number of
minutes. (Note though the qualifiers indicated in the "next repair at"description above).
count
The number of repairs of this type that have been executed on this node since HyperStore was installed on the
node.
378
7.1. Command Line Tools
This hsstool command provides status information for each of the dozens or hundreds of virtual nodes
(vNodes) in your storage cluster. It is very granular and verbose. In most circumstances you will find more
value in using the hsstool info or hsstool status commands rather than ring.
Since this command retrieves information for the whole cluster, the target <host> can be any node.
As an alternative to running hsstool ring on the command line, you can run it through the CMC UI:
The ring command results display a status line for each virtual node (vNode) in the storage cluster. In a typical
cluster the ring command may return hundreds of lines of information. The returned information is sorted by
ascending vNode token number.
The example below is an excerpt from a ring command response for a four node HyperStore system that spans
two data centers. Each of the four physical nodes has 32 vNodes, so the full response has 128 data lines. The
list is sorted by ascending vNode token number. Note that although the command is submitted to a particular
node ("cloudian-node1"), it returns information for the whole cluster. It doesn’t matter which node you submit
the command to.
379
Chapter 7. Reference
Response Items
Address
IP address of the physical node on which the vNode resides.
DC
Data center in which the vNode resides.
Rack
Rack in which the vNode resides.
Cassandra
Cassandra Service status of the vNode. Will be one of: "Up", "Down", "Joining" (in the process of joining the
cluster), "Leaving" (in the process of decommissioning or being removed from the cluster), or "?" (physical host
cannot be reached). All vNodes on a physical node will have the same Cassandra status.
Cassandra-Load
Cassandra load (quantity of data stored in Cassandra) for the physical host on which the vNode resides. There
will be some Cassandra load even if all S3 objects are stored in the HyperStore File System or the erasure cod-
ing file system. For example, Cassandra is used for storage of object metadata and service usage data, among
other things. Note that Cassandra load information is available only for the physical node as a whole; it is not
available on a per-vNode basis.
HSS
HyperStore Service status for the vNode. Will be one of: "Up", "Down", or "?" (physical host cannot be reached).
All vNodes on a physical node will have the same HSS status.
State
HyperStore Service state for the vNode. Will be one of: "Normal" or "Decommissioning". All vNodes on a given
physical node will have the same HSS state.
Token
The vNode’s token (from an integer token space ranging from 0 to 2 127 -1). This token is the top of the token
range that constitutes the vNode. Each vNode's token range spans from the next-lower token (exclusive) in the
cluster up to its own token (inclusive).
380
7.1. Command Line Tools
This hsstool command returns status information for the storage cluster as a whole.
Since this command retrieves information for the whole cluster, the target <host> can be any node.
As an alternative to running hsstool status on the command line, you can run it through the CMC UI:
The status command example below retrieves the status of a four-node cluster.
Address
IP address of the physical node on which the vNode resides.
DC
Data center in which the node resides.
Rack
Rack in which the node resides.
Cassandra
381
Chapter 7. Reference
Cassandra Service status of the node. Will be one of: "Up", "Down", "Joining" (in the process of joining the
cluster), "Leaving" (in the process of decommissioning or being removed from the cluster), or "?" (physical host
cannot be reached).
Cassandra-Load
Of the Cassandra data in the system as a whole, the portion (as a decimal) that is stored on each node.
HSS
HyperStore Service status for the node. Will be one of: "Up", "Down", or "?" (physical host cannot be reached).
State
HyperStore Service state for the node. Will be one of: "Normal" or "Decommissioning".
HyperStore-Disk
HyperStore data disk utilization on the node
Host-ID
System-generated unique ID for the node
Hostname
Hostname of the node
Options
If you use the -a flag -- that is, trmap list -a -- then the command returns the IDs of all snapshots -- the active
snapshots and also the disabled snapshots (snapshots for which the associated rebalancing operation has
completed). When you use the -a flag the return includes a status field for each snapshot, to distinguish active
snapshots from disabled snapshots. For example:
382
7.1. Command Line Tools
This hsstool command returns a list of token range map snapshot IDs along with information about each snap-
shot such as the snapshot creation time. You can also use the command to return the contents of a specified
token range map snapshot.
The system creates a token range map snapshot each time you add a new node to your cluster. The token
range map identifies, for each storage policy in your system, the nodes (endpoints) that store data from each
token range. The data from each token range will be stored on multiple nodes, with the number of nodes
depending on the storage policy (for example, in a 3X replication storage policy each token range would be
mapped to three storage endpoints). When you've added new nodes to your cluster, the system uses token
range maps to manage the rebalancing of S3 object data from existing nodes to the new nodes.
Typically you should not need to use this command unless you are working with Cloudian Support to
troubleshoot a failed attempt to add nodes to your HyperStore cluster or a failed attempt to rebalance the
cluster after adding nodes.
As the target <host> you can specify the hostname or IP address of any node in the cluster. The command
retrieves cluster-wide information that is available from any node that belongs to the cluster.
IMPORTANT ! Do not use the set or delete options unless instructed to do so by Cloudian Support.
As an alternative to running hsstool trmap on the command line, you can run it through the CMC UI:
383
Chapter 7. Reference
In this first example a list of active token range map snapshot IDs is retrieved. These are token range map snap-
shots for which the associated rebalancing operation has not yet completed.
In this next example, a token range map snapshot is retrieved (this is from a different system and is not one of
the snapshots listed in the first example). The map is in JSON format. The response is very large, and is trun-
cated below. In practice, if you use this command you should redirect the output to a text file.
384
7.1. Command Line Tools
"endPointDetails" : [ {
"endpoint" : "10.10.10.114",
"datacenter" : "DC3",
"rack" : "RAC1"
}, {
"endpoint" : "10.10.10.115",
"datacenter" : "DC2",
"rack" : "RAC1"
}, {
"endpoint" : "10.10.10.111",
"datacenter" : "DC1",
"rack" : "RAC1"
} ]
}, {
"left" : 6341660663762096831290541188712444913,
"right" : 6449737778660216877727472971404857143,
"endPointDetails" : [ {
...
...
Response Items
id
System-generated unique identifier of this token range map snapshot.
version
Version of the token range map snapshot. This integer is incremented each time a new snapshot is created.
timestamp
Timestamp indicating when the token range map snapshot was created.
rebalance
Status of the hsstool rebalance operation in regard to this token range map snapshot, such as REQUIRED or
COMPLETED. Each time you add a node to your system (using the CMC's function for adding a node, in the
Data Centers page), the system automatically generates a token range snapshot. After adding a node, you
then must run hsstool rebalance on the new node (which you can do from the CMC's Nodes Advanced page).
The rebalance operation utilizes the token range map snapshot. For complete instructions on adding nodes,
see "Adding Nodes" (page 212).
policies
This marks the beginning of the per-policy token range map information. The token range map will have sep-
arate token range map information for each of your storage policies.
policyId
System-generated unique identifier of a storage policy. Note that this ID appears three times: at the outset of
the policy block, then again as the policyId attribute, then again within the keyspaceName.
keyspaceName
Name of the Cassandra keyspace in which object metadata is stored for this storage policy. The name is in
format UserData_<policyId>.
385
Chapter 7. Reference
replicationScheme
Specification of the replication scheme, if this policy block is for a replication storage policy. In the example the
policy's replication scheme calls for one replica in each of three data centers.
ecScheme
Specification of the erasure coding scheme, if this policy block is for an erasure coding storage policy. In the
example, the policy block is for a replication policy so the ecScheme value is null.
ecMap
Content of the token range map for the policy scheme, if this policy block is for an erasure coding storage
policy. In the example, the policy block is for a replication policy so the ecMap value is null.
replicasMap
Content of the token range map for the policy scheme, if this policy block is for a replication storage policy. The
map consists of lists of endpoints per token range.
left
Token at the low end of the token range (exclusive). This is from the consistent hashing space of 0 to 2127 from
which HyperStore generates tokens for the purpose of allocating data across the cluster.
right
Token at the high end of the token range (inclusive). This is from the consistent hashing space of 0 to 2127 from
which HyperStore generates tokens for the purpose of allocating data across the cluster.
endPointDetails
Endpoint mapping information for this particular token range, for this storage policy. This is a list of endpoints
(nodes), with each endpoint identified by IP address as well as data center name and rack name. The number
of endpoints per token range will depend on the storage policy scheme. In the example the policy is a 3X rep-
lication policy, so there are three endpoints listed for each token range. Objects for which the object token
(based on a hash of the bucket name / object name combination) falls into this token range will have replicas
placed on each of these nodes.
Options
386
7.1. Command Line Tools
If the object name has spaces in it, enclose the bucket/object name pair in quotes. For example, "mybucket/big
document.doc".
Note In the CMC UI implementation of this command, you enter the bucket name and the full object
name (including folder path) in separate fields. For example, bucket name mybucket and full object
name Videos/Vacation/Italy_2021-06-27.mpg.
If you use the -a option do not use the <bucket/object> parameter or the -v <version> parameter. Run the com-
mand simply as hsstool -h <host> whereis -a
Note The CMC does not support the -a option. To use this option you need to use hsstool whereis on
the command line.
This hsstool command returns the current storage location of each replica of a specified S3 object (or in the
case of erasure coded objects, the location of each of the object’s fragments). The command response also
shows the specified object's metadata such as last modified timestamp and object digest.
Since this command returns information from across the cluster, you can specify any node as the target <host>.
As an alternative to running hsstool whereis on the command line, you can run it through the CMC UI:
387
Chapter 7. Reference
When you use the whereis -a command, information about all replicas and erasure coded fragments of all
objects in the entire service region is written to a log file. The log file is written on the HyperStore host that you
connect to when you run whereis -a, and by default the log file path is:
/var/log/cloudian/whereis.log
In the whereis.log file, the start and completion of the output from a single run of whereis -a is marked by
"START"and "END" timestamps. Within those timestamps, the output is organized by user. The start of output
for a particular user is marked by "#user:<canonical UID>". This line is then followed by lines for the user’s
buckets and objects, with the same object detail information as described in the whereis command results doc-
umentation above. Users who do not have any buckets will not be included in the log file.
The output of multiple runs of whereis -a may be written to the same log file, depending on the size of the out-
put. Because the output of whereis -a may be very large, it’s also possible that the output of a single run may
be spread across multiple log files, if maximum file size is reached and log rotation occurs.
By default this log is rotated if it reaches 10MB in size or at the end of the day, whichever occurs first. The old-
est rotated whereis log file is automatically deleted if it reaches 180 days in age or if the aggregate size of all
rotated whereis log files (after compression) reaches 100MB. These rotation settings are configurable in the
RollingRandomAccessFile name="APP" section of the /etc/cloudian-<version>-pup-
pet/modules/cloudians3/templates/log4j-hsstool.xml.erb file. For information about changing these settings see
"Log Configuration Settings" (page 564).
The first whereis command example below retrieves location information for an object named "Guide.pdf". This
is from a single-node HyperStore system, so there is just one replica of the object. The location detail inform-
ation for the object replica is truncated in this example.
388
7.1. Command Line Tools
The second whereis command example retrieves location information for an object named "obj1.txt". This is
from a two data center HyperStore system, using replicated 2+1 erasure coding. The location detail information
for each found fragment is truncated in this example.
Note For objects for which the Type is "TRANSITIONED" (auto-tiered), the response will also include a
URL field.
Response Items
Key
Key that uniquely identifies the S3 object, in format <bucketname>/<objectname>. For example, buck-
et1/Documents/Meetings_2021-06-27.docx.
PolicyID
System-generated identifier of the storage policy that applies to the bucket in which this object is stored.
Version
Object version, if versioning has been used for the object. Versions are identified by timeuuid values in hexa-
decimal format. If versioning has not been used for the object, the Version field displays "null".
Compression
Type of server-side compression applied to the object, if any. Possible values are NONE, SNAPPY, ZLIB, or
LZ4. The type of compression applied depends on the storage policy used by the bucket. Each storage policy
has its own configuration as to whether compression is used and the compression type.
389
Chapter 7. Reference
Create Time
Timestamp for the original creation of the object. Format is ISO 8601 and the time is in Coordinated Universal
Time (UTC).
Last Modified
Timestamp for last modification of the object. Format is ISO 8601 and time is in UTC.
Size
The object’s size in bytes.
Type
One of:
Region
The HyperStore service region in which the object is stored.
http://s3.amazonaws.com/bucket2.mdazyjgxnjyxndu2ody4mji1nty3/notes.txt
In this example, the tiering destination is Amazon S3; the bucket name in the destination system is buck-
et2.mdazyjgxnjyxndu2ody4mji1nty3 (which is the HyperStore source bucket name — bucket2 in this case —
appended by a 28 character random string); and the object name is notes.txt. Note that the URL field will spe-
cify the transfer protocol as http, whereas to actually access the object in the destination system the protocol
would typically be https.
Location detail (including indicator of whether the replica or fragment is found at the expected location)
For objects stored locally (objects that are not of type TRANSITIONED), the lower part of the response shows
the location of each object replica (for replicated objects) or of each erasure coded object fragment (for EC
390
7.1. Command Line Tools
objects).
l For EC objects only, the <key_suffix_digit> at the beginning of each location is a digit that the system
generates and uses to ensure that each fragment goes to a different node.
l The <base62-encoded-vNode-token> is a base-62 encoding of the token belonging to the vNode to
which the object instance or fragment is assigned.
l The <policyid> segment is the unique identifier of the storage policy applied to the bucket in which the
object is stored.
l The two <000-255> segments of the path are based on a hash of the <filename>, normalized to a
255*255 number.
l The <filename> is a dot-separated concatenation of the object’s hash token and the object’s Last Modi-
fied Time timestamp. The timestamp is formatted as <UnixTimeMillis><6digitAtomicCounter>-
<nodeIPaddrHex> (the last element is the IP address -- in hexadecimal format -- of the S3 Service node
that processed the object upload request). Note: For objects last modified prior to HyperStore version
6.1, the timestamp is simply Unix time in milliseconds. This was the timestamp format used in Hyper-
Store 6.0.x and older..
l If the replica or fragment is found at the expected location, the <size> field shows the size of the replica
or fragment . If the replica or fragment is not found at the expected location, the size field shows "-1". A
size of "-1" indicates that the replica or fragment is missing from a location where it is supposed to
be. If the digest is also missing, the digest will be absent from the location detail entry.
Note For multipart objects (large objects uploaded via the S3 multipart upload method), storage loc-
ation detail is shown for each part.
7.1.2. cloudianInstall.sh
The cloudianInstall.sh tool (also known as "the installer") serves several purposes including:
The cloudianInstall.sh tool is in your installation staging directory on your Configuration Master node. To per-
form advanced configurations, or to push configuration file changes to the system and restart services, you
would launch the tool simply like this, without using additional command line options:
# ./cloudianInstall.sh
391
Chapter 7. Reference
# ./cloudianInstall.sh -s survey.csv
Or like this if you are not using your DNS environment to resolve HyperStore service endpoints and you want to
use the bundled tool dnsmasq instead (which is not appropriate for production systems):
However the script does support additional command line options. The syntax is as follows:
Note If you use multiple options, on the command line place options that start with a "-" (such as -s
<survey-filename> or -d) before options that do not (such as no-hosts or configure-dnsmasq).
$ hspkg install
The installer's options are the same regardless of whether it is launched from the HSH command line or the
OS command line.
Note After using the installer, exit the installer when you’re done. Do not leave it running. Certain auto-
mated system tasks invoke the installer and cannot do so if it is already running.
l [-s <survey-filename>] — Name of your cluster survey file (including the full path to the file). If you do not
specify the survey file name argument, the script will prompt you for the file name during installation.
l [-k <ssh-private-key-filename>] — The Configuration Master employs SSH for secure communication
with the rest of your HyperStore installation nodes. By default the install script automatically creates an
SSH key pair for this purpose. But if instead you would prefer to use your own existing SSH key pair for
this purpose, you can use the installer's -k <ssh-private-key-filename> option to specify the name of the
private key file (including the full path to the file). When you run the install script it will copy the private
key and corresponding public key to the installation staging directory, and in the staging directory the
key file will be renamed to cloudian-installation-key. Then from the staging directory, the public key file
cloudian-installation-key.pub will be copied to each node on which you are installing HyperStore.
Note This usage information mentions more command line options than are described here in
this Help topic. This is because the usage information includes installer options that are meant
392
7.1. Command Line Tools
for HyperStore internal system use, such as options that are invoked by the CMC when you use
the CMC to add nodes to your cluster or remove nodes from your cluster. You should perform
such operations through the CMC, not directly through the installer. The CMC implements auto-
mations and sanity checks beyond what is provided by the install script alone.
l [no-hosts] — Use this option if you do not want the install tool to append entries for each HyperStore
host on to the /etc/hosts file of each of the other HyperStore hosts. By default the tool appends to these
files so that each host is resolvable to the other hosts by way of the /etc/hosts files.
l [configure-dnsmasq] — Use this option if you want the install tool to install and configure dnsmasq, a
lightweight utility that can provide domain resolution services for testing a small HyperStore system. If
you use this option the installer installs dnsmasq and automatically configures it for resolution of Hyper-
Store service domains. If you did not create DNS entries for HyperStore service domains as described
in "DNS Set-Up" (page 573), then you must use the configure-dnsmasq option in order for the system
to be functional when you complete installation. Note that using dnsmasq is not appropriate in a pro-
duction environment.
Note If you do not have the installer install dnsmasq during HyperStore installation, and then
later you decide that you do want to use dnsmasq for your already installed and running Hyper-
Store system, do not use the configure-dnsmasq command line option when you re-launch the
installer. Instead, re-launch the installer with no options and use the "Installer Advanced Con-
figuration Options" (page 407) menu to enable dnsmasq for your system.
l [no-firewall] — If this option is used, the HyperStore firewall will not be enabled upon HyperStore install-
ation. By default the HyperStore firewall will be enabled upon completion of a fresh HyperStore install-
ation.
l [force] — By default the installer performs certain prerequisite checks on each node on which you are
installing HyperStore and aborts the installation if any of your nodes fails a check. By contrast, if you
use the force option when you launch the installer, the installer will output warning messages to the ter-
minal if one or more nodes fails a prerequisite check but the installation will continue rather than abort-
ing. The prerequisite checks that this feature applies to are:
o CPU has minimum of 8 cores
o RAM is at least 128GB
o System Architecture is x86 64-bit
o SELinux is disabled
o firewalld is disabled
o iptables is not running
Note If you specify the force option when running the installer, the forceoption will "stick" and will
be used automatically for any subsequent times the installer is run to install additional nodes
(such as when you do an "Add Node" operation via the Cloudian Management Console, which
invokes the installer in the background). To turn the forceoption off so that it is no longer auto-
matically used when the installer is run to add more nodes, launch the installer and go to the
Advance Configuration Options. Then choose option t for Configure force behavior and follow
the prompts.
393
Chapter 7. Reference
Note Even if the force option is used the installer will abort if it detects an error condition on the
host that will prevent successful installation.
l [uninstall] — If you use this option when launching the installer, the installer main menu will include an
additional menu item -- "Uninstall Cloudian HyperStore".
Use this menu option only if you want to delete the entire HyperStore system, on all nodes, including
any metadata and object data stored in the system. You may want to use this Uninstall Cloudian Hyper-
Store option, for example, after completing a test of HyperStore -- if you do not want to retain the test sys-
tem.
IMPORTANT ! Do not use this option to uninstall a single node from a HyperStore system that
you want to retain (such as a live production system).
7.1.3. system_setup.sh
The system_setup.sh tool is for setting up nodes on which you will install HyperStore software, either during ini-
tial cluster installation or during cluster expansion. For basic information about using system_setup.sh, change
into the installation staging directory and run the following command:
# ./system_setup.sh --help
You can submit commands to the Redis Monitor primary host through the Redis Monitor CLI. A couple of the
more useful commands can also be executed through the CMC.
394
7.1. Command Line Tools
l To initiate a Redis Monitor CLI session, use netcat to connect to port 9078 on the node on which the
primary Redis Monitor is running:
# nc <redismon_primary_host> 9078
Specify the hostname or IP address (do not use 'localhost'). Once connected, you can then use any of
the Redis Monitor CLI commands listed below. When you're done using Redis Monitor commands,
enter quit to end your Redis Monitor CLI session and then enter <ctrl>-d to end your netcat session and
return to the terminal prompt.
l To access Redis Monitor commands in the CMC, go to the Node Advanced page (Cluster -> Nodes
-> Advanced) and from the "Command Type" drop-down list select "Redis Monitor Operations". Note
that only a small number of Redis Commands are available through the CMC (specifically get cluster
and set master).
get cluster
Use this command to retrieve basic status information that the Redis Monitor currently has for a specified Redis
DB cluster. The cluster status information includes:
Note For this and all other Redis Monitor commands, the Redis QOS cluster identifier includes the
name of the service region in which the cluster resides. This is necessary since in a multi-region Hyper-
Store system each region has its own Redis QoS cluster. By contrast the Redis Credentials cluster,
since it is global (extending across all service regions), does not include a region name in its identifier.
Example:
395
Chapter 7. Reference
# nc 10.50.20.12 9078
get cluster redis.credentials
OK master: store1(10.50.20.1):6379, monitoring: enabled, notifications: enabled
nodes: [[store1(10.50.20.1):6379,UP,master], [store4(10.50.20.4):6379,UP,slave],
[store5(10.50.20.5):6379,UP,slave]]
clients: [[store1(10.50.20.1):19080,UP,store1], [store2(10.50.20.2):19080,UP,store1],
[store3(10.50.20.3):19080,UP,store1], [store4(10.50.20.4):19080,UP,store1],
[store5(10.50.20.5):19080,UP,store1], [store6(10.50.20.6):19080,UP,store1],
[store1(10.50.20.1):19081,UP,store1], [store2(10.50.20.2):19081,UP,store1],
[store3(10.50.20.3):19081,UP,store1], [store4(10.50.20.4):19081,UP,store1],
[store5(10.50.20.5):19081,UP,store1], [store6(10.50.20.6):19081,UP,store1],
[store1(10.50.20.1):19082,UP,store1], [store2(10.50.20.2):19082,UP,store1],
[store3(10.50.20.3):19082,UP,store1], [store4(10.50.20.4):19082,UP,store1],
[store5(10.50.20.5):19082,UP,store1], [store6(10.50.20.6):19082,UP,store1]]
state: redis.credentials: master= 10.50.20.1 updatetime= Fri Sep 21 16:17:23 PDT 2018
7.1.4.1.2. CMC UI
get master
Use this command to retrieve from the Redis Monitor the identity of the current master node within a specified
Redis cluster.
Example:
# nc 10.50.20.12 9078
get master redis.qos.region1
OK store2(10.50.20.2):6380
396
7.1. Command Line Tools
get nodes
Use this command to retrieve from the Redis Monitor a list of all current members of a specified Redis cluster.
The command response also indicates the status (UP/DOWN) and role (master/slave> of each member node.
Example:
$ nc 10.50.20.12 9078
get nodes redis.credentials
OK [[[store1(10.50.20.1):6379,UP,master], [store4(10.50.20.4):6379,UP,slave],
[store5(10.50.20.5):6379,UP,slave]]]
get clients
Use this command to retrieve from the Redis Monitor a list of clients of a specified Redis cluster (client nodes
that write to and/or read from the Redis database). These are the clients to which the Redis Monitor sends noti-
fications regarding the Redis cluster’s status. For example, if the cluster’s master role changes from one node
to another, the Redis Monitor will notify these clients of the change.
The clients will include S3 Service instances (identified by JMX listening socket <host>:19080), IAM Service
instances (<host>:19084), Admin Service instances (<host>:19081), and HyperStore Service instances
(<host>:19082). The command response also indicates the status (UP/DOWN) of each client, and for each cli-
ent it shows which node the client thinks is the Redis master.
Example:
# nc 10.50.20.12 9078
get clients redis.credentials
OK [[[store1(10.50.20.1):19080,UP,store1], [store2(10.50.20.2):19080,UP,store1],
[store3(10.50.20.3):19080,UP,store1], [store4(10.50.20.4):19080,UP,store1],
[store5(10.50.20.5):19080,UP,store1], [store6(10.50.20.6):19080,UP,store1],
[store1(10.50.20.1):19081,UP,store1], [store2(10.50.20.2):19081,UP,store1],
[store3(10.50.20.3):19081,UP,store1], [store4(10.50.20.4):19081,UP,store1],
[store5(10.50.20.5):19081,UP,store1], [store6(10.50.20.6):19081,UP,store1],
[store1(10.50.20.1):19082,UP,store1], [store2(10.50.20.2):19082,UP,store1],
[store3(10.50.20.3):19082,UP,store1], [store4(10.50.20.4):19082,UP,store1],
[store5(10.50.20.5):19082,UP,store1], [store6(10.50.20.6):19082,UP,store1]]]
In the above example, "store1" is the current Redis Credentials master node. All the clients correctly have this
information.
397
Chapter 7. Reference
enable monitoring
Use this command to enable monitoring of a specified Redis cluster by the Redis Monitor.
Note Monitoring is enabled by default. This command is relevant only if you have previously disabled
monitoring.
Example:
# nc 10.50.20.12 9078
enable monitoring redis.credentials
OK enabled
disable monitoring
Use this command if you want to temporarily disable Redis Monitor’s monitoring of a specified Redis cluster —
for example if you are performing maintenance work on the Redis cluster. (You can subsequently use the
enable monitor command re-enable monitoring of that cluster.)
Example:
# nc 10.50.20.12 9078
disable monitoring redis.qos.region1
OK disabled
enable notifications
Use this command to enable Redis Monitor’s sending of notifications to the clients of a specified Redis cluster.
(The clients are the S3 Service instances, IAM Service instances, Admin Service instances, and HyperStore
Service instances that write to and/or read from that Redis cluster).
The Redis Monitor sends notifications to inform clients of the identity of the Redis cluster’s master node, in
either of these circumstances:
l The Redis master role has switched from one host to another. (This could happen if the original master
goes down and Redis Monitor detects this and fails the master role over to one of the slave nodes; or if
an operator uses the Redis Monitor CLI to move the master role from one node to another).
398
7.1. Command Line Tools
l The Redis Monitor in its regular polling of cluster clients' status detects that one of the clients has incor-
rect information about the identity of the Redis cluster master node. In this case the Redis Monitor noti-
fies the client to give it the correct information.
Note Notifications are enabled by default. This operation is relevant only if you have previously dis-
abled notifications using the disable notifications command.
Example:
# nc 10.50.20.12 9078
enable notifications redis.credentials
OK enabled
disable notifications
Use this command to temporarily disable Redis Monitor’s sending of Redis cluster status notifications to the cli-
ents of that cluster. For more information on the notification feature see "enable notifications" (page 398).
Example:
# nc 10.50.20.12 9078
disable notifications redis.qos.region1
OK disabled
set master
Use this command to assign the Redis master role to a different node within a specified Redis cluster. The
node to which you assign the master role must be one of the current slaves within the same Redis
cluster.
The Redis master node within a cluster is the node to which Redis clients submit writes. The writes are asyn-
chronously replicated to the slave(s) within that cluster. Redis clients read from the slave(s).
An example of when you would move the Redis master role is if you want to remove the current Redis master
host from your cluster.
Using this command is part of a broader procedure for moving a Redis master role to a slave. For the full pro-
cedure including the use of this command within the procedure, see "Move the Credentials DB Master Role
or QoS DB Master Role" (page 270).
399
Chapter 7. Reference
Example:
# nc 10.50.20.12 9078
set master redis.credentials store5:6379
OK set new master store5(10.50.20.5):6379
Note If you do not specify a <host:redisPort> value, the Redis Monitor chooses a slave node at random
(from within the cluster) to elevate to the master role.
7.1.4.9.2. CMC UI
In the CMC UI, use the "Hostname" field to specify the host to which you want to move the Redis master role.
add node
Use this command to add a Redis node to the list of nodes that Redis Monitor is monitoring, for a specified
Redis cluster. This would be if you have used the installer (cloudianInstall.sh in the installation directory on
your Configuration Master node) to activate Redis on a HyperStore node that wasn’t previously running Redis.
In this circumstances you have two options to make Redis Monitor aware of the new member of the Redis
cluster:
OR
Example:
400
7.1. Command Line Tools
# nc 10.50.20.12 9078
add node redis.qos.region1 store3:6380
OK added node store3(10.50.20.3):6380 to redis
add client
This command can be used to add a new S3 Service, IAM Service, Admin Service, and/or HyperStore Service
node to the list of clients to which Redis Monitor will send notifications regarding the status of a specified Redis
cluster.
In normal circumstances you should not have to use this command. If you add a new node to your HyperStore
cluster (as described in "Adding Nodes" (page 212)), the system automatically makes Redis Monitor aware of
the new clients of the Redis Credentials and Redis QoS clusters.
If you do use this command, add only one client per command run. A client is identified by its JMX socket (for
example "cloudian12:19080" for an S3 Service instance running on host cloudian12, or "cloudian12:19081" for
an Admin Service instance running on host cloudian12).
Example:
# nc 10.50.20.12 9078
add client redis.credentials store7:19080
OK added client store7(10.50.20.7):19080 to redis
test dc partition
If your HyperStore system includes multiple data centers (DCs), you can use this command to check whether or
not there is a DC partition in the specified Redis cluster. A Redis cluster is considered to have a DC partition if
the Redis Monitor -- from its location within one of the DCs -- cannot reach any of that cluster's Redis nodes or
any of that cluster's Redis clients (S3, IAM, Admin, HyperStore) in one of the other DCs.
Note The Redis Monitor automatically checks for a DC partition once every five seconds, and if a par-
tition is detected an alert is logged in cloudian-redismon.log on the Redis Monitor node and is dis-
played in the CMC's Alerts page. So under normal circumstances you should not need to manually
trigger a DC partition check by using this command.
Example:
# nc 10.50.20.12 9078
test dc partition redis.credentials
OK
401
Chapter 7. Reference
Note The Redis Monitor automatically checks for a "split brain" condition once every five seconds,
and if a split brain condition is detected an alert is logged in cloudian-redismon.log on the Redis Mon-
itor node and is displayed in the CMC's Alerts page. So under normal circumstances you should not
need to manually trigger a split brain check by using this command.
Example #1:
# nc 10.50.20.12 9078
test split brain redis.credentials
OK
Number of Master in cluster redis.credentials: 1
Has No Brain: false
Has Split Brain: false
Example #2:
# nc 10.50.20.12 9078
test split brain redis.credentials
OK
Number of Master in cluster redis.credentials: 2
Has No Brain: false
Has Split Brain: true
402
7.1. Command Line Tools
If DC partition monitoring is enabled, then in the circumstance where DC partition has been detected and the
Redis master node is in the unreachable DC, the Redis Monitor will continue with its normal master role mon-
itoring and managing behavior by promoting a reachable slave in a different DC to the master role (in other
words, failover of the master role will be executed).
If DC partition monitoring is disabled, then in the circumstance where DC partition has been detected and the
Redis master node is in the unreachable DC, the Redis Monitor will discontinue its normal master role mon-
itoring behavior and will not promote a reachable slave in a different DC to the master role (in other words,
failover of the master role will not be executed). Once the unreachable DC becomes reachable again -- which
will be detected by the Redis Monitor -- then the Redis Monitor will resume its normal monitoring behavior and
will execute failover of the master role if the existing master node goes down.
Since DC partition monitoring/failover is disabled by default, the only circumstances in which you might want to
use the disable dc partition monitoring command is if you have previously changed the redis-
.monitor.skip.dc.monitoring configuration property (so that DC partition monitoring/failover is enabled by con-
figuration) or if you have previously used the enable dc partition monitoring command (so that DC partition
monitoring/failover is enabled in the current session of the Redis Monitor).
Note If the Redis Monitor (or its host) is restarted, it will revert to using the value of the redis-
.monitor.skip.dc.monitoring configuration property to determine whether DC partition mon-
itoring/failover is enabled or disabled. Note also that the configuration property applies to all Redis
clusters in the system, while the enable/disable commands apply only to the Redis cluster that you spe-
cify when you run the command.
Note If a Redis cluster DC partition occurs an alert will be written to cloudian-redismon.log on the
Redis Monitor node and an alert will display in the CMC. You can also confirm that the condition exists
by using the test dc partition command. For additional guidance on managing and recovering from a
Redis cluster DC partition condition consult with Cloudian Support.
Example:
# nc 10.50.20.12 9078
disable dc partition monitoring redis.credentials
OK
Skip Monitoring when DC Partition detected
403
Chapter 7. Reference
cluster. (For a description of how the Redis Monitor determines that a Redis cluster is in a DC partition con-
dition, see "test dc partition" (page 401)).
If DC partition monitoring is enabled, then in the circumstance where DC partition has been detected and the
Redis master node is in the unreachable DC, the Redis Monitor will continue with its normal master role mon-
itoring and managing behavior by promoting a reachable slave in a different DC to the master role (in other
words, failover of the master role will be executed).
If DC partition monitoring is disabled, then in the circumstance where DC partition has been detected and the
Redis master node is in the unreachable DC, the Redis Monitor will discontinue its normal master role mon-
itoring behavior and will not promote a reachable slave in a different DC to the master role (in other words,
failover of the master role will not be executed). Once the unreachable DC becomes reachable again -- which
will be detected by the Redis Monitor -- then the Redis Monitor will resume its normal monitoring behavior and
will execute failover of the master role if the existing master node goes down.
Note If the Redis Monitor (or its host) is restarted, it will revert to using the value of the redis-
.monitor.skip.dc.monitoring configuration property to determine whether DC partition mon-
itoring/failover is enabled or disabled. Note also that the configuration property applies to all Redis
clusters in the system, while the enable/disable commands apply only to the Redis cluster that you spe-
cify when you run the command.
Note If a Redis cluster DC partition occurs an alert will be written to cloudian-redismon.log on the
Redis Monitor node and an alert will display in the CMC. You can also confirm that the condition exists
by using the test dc partition command. For additional guidance on managing and recovering from a
Redis cluster DC partition condition consult with Cloudian Support.
Example:
# nc 10.50.20.12 9078
enable dc partition monitoring redis.credentials
OK
Keep Monitoring when DC Partition detected
If split brain monitoring is enabled, then in the circumstance where split brain has been detected the Redis
Monitor will continue with its normal master role monitoring and managing behavior by demoting one of the
masters to a slave role. Of the two masters that constitute the "split brain", Redis Monitor will demote the one
404
7.1. Command Line Tools
that most recently became a master. The node that had been master for a longer period of time will be left as
the one master.
If split brain monitoring is disabled, then in the circumstance where split brain has been detected the Redis
Monitor will discontinue its normal master role monitoring behavior and will not automatically demote one of
the masters to a slave role. Instead it will be left to you to resolve the split brain condition by using the
resolve split brain command (which will let you choose which node should remain as the one master).
By default split brain monitoring/resolution is enabled. This is controlled by the setting "redis-
.monitor.skip.brain.monitoring" (page 482) in mts.properties.erb (which defaults to false, so that split brain
monitoring/resolution is enabled [is not "skipped"]).
Note If the Redis Monitor (or its host) is restarted, it will revert to using the value of the redis-
.monitor.skip.brain.monitoring configuration property to determine whether split brain mon-
itoring/resolution is enabled or disabled. Note also that the configuration property applies to all Redis
clusters in the system, while the enable/disable commands apply only to the Redis cluster that you spe-
cify when you run the command.
Note If a Redis cluster "split brain" condition occurs an alert will be written to cloudian-redismon.log on
the Redis Monitor node and an alert will display in the CMC. For additional guidance on managing and
recovering from a Redis cluster split brain condition consult with Cloudian Support.
Example:
# nc 10.50.20.12 9078
disable split brain monitoring redis.credentials
OK
Skip Monitoring when Split Brain detected
If split brain monitoring is enabled, then in the circumstance where split brain has been detected the Redis
Monitor will continue with its normal master role monitoring and managing behavior by demoting one of the
masters to a slave role. Of the two masters that constitute the "split brain", Redis Monitor will demote the one
that most recently became a master. The node that had been master for a longer period of time will be left as
the one master.
If split brain monitoring is disabled, then in the circumstance where split brain has been detected the Redis
Monitor will discontinue its normal master role monitoring behavior and will not automatically demote one of
the masters to a slave role. Instead it will be left to you to resolve the split brain condition by using the
resolve split brain command (which will let you choose which node should remain as the one master).
405
Chapter 7. Reference
By default split brain monitoring/resolution is enabled. This is controlled by the setting "redis-
.monitor.skip.brain.monitoring" (page 482) in mts.properties.erb (which defaults to false, so that split brain
monitoring/resolution is enabled [is not "skipped"]).
Since split brain monitoring/resolution is enabled by default, the only circumstances in which you might want to
use the enable split brain monitoring command is if you have previously changed the redis-
.monitor.skip.brain.monitoring configuration property (so that split brain monitoring/resolution is disabled by
configuration) or if you have previously used the disable split brain monitoring command (so that split brain
monitoring/resolution is disabled in the current session of the Redis Monitor).
Note If the Redis Monitor (or its host) is restarted, it will revert to using the value of the redis-
.monitor.skip.brain.monitoring configuration property to determine whether split brain mon-
itoring/resolution is enabled or disabled. Note also that the configuration property applies to all Redis
clusters in the system, while the enable/disable commands apply only to the Redis cluster that you spe-
cify when you run the command.
Note If a Redis cluster "split brain" condition occurs an alert will be written to cloudian-redismon.log on
the Redis Monitor node and an alert will display in the CMC. For additional guidance on managing and
recovering from a Redis cluster split brain condition consult with Cloudian Support.
Example:
# nc 10.50.20.12 9078
enable split brain monitoring redis.credentials
OK
Keep Monitoring when Split Brain detected
If a Redis cluster is in a "split brain" condition and automatic split brain resolution is disabled, you can use the
resolve split brain command to resolve the condition. When you run the command, it will show you how long
each of the current masters has been acting as a master, and you will be prompted to choose one of the mas-
ters to continue as master. The other master will be demoted to slave.
Note If a Redis cluster "split brain" condition occurs an alert will be written to cloudian-redismon.log on
the Redis Monitor node and an alert will display in the CMC. You can also confirm that the condition
exists by using the test split brain command. For additional guidance on managing and recovering
from a Redis cluster split brain condition consult with Cloudian Support.
406
7.2. Configuration Settings
Example:
# nc 10.112.2.12 9078
resolve split brain redis.credentials
OK
1. ch-us-east-1-us-east-1a-2-251(10.112.2.251):6379
ch-us-east-1-us-east-1a-2-251(10.112.2.251):6379: Consecutive time being master: 2.51 min
2. ch-us-east-1-us-east-1b-2-33(10.112.2.33):6379
ch-us-east-1-us-east-1b-2-33(10.112.2.33):6379: Consecutive time being master: 8.73 min
Please select which redis instance as master:
1
Forced new Master: ch-us-east-1-us-east-1a-2-251(10.112.2.251):6379
Note As a best practice, you should complete basic HyperStore installation first and confirm that things
are working properly (by running the installer’s Validation Tests, under the "Cluster Management"
menu) before you consider using the installer's advanced configuration options.
To access the advanced configuration options, on the Configuration Master node change into your installation
staging directory and launch the installer.
# ./cloudianInstall.sh
$ hspkg install
Once launched, the installer's menu options (such as referenced in the steps below) are the same regardless
of whether it was launched from the HSH command line or the OS command line.
At the installer main menu's Choice prompt enter 4 for Advanced Configuration Options.
407
Chapter 7. Reference
From this menu you can choose the type of configuration change that you want to make and then proceed
through the interactive prompts to specify your desired settings.
408
7.2. Configuration Settings
1. After selecting this option from the Advanced Configuration Options menu, follow the prompts to specify
your desired settings. The prompts indicate your current settings. At each prompt press Enter to keep
the current setting value, or type in a new value. The final prompt will ask whether you want to save
your changes -- type yes to do so.
2. Go to the installer's main menu again and choose "Cluster Management" → "Push Configuration Set-
tings to Cluster" and follow the prompts.
3. Go to the "Cluster Management" menu again, choose "Manage Services", and restart the S3 Service.
This operation is specifically in regard to Puppet certificates (used during cluster configuration management
processes) and has nothing to do with SSL for the S3 Service.
Note that if you don’t leave the Puppet agent daemons running, you must remember to trigger a one-time Pup-
pet sync-up each time you make a HyperStore configuration change on the Puppet master. By contrast, if you
leave the Puppet agent daemons running, a sync-up will happen automatically every 10 minutes even if you
don’t specifically trigger a sync-up after making a configuration change.
If you ever want to check on the current status of your Puppet master and agent daemons, you can do so by
choosing "Cluster Management" from the installer’s main menu; then choose "Manage Services"; then in the
Service Management sub-menu choose "Puppet service (status only)".
409
Chapter 7. Reference
terminated unexpectedly — such as by a <CTRL>-c command, or a Puppet master node shutdown — the tem-
porary lock may fail to release. This unreleased lock would prevent any subsequent sync-ups from being imple-
mented.
You can clear the lock by running the "Remove Puppet access lock" operation.
Afterward, to confirm that the lock is cleared you can check to make sure that no /tmp/cloudian.installer.lock dir-
ectory exists on the Puppet master node.
l When you ran the cloudianInstall.sh script to install your HyperStore system, you had the script auto-
matically install and configure dnsmasq to perform DNS resolution for HyperStore service domains.
(That is, you used the configure-dnsmasq option when you launched the install script.) However, now
you want to use your own DNS solution for resolving HyperStore domains, and you’ve completed your
DNS configuration as described in "DNS Set-Up" (page 573). You can use the installer’s "Enable or
disable DNSMASQ" menu option to disable dnsmasq. This stops dnsmasq on all HyperStore nodes
and disables dnsmasq by configuration.
l When you ran the cloudianInstall.sh script to install your HyperStore system, you did not have it install
dnsmasq (the default installer behavior is not to install dnsmasq). However, now you want to use dns-
masq for HyperStore domain resolution. With the installer’s "Enable or disable DNSMASQ" menu
option you can enable dnsmasq:
1. After selecting this option from the Advanced Configuration Options menu, follow the prompts to
choose to enable dnsmasq.
2. Go to the installer's main menu again and choose "Cluster Management" → "Push Configuration
Settings to Cluster" and follow the prompts.
3. Go to the "Cluster Management" menu again, choose "Manage Services", and restart the
DNSMASQ service.
410
7.2. Configuration Settings
The excluded host will also be excluded when you use the installer's "Cluster Management" → "Manage Ser-
vices" menu to stop, start, or restart particular services in the cluster (such as the S3 Service or the Cassandra
Service).
Note This "exclude from configuration push and service restarts" status is different than "maintenance
mode". The "excluded" status merely excludes a node from the list of nodes that the installer uses when
it pushes out configuration changes to the cluster or restarts services across the cluster; it has no effect
other than that. (For more information on "maintenance mode" see Start Maintenance Mode).
If you put a node into "excluded" status, and you later exit the installer, then if you subsequently launch the
installer again it will display a message indicating that there is a node in excluded status. In the sample below
the node "cloudian-node6" is in this status.
# ./cloudianInstall.sh
The following nodes have been excluded for configuration updates and service
restarts: cloudian-node6
Press any key to continue ...
When the node is back up again and you are ready to have it again be eligible for installer-managed con-
figuration pushes and service restarts, return to the "Advanced Configuration Options" → "Exclude host(s) from
configuration push and service restarts" function and enter "none" at the prompt.
If you made a system configuration change when a node was down and excluded, then after the node is back
up and you've taken the node out of excluded status, do a configuration push and a service restart (whichever
service restart is appropriate to the configuration change you made). This will bring the node's configuration up
to date with the rest of the cluster.
Note There are a small number of circumstances where the system will automatically place a down
node into the "excluded" status. One such circumstance is when the system is executing automatic fail-
over of the system cronjob host role, in the event that the primary cronjob host goes down. The next
time that you launch the installer it will display a message identifying the node that's in "excluded"
status.
s) Configure Firewall
This lets you enable and configure the built-in HyperStore firewall, on all the HyperStore nodes in your system.
For more information see "HyperStore Firewall" (page 581).
7.2.3. Pushing Configuration File Edits to the Cluster and Restarting Ser-
vices
Subjects covered in this section:
411
Chapter 7. Reference
l A Puppet master node on which all the HyperStore configuration templates reside. This is one of your
HyperStore nodes -- specifically, the node on which you ran the HyperStore installation script. In Hyper-
Store this is called the Configuration Master node.
l A Puppet agent on every node in your HyperStore system (including the node on which the Master is
running). In HyperStore these are called the Configuration Agents.
This cluster configuration management system enables you to edit HyperStore configuration templates in one
location — on the Configuration Master node — and have those changes propagate to all the nodes in your
HyperStore cluster, even across multiple data centers and multiple service regions. There are two options for
implementing Puppet sync-up: you can trigger an immediate sync-up by using the HyperStore installer, or you
can wait for an automatic sync-up which by default occurs on a 10 minute interval. With either approach, after
the sync-up you must restart the affected service(s) to apply the configuration changes.
IMPORTANT ! Do not directly edit configuration files on individual HyperStore nodes. If you make edits
on an individual node, those local changes will be overwritten when the local Configuration Agent
does its next sync-up with the Configuration Master.
Note To ensure high availability of the Configuration Master role, HyperStore automatically sets up a
backup Configuration Master node and supports a method for manually failing over the Configuration
Master role in the event of problems with the primary Configuration Master node. See "Move the Con-
figuration Master Primary or Backup Role" (page 266)
/opt/cloudian-staging/7.5
If you forget the location of your staging directory, you can find it displayed in the CMC's Cluster Information
page (Cluster -> Cluster Config -> Cluster Information) toward the bottom of the "Service Information" sec-
tion.
412
7.2. Configuration Settings
Among the important files in your installation staging directory is the HyperStore installation script cloud-
ianInstall.sh -- also known as the HyperStore installer -- which you can use for a variety of purposes including
pushing configuration changes out to the cluster and restarting services.
7.2.3.3. Using the Installer to Push Configuration Changes and Restart Services
After you’ve edited configuration file templates on the Configuration Master you can use the HyperStore
installer to trigger a Puppet sync-up and then restart the affected service(s):
1. After logging into the Configuration Master node as root, change into your installation staging dir-
ectory. Once in the staging directory, launch the HyperStore installer:
# ./cloudianInstall.sh
$ hspkg install
Once launched, the installer's menu options (such as referenced in the steps below) are the same
regardless of whether it was launched from the HSH command line or the OS command line.
2. Enter "2" for Cluster Management. This displays the Cluster Management menu.
413
Chapter 7. Reference
3. Enter "b" for Push Configuration Settings to Cluster. You will then be prompted to list the hosts on which
you want the Configuration Agents to sync up with the Configuration Master. The default is all your
HyperStore hosts; for this you can just press enter at the prompt. In a multi-region system you are also
given the option to sync-up only the agents in a particular region.
4. After the Puppet run completes for all the agents (and a success message displays on the console),
restart the affected service(s) to apply your configuration change:
a. From the Cluster Management menu, enter "c" for Manage Services. This displays the Service
Management menu.
b. From the Service Management menu, enter the number for the service to restart. The service to
restart will depend on which configuration setting(s) you edited. For example:
414
7.2. Configuration Settings
mts-ui.properties.erb CMC
c. After entering the service to manage, enter "restart". Watch the console for messages indicating
a successful stop and restart of the service. Note that the service restart occurs on each node on
which the service is running.
7.2.3.4. Option for Triggering a Puppet Sync-Up from the Command Line
HyperStore supports an alternative way of using the installer to trigger a Puppet sync-up, directly from the com-
mand line. To use this method, run the installer like this:
# ./cloudianInstall.sh runpuppet="[<region>,<host>,<host>,...]"
This triggers a Puppet sync-up, and the sync-up progress displays in your terminal.
Here are some examples for how to specify the runpuppet option:
Note If you use this method, you still need to subsequently launch the installer in the normal way and
then restart the affected service(s) as described in Step 4 from the "Using the Installer to Push Con-
figuration Changes and Restart Services" (page 413) section, in order to apply your changes.
415
Chapter 7. Reference
Note There are a small number of circumstances where the system will automatically place a down
node into the "excluded" status. One such circumstance is when the system is executing automatic fail-
over of the system cronjob host role, in the event that the primary cronjob host goes down. The next
time that you launch the installer it will display a message identifying the node that's in "excluded"
status.
Note however that this automatic Puppet sync-up does not apply the configuration changes to the currently
running services. Applying the configuration changes requires a service restart.
To apply your changes, first wait long enough to be sure that the automatic Puppet sync-up has occurred (the
default is every 10 minutes). Then, restart the affected service(s) by logging into your Configuration Master
node, changing into the installation staging directory, launching the installer, then going to the Cluster Man-
agement menu and restarting the affected service(s) as described in Step 4 in the "Using the Installer to Push
Configuration Changes and Restart Services" (page 413) section.
<username>@<hostname>$
For example:
sa_admin@hyperstore1$
Note To use the HSH to manage configuration files you must be an HSH Trusted user.
To view the list of configuration files that you can view and edit using the HSH:
To see the complete list of configuration files that you can view and edit in the HSH run this command:
The list includes all the files covered in the "HyperStore Configuration Files" section of this documentation
as well as a few additional system configuration files.
Note Although cloudian-crontab.erb is listed, using the HSH you can only view this file -- not edit it.
416
7.2. Configuration Settings
Specify just the configuration file name (such as common.csv), not the full path to the file.
In the background this invokes the Linux command less to display the configuration file. Therefore you can use
the standard keystrokes supported by less to navigate the display; for example:
Specify just the configuration file name (such as common.csv), not the full path to the file.
In the background this invokes the Linux text editor vi to display and modify the configuration file. Therefore you
can use the standard keystrokes supported by vi to make and save changes to the file; for example:
Note For more information about using the vi text editor, see any reputable online source.
If you made and saved a change to a configuration file, to apply the change you must use the installer to push
the change out to the cluster and restart the relevant service(s). From the HSH you can launch the installer as
follows:
$ hspkg install
For more information about using the installer to push your configuration change and restart services, see
"Using the Installer to Push Configuration Changes and Restart Services" (page 413).
417
Chapter 7. Reference
7.2.5.1. common.csv
The common.csv file is the main HyperStore configuration file and under typical circumstances this is the only
configuration file that you may want to edit. On the Configuration Master node the path to the file is:
/etc/cloudian-<version>-puppet/manifests/extdata/common.csv
Specify just the configuration file name, not the full path to the file.
In the background this invokes the Linux text editor vi to display and modify the configuration file. Therefore you
can use the standard keystrokes supported by vi to make and save changes to the file.
IMPORTANT ! If you make any edits to common.csv, be sure to push your edits to the cluster and
restart the affected services to apply your changes. For instructions see "Pushing Configuration File
Edits to the Cluster and Restarting Services" (page 411).
Note The HyperStore installer cloudianInstall.sh writes to this file. All common.csv settings that require
environment-specific customization are automatically pre-configured by the install script, based on
information that you provided during the installation process. Making any further customizations to this
configuration file is optional.
release_version
Current HyperStore release version.
Default = 7.5
Do not edit.
cloudian_license_file_name
Name of your Cloudian license file.
Do not edit manually. To apply an updated license file, use the CMC’s Update License function (Cluster →
Cluster Config → Cluster Information → Update License). That function will automatically update this con-
figuration setting as appropriate and dynamically apply the change to your live system. No service restart is
necessary.
default_region
The default service region for your S3 service. Must be one of the regions listed for the regions setting. In a
multi-region HyperStore system, the default region plays several roles in the context of S3 storage bucket
418
7.2. Configuration Settings
l For PUT Bucket requests that lack a CreateBucketConfiguration element in the request body, the
bucket will be created in the default region.
l For PUT Bucket requests that do include a CreateBucketConfiguration element (with Loca-
tionConstraint attribute), the PUT requests typically resolve to the default region, and S3 Service nodes
in the default region then initiate the process of creating the bucket in the requested region.
If your HyperStore system has only one service region, make that region the default region.
Do not change this setting after your system is installed and running. If for some reason you need to change
which region is your default region, please consult with Cloudian Support.
regions
Comma-separated list of service regions within your HyperStore system. Region names must be lower case
with no dots, dashes, underscores, or spaces. Even if you have only one region, you must give it a name and
specify it here.
javahome
Home directory of Java on your HyperStore nodes.
java_minimum_stack_size
Memory stack size to use for the HyperStore system’s Java-based services (Cassandra, S3 Service, Hyper-
Store Service, Admin Service)
Default = 256k
Note Optionally you can override this stack size value on a per-server-type basis by adding any of the
following settings to the configuration file and assigning them a value:
cassanda_stack_size
cloudian_s3_stack_size
cloudian_hss_stack_size
cloudian_admin_stack_size
installation_root_directory
Directory in which HyperStore service packages will be physically installed.
Default = /opt/cloudian-packages
run_root_directory
Root directory in which HyperStore services will be run. Links will be created from this directory to the physical
installation directory.
Default = /opt
pid_root_directory
419
Chapter 7. Reference
Root directory in which HyperStore service PID (process ID) files will be stored. A /cloudian sub-directory will
be created under this root directory, and the PID files will be stored in that sub-directory.
Default = /var/run
cloudian_user
User information for the user as which to run Cloudian HyperStore services, in format <user_name>,<group_
name>,<optional_numeric_UID>,<login_shell>. If you want this user to be something other than the default,
edit this setting and also the cloudian_runuser setting.
Default ="cloudian,cloudian,,/bin/bash"
cloudian_runuser
User name of the user as which to run Cloudian HyperStore services. This must match the first field from the
cloudian_user setting. If you edit the cloudian_runuser setting you must also edit the first field from the cloud-
ian_user setting.
Default = cloudian
user_bin_directory
Directory in which certain user-invocable scripts are stored.
Default = /usr/local/bin
user_home_directory
Directory under which to create the home directory of the HyperStore services runtime user. The system will
append the cloudian_runuser value to the user_home_directory value to get the full home directory path. For
example, with "cloudian" as the cloudian_runuser and "/export/home" as the user_home_directory, the Cloud-
ian user's home directory is "/export/home/cloudian".
Default = /export/home
cloudian_userid_length
Maximum number of characters allowed in a HyperStore user ID. The highest value you can set this to is 256.
Note This maximum applies also to the user full name. For example if this is set to 64, then when you
are creating a user through the CMC or the Admin API the user ID can be a maximum of 64 characters
long, and also the user full name can be a maximum of 64 characters long.
Default = 64
service_starts_on_boot
Whether to have HyperStore services start on host reboot, true or false.
Default = true
cloudian_log_directory
Directory into which to write application logs for the S3 Service, Admin Service, HyperStore Service, Redis Mon-
itor, Cloudian performance monitoring Agent, and Cloudian performance monitoring Data Collector.
Default = /var/log/cloudian
420
7.2. Configuration Settings
cleanup_directories_byage_withmatch
The cleanup_directories_byage_* settings configure Puppet to automatically delete certain files from your
HyperStore nodes after the files reach a certain age. The cleanup_directories_byage_withmatch setting is a
comma-separated list of directories in which to look for such files.
This feature works only if you leave the Puppet daemons running on your HyperStore nodes (which is the
default behavior), or if you regularly perform a Puppet push.
Default = "/var/log/cloudian,/tmp,/var/log/puppetserver,/opt/tomcat/logs"
cleanup_directories_byage_withmatch_timelimit
The cleanup_directories_byage_* settings configure Puppet to automatically delete certain files from your
HyperStore nodes after the files reach a certain age. The cleanup_directories_byage_withmatch_timelimit set-
ting specifies the age at which such files will be deleted (based on time elapsed since file creation). The age
can be specified as <x>m or <x>h or <x>d where <x> is a number of minutes, hours, or days.
This feature works only if you leave the Puppet daemons running on your HyperStore nodes (which is the
default behavior), or if you regularly perform a Puppet push.
Default = 15d
cleanup_directories_byage_matches
The cleanup_directories_byage_* settings configure Puppet to automatically delete certain files from your
HyperStore nodes after the files reach a certain age. The cleanup_directories_byage_matches setting spe-
cifies the file types to delete.
This feature works only if you leave the Puppet daemons running on your HyperStore nodes (which is the
default behavior), or if you regularly perform a Puppet push.
Default = "diagnostics*.gz,diagnostics*.tgz,hiq_metrics*.gz,jna*.tmp,liblz4-java*.so,snappy-*.so,*.cloudian-
bak,cloudian_system_info*.tar.gz,puppetserver*.log.zip,localhost*.log,localhost_access_log*.txt,catalina*.log"
cleanup_sysinfo_logs_timelimit
Retention period for Node Diagnostics packages. After a package reaches this age, Puppet will automatically
delete the package.
This cleanup feature works only if you leave the Puppet daemons running on your HyperStore nodes (which is
the default behavior).
Specify this value as a number of days, hours, or minutes, with the value formatted as <n>d or <n>h or <n>m
respectively (such as 15d or 12h or 30m).
Default = 15d
421
Chapter 7. Reference
Note When you use the CMC to collect diagnostics on a node (Cluster -> Nodes -> Advanced), the UI
gives you the option to have the diagnostics package automatically uploaded to Cloudian Support, and
also the option to have the system delete the package immediately after it's been successfully
uploaded to Cloudian Support. The retention period set by cleanup_sysinfo_logs_timelimit comes into
play only if you do not use the "upload and then immediately delete" options.
path_style_access
Whether the CMC (and also the installer's basic validation test script) should use "path style" request formatting
when submitting S3 requests to the HyperStore S3 Service. In path style S3 requests, the bucket name is part
of the Request-URI rather than being part of the Host header value.
Options are:
l true — The CMC will use path style HTTP request formatting when submitting S3 requests to the Hyper-
Store S3 Service. The bucket name associated with the request will be in the Request-URI. For
example:
PUT /bucket1/objectname HTTP/1.1
Host: s3-region1.mycompany.com
l false — The CMC will not use path style HTTP request formatting when submitting S3 requests to the
HyperStore S3 Service. Instead it will use "virtual host" style access. The bucket name associated with
the request will be in the HTTP Host header. For example:
PUT /objectname HTTP/1.1
Host: bucket1.s3-region1.mycompany.com
Note that this setting affects only the behavior of the CMC and the behavior of the installer's basic validation
test script, in their role as S3 clients. Meanwhile the HyperStore S3 Service always supports both path style
access and virtual host style access.
IMPORTANT ! If the CMC or any other S3 client applications use virtual host style access to the Hyper-
Store S3 Service, then your DNS environment must be configured to resolve this type of Host value.
See "DNS Set-Up" in the Cloudian HyperStore Installation Guide.
Default = true
fips_enabled
For information about this setting see "FIPS Support" (page 125).
sshdconfig_disable_override
Note This setting is relevant only to software-only deployments of HyperStore that were first installed
as HyperStore version 7.2.4 or later. It does not apply to HyperStore appliances, or older HyperStore
software deployments for which the upgrade path included versions 7.2.0, 7.2.1, 7.2.2, or 7.2.4. For
such systems HyperStore has already managed the sshd_config file and the sshdconfig_disable_over-
ride setting should be left at its default of false.
422
7.2. Configuration Settings
If this setting is left at its default value of false when you install HyperStore then on each HyperStore host your
existing /etc/ssh/sshd_config file will be appended with a HyperStore-managed section of SSH settings. This
HyperStore-managed section of the file allows HyperStore to implement the fips_enabled setting, if you set
fips_enabled to true; and it also allows HyperStore Shell users to log in to nodes, if you enable the HyperStore
Shell (HSH).
If you set sshdconfig_disable_override to true before executing the HyperStore installation then HyperStore
will not append the existing /etc/ssh/sshd_config file on HyperStore host machines. In this case you will not be
able to use the fips_enabled setting or use the HyperStore Shell.
For detail about when to make this edit within the context of the installation procedure, see "Installing a New
HyperStore System" in the Cloudian HyperStore Installation Guide.
Default = false
To apply a change to this setting on an already installed system (a change that should not be needed in typical
circumstances), on the Configuration Master node run this command:
Note hsctl is a new node management tool that remains mostly behind the scenes in HyperStore 7.5
but will be more prominent in future HyperStore releases. You can run the hsctl command above from
any directory.
hyperstore_data_directory
A quote-enclosed, comma-separated list of mount points to use for S3 object storage. In a production envir-
onment, use dedicated disks for S3 object storage. Do not use the same disk(s) that are storing the OS and
Cassandra.
The system will automatically assign virtual nodes (vNodes) to each of your S3 data mount points, in a manner
that allocates an approximately equal total token range to each mount point. For more information about
HyperStore vNodes see "How vNodes Work" (page 55).
Do not use symbolic links when specifying your mount points for the hyperstore_data_directory setting. The
HyperStore system does not support symbolic links for these directories.
Default = Set during installation based on operator input, if host has multiple disks. For hosts with only one
disk, default is /var/lib/cloudian
hyperstore_listen_ip
The IP interface on which each HyperStore Service node listens for data operations requests from clients. This
setting must match the cassandra_listen_address setting.
Specify this as an IP address alias. Puppet will use the alias to determine the actual IP address for each
node.
Default = %{::cloudian_ipaddress}
423
Chapter 7. Reference
hyperstore_timeout
For the S3 Service’s connections to the HyperStore Service, the transaction completion timeout (session
timeout) in milliseconds.
Default = 10000
For a diagram showing the place of this timeout within the S3 request processing flow, see the description of
mts.properties.erb: "cassandra.cluster.CassandraThriftSocketTimeout" (page 465).
hyperstore_connection_timeout
For the S3 Service’s connections to the HyperStore Service, the connection establishment timeout in mil-
liseconds.
Default = 10000
For a diagram showing the place of this timeout within the S3 request processing flow, see the description of
mts.properties.erb: "cassandra.cluster.CassandraThriftSocketTimeout" (page 465).
hyperstore.maxthreads.repair
Maximum number of simultaneous client threads for one S3 Service node to use on HyperStore File System
data repairs automatically performed during read operations. For more information on the "repair on read"
mechanism see "Automated Data Repair Feature Overview" (page 252).
Default = 50
hyperstore_jetty_minThreads
Each HyperStore Service node maintains a thread pool to process incoming HTTP requests from clients (S3
Service nodes). Idle threads are terminated if not used within a timeout period — unless the number of threads
in the pool is down to the required minimum pool size, in which case the idle threads are kept.
The hyperstore_jetty_minThreads parameter sets the minimum number of threads to keep in the thread pool.
Default = 100
auto_repair_computedigest_run_number
This property configures the scheduled auto-repair feature such that every Nth run of hsstool repair (for rep-
licated object data) and hsstool repairec (for erasure coded object data) will use the "-computedigest" option
in order to detect and repair any data corruption on disk ("bit rot"). For example, if this property is set to 3, then
on each node every 3rd run of repair will use the "-computedigest" option and for each data center every 3rd
run of repairec will use the "-computedigest" option.
By default the auto-repair interval for repair is 30 days, and each individual node has its own every-30-days
repair schedule. So if for example you set auto_repair_computedigest_run_number to 3, then on a given node
the automatically triggered repair runs would be implemented like this:
424
7.2. Configuration Settings
By default the auto-repair interval for repairec is 29 days. With erasure coded data repair, running hsstool
repairec on any one node repairs all the erasure coded data in the local data center. Consequently the auto-
repair feature runs the command on just one randomly selected node in each data center every 29 days.
So if for example you set auto_repair_computedigest_run_number to 3, then for a given data center the auto-
matically triggered repairec runs would be implemented like this:
Default = 0
Note Because it entails recalculating a fresh MD5 hash of each replica or erasure coded fragment on
the target node, using "-computedigest" on repair runs is an expensive operation in terms of resource
utilization.
hyperstore.maxthreads.write
Maximum number of simultaneous client threads for one S3 Service node to use on writes to the HyperStore
File System.
This setting is controlled by a HyperStore performance optimization script that takes the local hardware envir-
onment into account. Do not manually edit this setting -- your edits will not be applied by the system.
hyperstore.maxthreads.read
Maximum number of simultaneous client threads for one S3 Service node to use on reads of the HyperStore
File System.
This setting is controlled by a HyperStore performance optimization script that takes the local hardware envir-
onment into account. Do not manually edit this setting -- your edits will not be applied by the system.
hyperstore_maxperrouteconnections
The maximum allowed number of concurrently open connections between each S3 Service node and each
425
Chapter 7. Reference
HyperStore Service node. This allows for limiting the traffic load between each front-end S3 Service node (as it
processes incoming requests from S3 clients) and any single HyperStore Service node.
Note that each of your S3 Service nodes has its own pool of connections to the HyperStore storage layer, so
the total possible connections from the S3 Service as a whole to a single HyperStore Service node would be
the number of S3 Service nodes multiplied by the value of "Max Connections from One S3 Service Node to
One HyperStore Service Node".
This setting is controlled by a HyperStore performance optimization script that takes the local hardware envir-
onment into account. Do not manually edit this setting -- your edits will not be applied by the system.
hyperstore_maxtotalconnections
The maximum allowed number of concurrently open connections between each S3 Service node and all
HyperStore Service nodes, combined. This allows for limiting the traffic load between each front-end S3 Ser-
vice node (as it processes incoming requests from S3 clients) and the whole back-end HyperStore storage
layer.
Note that each of your S3 Service nodes has its own pool of connections to the HyperStore storage layer, so
the total possible connections from the front-end S3 Service as a whole to the back-end HyperStore storage
layer as a whole would be the number of S3 Service nodes multiplied by the value of "Max Connections from
One S3 Service Node to All HyperStore Service Nodes".
This setting is controlled by a HyperStore performance optimization script that takes the local hardware envir-
onment into account. Do not manually edit this setting -- your edits will not be applied by the system.
hyperstore_jetty_maxThreads
Each HyperStore Service node maintains a thread pool to process incoming HTTP requests from clients (S3
Service nodes). When there is a request to be serviced, a free thread from the pool is used and then returned
to the pool afterward. If a thread is needed for a job but no thread is free, a new thread is created and added to
the pool — unless the maximum allowed number of threads in the pool has been reached, in which case
queued jobs must wait for a thread to become free.
The hyperstore_jetty_maxThreads parameter sets the maximum number of threads to allow in the thread pool.
This setting is controlled by a HyperStore performance optimization script that takes the local hardware envir-
onment into account. Do not manually edit this setting -- your edits will not be applied by the system.
hyperstore_messaging_service_threadpool
Maximum size of the thread pool used by the HyperStore inter-node messaging service. When there is a new
task and there are no idle threads available in the thread pool:
l If fewer than this many threads are in the thread pool, the thread pool executor will create a new thread.
l If this many or more threads are in the thread pool, the thread pool executor will queue the new task.
This setting is controlled by a HyperStore performance optimization script that takes the local hardware envir-
onment into account. Do not manually edit this setting -- your edits will not be applied by the system.
hyperstore_repair_session_threadpool
Maximum size of the thread pool used for hsstool repair or hsstool repairec operations. When there is a new
task and there are no idle threads available in the thread pool:
l If fewer than this many threads are in the thread pool, the thread pool executor will create a new thread.
l If this many or more threads are in the thread pool, the thread pool executor will queue the new task.
426
7.2. Configuration Settings
This setting is controlled by a HyperStore performance optimization script that takes the local hardware envir-
onment into account. Do not manually edit this setting -- your edits will not be applied by the system.
hyperstore_repair_digest_index_threadpool
Maximum size of the thread pool used for reading file digests on a node in order to build the index required by
Merkle Tree based repair (the default hsstool repair type). When there is a new task and there are no idle
threads available in the thread pool:
l If fewer than this many threads are in the thread pool, the thread pool executor will create a new thread.
l If this many or more threads are in the thread pool, the thread pool executor will queue the new task.
This setting is controlled by a HyperStore performance optimization script that takes the local hardware envir-
onment into account. Do not manually edit this setting -- your edits will not be applied by the system.
hyperstore_rangerepair_threadpool
Maximum size of the thread pool used for running multiple range repair tasks in parallel during hsstool repair.
When there is a new task and there are no idle threads available in the thread pool:
l If fewer than this many threads are in the thread pool, the thread pool executor will create a new thread.
l If this many or more threads are in the thread pool, the thread pool executor will queue the new task.
This setting is controlled by a HyperStore performance optimization script that takes the local hardware envir-
onment into account. Do not manually edit this setting -- your edits will not be applied by the system.
hyperstore_stream_outbound_threadpool
Maximum size of the thread pool used for streaming files from one HyperStore node to another during Merkle
Tree based hsstool repair. When there is a new task and there are no idle threads available in the thread
pool:
l If fewer than this many threads are in the thread pool, the thread pool executor will create a new thread.
l If this many or more threads are in the thread pool, the thread pool executor will queue the new task.
This setting is controlled by a HyperStore performance optimization script that takes the local hardware envir-
onment into account. Do not manually edit this setting -- your edits will not be applied by the system.
hyperstore_downloadrange_session_threadpool
Maximum size of the thread pool used for range download sessions conducted by a HyperStore Service node.
When there is a new task and there are no idle threads available in the thread pool:
l If fewer than this many threads are in the thread pool, the thread pool executor will create a new thread.
l If this many or more threads are in the thread pool, the thread pool executor will queue the new task.
This setting is controlled by a HyperStore performance optimization script that takes the local hardware envir-
onment into account. Do not manually edit this setting -- your edits will not be applied by the system.
hyperstore_uploadrange_session_threadpool
Maximum size of the thread pool used for range upload sessions conducted by a HyperStore Service node.
When there is a new task and there are no idle threads available in the thread pool:
l If fewer than this many threads are in the thread pool, the thread pool executor will create a new thread.
l If this many or more threads are in the thread pool, the thread pool executor will queue the new task.
427
Chapter 7. Reference
This setting is controlled by a HyperStore performance optimization script that takes the local hardware envir-
onment into account. Do not manually edit this setting -- your edits will not be applied by the system.
hyperstore_decommission_threadpool
Maximum size of the thread pool used for uploading files away from a node that is being decommissioned.
When there is a new task and there are no idle threads available in the thread pool:
l If fewer than this many threads are in the thread pool, the thread pool executor will create a new thread.
l If this many or more threads are in the thread pool, the thread pool executor will queue the new task.
This setting is controlled by a HyperStore performance optimization script that takes the local hardware envir-
onment into account. Do not manually edit this setting -- your edits will not be applied by the system.
hyperstore_cleanup_session_threadpool
This thread pool size limit determines the maximum number of blobs (object replicas or erasure coded frag-
ments) to process in parallel within each cleanup "job" taking place on a node. Processing a blob entails check-
ing the blob’s corresponding object metadata to determine whether the blob is supposed to be where it is or
rather should be deleted.
This setting is controlled by a HyperStore performance optimization script that takes the local hardware envir-
onment into account. Do not manually edit this setting -- your edits will not be applied by the system.
hyperstore_auto_repair_threadpool
Maximum size of the thread pool used by the HyperStore auto-repair feature. When there is an auto-repair to
kick off and there are no idle threads available in the thread pool:
l If fewer than this many threads are in the thread pool, the thread pool executor will create a new thread.
l If this many or more threads are in the thread pool, the thread pool executor will queue the new task.
This setting is controlled by a HyperStore performance optimization script that takes the local hardware envir-
onment into account. Do not manually edit this setting -- your edits will not be applied by the system.
Note For more information about the auto-repair feature see "Automated Data Repair Feature Over-
view" (page 252).
hyperstore_repairec_sessionscan_threadpool
During an hsstool repairec operation, the threads in this thread pool are tasked with identifying erasure coded
objects in the system and batching them for evaluation. This parameter sets the maximum size of the thread
pool per node.
Default = 50
Reloadable via JMX (HyperStore Service’s JMX port 19082; MBean attribute = com.gem-
ini.cloudian.hybrid.server → FileRepairService → Attributes → RepairECSes-
sionScanThreadPoolCorePoolSize)
hyperstore_repairec_digestrequest_threadpool
During an hsstool repairec operation, the threads in this thread pool are tasked with reading the digests asso-
ciated with batches of erasure coded objects, to determine whether any of those objects need repair. This
428
7.2. Configuration Settings
parameter sets the maximum size of the thread pool per node.
Default = 30
Reloadable via JMX (HyperStore Service’s JMX port 19082; MBean attribute = com.gem-
ini.cloudian.hybrid.server → FileRepairService → Attributes → RepairECDi-
gestRequestThreadPoolFixedPoolSize)
hyperstore_repairec_task_threadpool
During an hsstool repairec operation, the threads in this thread pool are tasked with repairing erasure coded
objects that have been determined to need repair. This parameter sets the maximum size of the thread pool
per node.
Default = 60
Reloadable via JMX (HyperStore Service’s JMX port 19082; MBean attribute = com.gem-
ini.cloudian.hybrid.server → FileRepairService → Attributes → RepairECTaskThreadPoolCorePoolSize)
hyperstore_repairec_rocksdbscan_threadpool
During an hsstool repairec operation, the threads in this thread pool enable concurrent reads of digest data in
each RocksDB instance, as digest read requests come in from multiple digest request threads on multiple
nodes. This parameter sets the maximum size of the thread pool per RocksDB instance.
Default = 30
Reloadable via JMX (HyperStore Service’s JMX port 19082; MBean attribute = com.gem-
ini.cloudian.hybrid.server → FileRepairService → Attributes → RepairECRock-
sDBScanThreadPoolFixedPoolSize)
hyperstore_disk_check_interval
The interval (in minutes) at which each HyperStore Service node will run a check to see if there is significant
imbalance in disk usage on the node. If an imbalance is found in excess of that configured by disk.bal-
ance.delta, then one or more tokens are automatically moved from the over-used disk(s) to one or more less-
used disk(s) on the same host. For more information see "Automated Disk Management Feature Overview"
(page 281).
Note This feature applies only to HyperStore data disks (on which are stored S3 object data). It does
not apply to disks that are storing only the OS and Cassandra.
phonehome_proxy_host
If you want HyperStore to use a local forward proxy when the Smart Support (or "Phone Home") feature sends
daily system diagnostics packages to Cloudian Support, use this setting to specify the hostname or IP address
of the proxy.
Default = empty
429
Chapter 7. Reference
Note For this feature you should configure your forward proxy to support access to *.s3-sup-
port.cloudian.com (that is, to any sub-domain of s3-support.cloudian.com).
Note By default any proxy settings that you configure for the daily Smart Support uploads will also
apply to the sending of on-demand Node Diagnostics packages (triggered by your using the CMC's
Collect Diagnostics function [Cluster -> Nodes -> Advanced]). If you want to use a different proxy for
Node Diagnostics sending than you do for the daily Smart Support upload, use the sysinfo.proxy.* set-
tings in mts.properties.erb to separately configure proxy information for Node Diagnostics sending.
For more background information on these features see "Smart Support and Diagnostics Feature
Overview" (page 81).
phonehome_proxy_port
If you want HyperStore to use a local forward proxy when the Smart Support (or "Phone Home") feature sends
daily system diagnostics packages to Cloudian Support, use this setting to specify the proxy’s port number.
Default = empty
phonehome_proxy_username
If you want HyperStore to use a local forward proxy when the Smart Support (or "Phone Home") feature sends
daily system diagnostics packages to Cloudian Support, use this setting to specify the username that Hyper-
Store should use when connecting to the proxy (if a username and password are required by the proxy).
Default = empty
phonehome_proxy_password
If you want HyperStore to use a local forward proxy when the Smart Support (or "Phone Home") feature sends
daily system diagnostics packages to Cloudian Support, use this setting to specify the password that Hyper-
Store should use when connecting to the proxy (if are username and password are required by the proxy).
Default = empty
phonehome_uri
S3 URI to which to upload system-wide diagnostics data each day. By default this is the S3 URI for Cloudian
Support.
If you set this to a different S3 destination, include the HTTP or HTTPS protocol part of the URI (http:// or
https://).
Default = https://s3-support.cloudian.com:443
Note If you set phonehome_uri to a URI for your own HyperStore S3 Service (rather than the Cloudian
Support URI), and if your S3 Service is using HTTPS, then your S3 Service’s SSL certificate must be a
CA-verified certificate — not a self-signed certificate. By default the phone home function cannot
upload to an HTTPS URI that’s using a self-signed certificate. If you require that the upload go to an
430
7.2. Configuration Settings
HTTPS URI that’s using a self-signed certificate, contact Cloudian Support for guidance on modifying
the phone home launch script. For information on HTTPS set-up for the S3 Service, see "Setting Up
Security and Privacy Features" (page 104).
phonehome_bucket
l If you leave phonehome_uri at its default value -- which is Cloudian Support S3 URI -- you can leave
the phonehome_bucket, phonehome_access_key, and phonehome_secret_key properties empty. The
Smart Support feature will automatically extract the Cloudian Support S3 bucket name and security cre-
dentials from your encrypted HyperStore license file.
l If you set phonehome_uri to an S3 URI other than the Cloudian Support URI, set the phonehome_
bucket, phonehome_access_key, and phonehome_secret_key properties to the destination bucket
name and the applicable S3 access key and secret key.
Default = empty
phonehome_access_key
See phonehome_bucket.
Default = empty
phonehome_secret_key
See phonehome_bucket.
Default = empty
phonehome_gdpr
For description of how to use this setting, see "Encrypting Sensitive Fields in Uploaded Log File Copies to
Protect Data Privacy" (page 84).
Default = false
phonehome_gdpr_bucket
For description of how to use this setting, see "Encrypting Sensitive Fields in Uploaded Log File Copies to
Protect Data Privacy" (page 84).
Default = false
admin_auth_user
If the Admin Service is configured to require HTTP(S) Basic Authentication from clients (if admin_auth_
enabled is set to true), this is the username for clients to use when submitting HTTP(S) requests to the Admin
Service.
Default = sysadmin
admin_auth_pass
431
Chapter 7. Reference
If the Admin Service is configured to require HTTP(S) Basic Authentication from clients (if admin_auth_
enabled is set to true), this is the password for clients to use when submitting HTTP(S) requests to the Admin
Service. In this setting the password is configured as a comma-separated pair of "<Jetty_obfuscated_pass-
word>,<cleartext_password>".
For information about creating a Jetty-obfuscated password, see the Introduction section in the Cloudian
HyperStore Admin API Reference.
Default if original HyperStore install was version 7.2.2 or newer = "<obfuscated>,<cleartext>" of a random pass-
word generated by the system upon installation.
Default if original HyperStore install was older than version 7.2.2 = "1uvg1x1n1tv91tvt1x0z1uuq,public"
admin_auth_realm
If the Admin Service is configured to require HTTP(S) Basic Authentication from clients (if admin_auth_
enabled is set to true), this is the realm name used by the Admin Service for Basic Authentication purposes.
Default = CloudianAdmin
admin_auth_enabled
Whether to have the Admin Service require HTTP(S) Basic Authentication from connecting clients. Set to true to
have the Admin Service require Basic Authentication, or false to not require it.
Default if original HyperStore install was older than version 6.0.2 = false
admin_secure
If set to "true", the Admin Service will accept only HTTPS connections from clients (through port 19443 by
default). If set to "false", the Admin Service will allow regular HTTP connections from clients (through port
18080) as well as HTTPS connections (through port 19443).
This setting also controls CMC client-side behavior when the CMC calls the Admin Service for tasks such as
creating users, creating storage policies, and retrieving system monitoring data. If admin_secure is "true", the
CMC will exclusively use HTTPS when making requests to the Admin Service. If admin_secure is "false", the
CMC will exclusively use regular HTTP when making requests to the Admin Service.
Note however that even if you set admin_secure to "false" -- so that the Admin Service accepts HTTP requests
as well as HTTPS requests; and so that the CMC sends only HTTP requests to the Admin Service -- the Admin
Service's HTTPS port will still be accessed by other HyperStore system components. In particular, some of the
"System cron Jobs" (page 274) use HTTPS to make calls to the Admin Service.
Default = true
Note If your original HyperStore install was older than version 6.0.2 and you have upgraded to the cur-
rent version, the admin_secure setting does not appear in the common.csv file and an internal default
value of "false" is used. In such systems, if you want the Admin Service to accept only HTTPS con-
nections from clients, add the line admin_secure,true to the common.csv file.
432
7.2. Configuration Settings
cmc_admin_secure_port
If the CMC is using HTTPS to connect to the Admin Service (as it will if admin_secure is set to "true"), this is the
Admin Service listening port number to which it will connect. Note that this setting controls CMC client-side con-
figuration, not Admin Service configuration.
Default = 19443
user_password_min_length
For users' CMC passwords, the minimum required length in number of characters. The system will reject a
user's attempt to set a new password that does not meet this requirement.
Default = 9
To apply change, after Puppet propagation restart the S3 Service and CMC.
user_password_dup_char_ratio_limit
When a CMC user creates a new password, no more than this percentage of characters in the new password
can be characters that are in the user's current password. The system will reject a user's attempt to set a new
password that does not meet this requirement.
If this is set to 0, then none of the characters in a user's new password can be characters that are in the user's
current password.
Default = -1
To apply change, after Puppet propagation restart the S3 Service and CMC.
user_password_unique_generations
When a CMC user creates a new password, the new password cannot be the same as one of the user's past
passwords, going back this many passwords into the user's password history. The system will reject a user's
attempt to set a new password that does not meet this requirement.
For example if you set this to 10, then a user's new password cannot match against any of the user's past 10
passwords.
Default = 0
To apply change, after Puppet propagation restart the S3 Service and CMC.
user_password_rotation_graceperiod
When a CMC user creates a new password, they must wait at least this many days before replacing that pass-
word with another new password.
Default = 0
To apply change, after Puppet propagation restart the S3 Service and CMC.
433
Chapter 7. Reference
user_password_rotation_expiration
After a CMC user has had the same password for this many days, the password expires and the CMC will
require the user to create a new password before they can log in again. This will be enforced by the CMC's
login function.
Note that if you change this setting, the lifespan of each user's existing password is considered to have started
when the password was created. For example, if you change this setting's value from 0 to 90, then all users
whose existing passwords are already older than 90 days old will be required to change their passwords the
next time that they log into the CMC.
Default = 0
To apply change, after Puppet propagation restart the S3 Service and CMC.
user_password_lock_enabled
This setting controls whether or not the CMC password lock feature is enabled:
l If true, the CMC password lock feature is enabled and its behavior is configured by the user_password_
lock_durationsec and user_password_lock_maxfailedattempts settings.
Note The CMC password lock feature applies only to users for which local authentication
(based on a user ID and password defined within HyperStore) is performed, not users whose
CMC login attempts are authenticated by reference to an LDAP system. For users authenticated
by an LDAP system you can configure a password lockout policy within the LDAP system if
desired.
Default = true for fresh installations of HyperStore 7.3 or later; false for systems upgraded to HyperStore 7.3
from an earlier HyperStore version.
To apply change, after Puppet propagation restart the S3 Service and CMC.
user_password_lock_durationsec
Applicable only if user_password_lock_enabled is set to true.
l If the user tries again to log in with an incorrect password, this restarts the 30 minute lock-out period.
434
7.2. Configuration Settings
l If the user does not try again to log in with an incorrect password, the lock-out period automatically
expires at the end of the 30 minutes, and the user is then allowed to log in if they supply the correct
password.
l Optionally, the system administrator can intervene to release the lock on a user -- before the lock
reaches its automatic expiration -- by using the CMC (Users & Groups -> Manage Users; while on that
page click Help for details) or the Admin API (see the "user" section of the Cloudian HyperStore Admin
API Reference).
Default = 1800
To apply change, after Puppet propagation restart the S3 Service and CMC.
Note If a system administrator becomes locked out of the CMC -- due to too many login attempts within
an incorrect password -- the lock on that administrator can be released by a different system admin-
istrator using the CMC, or by the Admin API method mentioned above. Alternatively, as with any type of
locked out user the lock-out will release automatically after the configurable time interval.
user_password_lock_maxfailedattempts
Applicable only if user_password_lock_enabled is set to true.
Default = 6
To apply change, after Puppet propagation restart the S3 Service and CMC.
iam_service_enabled
If this is set to "true" then HyperStore's IAM Service is enabled and IAM functionality will display in the CMC. For
more information about the IAM API see the IAM section of the Cloudian HyperStore AWS APIs Support
Reference. For more information about IAM in the CMC, while on any of the pages in the CMC's IAM section
click Help.
Default = true
iam_port
Port on which the HyperStore IAM Service listens for regular HTTP connections.
Default = 16080
iam_secure
If set to "true", the IAM Service will accept only HTTPS connections from clients (through port 16443 by default).
If set to "false", the IAM Service will allow regular HTTP connections from clients (through port 16080) as well
as HTTPS connections (through port 16443).
This setting also controls CMC client-side behavior when the CMC calls the IAM Service for tasks such as cre-
ating IAM or creating IAM policies. If iam_secure is "true", the CMC will exclusively use HTTPS when making
requests to the IAM Service. If iam_secure is "false", the CMC will exclusively use regular HTTP when making
requests to the IAM Service.
435
Chapter 7. Reference
Default = false
To apply change, after Puppet propagation restart the IAM Service and the CMC.
iam_secure_port
Port on which the HyperStore IAM Service listens for HTTPS connections.
Default = 16443
iam_service_endpoint
IAM Service endpoint. This setting is controlled by the installer. Do not edit this setting directly. For instructions
on changing the IAM Service endpoint, see "Changing S3, Admin, CMC, or IAM Service Endpoints" (page
513).
iam_max_groups
The maximum number of IAM groups allowed per Cloudian account root.
Default = 300
Note The maximum number of IAM users per Cloudian account root is 5000. This limit is not con-
figurable.
iam_max_groups_per_user
The maximum number of IAM groups that an IAM user can belong to at a time. If an IAM user belongs to this
many groups, the IAM Service will not allow her to be joined to any more groups.
IMPORTANT ! The more IAM groups an IAM user belongs to, the more complex and time-consuming
will be the operation to assess the various IAM policies that apply to the user when the user submits an
S3 request (to determine whether the user has permission for that request). Therefore, exercise caution
in raising the value of this setting.
Default = 10
mfa_totp_issuer
Users can enable multi-factor authentication (MFA) on their CMC login accounts (see "Multi-Factor Authentic-
ation" (page 123)). The mfa_totp_issuer setting specifies the MFA TOTP (time-based one-time password)
Issuer name that will display in a user's virtual MFA device if the user enables MFA on their CMC account. The
example below is from the virtual MFA device Google Authenticator, with the Issuer at its default value "Cloud-
ian".
436
7.2. Configuration Settings
Default = Cloudian
cloudian_s3admin_min_threads
Minimum number of threads to keep in each Admin Service node’s HTTP request processing thread pool.
The Admin Service uses a thread pool to process incoming HTTP requests from clients. An initial pool of
threads is created at server initialization time, and additional threads may be added if needed to process
queued jobs. Idle threads are terminated if not used within a thread timeout period — unless the number of
threads in the pool is down to cloudian_s3admin_min_threads. If only this many threads are in the pool, then
threads are kept open even if they’ve been idle for longer than the thread timeout.
Default = 10
cloudian_s3admin_max_threads
Maximum number of threads to allow in the Admin Service’s HTTP request processing thread pool. If there are
fewer than this many threads in the pool, new threads may be created as needed in order to handle queued
HTTP request processing jobs. If the maximum thread pool size is reached, no more threads will be created —
instead, queued HTTP request processing jobs must wait for an existing thread to become free.
Default = 50
cloudian_s3admin_max_idletime
When the Admin Service processes HTTP requests from clients, the maximum allowed connection idle time in
milliseconds. If this much idle time passes before a new request is received on an open connection with a cli-
ent, or if this much idle time passes during the reading of headers and content for a request, or if this much idle
time passes during the writing of headers and content of a response, the connection is closed.
Default = 60000
cloudian_s3admin_lowres_maxidletime
Special, "low resource" maximum idle time to apply to Admin Service HTTP connections when the number of
simultaneous connections to an Admin Service node exceeds cloudian_s3admin_lowres_maxconnections.
437
Chapter 7. Reference
Configured in milliseconds. With this setting, you can have the Admin Service be less tolerant of connection
idle time during times of high concurrent usage. (For general idle timer behavior, see the description of cloud-
ian_s3admin_max_idletime above.)
Default = 5000
cloudian_s3admin_lowres_maxconnections
If the number of simultaneous HTTP connections to an Admin Service node exceeds this value, the special idle
timer configured by cloudian_s3admin_lowres_maxidletime is applied to that node rather than the usual cloud-
ian_s3admin_max_idletime.
Default = 1000
cloudian_s3_max_threads
Maximum number of threads to allow in the S3 Service’s HTTP request processing thread pool. If there are
fewer than this many threads in the pool, new threads may be created as needed in order to handle queued
HTTP request processing jobs. If the maximum thread pool size is reached, no more threads will be created —
instead, queued HTTP request processing jobs must wait for an existing thread to become free.
This setting is controlled by a HyperStore performance optimization script that takes the local hardware envir-
onment into account. Do not manually edit this setting -- your edits will not be applied by the system.
cloudian_s3_max_idletime
When the S3 Service processes HTTP requests from clients, the maximum allowed connection idle time in mil-
liseconds. If this much idle time passes before a new request is received on an open connection with a client,
or if this much idle time passes during the reading of headers and content for a request, or if this much idle time
passes during the writing of headers and content of a response, the connection is closed.
Default = 60000
cloudian_s3_lowres_maxidletime
Special, low resource maximum idle timer to apply to S3 Service HTTP connections when the number of sim-
ultaneous connections to an S3 Service node exceeds cloudian_s3_lowres_maxconnections. Configured in
milliseconds.
Default = 5000
cloudian_s3_lowres_maxconnections
If the number of simultaneous HTTP connections to an S3 Service node exceeds this value, the special idle
timer configured by cloudian_s3_lowres_maxidletime is applied to that node.
Default = 2000
cloudian_tiering_useragent
User agent string used by HyperStore when it acts as an S3 client for auto-tiering to an external S3 system.
438
7.2. Configuration Settings
s3_proxy_protocol_enabled
If you are using HAProxy as the load balancer in front of your S3 Service -- or a different load balancer that
supports the PROXY Protocol -- you can set s3_proxy_protocol_enabled to true if you want the S3 Service to
support the PROXY Protocol. If you do so, the following will happen:
l The S3 Server will create dedicated PROXY Protocol connectors listening on port 81 (for regular
PROXY Protocol) and port 4431 (for PROXY Protocol with SSL/TLS). These connectors will be enabled
and configured in s3.xml.erb. By default these connectors are disabled that template.
l If you configure your load balancer to use the PROXY Protocol for communicating with the S3 Service,
the load balancer when relaying each S3 request to the S3 Service will pass along the originating cli-
ent's IP address.
Note For guidance on load balancer configuration consult with your Cloudian Sales Engin-
eering or Professional Services representative.
l In the S3 request log, the S3 request entries will then show the true client IP address as the source
address, rather than showing the loader balancer's IP address as the source. Also, the true client IP
address for each request will be available to support using S3 bucket policies that filter based on
source IP address; or billing rating plans that "allowlist" certain source IP addresses.
Setting s3_proxy_protocol_enabled to true is appropriate if you're using HAProxy for load balancing -- or a dif-
ferent load balancer that supports the PROXY Protocol -- and you want S3 Service request logging to show the
true origin address associated with S3 requests; and/or you want to implement bucket policies or rating plans
that are responsive to the origin address. If you're using a load balancer that supports PROXY Protocol, using
this protocol is the preferred method for providing the originating client IP address to the S3 layer, rather than
using the X-Forwarded-for header.
Note If you intend to use the PROXY Protocol with TLS/SSL (on S3 Service listening port 4431) you
must set up TLS/SSL for the S3 Service, if you have not already done so. For instructions see "Setting
Up Security and Privacy Features" (page 104).
If s3_proxy_protocol_enabled is set to true then when you configure TLS/SSL for the S3 Service your
configuration information will be applied to PROXY Protocol port 4431 as well as to the regular S3
HTTPS port 443.
Default = false
cloudian_s3_heap_limit
Maximum heap size limit for the S3 Service application. The JAVA_OPTS -Xmx value passed to the JVM will
be the lower of this value and the percent-of-system-RAM value set by cloudian_s3_max_heap_percent.
Default = 30g
cloudian_s3_max_heap_percent
439
Chapter 7. Reference
Maximum heap size for the S3 Service application, as a percentage of host system RAM. The JAVA_OPTS -
Xmx value passed to the JVM will be the lower of this value and the limiting value set by cloudian_s3_heap_
limit.
Default = 15
cloudian_hss_heap_limit
Maximum heap size limit for the HyperStore Service application. The JAVA_OPTS -Xmx value passed to the
JVM will be the lower of this value and the percent-of-system-RAM value set by cloudian_hss_max_heap_per-
cent.
Default = 30g
cloudian_hss_max_heap_percent
Maximum heap size for the HyperStore Service application, as a percentage of host system RAM. The JAVA_
OPTS -Xmx value passed to the JVM will be the lower of this value and the limiting value set by cloudian_hss_
heap_limit.
Default = 15
cloudian_admin_heap_limit
Maximum heap size limit for the Admin Service application. The JAVA_OPTS -Xmx value passed to the JVM
will be the lower of this value and the percent-of-system-RAM value set by cloudian_admin_max_heap_per-
cent.
Default = 16g
cloudian_admin_max_heap_percent
Maximum heap size for the Admin Service application, as a percentage of host system RAM. The JAVA_OPTS
-Xmx value passed to the JVM will be the lower of this value and the limiting value set by cloudian_admin_
heap_limit.
Default = 5
cloudian_s3_init_heap_percent
Initial heap size (JAVA_OPTS -Xms value) for the S3 Service application, as a percentage of the -Xmx value.
Default = 25
cloudian_hss_init_heap_percent
Initial heap size (JAVA_OPTS -Xms value) for the HyperStore Service application, as a percentage of the -Xmx
value.
Default = 25
440
7.2. Configuration Settings
cloudian_admin_init_heap_percent
Initial heap size (JAVA_OPTS -Xms value) for the Admin Service application, as a percentage of the -Xmx
value.
Default = 25
cloudian_s3_new_heap_percent
New heap size (JAVA_OPTS -Xmn value) for the S3 Service application, as a percentage of the -Xmx value.
This is the heap size specifically for "young generation" objects.
Default = 25
cloudian_hss_new_heap_percent
New heap size (JAVA_OPTS -Xmn value) for the HyperStore Service application, as a percentage of the -Xmx
value. This is the heap size specifically for "young generation" objects.
Default = 25
cloudian_admin_new_heap_percent
New heap size (JAVA_OPTS -Xmn value) for the Admin Service application, as a percentage of the -Xmx
value. This is the heap size specifically for "young generation" objects.
Default = 25
cloudian_heapdump_on_out_of_memory
Whether to enable Java heap dump on an S3 Service or HyperStore Service or Monitoring Data Collector out-
of-memory error. Dump paths are /var/log/cloudian/s3.hprof and /var/log/cloudian/hss.hprof and /var/-
log/cloudian/datacollector.hprof, respectively. The dump file is in HPROF binary format.
Default =true
To apply change, after Puppet propagation restart S3 Service and HyperStore Service
cassandras_per_s3node
Maximum number of Cassandra nodes to which an individual S3 Service node may keep simultaneous live
connections.
Default = 9
cassandra_max_active
The maximum allowed number of simultaneously active connections in a Cassandra connection pool. If this
limit has been reached and a thread requires a new connection to Cassandra, the thread will wait for a period
configured by cassandra.cluster.MaxWaitTimeWhenExhausted in mts.properties.erb (default 9 seconds) before
441
Chapter 7. Reference
If this is set to a negative value (e.g. -1) this disables the limit on active connections.
This setting is controlled by a HyperStore performance optimization script that takes the local hardware envir-
onment into account. Do not manually edit this setting -- your edits will not be applied by the system.
cloudian_s3_aes256encryption_enabled
This setting is for enabling AES-256 in HyperStore, for use with server-side encryption. If AES-256 is not
enabled, AES-128 is used instead.
Default = false
cloudian_s3_autoinvalidateiamcache
When set to true, each region's cache of IAM user information (including IAM group membership and
IAM policy-based permissions) is automatically cleared out every six hours, as well as cache data being
updated whenever there is a change to IAM user information.
When set to false, each regions cache of IAM user information is not automatically cleared out every six hours,
and instead the cache data is only updated whenever there is a change to IAM user information. This is the
preferable setting if you have a multi-region HyperStore system with high latency between regions. The false
setting reduces the need for the HyperStore S3 Service in non-default regions to make calls to the default
region to retrieve IAM information when IAM users submit S3 requests to the non-default regions.
Default = true
To apply change, after Puppet propagation restart S3 Service and IAM Service
s3_perbucket_qos_enabled
This setting enables/disables the keeping of per-bucket stored-bytes counts (the number of net bytes -- exclud-
ing replication or erasure coding overhead -- stored in each bucket) and stored-objects counts (the number of
objects in each bucket) in the system.
l The system will keep track of stored-bytes and stored-objects counts for each bucket. These counts will
be kept in the QoS DB.
Note In some atypical circumstances the keeping of these per-bucket stored-bytes and stored-
object counts, which will be updated after every S3 transaction, may cause a minor-to-moderate
decline in S3 write performance. Example circumstances would be if there are an exceptionally
large number of buckets in your HyperStore system, or if you have a multi-data center Hyper-
Store deployment with more than usual latency between the data centers. If you want to enable
per-bucket stored-bytes and stored-objects counters in your system but are concerned about
potential performance impacts, consult with Cloudian Support.
l You will be able to query the current stored-bytes and stored-objects counts for individual buckets, by
using the Admin API calls GET /system/bytecount and GET /system/objectcount. For more information
on these Admin API calls see 11.11 system.
l If you use Cloudian's monitoring and visualization product HyperIQ, HyperIQ will be able to report
stored-bytes and stored-objects counts for individual buckets.
442
7.2. Configuration Settings
l In the CMC, the object listing page for a bucket will display the bucket's current stored-bytes and
stored-objects counts.
IMPORTANT ! After setting s3_perbucket_qos_enabled to true, pushing the change out to the cluster,
and restarting the S3 Service, execute the Admin API call POST /usage/repair?groupId=ALL to bring
the counters up to date for buckets that already have objects in them. For more information about this
Admin API call see 11.13 usage.
Subsequently, the counters will automatically be updated for each bucket each time there is an S3
transaction that impacts the byte count or object count for the bucket.
If you leave s3_perbucket_qos_enabled at its default value false, then per-bucket stored-bytes and stored-
objects counts will not be maintained in the QoS DB; you will not be able to query per-bucket counts through
the Admin API calls referenced above; HyperIQ will not be able to report on per-bucket stored-bytes and
stored-objects counts; and the CMC's object listing page for a bucket will not show stored-bytes and stored-
objects counts.
Default = false
redis_credentials_master_port
Port on which the Redis Credentials master node listens for data storage requests from clients. The Redis Cre-
dentials slave nodes will listen on this port as well.
Default = 6379
To apply change, after Puppet propagation restart Redis Credentials, S3 Service, and HyperStore Service
redis_monitor_subscription_check
This setting controls certain aspects of HyperStore services start-up behavior, including during a HyperStore
version upgrade operation. Leave this setting at its default value.
Default = false
redis_qos_master_port
Port on which the Redis QoS master node listens for data storage requests from clients. The Redis QoS slave
nodes will listen on this port as well.
Default = 6380
To apply change, after Puppet propagation restart Redis QoS, S3 Service, and HyperStore Service
redis_monitor_listener_port
Port on which the Redis Monitor can be queried for information about the Redis cluster state, via the Redis
Monitor CLI.
Default = 9078
redis_lib_directory
Directory in which to store Redis data files.
443
Chapter 7. Reference
Default = /var/lib/redis
If you want to change this for a HyperStore system that’s already in operation, consult with Cloudian Support.
redis_log_directory
Directory in which to store Redis log files.
Default = /var/log/redis
To apply change, after Puppet propagation restart Redis Credentials and Redis QoS
cassandra_max_heap_size
Max Heap size setting (memory allocation) for the Cassandra application. If this setting is assigned a value,
this value will be used for MAX_HEAP_SIZE in cassandra-env.sh. By default the cassandra_max_heap_size
setting is commented out and not used by the system. Instead, the default behavior for Cloudian HyperStore is
for MAX_HEAP_SIZE in cassandra-env.sh to be automatically set based on the host’s RAM size.
cassandra_enable_gc_logging
Whether to enable Java garbage collection (GC) logging for Cassandra. The log is written to /var/-
log/cassandra/gc.log.
Default = true
Note GC logging is also enabled for the S3 Service, Admin Service, and HyperStore Service. These
GC logs are under /var/log/cloudian and are named s3-gc.log, admin-gc.log, and hss-gc.log, respect-
ively.
cassandra_heapdump_on_out_of_memory
Whether to enable Java heap dump on a Cassandra out-of-memory error. Dump path is /var/-
log/cassandra/cassandra.hprof. The dump file is in HPROF binary format.
Default = true
cassandra_lib_directory
Directory in which to store Cassandra application state data.
Default = /var/lib/cassandra
If you want to change this for a HyperStore system that’s already in operation, consult with Cloudian Support.
cassandra_log_directory
Directory in which to store Cassandra application log files.
Default = /var/log/cassandra
444
7.2. Configuration Settings
cassandra_saved_cache_directory
Directory in which to store the Cassandra saved_caches file.
Default = /var/lib/cassandra
If you want to change this for a HyperStore system that’s already in operation, consult with Cloudian Support.
cassandra_commit_log_directory
Directory in which to store the Cassandra commit log file.
Default = Set during installation based on operator input, if host has multiple disks. For hosts with only one
disk, default is /var/lib/cassandra_commit
If you want to change this for a HyperStore system that’s already in operation, consult with Cloudian Support.
cassandra_data_directory
Directory in which to store Cassandra data files. By default, Cassandra is used only for storing S3 object
metadata (metadata associated with individual objects) and service metadata such as account information,
usage data, and system monitoring data.
Default = Set during installation based on operator input, if host has multiple disks. For hosts with only one
disk, default is /var/lib/cassandra/data
If you want to change this for a HyperStore system that’s already in operation, consult with Cloudian Support.
IMPORTANT ! The Cassandra data directory should not be mounted on a shared file system such as a
NAS device.
cassandra_port
Port on which Cassandra listens for data operations requests from clients.
Default = 9160
concurrent_compactors
Number of simultaneous Cassandra compactions to allow, not including validation "compactions" for anti-
entropy repair. Simultaneous compactions can help preserve read performance in a mixed read/write work-
load, by mitigating the tendency of small sstables to accumulate during a single long running compaction.
Default = 2
cassandra_tombstone_warn_threshold
If while processing a Cassandra query it is found that a single row within a column family has more than this
many tombstones (deleted data markers), a tombstone warning is logged in the Cassandra application log.
An example of query that can potentially encounter a high number of tombstones is a metadata query triggered
by an S3 Get Bucket (List Objects) operation.
Default = 50000
cassandra_tombstone_failure_threshold
445
Chapter 7. Reference
If while processing a Cassandra query it is found that a single row within a column family has more than this
many tombstones (deleted data markers), the query fails and a tombstone error is logged in the Cassandra
application log.
An example of query that can potentially encounter a high number of tombstones is a metadata query triggered
by an S3 Get Bucket (List Objects) operation.
Default = 100000
cassandra_tombstone_cleanup_threshold
If while processing a Cassandra query it is found that a single row within a CLOUDIAN_METADATA or MPSes-
sion column family has more than this many tombstones (deleted data markers), a tombstone purge process is
automatically triggered for that column family.
An example of query that can potentially encounter a high number of tombstones is a metadata query triggered
by an S3 Get Bucket (List Objects) operation.
Default = 75000
Note You can also manually trigger a tombstone purge for a specific bucket, as described in "Tomb-
stone Cleanup Processing" (page 278).
cassandra_tombstone_gcgrace
While doing a purge of tombstones (deleted data markers) in a CLOUDIAN_METADATA or MPSession column
family, the system will not purge tombstones that are fewer than this many seconds old.
Default = 0
cassandra_listen_address
For each Cassandra node, the IP interface on which the node will listen for cluster management com-
munications from other Cassandra nodes in the cluster. Specify this as an IP address alias. Puppet will use
the alias to determine the actual IP address for each node.
Default = %{::cloudian_ipaddress}
cassandra_rpc_address
For each Cassandra node, the IP interface on which the node will listen for data operations requests from cli-
ents, via Thrift RPC. Specify this as an IP address alias. Puppet will use the alias to determine the actual IP
address for each node.
If desired, this can be the same IP address alias as used for cassandra_listen_address.
446
7.2. Configuration Settings
Default = %{::cloudian_ipaddress}
If you want to change this for a HyperStore system that’s already in operation, consult with Cloudian Support.
cassandra_default_node_datacenter
This instance of this setting is not used. Instead, the instance of cassandra_default_node_datacenter in
region.csv is used. Typically you should have no need to edit that setting.
admin_whitelist_enabled
Whether to enable the billing "allowlist" feature that allows favorable billing terms for a specified list of source
IP addresses or subnets. If this feature is enabled, allowlist management functionality displays in the CMC.
This functionality is available only to HyperStore system administrators, not to group admins or regular users.
Default = true
allow_delete_users_with_buckets
If this is set to true, then a user who owns buckets can be deleted via the CMC or the Admin API, and the sys-
tem will not only delete the user but will also automatically delete the user's buckets and all the data in
those buckets. This includes buckets and data belonging to any IAM users who have been created under the
user account root. The deleted data will not be recoverable.
If this is set to false, then the system will not allow a user who owns buckets to be deleted via the CMC or the
Admin API. Instead, the user or an administrator must first delete all buckets owned by the user -- via the
CMC or a different S3 client application -- including any buckets belonging IAM users under the user account
root. Only after all such buckets are deleted can the user then be deleted via the CMC or the Admin API.
Default = false for systems originally installed as 7.3 or later; true for systems originally installed as 7.2.x or
earlier.
cmc_log_directory
Directory in which to store CMC application logs.
Default = /var/log/cloudian
cmc_admin_host_ip
Fully qualified domain name for the Admin Service. The CMC connects to this service.
If you have multiple service regions for your HyperStore system, this FQDN must be the one for the Admin Ser-
vice in your default service region.
cmc_cloudian_admin_user
For the CMC, the login user name of the default system admin user. The system admin will use this user name
to log into the CMC.
447
Chapter 7. Reference
Default = admin
cmc_domain
Service endpoint (fully qualified domain name) for the CMC. This endpoint must be resolvable for CMC clients.
To change this endpoint, use the "Installer Advanced Configuration Options" (page 407).
cmc_web_secure
Whether the CMC should require HTTPS for all incoming client connections, true or false.
Set this property to "true" to require HTTPS. In this mode of operation, requests incoming to the CMC’s regular
HTTP port will be redirected to the CMC’s HTTPS port.
Set this property to "false" to allow clients to connect through regular HTTP. In this mode of operation, requests
incoming to the CMC’s HTTPS port will be redirected to the CMC’s regular HTTP port.
Default = true
cmc_http_port
Port on which the CMC listens for regular HTTP requests.
Default = 8888
cmc_https_port
Port on which the CMC listens for HTTPS requests.
Default = 8443
cmc_admin_secure_ssl
If the CMC is using HTTPS to connect to the Admin Service (as it will if "admin_secure" (page 432) is set to
"true"), this setting controls the CMC's requirements regarding the Admin Service's SSL certificate:
l If set to "true", then when the CMC makes HTTPS connections to the Admin Service the CMC's HTTPS
client will require that the Admin Service's SSL certificate be CA validated (or else will drop the con-
nection).
l If set to "false", then when the CMC makes HTTPS connections to the Admin Service the CMC's HTTPS
client will allow the Admin Service's SSL certificate to be self-signed.
Note Note that the SSL certificate that is used with the Admin Service by default is self-signed.
Default = false
cmc_application_name
Name of the CMC web application, to be displayed in the URL paths for the various CMC UI pages. The CMC’s
448
7.2. Configuration Settings
Use only alphanumeric characters. Do not use spaces, dashes, underscores, or other special characters..
Default = Cloudian (and so URL paths are in form https://<host>:<port>/Cloudian/<page>.htm. For example
https://enterprise2:8443/Cloudian/dashboard.htm).
cmc_storageuri_ssl_enabled
If this is set to "true", the CMC uses HTTPS to connect to the HyperStore S3 Service (in implementing the CMC
Buckets & Objects functionality). If "false", the CMC uses regular HTTP to connect to the HyperStore S3 Ser-
vice.
Do not set this to "true" unless you have set up HTTPS for your HyperStore S3 Service (for more information
see "Setting Up Security and Privacy Features" (page 104) ).
Default = false
cmc_grouplist_enabled
This setting controls whether the CMC will show a drop-down list of group names when selection of a group is
necessary in interior parts of the CMC UI (parts other than the login page). This is relevant only for system
administrators, since only system administrators have the opportunity to choose among groups for certain fea-
tures (such as user management, group management, or usage reporting). Set this to "false" to have the UI
instead present a text box in which the administrator can type the group name.
Default = true
Note If the number of groups in your system exceeds the value of the cmc_grouplist_size_max setting
(100 by default), then group drop-down lists are not supported and the UI will display a text input box
for group name regardless of how you've set cmc_grouplist_enabled.
cmc_login_languageselection_enabled
This setting controls whether to display at the top of the CMC’s Sign In screen a selection of languages from
which the user can choose, for rendering the CMC's text (such as screen names, button labels, and so on). The
supported languages are English, Japanese, Spanish, German, and Portuguese. With this set to "true", the
CMC language will initially be based on the user's browser language setting, but in the Sign In screen the user
will be able to select a different supported language if they wish.
If you set this to "false", then the language selection will not display at the top of the CMC’s Sign In screen, and
instead the CMC text language will be exclusively based on the user's browser language setting. If the user's
browser language setting matches one of the supported CMC languages, then that language will be used for
the CMC text. If the user's browser language setting does not match any of the CMC's supported languages,
the CMC text will display in English.
Default = true
cmc_login_grouplist_enabled
This setting controls whether to enable the Group drop-down list on the CMC’s Sign In screen. The Group
449
Chapter 7. Reference
drop-down list lists all groups registered in the HyperStore system, and when users log in they can choose
their group from the list. If disabled, the drop-down list will not display and instead users will need to enter their
group name in a Group Name text input box when logging into the CMC.
Set this to "false" if you don’t want users to see the names of other groups.
Default = true
Note If the number of groups in your system exceeds the value of the cmc_grouplist_size_max setting
(100 by default), then group drop-down lists are not supported and the UI will display a text input box
for group name regardless of how you've set cmc_login_grouplist_enabled.
cmc_login_grouplist_admincheckbox_enabled
This setting is relevant only if the group name drop-down list is not being displayed in the CMC login page
(because cmc_login_grouplist_enabled is set to false or cmc_grouplist_size_max is exceeded). With no group
name drop-down list in the login page, then:
Default = true
cmc_grouplist_size_max
Maximize number of groups that can be displayed in a CMC drop-down list.
If you have more than this many groups in your HyperStore system, then in parts of the CMC interface that
require the user to select a group the interface will display a text input box rather than a drop-down list of
groups to select from. The CMC will do this regardless of your setting for cmc_login_grouplist_enabled and
cmc_grouplist_enabled.
For example, if cmc_login_grouplist_enabled and cmc_grouplist_enabled are set to "true" and cmc_grouplist_
size_max is set to 100 (the default values), then the CMC will display drop-down lists for group selection if you
have up to 100 groups in your system, or text input boxes for group name entry if you have more than 100
groups in your system.
Default = 100
cmc_session_timeout
Session timeout for a user logged in to the CMC, as a number of minutes. After a logged-in user has been inact-
ive for this many minutes, the CMC will terminate the user's session.
Default = 30
450
7.2. Configuration Settings
cmc_view_user_data
Within the Manage Users function in the CMC GUI, this setting enables or disables the capability of admin
users to access regular users' storage buckets. When allowed this access, admin users can view regular users'
data, add data to users' buckets, and delete users' data. Also when allowed this access, admin users can
change the properties of regular users' buckets and objects.
Options are:
l true — This capability will display for users logged in as a system administrator or group administrator.
For group admins this capability is restricted to their own group.
l false — This capability will not display for any admin users.
l SystemAdmin — This capability will display only for users logged in as a system administrator.
l GroupAdmin — This capability will display only for users logged in as a group administrator.
Default = false
Note Regardless of how the cmc_view_user_data setting is configured, regular users can view and
manage their own object data through the Buckets & Objects section of the CMC.
cmc_crr_external_enabled
This controls whether settings for replicating from a local source bucket to an external S3 system -- a system
other than the HyperStore system in which the source bucket resides -- will appear in the Cross Region Rep-
lication tab of the CMC's Bucket Properties dialog. The default is false, so that such settings do not appear in
the dialog, and the dialog can only be used to configure replication from a source bucket to a destination
bucket in the same HyperStore system.
If you are considering setting cmc_crr_external_enabled to true you should first read "Cross-System Rep-
lication" (page 140) including the limitations and caveats noted in that section.
Default = false
cmc_login_banner_*
For information about using the cmc_login_banner_size, cmc_login_banner_title, cmc_login_banner_mes-
sage, and cmc_login_banner_button_confirm settings see "Configuring a Login Page Acknowledgment
Gate" (page 181).
cmc_csrf_origin_check_enabled
Set this to true if you want the CMC to implement request origin checks as a safeguard against Cross-Site
Request Forgery (CSRF). CSRF exploits users who, while logged in to a secure web application such as the
CMC, are concurrently doing things like checking their email or surfing web sites in other browser tabs. CSRF
tricks such users into clicking on seemingly harmless links or images that are designed to submit malicious
requests to the targeted web application without the user's knowledge. The malicious requests, sent from the
user's browser, can reach the secure web application because the user has already successfully logged in.
If you set cmc_csrf_origin_check_enabled to true then the CMC will check standard HTTP request headers to
confirm that, apart from login requests, all other requests to the CMC domain are originating from the CMC
domain (as would be expected for legitimate requests, such as if the user is clicking links within the CMC or
451
Chapter 7. Reference
taking actions within a CMC page). If an HTTP request's origin domain doesn't match the target domain (the
CMC domain), the CMC will reject the request.
If you set cmc_csrf_origin_check_enabled to true and you have the CMC behind a proxy or a load balancer,
then you must also configure the cmc_csrf_origin_allowlist setting, described below.
Default = false
cmc_csrf_origin_allowlist
If you set cmc_csrf_origin_check_enabled to true and you have the CMC behind a proxy or a load balancer,
use the cmc_csrf_origin_allowlist setting to specify the CMC domain(s) associated with the proxy or load bal-
ancer (the CMC FQDN[s] mapped to the proxy or load balancer in your DNS configuration). This is necessary
because in this environment, when the CMC application applies the origin check to guard against CSRF, the
origin domain in incoming HTTP requests should match the CMC domain associated with the proxy or load bal-
ancer -- rather than matching the internal CMC application domain to which the proxy or load balancer for-
wards the requests.
When configuring cmc_csrf_origin_allowlist, include the transfer protocol along with the FQDN -- for example
https://cmc.domain.com. If there are multiple CMC domains associated with your proxies or load balancers, con-
figure cmc_csrf_origin_allowlist as a vertical bar separated list -- for example https://cm-
c.domain1.com|https://cmc.domain2.com.
Default = empty
cmc_sso_enabled
Whether to enable CMC support for single sign-on functionality, true or false. When this is set to false, if a user
attempts to access the CMC via SSO, the access will be denied and an error will be written to cloudian-ui.log.
Default = false
IMPORTANT ! If you enable CMC SSO functionality, then for security reasons you should set custom
values for cmc_sso_shared_key and cmc_sso_cookie_cipher_key. (the next two settings below
cmc_sso_enabled). Do not leave these settings at their default values.
Note For more information on CMC SSO, see "Implementing Single Sign-On for the CMC" (page
186).
cmc_sso_shared_key
Shared security key used for hash creation, when using the "auto-login with one way hash" method of single
sign-on access to the CMC.
Default = ss0sh5r3dk3y
cmc_sso_cookie_cipher_key
Triple DES key used for cookie encryption.
Default = 123456789012345678901234
452
7.2. Configuration Settings
Note If you change this value after CMC SSO has already been in service, end users who had used
SSO previously will have on their browser a Cloudian SSO cookie that is no longer valid. If such users
access the CMC after your cmc_sso_cookie_cipher_key change, the CMC detects the invalid cookie,
deletes it, and drops a new, valid one.
cmc_bucket_tiering_default_destination_list
The list of auto-tiering destinations to display in the CMC interface that bucket owners use to configure auto-tier-
ing for a bucket (for more information on this interface, while on the CMC's Bucket Properties page (Buckets
& Objects -> Buckets -> Properties) click Help. Specify this as a quote-enclosed list, with comma-separation
between destination attributes and vertical bar separation between destinations, like this:
"<name>,<endpoint>,<protocol>|<name>,<endpoint>,<protocol>|..."
This can be multiple destinations (as it is by default), or you can edit the setting to have just one destination in
the "list" if you want your users to only use that one destination.
For multiple destinations you can have as many as you want, within reason (bear in mind that in the interface
the dialog box will expand to accommodate the additional destinations).
The <name> will display in the CMC interface that bucket owners use to configure auto-tiering, as the auto-tier-
ing destination name. The <protocol> must be one of the following:
l s3
l glacier
l azure
l spectra
If you wish you can include multiple destinations of the same type, if those destinations have different end-
points. For example, "Spectra 1,<endpoint1>,spectra|Spectra 2,<endpoint2>,spectra". Each such destination
will then appear in the CMC interface for users configuring their buckets for auto-tiering.
Note If your original HyperStore version was older than 7.1.4, then after upgrade to 7.1.4 or later your
default value here will also include a Spectra destination.
Note For more information about setting up auto-tiering, see "Setting Up Auto-Tiering" (page 131).
That section includes information about how to enable the auto-tiering feature (which is disabled by
default in the CMC interface) and about the option of having all end users use the same system-con-
figured security credentials for accessing the tiering destination (rather than supplying their own secur-
ity credentials for tiering, which is the default behavior). Note that if you choose to have all users use
the same tiering security credentials you will need to specify a single system default tiering destination
-- using a setting in the CMC's Configuration Settings page, as described in "Setting Up Auto-Tier-
ing" (page 131) -- and that configuration setting will override the cmc_bucket_tiering_default_des-
tination_list setting.
awsmmsproxy_host
453
Chapter 7. Reference
This setting is obsolete and will be removed from a future HyperStore release. Do not use.
bucketstats_enabled
Whether to enable usage statistics reporting on a per-bucket basis, true or false.If you set this to true, you can
subsequently retrieve usage data for a specified bucket or buckets by using the Admin API's GET /usage and
POST /usage/bucket methods. Per-bucket usage statistics will be available only dating back to the point in time
that you enabled this feature. Per-bucket usage statistics are not tracked by default and will not be available for
time periods prior to when you set this parameter to true.
Note that:
Default = false
cloudian_elasticsearch_hosts
For information about this setting, see "Enabling Elasticsearch Integration for Metadata Search" (page 157).
sqs_*
For information about these settings, see 15.1.1 HyperStore Support for the AWS SQS API.
IMPORTANT ! Do not manually edit settings that appear below this point in the common.csv file.
On rare occasions you might -- in consultation with Cloudian Support -- edit the mts.properties.erb file, the
hyperstore-server.properties.erb file, or the mts-ui.properties.erb file.
There is no requirement to manually edit any HyperStore configuration file. Configuration customizations
that are required in order to tailor HyperStore to your environment are implemented automatically by the install-
ation script during the initial installation of your HyperStore system and during cluster expansions or con-
tractions.
Note The configuration settings that operators would most commonly want to adjust during the oper-
ation of a HyperStore system are editable through the CMC's Configuration Settings page (Cluster ->
Cluster Config -> Configuration Settings). The only reason to manually edit configuration files is for
less-frequently-used settings that are not in the CMC's Configuration Settings page.
454
7.2. Configuration Settings
7.2.5.3. hyperstore-server.properties.erb
The hyperstore-server.properties file configures the HyperStore Service. On each of your HyperStore nodes,
the file is located at the following path by default:
/opt/cloudian/conf/hyperstore-server.properties
Do not directly edit the hyperstore-server.properties file on individual HyperStore nodes. Instead, if you
want to make changes to the settings in this file, edit the configuration template file hyperstore-serv-
er.properties.erb on the Configuration Master node:
/etc/cloudian-<version>-puppet/modules/cloudians3/templates/hyperstore-server.properties.erb
Certain hyperstore-server.properties.erb properties take their values from settings in common.csv or from set-
tings that you can control through the CMC's Configuration Settings page (Cluster -> Cluster Config -> Con-
figuration Settings). In the hyperstore-server.properties.erb file these properties' values are formatted as
bracket-enclosed variables, like <%= … %>. In the property documentation below, the descriptions of such
properties indicate "Takes its value from <location>: <setting>; use that setting instead." . The remaining prop-
erties in the hyperstore-server.properties.erb file -- those that are "hard-coded" with specific values -- are set-
tings that in typical circumstances you should have no need to edit. Therefore in typical circumstances you
should not need to manually edit the hyperstore-server.properties.erb file.
Specify just the configuration file name, not the full path to the file.
In the background this invokes the Linux text editor vi to display and modify the configuration file. Therefore you
can use the standard keystrokes supported by vi to make and save changes to the file.
cloudian.storage.datadir
Takes its value from common.csv: "hyperstore_data_directory" (page 423); use that setting instead.
secure.delete
Set this to true if you want HyperStore to use "secure delete" methodology whenever implementing the dele-
tion of an object from a bucket.
For information about how secure delete works, see "Enabling Secure Delete" (page 124)
IMPORTANT ! Using secure delete has substantial impact on system performance for delete oper-
ations. Consult with your Cloudian representative if you are considering using secure delete.
Default = false
455
Chapter 7. Reference
messaging.service.listen.address
Takes its value from common.csv: "hyperstore_listen_ip" (page 423); use that setting instead.
messaging.service.listen.port
The port on which a HyperStore Service node listens for messages from other HyperStore Service nodes. This
internal cluster messaging service is used in support of cluster management operations such as node repair.
Default = 19050
messaging.service.read.buffer.size
When a HyperStore Service node reads data it has received over the network from other HyperStore Service
nodes -- such as during repair operations -- this is the read buffer size.
Default = 65536
messaging.service.write.buffer.size
When a HyperStore Service node writes data to the network for transferring to other HyperStore Service nodes
-- such as during repair operations -- this is the write buffer size.
Default = 1048576
messaging.service.threadpool.corepoolsize
Takes its value from common.csv: "hyperstore_messaging_service_threadpool" (page 426); use that setting
instead.
messaging.service.maxconnections
Maximum simultaneous number of connections that the HyperStore messaging service will accept.
Default = 2000
messaging.service.repairfile.timeout
Maximum time in seconds to allow for repair of a single file on a HyperStore node. File repair entails checking
other HyperStore nodes to find the most recent copy of the file and then downloading that copy.
Default = 120
messaging.service.connection.timeout
Maximum time in seconds that a HyperStore node will allow for establishing a connection to another Hyper-
Store node, for conducting inter-node operations.
Default = 300
repair.session.threadpool.corepoolsize
Takes its value from common.csv:"hyperstore_repair_session_threadpool" (page 426); use that setting
instead.
repair.session.rangeslice.maxrows
During hsstool repair or hsstool repairec operations, the maximum number of row keys to retrieve per get_
range_slice query performed on Cassandra <GROUPID>_METADATA column families.
Default = 2
456
7.2. Configuration Settings
Note Cloudian, Inc recommends that you leave this setting at its default value. Do not set it to a value
lower than 2.
repair.session.columnslice.maxcolumns
During hsstool repair or hsstool repairec operations, the maximum number of columns to retrieve per get_
slice or get_range_slice query performed on Cassandra <GROUPID>_METADATA column families.
Default = 1000
repair.session.slicequery.maxretries
During hsstool repair or hsstool repairec operations, the maximum number of times to retry get_slice or get_
range_slice queries after encountering a timeout. The timeout interval is configured by cas-
sandra.cluster.CassandraThriftSocketTimeout in mts.properties, and retries are attempted as soon as a timeout
occurs.
Default = 3
repair.session.updateobjs.queue.maxlength
During hsstool repair operations, the target maximum number of object update jobs to queue for processing.
Object update jobs are placed in queue by a differencer mechanism that detects discrepancies between object
metadata on remote replicas versus object metadata on the local node.
This target maximum may be exceeded in certain circumstances as described for repair.ses-
sion.updateobjs.queue.maxwaittime (below).
Default = 1000
Reloadable via JMX (HyperStore Service’s JMX port 19082; MBean attribute = com.gem-
ini.cloudian.hybrid.server → FileRepairService → Attributes → RepairSessionJobQueueMaxLength)
repair.session.updateobjs.queue.waittime
If during hsstool repair operations the differencer detects that the number of queued object update jobs is at or
above the target maximum (as configured by repair.session.updateobjs.queue.maxlength), the number of
seconds to wait before checking the queue size again. During this interval the differencer adds no more object
update jobs to the queue.
Default = 2
repair.session.updateobjs.queue.maxwaittime
During hsstool repair operations, the maximum total number of seconds for the differencer to wait for the
object update job queue to fall below its target maximum size. After this interval, the differencer goes ahead
and writes its current batch of update requests to the queue. In this scenario, the queue can grow beyond the
target maximum size. The next time that the differencer has object update requests, it again checks the queue
size, and if it’s larger than the target maximum size, the wait time procedure starts over again.
Default = 120
repair.session.object.download.maxretries
During hsstool repair or hsstool repairec operations, the maximum number of times to retry object download
requests after encountering a timeout. The timeout interval is configured by mts.properties: cas-
sandra.cluster.CassandraThriftSocketTimeout, and retries are attempted as soon as a timeout occurs.
457
Chapter 7. Reference
Default = 3
repair.session.inmemory.fileindex
When performing Merkle Tree based hsstool repair (the default repair type) for files in the HyperStore File Sys-
tem, whether to hold the file indexes in memory rather than writing them to disk. Options are:
l true — For each vNode being repaired, a file index directory is created and is held in memory unless its
size exceeds a threshold in which case it is written to disk (under the HyperStore data mount point that
the vNode is associated with). For most vNodes it will not be necessary to write the file indexes to disk.
If file indexes are written to disk, they are automatically deleted after the repair operation completes.
l false — For each vNode being repaired, a file index directory is created and written to disk (under the
HyperStore data mount point that the vNode is associated with), regardless of size. The file indexes are
automatically deleted after the repair operation completes.
Default = true
Reloadable via JMX (HyperStore Service’s JMX port 19082; MBean attribute = com.gem-
ini.cloudian.hybrid.server → FileRepairService → Attributes → RepairSessionInMemoryFileIndex)
repair.digest.index.threadpool.corepoolsize
Takes its value from common.csv: "hyperstore_repair_digest_index_threadpool" (page 427); use that set-
ting instead.
repair.merkletree.response.waittime
During hsstool repair operations, the repair coordinator node retrieves Merkle Trees from HyperStore endpoint
nodes. When contacted by the coordinator node, the endpoint nodes first must construct the Merkle Trees
before returning them to the coordinator node. Constructing the trees can take some time if a very large num-
ber of objects is involved.
The repair.merkletree.response.waittime property sets the maximum amount of time in minutes that the coordin-
ator node will wait for Merkle Trees to be returned by all endpoint nodes. If not all Merkle Trees have been
returned in this time, the repair operation will fail.
Default = 120
rangerepair.threadpool.corepoolsize
Takes its value from common.csv: "hyperstore_rangerepair_threadpool" (page 427); use that setting instead.
stream.outbound.threadpool.corepoolsize
Takes its value from common.csv: "hyperstore_stream_outbound_threadpool" (page 427); use that setting
instead.
repairec.sessionscan.threadpool.corepoolsize
Takes its value from common.csv: "hyperstore_repairec_sessionscan_threadpool" (page 428); use that set-
ting instead.
repairec.digestrequest.threadpool.fixedpoolsize
Takes its value from common.csv: "hyperstore_repairec_digestrequest_threadpool" (page 428); use that
setting instead.
repairec.task.threadpool.corepoolsize
Takes its value from common.csv: "hyperstore_repairec_task_threadpool" (page 429); use that setting
458
7.2. Configuration Settings
instead.
repairec.rocksdbscan.threadpool.corepoolsize
Takes its value from common.csv: "hyperstore_repairec_rocksdbscan_threadpool" (page 429); use that set-
ting instead.
repairec.session.queue.maxlength
During "hsstool repairec" (page 363) operations, the target maximum number of object update jobs to queue
for processing. Object update jobs are placed in queue by a differencer mechanism that detects discrepancies
between object metadata on remote replicas versus object metadata on the local node.
This target maximum may be exceeded in certain circumstances as described for repair-
ec.session.queue.maxwaittime(below).
Default = 2000
repairec.session.queue.waittime
If during hsstool repairec operations the differencer detects that the number of queued object update jobs is at
or above the target maximum (as configured by repairec.session.updateobjs.queue.maxlength), the number of
seconds to wait before checking the queue size again. During this interval the differencer adds no more object
update jobs to the queue.
Default = 2
repairec.session.queue.maxwaittime
During hsstool repairec operations, the maximum total number of seconds for the differencer to wait for the
object update job queue to fall below its target maximum size. After this interval, the differencer goes ahead
and writes its current batch of update requests to the queue. In this scenario, the queue can grow beyond the
target maximum size. The next time that the differencer has object update requests, it again checks the queue
size, and if it’s larger than the target maximum size, the wait time procedure starts over again.
Default = 120
downloadrange.session.threadpool.corepoolsize
Takes its value from common.csv: "hyperstore_downloadrange_session_threadpool" (page 427); use that
setting instead.
uploadrange.session.threadpool.corepoolsize
Takes its value from common.csv: "hyperstore_uploadrange_session_threadpool" (page 427); use that set-
ting instead.
decommission.threadpool.corepoolsize
Takes its value from common.csv: "hyperstore_decommission_threadpool" (page 428); use that setting
instead.
cleanup.session.threadpool.corepoolsize
Takes its value from common.csv: "hyperstore_cleanup_session_threadpool" (page 428); use that setting
instead.
cleanup.session.deleteobjs.queue.maxlength
During hsstool cleanup or hsstool cleanupec operations, the target maximum number of object delete jobs to
459
Chapter 7. Reference
queue for processing. Object delete jobs are placed in queue by a cleanup job that detects discrepancies
between object metadata in Cassandra versus object metadata on the local node.
This target maximum may be exceeded in certain circumstances as described for cleanup.ses-
sion.deleteobjs.queue.maxwaittime.
Default = 2000
cleanup.session.deleteobjs.queue.waittime
If during hsstool cleanup or hsstool cleanupec operations a cleanup job detects that the number of queued
object delete jobs is at or above the target maximum (as configured by cleanup.ses-
sion.deleteobjs.queue.maxlength), the number of seconds to wait before checking the queue size again. Dur-
ing this interval the cleanup job adds no more object delete jobs to the queue.
Default = 2
cleanup.session.deleteobjs.queue.maxwaittime
During hsstool cleanup or hsstool cleanupec operations, the maximum total number of seconds for the dif-
ferencer to wait for the object delete job queue to fall below its target maximum size. After this interval, the dif-
ferencer goes ahead and writes its current batch of delete requests to the queue. In this scenario, the queue
can grow beyond the target maximum size. The next time that the differencer has object delete requests, it
again checks the queue size, and if it’s larger than the target maximum size, the wait time procedure starts over
again.
Default = 120
cleanup.session.delete.graceperiod
During hsstool cleanup or hsstool cleanupec operations, only consider an object for deletion if at least this
many seconds have passed since the object’s Last Modified timestamp.
Default = 86400
cleanupjobs.threadpool.corepoolsize
During hsstool cleanup or hsstool cleanupec operations, this setting controls how many cleanup "jobs" can
run in parallel on a single HyperStore node. For each HyperStore data mount point on a node, the object data
directory structure is as follows:
<mountpoint>/<hsfs|ec>/<base62-encoded-vNode-token>/<policyid>/<000-255>/<000-255>/<filename>
Under the hsfs (for replica data) or ec (for erasure coded data) directory level, there are sub-directories for
each of the mount point's vNodes (identified by token), and under those, sub-directories for each storage policy
configured in your system (identified by system-generated policy ID). For more information on this directory
structure, see "HyperStore Service and the HSFS" (page 38).
When a physical HyperStore node is cleaned, there is a separate cleanup "job" for each <policyId> sub-dir-
ectory on the physical node. The cleanupjobs.threadpool.corepoolsize setting controls how many such jobs
can run in parallel on a given physical node. Each concurrent job will run on a different HyperStore data disk
on the node.
460
7.2. Configuration Settings
Default = 10
cleanup.task.batch.threadpool.size
This setting and the cleanup.task.batch.integritycheck.threadpool.size and cleanup.session.threadpool.batch
settings provide additional performance tuning controls over various aspects of the hsstool cleanupec oper-
ation. The defaults are appropriate for most environments, and typically you should have no need to edit these
settings unless instructed to do so by Cloudian Support.
Default = 20
cleanup.task.batch.integritycheck.threadpool.size
See the description of cleanup.task.batch.threadpool.size.
Default = 10
cleanup.session.threadpool.batch
See the description of cleanup.task.batch.threadpool.size.
Default = 15
max.cleanup.operations.perdc
Maximum number of hsstool cleanup or hsstool cleanupec operations to allow at one time, per data center.
The limit is applied separately to cleanup and cleanupec operations. For example, if this property is set to 1 (as
it is by default), then within a DC you can run one cleanup operation at a time and one cleanupec operation at
a time. The limit does not prevent you from running one cleanup operation and one cleanupec operation sim-
ultaneously.
If you have multiple DCs in your HyperStore system, the limit is applied separately to each DC. For example if
you have two data centers named DC1 and DC2, and if this property is set to 1, you can run one cleanup oper-
ation and one cleanupec operation in DC1 at the same time as you are running one cleanup operation and
one cleanupec operation in DC2.
If you are considering raising this limit from its default of 1, first consult with Cloudian Support.
Default = 1
hyperstore.proactiverepair.poll_time
At this recurring interval each HyperStore node checks to see whether any proactive repair jobs are queued for
itself, and executes those jobs if there are any. Configured as a number of minutes.
This check is also automatically performed when a HyperStore Service node starts up. Subsequently the recur-
ring check occurs at this configured interval.
For more information about the proactive repair feature, see "Automated Data Repair Feature Overview"
(page 252) .
Default = 60
461
Chapter 7. Reference
Note If for some reason you want to trigger proactive repair on a particular node immediately, you can
do so by running the"hsstool proactiverepairq" (page 340) command with the "-start" option.
Note For information about temporarily disabling the proactive repair feature, see "Disabling or Stop-
ping Data Repairs" (page 256).
stream.throughput.outbound
During hsstool repair or hsstool repairec operations, HyperStore nodes may stream large amounts of data to
other HyperStore nodes. On each node this setting places an upper limit on outbound streaming throughput
during repair operations, in megabits per second.
Default = 800
Reloadable via JMX (HyperStore Service’s JMX port 19082; MBean attribute = com.gem-
ini.cloudian.hybrid.server → FileStreamingService → Attributes → MaxStreamThroughputOutbound)
auto.repair.threadpool.corepoolsize
Takes its value from common.csv: "hyperstore_auto_repair_threadpool" (page 428); use that setting instead.
auto.repair.scheduler.polltime
Interval (in minutes) at which each HyperStore node’s auto-repair scheduler will check the auto-repair queues
for HSFS repair, Cassandra repair, and erasure coded data repair to see whether it’s time to initiate a repair on
that node.
Default = 10
Reloadable via JMX (HyperStore Service’s JMX port 19082; MBean attribute = com.gem-
ini.cloudian.hybrid.server → FileRepairService → Attributes → AutoRepairSchedulerPollTime)
auto.repair.schedule.interval
Takes its value from the "Replicas Repair Interval (Minutes)" setting in the CMC's Configuration Settings page
(Cluster -> Cluster Config -> Configuration Settings); use that setting instead.
auto.repairec.schedule.interval
Takes its value from the "EC Repair Interval (Minutes)" setting in the CMC's Configuration Settings page
(Cluster -> Cluster Config -> Configuration Settings); use that setting instead.
auto.repaircassandra.schedule.interval
Takes its value from the "Cassandra Repair Interval (Minutes)" setting in the CMC's Configuration Settings
page (Cluster -> Cluster Config -> Configuration Settings); use that setting instead.
cloudian.storage.jmx.port
The port on which the HyperStore Service listens for JMX requests.
Default = 19082 (set elsewhere in the manifest structure; do not edit this property)
disk.fail.action
Takes its value from the "HyperStore Disk Failure Action" setting in the CMC's Configuration Settings page
462
7.2. Configuration Settings
(Cluster -> Cluster Config -> Configuration Settings); use that setting instead.
disk.repair.rebuild
When HyperStore implements a replaceDisk operation it automatically executes hsstool repair and hsstool
repairec for the replacement disk. With disk.repair.rebuild=true (the default setting), the automatic executions of
hsstool repair and hsstool repairec will use the -rebuild option that those operations support. Using the -rebuild
option is the most efficient way to rebuild data on to a replacement disk. Typically the only occasion you would
have for setting disk.repair.rebuild=false -- so that the -rebuild option is not used -- is if you are instructed to do
so by Cloudian Support in the context of troubleshooting a failed attempt to replace a disk.
Default = true
disk.fail.error.count.threshold
This setting in combination with the disk.fail.error.time.threshold setting provides you the option to specify a
read/write error frequency threshold that must be met before the system takes the automated action that is spe-
cified by the "HyperStore Disk Failure Action" setting.
The threshold, if configured, is in the form of "If disk.fail.error.count.threshold number of HSDISKERROR mes-
sages are logged in cloudian-hyperstore.log in regard to the same disk within an interval of disk.-
fail.error.time.threshold seconds, then take the automated action specified by the disk.fail.action setting."
If you set the two threshold settings to "0", then no threshold behavior is implemented, and instead the auto-
mated action specified by the disk.fail.action setting is triggered by any single occurrence of an
HSDISKERROR message in cloudian-hyperstore.log.
Default = 100
disk.fail.error.time.threshold
Disk error threshold time span in seconds. For more description see disk.fail.error.count.threshold above.
Default = 1800
disk.check.interval
Takes its value from common.csv: "hyperstore_disk_check_interval" (page 429); use that setting instead.
disk.balance.delta
When the disk balance check is run (at the disk.check.interval), token migration is triggered if a disk’s utilization
percentage differs from the average disk utilization percentage on the node by more than the configured
disk.balance.delta. If disk.balance.delta = 10 (the default), then, for example:
l If the average disk space utilization on a node is 35%, and the disk space utilization for Disk4 is 55%,
then one or more tokens will be migrated away from Disk4 to other disks on the node (since the actual
delta of 20% exceeds the maximum allowed delta of 10%).
l If the average disk utilization on a node is 40%, and the disk utilization for Disk7 is 25%, then one or
more tokens will be migrated to Disk7 from the other disks on the node (since the actual delta of 15%
exceeds the maximum allowed delta of 10%).
For more information on this feature see "Automated Disk Management Feature Overview" (page 281).
Default = 10
463
Chapter 7. Reference
disk.audit.interval
At this configurable interval (in number of minutes), the system tries to write one byte of data to each Hyper-
Store data disk. If any of these writes fail, /var/log/messages is scanned for messages indicating that the file sys-
tem associated with the disk drive in question is in a read-only condition (message containing the string
"Remounting filesystem read-only"). If any such message is found, the disk is automatically disabled in accord-
ance with your configured "HyperStore Disk Failure Action" (in the CMC, Cluster -> Cluster Config -> Con-
figuration Settings).
The scan of /var/log/messages will be limited to the time period since the the last time the disk audit was run.
This recurring audit of disk drive health is designed to proactively detect disk problems even during periods
when there is no HyperStore Service read/write activity on a disk.
Default = 60
disk.error.check.fs
When scanning /var/log/messages as part of the disk audit, the messages will be first filtered by this file system
name string.
Default = "EXT4-fs"
max.diskusage.percentage
The hsstool repair operation will fail if any HyperStore data disk that stores token ranges impacted by the oper-
ation is more than this percent full.
Default = 95
auto.repair.computedigest.run.number
Takes its value from common.csv: "auto_repair_computedigest_run_number" (page 424); use that setting
instead.
hss.errorlogger.appender
Do not edit this setting.
hss.heallogger.appender
Do not edit this setting.
enable.cassandra.rangerepair
If this is set to true, HyperStore uses a "range repair" approach when executing Cassandra auto-repairs. Each
impacted token range is repaired one range at a time, sequentially. This approach improves the performance
for Cassandra auto-repairs.
For background information on the scheduled auto-repair feature see "Automated Data Repair Feature Over-
view" (page 252).
Default = true
retry.rebalance.ranges
If this is set to true, HyperStore will automatically retry any failed sub-tasks from an hsstool rebalance operation
that has completed on a newly added node. Like other types of proactive repair, this retry of any failed rebal-
ance sub-tasks occurs once per hour.
Default = true
464
7.2. Configuration Settings
Note The rocksdb.* properties at the bottom of the hyperstore-server.properties.erb file control the
behavior and performance of the RockDB database in which object digests are stored. Do not edit
these settings.
7.2.5.4. mts.properties.erb
The mts.properties file configures the S3 Service and the Admin Service. On each of your HyperStore nodes,
the file is located at the following path by default:
/opt/cloudian/conf/mts.properties
Do not directly edit the mts.properties file on individual HyperStore nodes. Instead, if you want to make
changes to the settings in this file, edit the configuration template file mts.properties.erb on the Configuration
Master node:
/etc/cloudian-<version>-puppet/modules/cloudians3/templates/mts.properties.erb
Certain mts.properties.erb properties take their values from settings in common.csv or from settings that you
can control through the CMC's Configuration Settings page (Cluster -> Config. In the mts.properties.erb file
these properties' values are formatted as bracket-enclosed variables, like <%= … %>. In the property doc-
umentation below, the descriptions of such properties indicate "Takes its value from <location>: <setting>; use
that setting instead." The remaining properties in the mts.properties.erb file -- those that are "hard-coded" with
specific values -- are settings that in typical circumstances you should have no need to edit. Therefore in typ-
ical circumstances you should not need to manually edit the mts.properties.erb file.
Specify just the configuration file name, not the full path to the file.
In the background this invokes the Linux text editor vi to display and modify the configuration file. Therefore you
can use the standard keystrokes supported by vi to make and save changes to the file.
IMPORTANT ! If you do make edits to mts.properties.erb, be sure to push your edits to the cluster and
restart the S3 Service to apply your changes. Note that restarting the S3 Service automatically restarts
the Admin Service as well. For instructions see "Pushing Configuration File Edits to the Cluster and
Restarting Services" (page 411).
cassandra.cluster.name
Takes its value from region.csv: cassandra_cluster_name. Typically you should have no need to edit that file.
cassandra.cluster.Hosts
Takes its value from topology.csv. Typically you should have no need to edit that file.
cassandra.cluster.CassandraThriftSocketTimeout
After submitting a request to a Cassandra instance via its Thrift socket, the amount of time in milliseconds to
465
Chapter 7. Reference
Default = 15000
The diagram below shows the place of cassandra.cluster.CassandraThriftSocketTimeout and other timeouts
within the S3 Service’s request processing flow.
cassandra.cluster.MaxActive
Takes its value from common.csv: "cassandra_max_active" (page 441); use that setting instead.
Note The S3 Service, Admin Service, and HyperStore Service each separately maintain their own pool
of connections to the Cassandra data store. The configuration settings in this section are applied sep-
arately to each of the three connection pools. For example, if MaxActive=10, then the S3 Service to Cas-
sandra, Admin Service to Cassandra, and HyperStore Service to Cassandra connection pools are
each allowed a maximum of 10 simultaneously active connections.
cassandra.cluster.MaxIdle
The maximum allowed number of idle connections in a Cassandra connection pool. Any idle connections in
excess of this limit are subject to being closed. Set to a negative value to disable this limit. Note this control is
applicable only if TimeBetweenEvictionRunsMillis is set to a positive value.
Default = 20
cassandra.cluster.MaxWaitTimeWhenExhausted
If cassandra.cluster.MaxActive has been reached for a target Cassandra host and a thread requires a new
466
7.2. Configuration Settings
connection to the host, the thread will wait this many milliseconds before returning an error to the client.
Default = 9000
For a diagram showing the place of this timeout within the S3 request processing flow, see cas-
sandra.cluster.CassandraThriftSocketTimeout (above).
cassandra.cluster.RetryDownedHosts
Whether or not to periodically retry Cassandra hosts that have been detected as being down, using a back-
ground thread. If set to "true", the retry is attempted at configurable interval cas-
sandra.cluster.RetryDownedHostsDelayInSeconds.
Default = true
cassandra.cluster.RetryDownedHostsQueueSize
Maximum number of downed Cassandra hosts to maintain in the downed host retry queue at the same time. If
multiple Cassandra nodes are down, and if cassandra.cluster.RetryDownedHosts=true, then a queue is main-
tained for retrying downed nodes. The cassandra.cluster.RetryDownedHostsQueueSize sets a limit on the num-
ber of nodes that can be in the retry queue simultaneously.
Default = -1 (unlimited)
cassandra.cluster.RetryDownedHostsDelayInSeconds
The number of seconds to wait between retry attempts for downed hosts. Applicable only if
cassandra.cluster.RetryDownedHosts=true.
Default = 10
cassandra.cluster.Lifo
If true, use a “last in, first out” policy for retrieving idle connections from a pool (use the most recently used idle
connection). If false, use "first in, first out" retrieval of idle connections (use the oldest idle connection).
Default = true
cassandra.cluster.MinEvictableIdleTimeMillis
The minimum time in milliseconds that a connection must be idle before it becomes eligible for closing due to
maximum idle connection limits.
Default = 100000
cassandra.cluster.TimeBetweenEvictionRunsMillis
The interval in milliseconds at which to check for idle connections in a pool, for enforcement of maximum idle
connection limits.
Default = 100000
cassandra.cluster.UseThriftFramedTransport
Whether to use framed transport for Thrift. Strongly recommended to leave as true. If set to false, then
Default = true
cassandra.cluster.AutoDiscoverHosts
467
Chapter 7. Reference
Whether or not to periodically check for the presence of new Cassandra hosts within the cluster and to use
these same settings to interface with those new hosts. If set to true, a check for new hosts is performed at con-
figurable interval cassandra.cluster.AutoDiscoveryDelayInSeconds.
Default = true
cassandra.cluster.AutoDiscoveryDelayInSeconds
Number of seconds to wait between checks for new hosts. Applicable only if cas-
sandra.cluster.AutoDiscoverHosts=true
Default = 60
cassandra.cluster.RunAutoDiscoveryAtStartup
Whether or not to perform auto-discovery at cluster start-up. See cassandra.cluster.AutoDiscoverHosts.
Default = false
cassandra.cluster.HostTimeoutCounter
If a Cassandra node returns more than this many host timeout exceptions within an interval of
cassandra.cluster.HostTimeoutWindow, mark the node as temporarily suspended. This setting and the other
HostTimeout* settings below are applicable only if cassandra.cluster.UseHostTimeoutTracker=true.
Default = 3
cassandra.cluster.HostTimeoutWindow
If within an interval of this many milliseconds a Cassandra node returns more than cas-
sandra.cluster.HostTimeoutCounter host timeout exceptions, mark the node as temporarily suspended.
Default = 1000
cassandra.cluster.HostTimeoutUnsuspendCheckDelay
How often to check suspended nodes list to see which nodes should be unsuspended, in seconds.
Default = 10
cassandra.cluster.HostTimeoutSuspensionDurationInSeconds
When the periodic check of the suspended nodes list is performed, if a node has been suspended for more
than this many seconds, unsuspend the node and place it back in the available pool.
Default = 30
cassandra.cluster.UseHostTimeoutTracker
Whether to keep track of how often each Cassandra node replies with a host timeout exception, and to tem-
porarily mark nodes as suspended if their timeout exceptions are too frequent. See the HostTimeout* setting
descriptions above for details.
Default = true
cassandra.cluster.UseSocketKeepalive
Whether to periodically send keep-alive probes to test pooled connections to Cassandra hosts.
Default = true
468
7.2. Configuration Settings
cassandra.data.directories
Takes its value from common.csv: "cassandra_data_directory" (page 445); use that setting instead.
cassandra.fs.keyspace
Base name of the Cassandra keyspaces in which object metadata is stored. A storage policy ID is appended to
this base name to create a keyspace for a particular storage policy (for example, UserData_
b06c5f9213ae396de1a80ee264092b56). There will be one such keyspace for each storage policy that you
have configured in your system.
Default = UserData
cassandra.jmx.port
Cassandra’s JMX listening port.
Default = 7199
cassandra.tombstone_cleanup_threshold
Takes its value from common.csv: "cassandra_tombstone_cleanup_threshold" (page 446); use that setting
instead.
cassandra.tombstone_gcgrace
Takes its value from common.csv: "cassandra_tombstone_gcgrace" (page 446); use that setting instead.
cloudian.repair.eventqueue.create.interval
When proactive repair is run on a node, it reads from a queue of node-specific object write failure events that
is stored in Cassandra. The object write failure events are timestamped and are bundled together based on the
time interval in which they occurred. The cloudian.repair.eventqueue.create.interval setting controls the time
span (in minutes) that’s used for the bundling. For example with the default of 60 a new bundle starts each 60
minutes, if a node is unavailable for longer than this and write failure events for the node are accumulating.
When the node comes back online, the first automatic run of proactive repair will repair the objects from all the
interval-based bundles except for the most current one (the in-progress interval, such as the current hour if the
default of 60 is being used). That bundle will be processed at the next automatic run of proactive repair. By
default proactive repair runs (if needed) every 60 minutes — this frequency is configurable by "hyper-
store.proactiverepair.poll_time" (page 461).
Default = 60
cloudian.cassandra.default.ConsistencyLevel.Read
For the Reports keyspace (service usage data), AccountInfo keyspace (user account information), and Mon-
itoring keyspace (system monitoring data), the consistency level to require for read operations. The reads are
requested by other HyperStore components such as the S3 Service and the Admin Service.
For general information about system metadata storage, see "System Metadata Replication" (page 97).
You can optionally use a comma-separated list to implement Dynamic Consistency Levels.
Default = LOCAL_QUORUM,ONE
cloudian.cassandra.default.ConsistencyLevel.Write
For the Reports keyspace (service usage data), AccountInfo keyspace (user account information), and
469
Chapter 7. Reference
Monitoring keyspace (system monitoring data), the consistency level to require for write operations. The writes
are requested by other HyperStore components such as the S3 Service and the Admin Service.
You can optionally use a comma-separated list to implement Dynamic Consistency Levels.
Default = QUORUM,LOCAL_QUORUM
IMPORTANT ! Since this setting applies to writes, using "ONE" is not recommended.
cloudian.cassandra.UserData.ConsistencyLevel.HyperStore
The consistency level to require when hsstool is reading or writing to the UserData_<policyid> keyspaces in
order to implement a repair or cleanup operation. If the configured consistency requirement can’t be met, the
operation fails and subsequently retries. The operation will abort if two retry attempts fail to achieve the
required consistency level.
You can optionally use a comma-separated list to implement Dynamic Consistency Levels.
Default = QUORUM
IMPORTANT ! Since this setting applies to writes as well as reads, using "ONE" is not recommended.
cloudian.cassandra.localdatacenter
Takes its value from topology.csv. Typically you should have no reason to edit that file.
cloudian.s3.authorizationV4.singleregioncheck
In Cloudian deployments that consist of only one service region, this setting impacts the system’s handling of
incoming S3 requests that are using AWS Signature Version 4 Authentication:
l If this setting is set to "true", the system will validate that the region name that the client used when cre-
ating the request signature (as indicated in the scope information that the client specifies in the request)
is in fact the region name for your single region. If not the request will be rejected.
l If set to "false" the system will not validate the region name. Instead, the system calculates and validates
the signature using whatever region name is specified in the request.
Default = false
Note Do not set this property to "true" if the S3 service endpoint (URI) for your one region does not
include a region name string. For example, you can set this property to "true" -- thereby enabling region
name validation --if your S3 endpoint is s3-tokyo.enterprise.com (where tokyo is the region name), but
not if your S3 endpoint is s3.enterprise.com (which lacks a region name string).
cloudian.s3.authorizationV4.multiregioncheck
In Cloudian deployments that consist of multiple service regions, this setting impacts the system’s handling of
incoming S3 requests that are using AWS Signature Version 4 Authentication:
470
7.2. Configuration Settings
l If this setting is set to "true", the system will validate that the region name that the client used when cre-
ating the request signature (as indicated in the scope information that the client specifies in the request)
is in fact the region name for the region to which the request has been submitted. If not the request will
be rejected.
l If set to "false" the system will not validate the region name. Instead, the system calculates and validates
the signature using whatever region name is specified in the request.
Default = false
IMPORTANT ! Do not set this property to "true" if the S3 service endpoints (URIs) for your regions do
not include region name strings. For example, you can set this property to "true" -- thereby enabling
region name validation --if your S3 endpoints are s3-tokyo.enterprise.com and s3-osaka.en-
terprise.com (where tokyo and osaka are region names), but not if your S3 endpoints lack region name
strings.
cloudian.s3.serverside.encryption.keylength
Takes its value from common.csv: "cloudian_s3_aes256encryption_enabled" (page 442); use that setting
instead.
cloudian.s3.qos.enabled
Takes its value from "Enforce Configured QoS Limits for Storage Utilization" in the CMC's Configuration Set-
tings page (Cluster -> Cluster Config -> Configuration Settings); use that setting instead.
cloudian.s3.qos.rate.enabled
Takes its value from "Enforce Configured QoS Limits for Request Rates and Data Transfer Rates" in the CMC's
Configuration Settings page (Cluster -> Cluster Config -> Configuration Settings); use that setting instead.
cloudian.s3.qos.cache.enabled
When QoS enforcement is enabled, for each S3 request the S3 Service checks current user and group QoS
usage against configured QoS limits. The usage counters and configured limits are stored in the Redis QoS
DB. This setting if set to "true" enables the S3 Service to cache QoS counter and limits data and to check that
cached data first when performing its QoS enforcement role. If there is no cache hit then Redis is checked.
Default = true
Reloadable via JMX (S3 Service’s JMX port 19080; MBean attribute = com.gemini.cloudian.s3 → Configuring
→ Attributes → QosCacheEnabled)
cloudian.s3.qos.cache.expiryms
For the S3 Service’s QoS data cache (if enabled), the cache entry expiry time in milliseconds.
Default = 10000
Reloadable via JMX (S3 Service’s JMX port 19080; MBean attribute = com.gemini.cloudian.s3 → Configuring
→ Attributes → QosCacheExpiryMs)
cloudian.s3.qos.storagewrite.batch.enabled
If set to "true" then the AccountHandler’s updates of storage object and storage bytes counters in Redis are
batched together.
Default = false
471
Chapter 7. Reference
Reloadable via JMX (S3 Service’s JMX port 19080; MBean attribute = com.gemini.cloudian.s3 → Configuring
→ Attributes → QosStorageWriteEnabled)
cloudian.s3.qos.storagewrite.intervalms
If batching of storage counter updates is enabled, the batch interval in milliseconds.
Default = 60000
Reloadable via JMX (S3 Service’s JMX port 19080; MBean attribute = com.gemini.cloudian.s3 → Configuring
→ Attributes → QosStorageWriteIntervalMs)
cloudian.s3.qos.maxdelay
If a PUT Object or PUT Object Copy operation that is overwriting an existing object takes more than this many
milliseconds to complete, then before the system updates the user’s Storage Bytes (SB) count it re-checks the
existing object’s metadata to ensure that the calculated size differential between the existing object and the
new object is based on fresh metadata. This configurable behavior is intended to reduce the likelihood of erro-
neous SB counts resulting from race conditions wherein multiple client requests are overwriting the same
object at the same time.
The context is that in the case of an existing object being overwritten by a new instance of the object, the sys-
tem updates the user’s SB count by calculating the delta between the original object’s size (as indicated by its
metadata) and the new object’s size and then applying that difference to the user’s SB count. It’s important
therefore that the calculated delta be based on up-to-date metadata for the object that’s being overwritten,
even in the case where the writing of the new object takes a long time to complete.
Default = 60000
cloudian.s3.qos.bucketcounter.enabled
Takes its value from common.csv: "s3_perbucket_qos_enabled" (page 442); use that setting instead.
cloudian.s3.metadata.cache.enabled
If set to true, bucket metadata is cached by the S3 Service. When S3 request processing requires bucket
metadata, the cache is checked first. If there is no cache hit then the needed metadata is retrieved from Redis,
and is cached for subsequent use.
On an ongoing basis, the system detects if the corresponding source metadata in Redis changes and then
automatically invalidates the cached metadata.
Default = true
Reloadable via JMX (S3 Service’s JMX port 19080; MBean attribute = com.gemini.cloudian.s3 → Configuring
→ Attributes → MDCacheEnabled)
cloudian.s3.group.cache.enabled
If set to true, then user group metadata is cached by the S3 Service. When S3 request processing requires
group status, the cache is checked first. If there is no cache hit then the needed metadata is retrieved from
Redis, and is cached for subsequent use.
On an ongoing basis, the system detects if the corresponding source metadata in Redis changes and then
automatically invalidates the cached metadata.
Default = true
472
7.2. Configuration Settings
Reloadable via JMX (S3 Service’s JMX port 19080; MBean attribute = com.gemini.cloudian.s3 → Configuring
→ Attributes → GroupCacheEnabled)
cloudian.s3.credentials.cache.enabled
If set to true, users active security credentials are cached by the S3 Service. When S3 request processing
requires a user’s credentials, the cache is checked first. If there is no cache hit then the needed credentials
retrieved from Redis, and are cached for subsequent use.
On an ongoing basis, the system detects if the corresponding source credentials in Redis change and then
automatically invalidates the cached credentials.
Default = true
Reloadable via JMX (S3 Service’s JMX port 19080; MBean attribute = com.gemini.cloudian.s3 → Configuring
→ Attributes → GroupCacheEnabled)
cloudian.s3.user.cache.enabled
If set to true, user metadata is cached by the S3 Service. When S3 request processing requires a user’s status
and/or display name, the cache is checked first. If there is no cache hit then the needed metadata is retrieved
from Redis, and is cached for subsequent use.
On an ongoing basis, the system detects if the corresponding source metadata in Redis changes and then
automatically invalidates the cached metadata.
Default = true
Reloadable via JMX (S3 Service’s JMX port 19080; MBean attribute = com.gemini.cloudian.s3 → Configuring
→ Attributes → UserCacheEnabled)
cloudian.s3.autoinvalidateiamcache
Takes its value from common.csv: "cloudian_s3_autoinvalidateiamcache" (page 442); use that setting
instead.
cloudian.s3.regions
Takes its value from common.csv: "regions" (page 419); use that setting instead.
cloudian.s3.home_region
Takes its value from region.csv: region_name; use that setting instead.
cloudian.s3.default_region
Takes its value from common.csv: "default_region" (page 418); use that setting instead.
cassandra.cluster.name.<region>
Takes its value from region.csv: cassandra_cluster_name. Typically you should have no reason to edit that set-
ting.
cassandra.cluster.hosts.<region>
Takes its value from region.csv: cassandra_cluster_hosts. Typically you should have no reason to edit that set-
ting.
cloudian.s3.domain.<region>
Takes its value from region.csv: s3_domain_and_port. Typically you should have no reason to edit that setting.
473
Chapter 7. Reference
cloudian.s3.ssl_domain.<region>
Domain and SSL listening port from one of your regions, in format <FQDN>.<port>. This is used if your system
has multiple service regions and you have configured your S3 service to use SSL. For example, if a GET
Object request comes in to the S3 service in region1, and the object is stored in region2, and region2 is con-
figured to use SSL for its S3 service — then the S3 service in region1 needs to know the domain and SSL port
for the S3 service in region2, so that it can specify a correct Location header in the Redirect message it returns
to the requesting client.
If you have a multi-region HyperStore deployment, the installation script automatically configures the cloud-
ian.s3.ssl_domain.<region> setting for each of your regions, including the local region. For example:
cloudian.s3.ssl_domain.region1 = s3.region1.mycompany.com:443
cloudian.s3.ssl_domain.region2 = s3.region2.mycompany.com:443
cloudian.s3.ssl_domain.region3 = s3.region3.mycompany.com:443
If you have only one region in your HyperStore deployment, then this setting is not relevant to your system.
cloudian.s3.website_endpoint
Takes its value from region.csv: cloudians3_website_endpoint.
To change this, use the HyperStore "Installer Advanced Configuration Options" (page 407).
cloudian.util.dns.resolver
Method used to resolve foreign buckets to the correct region, in support of the S3 LocationConstraint feature.
Currently only one option is supported:
Default = com.gemini.cloudian.util.dns.RedirectResolver
cloudian.publicurl.port
Takes its value from common.csv: ld_cloudian_s3_port, which is controlled by the installer. To change S3
listening ports use the installer's Advanced Configuration options.
cloudian.publicurl.sslport
Takes its value from common.csv: ld_cloudian_s3_ssl_port, which is controlled by the installer. To change S3
listening ports use the installer's Advanced Configuration options.
cloudian.s3.server
HTTP Server header value to return in responses to S3 requests. If you want no Server header value returned
in responses to S3 requests, uncomment this setting and set it to empty.
cloudian.s3.tmpdir
The directory in which to temporarily store large files in order to reduce memory usage.
cloudian.fs.read_buffer_size
474
7.2. Configuration Settings
When interacting with the file storage system in Cassandra, the size of the S3 Service’s and Admin Service’s
read buffer, in bytes. A larger read buffer can enhance performance but will also consume more memory.
Default = 65536
Reloadable via JMX (S3 Service’s JMX port 19080; MBean attribute = com.gemini.cloudian.s3 → Configuring
→ Attributes → ReadBufferSize)
cloudian.s3.putobject.max_size
Takes its value from the "Put Object Maximum Size (Bytes)" setting on the CMC's Configuration Settings page
(Cluster -> Cluster Config -> Configuration Settings); use that setting instead.
cloudian.s3.getbucket.max_num_of_keys
When performing an S3 GET Bucket operation (which returns meta-data about the objects in the bucket spe-
cified in the request), the maximum number of objects to list in the response. If the client request sets a "max-
keys" parameter, then the lower of the client-specified value and the cloudian.s3.getbucket.max_num_of_keys
value is used.
Default = 1000
Reloadable via JMX (S3 Service’s JMX port 19080; MBean attribute = com.gemini.cloudian.s3 → Configuring
→ Attributes → GetBucketMaxKeys)
cloudian.s3.max_user_buckets
Takes its value from the "Maximum Buckets Per User" setting on the CMC's Configuration Settings page
(Cluster -> Cluster Config -> Configuration Settings); use that setting instead.
cloudian.s3.delimiter_regex
Regular expression indicating allowed delimiters in getBucketList objects.
Default = .+
Reloadable via JMX (S3 Service’s JMX port 19080; MBean attribute = com.gemini.cloudian.s3 → Configuring
→ Attributes → DelimiterRegex)
cloudian.s3.multipart.maxparts
Takes its value from the "Multipart Upload Maximum Parts" setting on the CMC's Configuration Settings page
(Cluster -> Cluster Config -> Configuration Settings); use that setting instead.
cloudian.s3.multipart.minpartsize
In an S3 Multipart Upload submitted to HyperStore, the required minimum size per part, excluding the last part.
Expressed in number of bytes. The operation will fail if a part other than the last part is smaller than this many
bytes. For example, if you set this to 5242880 (5MB, which is the Amazon S3 default for minimum part size)
then in a Multipart Upload each part uploaded must be at least 5MB in size, except for the last part which is
allowed to be as small as necessary to complete the object.
The HyperStore default of 1 byte essentially places no restriction on minimum part size.
Default = 1
cloudian.s3.unsupported
Comma-separated list of Amazon S3 URI parameters that the HyperStore system does not support. This list
applies across HTTP methods and across S3 resource types. In response to requests that include an
475
Chapter 7. Reference
unsupported URI parameter, the HyperStore system will return 501, Not Implemented.
Default = accelerate,requestPayment,analytics,metrics,select,notification
cloudian.util.ntp.Path
Path to ntpstat. Setting this value is required only if ntpstat is not in the environment PATH.
cloudian.util.ntp.MaximumSynchronizationDistance
Maximum system clock skew to allow when servicing S3 requests that require the S3 Service to generate an
object versionId, in milliseconds.
l If this setting is set to a positive value, a S3 Service node that is processing S3 requests that require
generating a versionid will perform an ntpstat check. If the node’s system clock is skewed by more than
cloudian.util.ntp.MaximumSynchronizationDistance milliseconds, the S3 Service rejects the S3 request
with a "503 Service Unavailable" error response. The frequency of the ntpstat check can be limited by
using the cloudian.util.ntp.CheckIntervalMillis setting.
l If this setting is set to 0, the npstat check is not performed when processing S3 requests that require gen-
erating a versionId.
Default = 0
cloudian.util.ntp.CheckIntervalMillis
Minimum time between ntpstat checks, in milliseconds.
l If this setting is set to a positive value, then when the S3 Service is processing S3 requests that require
generating a versionId, an ntpstat check will be performed only if cloudian.util.ntp.CheckIntervalMillis
milliseconds or more have passed since the last ntpstat check was performed.
l If this setting is set to 0, an ntpstat check is performed with every S3 request that requires generating a
versionId.
Default = 60000
cloudian.s3.batch.delete.delay
When executing the batch processing to purge deleted objects from disk (see cloud-
ian.delete.queue.poll.interval below for background information), the number of milliseconds to pause in
between purges of individual objects.
If cloudian.s3.batch.delete.delay is set to 0 (as it is by default), then when the batch processing job is running
there is no delay between individual object purges.
If you wish you can use the cloudian.s3.batch.delete.delay property to "throttle" the execution of the deleted
object batch processing job (to slow it down so as to reduce its resource demands).
Default = 0
cloudian.delete.queue.poll.interval
When S3 requests for object deletion are received and successfully processed by HyperStore, the system
immediately marks the objects as having been deleted, and stores this object deletion flag in a queue in the
Metadata DB. But the actual purging of object data from disk does not occur until a batch processing job runs,
to physically purge objects that have been marked as deleted.
476
7.2. Configuration Settings
The cloudian.delete.queue.poll.interval property sets the interval at which to run the batch processing of
queued object deletes, in number of minutes. With the default of 60, the object deletion batch processing job is
run once per hour on each node. For performance reasons the kick-off times are staggered across the nodes --
so, all nodes run the job once per hour (by default configuration) but the jobs do not all kick off at the same
time. However because a batch processing job takes some time to complete, there may be overlap such that
batch processing jobs are running on multiple nodes concurrently. You can limit the degree of concurrency
with the cloudian.delete.dc.instances property (below).
Also for performance reasons, when a batch processing job is being run on a node, that node manages the
batch but the work of implementing the purge actions contained in that batch is spread across the cluster.
Note This property can be set to "0" to disable batch processing of objects marked for deletion, but if
you do this it should only be temporarily, and in consultation with Cloudian Support.
Default = 60
cloudian.delete.dc.instances
Within a data center, this is the maximum number of batch delete processing jobs that will be allowed to run
concurrently. For example, with the default of 2, if batch delete processing jobs are running on two nodes in a
data center, and then the time arrives for a batch delete processing job to kick off on another node in that same
data center (based on cloudian.delete.queue.poll.interval ), the kick-off of that job will wait until one of the cur-
rently running batch delete processing jobs completes -- so that no more than two jobs are running con-
currently per data center.
Since the work of executing the physical deletes associated with a batch delete process job is spread across
the cluster -- which the batch processing jobs are managed by the particular nodes on which they're running --
this setting has the effect of limiting the overall workload that batch delete processing places on your cluster.
Default = 2
cloudian.delete.purge.bucket.threads
Maximum number of threads to devote to purging a bucket (while executing the purging associated with the
Admin API call POST /bucketops/purge). For details on this API call see the "bucketops" section of the Cloudian
HyperStore Admin API Reference.
Default = 2
cloudian.s3.client.ConnectionTimeout
When interfacing with the S3 Service in a different Cloudian HyperStore service region, the connection estab-
lishment timeout in milliseconds.
Default = 2000
Reloadable via JMX (S3 Service’s JMX port 19080; MBean attribute = com.gemini.cloudian.s3 → Configuring
→ Attributes → S3ClientConnectionTimeout)
cloudian.s3.client.SocketTimeout
When interfacing with the S3 Service in a different Cloudian HyperStore service region, the request processing
timeout in milliseconds. When a request is sent over an open connection, if a complete response is not
received within this interval, the request times out and the connection is closed.
Default = 50000
477
Chapter 7. Reference
Reloadable via JMX (S3 Service’s JMX port 19080; MBean attribute = com.gemini.cloudian.s3 → Configuring
→ Attributes → S3ClientSocketTimeout )
cloudian.s3.client.MaxErrorRetry
When HyperStore's native S3 client is interfacing with a remote S3 Service (such as during auto-tiering), the
maximum number of times to retry sending a request that encounters a temporary error response from the
remote service (such as a 503 error).
For auto-tiering, after this many retries the request will be put back in the auto-tiering queue and tried again at
the next running of the auto-tiering cron job.
Default = 3
Reloadable via JMX (S3 Service’s JMX port 19080; MBean attribute = com.gemini.cloudian.s3 → Configuring
→ Attributes → S3ClientMaxErrorRetry)
cloudian.s3.client.UserAgent
When interfacing with the S3 Service in a different Cloudian HyperStore service region, the value of the first
part of the HTTP User-Agent header, where the whole header is "<ConfigValue>, <AWS SDK Version> <OS
Version> <JDK Version>". For example, if you set this setting to "Agent99", then the resulting User-Agent
header will be "Agent99, <AWS SDK Version> <OS Version> <JDK Version>", with the latter three values
being populated automatically by the system.
Reloadable via JMX (S3 Service’s JMX port 19080; MBean attribute = com.gemini.cloudian.s3 → Configuring
→ Attributes → S3ClientUserAgent)
cloudian.auditlog.enabled
Do not edit this setting.
reports.raw.ttl
Time-to-live for "raw" S3 usage data in the Reports keyspace in Cassandra, in seconds. Raw service usage
data will be automatically deleted this many seconds after its creation. This is set by the system -- do not manu-
ally edit this setting.
This applies to per-bucket usage data (if you have enabled per-bucket usage tracking) as well as to per-user
and per-group usage data.
Default =
l 604800 (seven days), if the "Track/Report Usage for Request Rates and Data Transfer Rates" setting in
the CMC's Configuration Settings page (Cluster -> Cluster Config -> Configuration Settings) is set to
"false", as it is by default
l 86400 (one day) if the "Track/Report Usage for Request Rates and Data Transfer Rates" setting is set to
"true"
For an overview of how the HyperStore system tracks service usage by groups, users, and buckets, see
"Usage Reporting Feature Overview" (page 165).
reports.rolluphour.ttl
Time-to-live for hourly roll-up S3 usage data in the Reports keyspace in Cassandra, in seconds. Hourly roll-up
data will be automatically deleted this many seconds after its creation.
478
7.2. Configuration Settings
This applies to per-bucket usage data (if you have enabled per-bucket usage tracking) as well as to per-user
and per-group usage data.
IMPORTANT ! This hourly rollup data is the basis for generating billing reports. After hourly rollup data
is deleted it is no longer available for generating billing reports.
reports.rollupday.ttl
Time-to-live for daily roll-up S3 usage data in the Reports keyspace in Cassandra, in seconds. Daily roll-up
data will be automatically deleted this many seconds after its creation.
This applies to per-bucket usage data (if you have enabled per-bucket usage tracking) as well as to per-user
and per-group usage data.
reports.rollupmonth.ttl
Time-to-live for monthly roll-up S3 usage data in the Reports keyspace in Cassandra, in seconds. Monthly roll-
up data will be automatically deleted this many seconds after its creation.
This applies to per-bucket usage data (if you have enabled per-bucket usage tracking) as well as to per-user
and per-group usage data.
reports.auditdata.ttl
Time-to-live for audit data in the Reports keyspace in Cassandra, in seconds. Audit data will be automatically
deleted this many seconds after its creation.
events.acknowledged.ttl
Time-to-live for acknowledged system events in the Monitoring keyspace in Cassandra, in seconds. Acknow-
ledged events will be automatically deleted this many seconds after an administrator acknowledges them.
Unacknowledged events are not subject to this timer. The time-to-live countdown on an event record does not
begin until an administrator acknowledges the event through the CMC’s Alerts page or Node Status page.
If you want alerts to be deleted immediately after they have been acknowledged, set this property to 1.
Note As soon an alert occurs a record of it is written to the Smart Support log (which by default is
uploaded to Cloudian Support once a day). So the Smart Support record of alerts persists even after
the alerts have been deleted from your system. For more information on Smart Support see "Smart
Support and Diagnostics Feature Overview" (page 81).
monitoring.ophistory.ttl
Time-to-live for per-node data repair and cleanup operation status summaries in Cassandra's Monitoring key-
space, in seconds. Each repair and cleanup status summary will be automatically deleted after it has been
479
Chapter 7. Reference
For as long as repair and cleanup status summaries are stored in the Monitoring keyspace, they can be
retrieved on a per-node basis by using the hsstool opstatus command, with the "-q history" option. This
returns the operation history of a specified node. The monitoring.ophistory.ttl property controls the maximum
length of that retrievable operation history -- for example with monitoring.ophistory.ttl at its default value, for
each HyperStore node you can retrieve the history of its repair and cleanup operations from the past 90 days.
Repair and cleanup operation status information older than that will be deleted.
usage.repair.row.size
When querying Cassandra for the object metadata associated with individual service users, the maximum num-
ber of users to retrieve per query.
The mts.properties file's "Usage" section has settings for tuning the performance of usage data repair oper-
ations.
Default = 1000
usage.repair.column.size
When querying Cassandra for the object metadata associated with individual service users, for each user the
maximum number of objects for which to retrieve metadata per query.
Default = 1000
usage.repair.maxdirtyusers
Maximum number of "dirty" users for whom to verify (and if necessary repair) usage data during a single run of
the Admin API POST /usage/repair/dirtyusers operation (for details of this API call see the "usage" section in
the Cloudian HyperStore Admin API Reference.
Default = 1000
usage.rollup.userchunk.size
When performing rollups of usage data, the maximum number of users to take into memory at a time.
Default = 1000
usage.rollup.usagechunk.size
When performing rollups of usage data, the maximum number of usage records to take into memory at a time.
Default = 1000
usage.rollup.hour.maxretry
Each time the system performs an hourly rollup of usage data for the hour that just ended, it will check whether
the hourly rollup data from the preceding hours exists (as it should, unless relevant system services have been
down or unreachable). If the hourly rollup data from preceding hours is missing, then the system retries pro-
cessing the hourly rollups for those hours that are missing the hourly rollup data. The usage.rol-
lup.hour.maxretry property sets a maximum on the number of preceding hours to check on and (if needed)
perform a retry for.
480
7.2. Configuration Settings
For example, suppose usage.rollup.hour.maxretry=6. With this setting, if the system is for example about to per-
form the hourly rollup from the 10th hour of the day, it will first check that hourly rollup data exists for the 9th,
8th, 7th, 6th, 5th, and 4th hours of the day -- and if any of those hourly rollups are missing, the system will try
again to execute those hourly rollups. After doing so the system will then perform the hourly rollup of the usage
data from the 10th hour of the day.
Default = 24
cloudian.s3.usagerates.enabled
Takes its value from the "Track/Report Usage for Request Rates and Data Transfer Rates" setting on the
CMC's Configuration Settings page (Cluster -> Cluster Config -> Configuration Settings); use that setting
instead.
bucketstats.enabled
Takes its value from common.csv: "bucketstats_enabled" (page 454); use that setting instead.
cloudian.s3.redis.retry
The interval in seconds at which the S3 Service will retry a Redis node that has been unresponsive. If the S3
Service finds a Redis node to be unresponsive the S3 Service will temporarily remove that node from the list of
Redis nodes that are available to service requests. At an interval of cloudian.s3.redis.retry seconds the S3 Ser-
vice will retry the Redis node. If it's found to be responsive, the node is added back to the S3 Service's list of
available Redis nodes.
This setting is applicable to Redis Credentials nodes and Redis QoS nodes.
Default = 30
redis.monitor.subscription.check
Takes its value from common.csv: "redis_monitor_subscription_check" (page 443); use that setting instead.
Note The HyperStore Redis Monitor Service monitors Redis cluster health and implements automatic
failover of the Redis master node role. For Redis Monitor redundancy, it runs on two of your S3 Service
/ Admin Service nodes, with the Monitor configured as primary on one node and as backup on the
other node.
redis.monitor.primary.pollInterval
The interval at which the backup Redis Monitor instance should check on the health of the primary instance, in
seconds. If the primary Redis Monitor instance is unresponsive, the backup instance takes over the monitoring
duties.
Default = 5
redis.credentials.cluster.pollInterval
Interval at which the Redis Monitor application should check the health of the Redis Credentials servers, in
seconds. At this interval, the Redis Monitor also checks the S3 Service / Admin Service nodes via JMX to
ensure that they are configured to point to the current Redis Credentials master, and updates their con-
figuration if necessary.
Default = 5
redis.credentials.cluster.client.request.waittime
481
Chapter 7. Reference
Maximum time for the Redis Monitor to wait for a JMX connection attempt to a S3 Service / Admin Service node
to complete, in seconds. If the connection attempt doesn’t complete (with a success or failure result) within this
interval, the Redis Monitor marks the S3 Service / Admin Service node as DOWN and writes an INFO level mes-
sage to cloudian-redismon.log. Meanwhile, the connection attempt will continue until completion, and sub-
sequently polling of the S3 Service / Admin Service node will resume at the regular polling interval.
Default = 3
redis.monitor.alert.limit
This setting limits how much the Redis Monitor's monitoring for conditions of DC partition or "split brain" writes
to cloudian-redismon.log.
Default = 100
redis.monitor.skip.dc.monitoring
For information about this setting please see "disable dc partition monitoring" (page 402) and "enable dc
partition monitoring" (page 403).
redis.monitor.skip.brain.monitoring
For information about this setting please see "disable split brain monitoring" (page 404) and "enable split
brain monitoring" (page 405).
credentials.user.max
Maximum allowed number of S3 credentials per HyperStore user. Each credential is a key pair consisting of a
public key (access key) and a private key (secret key). These credentials enable a HyperStore user to access
the HyperStore S3 storage system through either the CMC or a third party S3 client.
Inactive credentials count toward this maximum as well as active credentials. Credentials can be created,
made active or inactive, and deleted, through either the CMC or the Admin API.
Note If a HyperStore user creates IAM users under their HyperStore account and creates S3 cre-
dentials for those IAM users, the IAM users' credentials do not count toward the HyperStore user's max-
imum allowed number of S3 credentials. IAM user credentials are limited separately, by the
credentials.iamuser.max property.
Default = 5
credentials.iamuser.max
Maximum allowed number of S3 credentials per IAM user. Each credential is a key pair consisting of a public
key (access key) and a private key (secret key). These credentials enable an IAM user to access the Hyper-
Store S3 storage system through a third party S3 client. IAM users cannot access the HyperStore S3 storage
system through the CMC.
Inactive credentials count toward this maximum as well as active credentials. Credentials can be created,
made active or inactive, and deleted, through either the CMC's IAM User section (which is accessible only to
482
7.2. Configuration Settings
HyperStore group administrators or regular users -- not system administrators) or the HyperStore imple-
mentation of the Amazon IAM API.
Default = 2
keystore.pass
Password for the Java keystore file /opt/cloudian/conf/.keystore. This keystore file stores the Admin Service’s
pre-generated, self-signed, RSA-based public and private keys for SSL.
Default = adminpass
secure.transact.alias
Alias identifying the Admin Service’s certificate entry within the keystore.
Default = secure
secure.transact.pass
Password to access the certificate entry that’s identified by secure.transact.alias.
Default = private
admin.auth.realm
Takes its value from common.csv: "admin_auth_realm" (page 432); use that setting instead.
admin.auth.enabled
Takes its value from common.csv: "admin_auth_enabled" (page 432); use that setting instead.
admin.secure
Takes its value from common.csv: "admin_secure" (page 432); use that setting instead.
admin.user.password.length
Maximum allowed character length for users' Cloudian Management Console login passwords.
Default = 64
user.password.min.length
Takes its value from common.csv: "user_password_min_length" (page 433); use that setting instead.
user.password.dup.char.ratio.limit
Takes its value from common.csv: "user_password_dup_char_ratio_limit" (page 433); use that setting
instead.
user.password.unique.generations
Takes its value from common.csv: "user_password_unique_generations" (page 433); use that setting
instead.
user.password.rotation.graceperiod
Takes its value from common.csv: "user_password_rotation_graceperiod" (page 433); use that setting
instead.
user.password.rotation.expiration
Takes its value from common.csv: "user_password_rotation_expiration" (page 434); use that setting instead.
483
Chapter 7. Reference
user.password.lock.enabled
Takes its value from common.csv: "user_password_lock_enabled" (page 434); use that setting instead.
user.password.lock.duractionsec
Takes its value from common.csv: "user_password_lock_durationsec" (page 434); use that setting instead.
user.password.lock.maxfailedattempts
Takes its value from common.csv: "user_password_lock_maxfailedattempts" (page 435); use that setting
instead.
awsmms.proxy.host
This setting is obsolete and will be removed from a future HyperStore release. Do not use.
awsmms.proxy.port
This setting is obsolete and will be removed from a future HyperStore release. Do not use.
admin.whitelist.enabled
Takes its value from common.csv: "admin_whitelist_enabled" (page 447); use that setting instead.
admin.allow_delete_users_with_buckets
Takes its value from common.csv: "allow_delete_users_with_buckets" (page 447); use that setting instead.
hyperstore.endport
The HyperStore Service listening port to which the S3 Service will submit data operation requests.
Default = Takes its value from elsewhere within the Puppet manifest structure; default is 19090
hyperstore.maxthreads.read
Takes its value from common.csv: "hyperstore.maxthreads.read" (page 425); use that setting instead.
hyperstore.maxthreads.write
Takes its value from common.csv: "hyperstore.maxthreads.write" (page 425); use that setting instead.
hyperstore.maxthreads.repair
Takes its value from common.csv: "hyperstore.maxthreads.repair" (page 424); use that setting instead.
hyperstore.maxthreads.delete
Maximum number of simultaneous client threads for one S3 Service node to use on deletes from the Hyper-
Store File System.
Default = 10
hyperstore.snd.buffer
Socket send buffer size from S3 nodes to HyperStore nodes.
Default = 0
hyperstore.rcv.buffer
Socket receive buffer size from S3 nodes to HyperStore nodes.
484
7.2. Configuration Settings
Default = 0
hyperstore.timeout
Takes its value from common.csv: "hyperstore_timeout" (page 424); use that setting instead.
hyperstore.connection.timeout
Takes its value from common.csv: "hyperstore_connection_timeout" (page 424); use that setting instead.
hyperstore.maxtotalconnections
Takes its value from common.csv: hyperstore_maxtotalconnections, which is controlled by a performance
optimization script that runs automatically when you install your cluster or resize your cluster.
hyperstore.maxperrouteconnections
Takes its value from common.csv: hyperstore_maxperrouteconnections, which is controlled by a performance
optimization script that runs automatically when you install your cluster or resize your cluster
cassandra.range_repair.max.waiting.time.in_sec
During a Cassandra repair operation each of a node's token ranges are repaired one at a time, sequentially.
The system will wait for a maximum of this many seconds for repair of a range to complete. If repair of a range
times out by not being completed within this many seconds, the system moves on to repair the next range in
sequence. Subsequently, after other ranges are repaired, repair of the range that timed out will be retried a
maximum of 3 times. If it still cannot be repaired, the Cassandra repair as a whole will return a Failed status.
Default = 7200
phonehome.enabled
The HyperStore Data Collector collects and stores system-wide diagnostic data for your HyperStore system on
an ongoing basis. By default this diagnostics data is automatically uploaded to Cloudian Support via the S3
protocol once a day, as part of the Smart Support feature. For more information about this feature -- including
what data gets sent to Cloudian Support and how they use it for your benefit -- see "Smart Support and Dia-
gnostics Feature Overview" (page 81).
l If you want diagnostics data automatically uploaded to Cloudian Support via S3 each day, there's
nothing you need to do -- just leave phonehome.enabled set to "true" and leave the setting
common.csv: phonehome_uri set to the Cloudian Support S3 URI. This is the default behavior and is
the recommended behavior.
l If you want diagnostics data automatically uploaded each day to an S3 destination other than
Cloudian Support, leave phonehome.enabled set to "true" and set the common.csv: phonehome_uri
setting to the desired S3 URI (and also set common.csv: phonehome_{bucket, access_key, secret_
key}).
l If you do not want diagnostics data automatically uploaded to an S3 destination each day, set
phonehome.enabled to "false". This is not recommended.
Even if you choose not to automatically upload the daily diagnostic data to an S3 destination -- that is, even if
you set phonehome.enabled to "false" -- a diagnostics data file is still generated locally and stored under /var/-
log/cloudian on the node on which the HyperStore Monitoring Data Collector runs. (To see which of your
nodes is the Data Collector node, go to the CMC's Cluster Information page [Cluster -> Cluster Config ->
Cluster Information] and check for the identity of the "System Monitoring/Cronjob Primary Host".) The "live" dia-
gnostics log -- which is recording the current day’s performance statistics -- is named diagnostics.csv. The
485
Chapter 7. Reference
rolled up daily diagnostic packages from previous days -- which include prior days' diagnostics.csv files and
also various application and transaction logs -- are named diagnostics_<date/time>_<version>_<region>.tgz.
Note The deletion of old diagnostics packages is managed by Puppet, as configured by "cleanup_dir-
ectories_byage_withmatch_timelimit" (page 421) in common.csv. By default Puppet deletes the dia-
gnostics packages after they are 15 days old. This presumes that you have left the Puppet daemons
running in your HyperStore cluster, which is the default behavior. If you do not leave the Puppet dae-
mons running the diagnostics logs will not be automatically deleted. In that case you should delete the
old packages manually, since otherwise they will eventually consume a good deal of storage space.
Default = true
phonehome.uri
Takes its value from common.csv: "phonehome_uri" (page 430); use that setting instead.
Default = empty
phonehome.proxy.host
Takes its value from common.csv: "phonehome_proxy_host" (page 429); use that setting instead.
phonehome.proxy.port
Takes its value from common.csv: "phonehome_proxy_port" (page 430); use that setting instead.
phonehome.proxy.username
Takes its value from common.csv: "phonehome_proxy_username" (page 430); use that setting instead.
phonehome.proxy.password
Takes its value from common.csv: "phonehome_proxy_password" (page 430); use that setting instead.
phonehome.gdpr
Takes its value from common.csv: "phonehome_gdpr" (page 431); use that setting instead.
phonehome.gdpr.bucket
Takes its value from common.csv: "phonehome_gdpr_bucket" (page 431); use that setting instead.
sysinfo.uri
S3 URI to which to upload on-demand Node Diagnostics packages, when you use the CMC's Collect Dia-
gnostics function (Cluster -> Nodes -> Advanced). By default this is the S3 URI for Cloudian Support, but if
you prefer you can set this to a different S3 URI. For an overview of this feature see "Smart Support and Dia-
gnostics Feature Overview" (page 81).
Include the HTTP or HTTPS protocol part of the URI (http:// or https://).
Default = https://s3-support.cloudian.com:443
486
7.2. Configuration Settings
Note If you set sysinfo.uri to a URI for your own HyperStore S3 storage system (rather than Cloudian
Support), and if your S3 Service is using HTTPS, then your S3 Service’s SSL certificate must be a CA-
verified, trusted certificate — not a self-signed certificate. By default the Node Diagnostics upload func-
tion cannot upload to an HTTPS URI that’s using a self-signed certificate. If you require that the upload
go to an HTTPS URI that’s using a self-signed certificate, contact Cloudian Support.
Default = empty
sysinfo.proxy.host
Takes its value from common.csv: "phonehome_proxy_host" (page 429); use that setting instead.
Note This property, together with the other sysinfo.proxy.* properties below, is for using a local forward
proxy when sending Node Diagnostics packages to Cloudian Support (or another external S3 des-
tination). By default these properties will inherit the same common.csv values that you set for proxying
of the daily Smart Support upload (also known as "phone home"). If you want to use a different proxy for
sending Node Diagnostics packages -- not the same proxy settings that you use for the phone home
feature -- edit the sysinfo.proxy.* settings directly in mts.properties.erb. For example you could change
sysinfo.proxy.host=<%= @phonehome_proxy_host %> to sysinfo.proxy.host=proxy2.enterprise.com.
sysinfo.proxy.port
Takes its value from common.csv: "phonehome_proxy_port" (page 430); use that setting instead.
sysinfo.proxy.username
Takes its value from common.csv: "phonehome_proxy_username" (page 430); use that setting instead.
sysinfo.proxy.password
Takes its value from common.csv: "phonehome_proxy_password" (page 430); use that setting instead.
s3.client.timeout
When uploading a Node Diagnostics package to an S3 destination such as Cloudian Support, the socket
timeout in milliseconds.
Default = 1800000
s3.upload.part.minsize
When HyperStore uses Multipart Upload to transmit a Node Diagnostics package to an S3 destination such as
Cloudian Support, each of the parts will be this many bytes or larger — with the exception of the final part,
487
Chapter 7. Reference
which may be smaller. For example, if Multipart Upload is used for an 18MiB object, and the configured min-
imum part size is 5MiB, the object will be transmitted in four parts of size 5MiB, 5MiB, 5MiB, and 3MiB.
s3.upload.part.threshold
When HyperStore transmits a Node Diagnostics package to an S3 destination such as Cloudian Support, it
uses Multipart Upload if the package is larger than this many bytes.
cloudian.protection.policy.max
Maximum number of bucket protection policies (storage policies) that the system will support. Policies with
status "Active", "Pending", or "Disabled" count toward this system limit.
If the policy maximum has been reached, you will not be able to create new policies until you either delete exist-
ing policies or increase the value of cloudian.protection.policy.max.
Default = 25
Reloadable via JMX (S3 Service’s JMX port 19080; MBean attribute = com.gemini.cloudian.s3 → Configuring
→ Attributes → MaxProtectionPolicies)
For more information about storage policies, see "Storage Policies Feature Overview" (page 91).
cloudian.tiering.useragent
Takes its value from common.csv: "cloudian_tiering_useragent" (page 438); use that setting instead.
cloudian.s3.tiering.client.maxconnections
Takes its value from common.csv: "cloudian_s3_max_threads" (page 438); use that setting instead.
cloudian.s3.ssec.usessl
This setting controls whether the S3 servers will require that incoming S3 requests use HTTPS (rather than reg-
ular HTTP) connections when the request is using Server Side Encryption with Customer-provided encryption
keys (SSE-C). Leaving this setting at its default of "true" -- so that the S3 servers require HTTPS connections for
such requests -- is the recommended configuration. The only circumstance in which you might set this to "false"
is if:
l You are using a load balancer in front of your S3 servers -- and the load balancer, when receiving an
incoming HTTPS request from clients, terminates the SSL and uses regular HTTP to connect to an S3
server over your internal network.
l You trust your internal network to safely transport users' encryption keys from the load balancer to the
S3 servers over regular HTTP.
For background information about HyperStore support for server-side encryption (SSE and SSE-C), see "Set-
ting Up Server-Side Encryption" (page 113).
Default = true
util.awskmsutil.region
Amazon Web Services (AWS) service region to use if you are configuring your HyperStore system to support
AWS-KMS as a method of server-side encryption. For complete instructions see "Using AWS KMS" (page
118).
488
7.2. Configuration Settings
Default = us-east-1
cloudian.s3.torrent.tracker
If you want your service users to be able to use BitTorrent for object retrieval, use this property to specify the
URL of a BitTorrent "tracker" (a server that keeps track of the clients that have retrieved a particular object and
makes this information available to other clients retrieving the object). This can be a tracker that you implement
yourself or one of the many public BitTorrent trackers. HyperStore itself does not provide a tracker.
The tracker URL that you specify here will be included in the torrent file that the HyperStore S3 Server returns
to clients when they submit a GetObjectTorrent request.
cloudian.elasticsearch.*
For information about the cloudian.elasticsearch.* settings, see "Enabling Elasticsearch Integration for
Metadata Search" (page 157).
The settings in the "Node Status Configuration" section of mts.properties.erb configure a feature whereby the
S3 Service on any node will mark the HyperStore Service on any node as being "Down" if recent requests to
that HyperStore Service node have failed at a rate in excess of defined thresholds. When an S3 Service node
marks a HyperStore Service node as Down, that S3 Service node will temporarily stop sending requests to that
HyperStore Service node. This prevents a proliferation of log error messages that would likely have resulted if
requests continued to be sent to that HyperStore Service node, and also allows for the implementation of fall-
back consistency levels in the case of storage policies configured with "Dynamic Consistency Levels" (page
52).
The configurable thresholds in this section are applied by each individual S3 Service node -- so that each indi-
vidual S3 Service node makes its own determination of when a problematic HyperStore Service node should
be marked as Down.
An S3 Service node will mark a HyperStore Service node as Down in either of these conditions:
l The number of timeout error responses from a HyperStore Service node has exceeded hss.-
timeout.count.threshold (default = 10) over a period of hss.timeout.time.threshold number of
seconds (default = 300) and also the percentage of error responses of any type from that HyperStore
Service node has exceeded hss.fail.percentage.threshold (default = 50) over the past hss.-
fail.percentage.period number of seconds (default = 300).
l The number of other types of error responses from a HyperStore Service node has exceeded hss.-
fail.count.threshold (default = 10) over a period of hss.fail.time.threshold number of seconds (default
= 300) and also the percentage of error responses of any type from that HyperStore Service node has
exceeded hss.fail.percentage.threshold (default = 50) over the past hss.fail.percentage.period num-
ber of seconds (default = 300).
489
Chapter 7. Reference
So by default an S3 Service node will mark a HyperStore Service node as Down if during a five minute period
the HyperStore Service node has returned either more than 10 timeout responses or more than 10 error
responses of other types, while during that same five minute period more than half the requests that the S3 Ser-
vice node has sent to that HyperStore Service node have failed.
Note that these triggering conditions are based on a combination of number of error responses and per-
centage of error responses from the problematic HyperStore Service node. This approach avoids marking a
HyperStore Service node as down in circumstances when a high percentage of a very small number of
requests fail, or when the number of failed requests is sizable but constitutes only a small percentage of the
total requests.
Once an S3 Service node has marked a HyperStore Service node as Down, that Down status will persist for
hss.bring.back.wait number of seconds (default = 300) before the Down status is cleared and that S3 Service
node resumes sending requests to that HyperStore Service node.
Note If the S3 Service node is restarted during this interval, then the Down status for that HyperStore
Service node will be lost and the S3 Service node upon restarting will resume sending requests to that
HyperStore Service node.
Note There is special handling in the event that a HyperStore Service node returns a "Connection
Refused" error to an S3 Service node (such as would happen if the HyperStore Service was stopped
on the target node). In this case the S3 Service node immediately marks that HyperStore Service node
as being down, and will then resume sending requests to that HyperStore Service node after a wait
period of 15 seconds. This behavior is not configurable.
hyperstore.proactiverepair.queue.max.time
When eventual consistency for writes is used in the system -- that is, if you have storage policies for which you
have configured the write consistency level to be something less strict than ALL -- S3 writes may succeed in
the system even in circumstances when one or more write endpoints is unavailable. When this happens the
system's proactive repair feature queues information about the failed endpoint writes, and automatically
executes those writes later -- on an hourly interval (by default), without operation intervention. For more inform-
ation about proactive repair see "Proactive Repair" (page 253).
The proactive repair feature's queueing mechanism entails writing metadata to Cassandra, which is sub-
sequently removed when the endpoint writes are executed by proactive repair. To avoid over-burdening Cas-
sandra with proactive queueing data it's best if a cap be placed on how long the queueing can go on for in a
given instance of a write endpoint being unavailable. The hyperstore.proactiverepair.queue.max.time property
sets this cap, in minutes.
If a node has been unavailable for more than hyperstore.proactiverepair.queue.max.time minutes, the system
stops writing to the proactive repair queue for that node, an error is logged in the S3 application log, and an
alert is generated in the CMC. As indicated by the alert, in this circumstance after the node comes back online
you need to wait for proactive to complete on the node (you can monitor this in the CMC's Repair Status page
[Cluster -> Repair Status]) and then you must manually initiate a full repair on the node (see hsstool repair
and hsstool repairec).
Note that once the node is back up, the timer is reset to 0 in terms of counting against the hyper-
store.proactiverepair.queue.max.time limit. So if that subsequently node goes down again, proactive repair
queueing would again occur for that node for up to hyperstore.proactiverepair.queue.max.time minutes.
Default = 240
490
7.2. Configuration Settings
Note To disable this limit -- so that there is no limit on the time for which proactive repair queueing
metadata can build up for a node that's unavailable -- set hyperstore.proactiverepair.queue.max.time to
0.
cloudian.s3.enablesharedbucket
To enable the HyperStore S3 API extension that allows an S3 user to list all the buckets that have been shared
with him or her, set this property to true.
Default = false
Note For information about the relevant S3 API call and how to use the extension, see ListBuckets in
the S3 section of the Cloudian HyperStore AWS APIs Support Reference.
cloudian.userid.length
Takes its value from common.csv: "cloudian_userid_length" (page 420); use that setting instead.
cloudian.iam.max.groups
Takes its value from common.csv: "iam_max_groups" (page 436); use that setting instead.
cloudian.iam.max.groups.per.user
Takes its value from common.csv: "iam_max_groups_per_user" (page 436); use that setting instead.
mfa.totp.issuer
Takes its value from common.csv: "mfa_totp_issuer" (page 436); use that setting instead.
cloudian.fips.enabled
Takes its value from common.csv: fips_enabled. For information about the fips_enabled setting see "FIPS Sup-
port" (page 125).
7.2.5.5. mts-ui.properties.erb
The mts-ui.properties file configures the Cloudian Management Console server (CMC). On each of your Hyper-
Store nodes, the file is located at the following path by default:
/opt/tomcat/webapps/Cloudian/WEB-INF/classes/mts-ui.properties
Do not directly edit the mts-ui.properties file on individual HyperStore nodes. Instead, if you want to make
changes to the settings in this file, edit the configuration template file mts-ui.properties.erb on the Configuration
Master node:
/etc/cloudian-<version>-puppet/modules/cmc/templates/mts-ui.properties.erb
Certain mts-ui.properties.erb properties take their values from settings in common.csv or from settings that you
can control through the CMC's Configuration Settings page (Cluster -> Cluster Config -> Configuration Set-
tings). In the mts-ui.properties.erb file these properties' values are formatted as bracket-enclosed variables,
like <%= … %>. In the property documentation below, the descriptions of such properties indicate "Takes its
value from <location>: <setting>; use that setting instead." The remaining properties in the mts-ui.properties.erb
file -- those that are "hard-coded" with specific values -- are settings that in typical circumstances you should
491
Chapter 7. Reference
have no need to edit. Therefore in typical circumstances you should not need to manually edit the mts-ui.-
properties.erb file.
Specify just the configuration file name, not the full path to the file.
In the background this invokes the Linux text editor vi to display and modify the configuration file. Therefore you
can use the standard keystrokes supported by vi to make and save changes to the file.
IMPORTANT ! If you do make edits to mts-ui.properties.erb, be sure to push your edits to the cluster
and restart the CMC to apply your changes. For instructions see "Pushing Configuration File Edits to
the Cluster and Restarting Services" (page 411).
admin.host
Takes its value from common.csv: "cmc_admin_host_ip" (page 447); use that setting instead.
admin.port
Takes its value from common.csv:ld_cloudian_s3_admin_port, which is controlled by the installer.
admin.secure
Takes its value from common.csv:"admin_secure" (page 432); use that setting instead.
admin.secure.port
Takes its value from common.csv: "cmc_admin_secure_port" (page 433); use that setting instead.
admin.secure.ssl
Takes its value from common.csv: "cmc_admin_secure_ssl" (page 448); use that setting instead.
admin.conn.timeout
The connection timeout for the CMC to use as a client to the Admin Service, in milliseconds. If the CMC cannot
connect to the Admin Service within this many milliseconds, the connection attempt times out and the
CMC interface displays an error message.
Note To provide any of its functions for any type of user, the CMC must successfully connect to the
Admin Service.
iam.enabled
Takes its value from common.csv: "iam_service_enabled" (page 435); use that setting instead.
iam.host
Takes its value from common.csv: "iam_service_endpoint" (page 436); use that setting instead.
492
7.2. Configuration Settings
iam.port
Takes its value from common.csv: "iam_port" (page 435); use that setting instead.
iam.secure.port
Takes its value from common.csv: "iam_secure_port" (page 436); use that setting instead.
iam.secure
Takes its value from common.csv: "iam_secure" (page 435); use that setting instead.
iam.socket.timeout
If during an HTTP/S connection with the IAM Service this many milliseconds pass without any data being
passed back by the IAM Service, the CMC will drop the connection.
Default = 30000
iam.max.retry
If when trying to initiate an IAM request the CMC fails in an attempt to connect to the IAM Service, the CMC will
retry this many times before giving up.
Default = 3
web.secure
Takes its value from common.csv: "cmc_web_secure" (page 448); use that setting instead.
web.secure.port
Takes its value from common.csv: "cmc_https_port" (page 448); use that setting instead.
web.nonsecure.port
Takes its value from common.csv: "cmc_http_port" (page 448); use that setting instead.
storageuri.ssl.enabled
Takes its value from common.csv: "cmc_storageuri_ssl_enabled" (page 449); use that setting instead.
crr.external.enabled
Takes its value from common.csv: "cmc_crr_external_enabled" (page 451); use that setting instead.
path.style.access
Takes its value from common.csv: "path_style_access" (page 422); use that setting instead.
application.name
Takes its value from common.csv: "cmc_application_name" (page 448); use that setting instead.
s3.client.timeout
Socket timeout on requests from the CMC to the S3 Service, in milliseconds.
Default = 1800000
s3.upload.part.minsize
When the CMC uses Multipart Upload to transmit an object to the S3 Service, each of the parts will be this
493
Chapter 7. Reference
many bytes or larger — with the exception of the final part, which may be smaller. For example, if Multipart
Upload is used for an 18MiB object, and the configured minimum part size is 5MiB, the object will be trans-
mitted in four parts of size 5MiB, 5MiB, 5MiB, and 3MiB.
s3.upload.part.threshold
When a CMC user uploads an object larger than this many bytes, the CMC uses Multipart Upload to transmit
the object to the S3 Service, rather than PUT Object.
s3.qos.bucketcounter.enabled
Takes its value from common.csv: "s3_perbucket_qos_enabled" (page 442); use that setting instead.
query.maxrows
For the CMC's Usage By Users & Groups page, the maximum number of data rows to retrieve when pro-
cessing a usage report request.
Default = 100000
page.size.default
For the CMC's Usage By Users & Groups page, for usage report pagination, the default number of table rows
to display on each page of a tabular report.
Default = 10
page.size.max
For the CMC's Usage By Users & Groups page, for usage report pagination, the maximum number of table
rows that users can select to display on each page of a tabular report.
Default = 100
list.multipart.upload.max
In the CMC's Objects page, when a user is uploading multiple objects each of which is large enough to trigger
the use of the S3 multipart upload method, the maximum number of multipart upload objects for which to sim-
ultaneously display upload progress.
For example, with the default value of 1000, if a user is concurrently uploading 1005 objects that require the
use of the multipart upload method, the CMC's Objects page will display uploading progress for 1000 of those
objects.
Default = 1000
graph.datapoints.max
For the CMC's Usage By Users & Groups page, the maximum number of datapoints to include within a graph-
ical report.
Default = 1000
csv.rows.max
For the CMC's Usage By Users & Groups page, the maximum number of rows to include within a comma-sep-
arated value report.
494
7.2. Configuration Settings
Default = 1000
fileupload.abort.max.hours
When a very large file is being uploaded through the CMC, the maximum number of hours for the CMC to wait
for the S3 file upload operation to complete. If this maximum is reached, the upload operation is aborted.
Default = 3
license.request.email
Email address to which to send Cloudian license requests. This address is used in a request license inform-
ation link in the CMC interface.
Default = [email protected]
admin.auth.user
Takes its value from common.csv: "admin_auth_user" (page 431); use that setting instead.
admin.auth.pass
Takes its value from common.csv: "admin_auth_pass" (page 431); use that setting instead.
admin.auth.realm
Takes its value from common.csv: "admin_auth_realm" (page 432); use that setting instead.
user.password.min.length
Takes its value from common.csv: "user_password_min_length" (page 433); use that setting instead.
user.password.dup.char.ratio.limit
Takes its value from common.csv: "user_password_dup_char_ratio_limit" (page 433); use that setting
instead.
user.password.unique.generations
Takes its value from common.csv: "user_password_unique_generations" (page 433); use that setting
instead.
user.password.rotation.graceperiod
Takes its value from common.csv: "user_password_rotation_graceperiod" (page 433); use that setting
instead.
user.password.rotation.expiration
Takes its value from common.csv: "user_password_rotation_expiration" (page 434); use that setting instead.
acl.grantee.public
In the CMC UI dialogs that let CMC users specify permissions on S3 buckets, folders, or files, the label to use
for the ACL grantee "public". If the acl.grantee.public property is not set in mts-ui.properties, then the system
instead uses the acl.grantee.public value from your resources_xx_XX.properties files (for example, for the U.S.
English version of the UI, the value is in resources_en_US.properties).
acl.grantee.cloudianUser
495
Chapter 7. Reference
In the CMC UI dialogs that let CMC users specify permissions on S3 buckets, folders, or files, the label to use
for the ACL grantee "all Cloudian HyperStore service users". If the acl.grantee.cloudianUser property is not set
in mts-ui.properties, then the system instead uses the acl.grantee.cloudianUser value from your resources_xx_
XX.properties files (for example, for the U.S. English version of the UI, the value is in resources_en_US.-
properties).
session.timedout.url
URL of page to display if a CMC user’s login session times out.
If this value is not set in mts-ui.properties, the behavior defaults to displaying the CMC Login screen if the
user’s session times out.
admin.manage_users.enabled
This setting controls whether the Manage Users function (Users & Groups -> Manage Users) will be enabled
in the CMC GUI.
Options are:
l true — This function will display for users logged in as a system administrator or group administrator.
For group admins this function is restricted to their own group.
l false — This function will not display for any users. If you set this to "false" then the Manage Users func-
tionality as a whole is disabled and the more granular admin.manage_users.*.enabled properties
below are ignored.
l SystemAdmin — This function will display only for users logged in as a system administrator.
l GroupAdmin — This function will display only for users logged in as a group administrator.
Default = Commented out and uses internal default of "true". To assign a different value, uncomment the setting
and edit its value.
Note If you want to enable some aspects of the Manage Users function and not others, you can have
admin.manage_users.enabled set so that the function is enabled for your desired user types, and then
use the granular admin.manage_users.*.enabled properties below to enable/disable specific cap-
abilities.
admin.manage_users.create.enabled
Within the Manage Users function in the CMC GUI, this setting enables or disables the capability to create new
users.
Options are:
l true — This capability will display for users logged in as a system administrator or group administrator.
For group admins this capability is restricted to their own group.
l false — This capability will not display for any users.
l SystemAdmin — This capability will display only for users logged in as a system administrator.
l GroupAdmin — This capability will display only for users logged in as a group administrator.
496
7.2. Configuration Settings
Default = Commented out and uses internal default of "true". To assign a different value, uncomment the setting
and edit its value.
admin.manage_users.edit.enabled
Within the Manage Users function in the CMC GUI, this setting enables or disables the capability to edit exist-
ing users' profiles and service attributes.
Options are:
l true — This capability will display for users logged in as a system administrator or group administrator.
For group admins this capability is restricted to their own group.
l false — This capability will not display for any users.
l SystemAdmin — This capability will display only for users logged in as a system administrator.
l GroupAdmin — This capability will display only for users logged in as a group administrator.
Default = Commented out and uses internal default of "true". To assign a different value, uncomment the setting
and edit its value.
admin.manage_users.delete.enabled
Within the Manage Users function in the CMC GUI, this setting enables or disables the capability to delete
users.
Options are:
l true — This capability will display for users logged in as a system administrator or group administrator.
For group admins this capability is restricted to their own group.
l false — This capability will not display for any users.
l SystemAdmin — This capability will display only for users logged in as a system administrator.
l GroupAdmin — This capability will display only for users logged in as a group administrator.
Default = Commented out and uses internal default of "true". To assign a different value, uncomment the setting
and edit its value.
admin.manage_users.viewuserdata.enabled
Takes its value from common.csv: "cmc_view_user_data" (page 451); use that setting instead.
admin.manage_users.edit.user_credentials.enabled
Within the Manage Users function in the CMC GUI, this setting enables or disables the capability to change
users' CMC login passwords and to view and manage user's S3 access credentials.
Options are:
l true — This capability will display for users logged in as a system administrator or group administrator.
For group admins this capability is restricted to their own group.
l false — This capability will not display for any users.
497
Chapter 7. Reference
l SystemAdmin — This capability will display only for users logged in as a system administrator.
l GroupAdmin — This capability will display only for users logged in as a group administrator.
Default = Commented out and uses internal default of "true". To assign a different value, uncomment the setting
and edit its value.
admin.manage_users.edit.user_qos.enabled
Within the Manage Users function in the CMC GUI, this setting enables or disables the capability to set Quality
of Service (QoS) controls for specific users.
Options are:
l true — This capability will display for users logged in as a system administrator or group administrator.
For group admins this capability is restricted to their own group.
l false — This capability will not display for any users.
l SystemAdmin — This capability will display only for users logged in as a system administrator.
l GroupAdmin — This capability will display only for users logged in as a group administrator.
Default = Commented out and uses internal default of "true". To assign a different value, uncomment the setting
and edit its value.
admin.manage_groups.enabled
This setting controls whether the Manage Groups function (Users & Groups -> Manage Groups) will be
enabled in the CMC GUI.
Options are:
l true — This function will display for users logged in as a system administrator or group administrator.
For group admins this function is restricted to their own group.
l false — This function will not display for any users. If you set this to "false" then the Manage Groups
functionality as a whole is disabled and the more granular admin.manage_groups.*.enabled properties
below are ignored.
l SystemAdmin — This function will display only for users logged in as a system administrator.
l GroupAdmin — This function will display only for users logged in as a group administrator.
Default = Commented out and uses internal default of "true". To assign a different value, uncomment the setting
and edit its value.
Note If you want to enable some aspects of the Manage Groups function and not others, you can have
admin.manage_groups.enabled set so that the function is enabled for your desired user types, and
498
7.2. Configuration Settings
admin.manage_groups.create.enabled
Within the Manage Groups function in the CMC GUI, this setting enables or disables the capability to create
new groups.
Options are:
l true — This capability will display for users logged in as a system administrator.
l false — This capability will not display for any users.
Default = Commented out and uses internal default of "true". To assign a different value, uncomment the setting
and edit its value.
admin.manage_groups.edit.enabled
Within the Manage Groups function in the CMC GUI, this setting enables or disables the capability to edit an
existing group’s profile and service attributes.
Options are:
l true — This capability will display for users logged in as a system administrator or group administrator.
For group admins this capability is restricted to their own group.
Note Even when this capability is enabled for group admins, they will not be able to perform cer-
tain group-related actions that are reserved for system admins, such as setting QoS controls for
the group as a whole or assigning a default rating plan for the group. Group admins' privileges
will be limited to changing their group description and changing the default user QoS settings
for the group. The latter capability is controlled by a more granular configuration property admin.-
manage_groups.user_qos_groups_default.enabled (below), which defaults to "true".
Default = Commented out and uses internal default of "true". To assign a different value, uncomment the setting
and edit its value.
admin.manage_groups.delete.enabled
Within the Manage Groups function in the CMC GUI, this setting enables or disables the capability to delete a
group.
Options are:
l true — This capability will display for users logged in as a system administrator.
l false — This capability will not display for any users.
Default = Commented out and uses internal default of "true". To assign a different value, uncomment the setting
and edit its value.
admin.manage_groups.user_qos_groups_default.enabled
Within the Manage Groups function in the CMC GUI, this setting enables or disables the capability to set
499
Chapter 7. Reference
default Quality of Service (QoS) controls for users within a specific group.
Options are:
l true — This capability will display for users logged in as a system administrator or group administrator.
For group admins this capability is restricted to their own group.
l false — This capability will not display for any users.
l SystemAdmin — This capability will display only for users logged in as a system administrator.
l GroupAdmin — This capability will display only for users logged in as a group administrator.
Default = Commented out and uses internal default of "true". To assign a different value, uncomment the setting
and edit its value.
account.profile.writeable.enabled
This setting controls whether a user can edit his or her own account profile information in the Account Profile
section of the CMC GUI (accessible through a drop-down list under the user's login name in the upper right of
the GUI). For user types for which this editing capability is not enabled, account profile information will be read-
only.
Options are:
l true — This capability will be enabled for all user types (system administrator, group administrator, and
regular user).
l false — This capability will be disabled for all user types.
l SystemAdmin — This capability will be enabled only for users logged in as a system administrator.
l GroupAdmin — This capability will be enabled only for users logged in as a group administrator.
l User — This capability will be enabled only for users logged in as a regular user.
l You can also specify a comma-separated list of multiple user types — for example, "Sys-
temAdmin,GroupAdmin".
Default = Commented out and uses internal default of "true". To assign a different value, uncomment the setting
and edit its value.
account.credentials.enabled
This setting controls whether the Security Credentials function will be enabled in the CMC GUI (accessible
through a drop-down list under the user's login name in the upper right of the GUI).
Options are:
l true — This function will display for all user types (system administrator, group administrator, and reg-
ular user).
l false — This function will not display for any user types. If you set this to "false" then the Security Cre-
dentials functionality as a whole is disabled and the more granular account.credentials.*.enabled prop-
erties below are ignored.
l SystemAdmin — This function will display only for users logged in as a system administrator.
l GroupAdmin — This function will display only for users logged in as a group administrator.
l User — This function will display only for users logged in as a regular user.
500
7.2. Configuration Settings
l You can also specify a comma-separated list of multiple user types for which to enable this function —
for example, "SystemAdmin,GroupAdmin".
Default = Commented out and uses internal default of "true". To assign a different value, uncomment the setting
and edit its value.
Note If you want to enable some aspects of the Security Credentials function and not others, you can
have account.credentials.enabled set so that the function is enabled for your desired user types, and
then use the granular account.credentials.*.enabled properties below to enable/disable specific cap-
abilities.
account.credentials.access.enabled
Within the Security Credentials function in the CMC GUI, this setting enables or disables the capability of
CMC users to view and change their own S3 storage access keys.
Options are:
l true — This capability will display for all user types (system administrator, group administrator, and reg-
ular user).
l false — This capability will not display for any user types.
l SystemAdmin — This capability will display only for users logged in as a system administrator.
l GroupAdmin — This capability will display only for users logged in as a group administrator.
l User — This capability will display only for users logged in as a regular user.
l You can also specify a comma-separated list of multiple user types for which to enable this capability —
for example, "SystemAdmin,GroupAdmin".
Default = Commented out and uses internal default of "true". To assign a different value, uncomment the setting
and edit its value.
account.credentials.signin.enabled
Within the Security Credentials function in the CMC GUI, this setting enables or disables the capability CMC
users to change their own CMC login password.
Options are:
l true — This capability will display for all user types (system administrator, group administrator, and reg-
ular user).
l false — This capability will not display for any user types.
l SystemAdmin — This capability will display only for users logged in as a system administrator.
l GroupAdmin — This capability will display only for users logged in as a group administrator.
l User — This capability will display only for users logged in as a regular user.
l You can also specify a comma-separated list of multiple user types for which to enable this capability —
for example, "SystemAdmin,GroupAdmin".
Default = Commented out and uses internal default of "true". To assign a different value, uncomment the setting
and edit its value.
account.activity.enabled
This setting controls whether the Account Activity function (Users & Groups -> Account Activity) will be
501
Chapter 7. Reference
Options are:
l true — This capability will display for users logged in as a system administrator.
l false — This capability will not display for any users.
Default = Commented out and uses internal default of "true". To assign a different value, uncomment the setting
and edit its value.
usage.enabled
This setting controls whether the Usage By Users and Groups function (Analytics -> Usage By User &
Group) will be enabled in the CMC GUI.
Options are:
l true — This function will display for all user types (system administrator, group administrator, and reg-
ular user).
l false — This function will not display for any user types.
l SystemAdmin — This function will display only for users logged in as a system administrator.
l GroupAdmin — This function will display only for users logged in as a group administrator.
l User — This function will display only for users logged in as a regular user.
l You can also specify a comma-separated list of multiple user types for which to enable this function —
for example, "SystemAdmin,GroupAdmin".
Default = Commented out and uses internal default of "true". To assign a different value, uncomment the setting
and edit its value.
security.serviceinfo.enabled
This setting controls whether HyperStore's S3 service endpoints will display for group admins and regular
users in the CMC's Security Credentials page (accessible through a drop-down list under the user's login
name in the upper right of the GUI).
Options are:
l true — S3 service endpoints will display for group admins and regular users in the CMC's Security Cre-
dentials page.
l false — S3 service endpoints will not display for group admins or regular users in the CMC's Security
Credentials page.
Default = Commented out and uses internal default of "true". To assign a different value, uncomment the setting
and edit its value.
Note For HyperStore system admins, the S3 service endpoint information displays in the Cluster
Information page (Cluster -> Cluster Config -> Cluster Information). The security.serviceinfo.enabled
property has no effect on this display.
login.languageselection.enabled
Takes its value from common.csv: "cmc_login_languageselection_enabled" (page 449); use that setting
instead.
login.grouplist.enabled
502
7.2. Configuration Settings
Takes its value from common.csv: "cmc_login_grouplist_enabled" (page 449); use that setting instead.
login.grouplist.admincheckbox.enabled
Takes its value from common.csv: "cmc_login_grouplist_admincheckbox_enabled" (page 450); use that set-
ting instead.
login.banner.*
These settings take their values from the cmc_login_banner_* settings in common.csv; use those settings
instead. For information about using those settings see "Configuring a Login Page Acknowledgment Gate"
(page 181).
csrf.origin.check.enabled
Takes its value from common.csv: "cmc_csrf_origin_check_enabled" (page 451); use that setting instead.
csrf.target.origin.allowlist
Takes its value from common.csv: "cmc_csrf_origin_allowlist" (page 452); use that setting instead.
grouplist.enabled
Takes its value from common.csv: "cmc_grouplist_enabled" (page 449); use that setting instead.
grouplist.size.max
Takes its value from common.csv: "cmc_grouplist_size_max" (page 450); use that setting instead.
error.stacktrace.enabled
Throughout the CMC UI, when an exception occurs and an error page is displayed to the user, include on the
error page a link to a stack trace.
Options are:
l true — Stack trace links will display for all user types (system administrator, group administrator, and
regular user).
l false — Stack trace links will not display for any user types.
l SystemAdmin — Stack trace links will display only for users logged in as a system administrator.
l GroupAdmin — Stack trace links will display only for users logged in as a group administrator.
l User — Stack trace links will display only for users logged in as a regular user.
l You can also specify a comma-separated list of multiple user types for which to enable this feature —
for example, "SystemAdmin,GroupAdmin".
Default = false
admin.whitelist.enabled
Takes its value from common.csv: "admin_whitelist_enabled" (page 447); use that setting instead.
sso.enabled
Takes its value from common.csv: "cmc_sso_enabled" (page 452); use that setting instead.
sso.shared.key
Takes its value from common.csv: "cmc_sso_shared_key" (page 452); use that setting instead.
503
Chapter 7. Reference
sso.tolerance.millis
Maximum allowed variance between the CMC server time and the timestamp submitted in a client request
invoking the "auto-login with one way hash" method of single sign-on access to the CMC, in milliseconds. If the
variance is greater than this, the request is rejected. This effectively serves as a request expiry mechanism.
Default = 3600000
sso.cookie.cipher.key
Takes its value from common.csv: "cmc_sso_cookie_cipher_key" (page 452); use that setting instead.
bucket.storagepolicy.showdetail.enabled
Within the Bucket Properties function in the CMC GUI, this setting enables or disables the display of a "Stor-
age Policy" tab that provides information about the storage policy being used by the bucket. This information
includes the storage policies replication factor or erasure coding k+m configuration..
Options are:
l true — This tab will display for all user types (system administrator, group administrator, and regular
user).
l false — This tab will not display for any user types.
l SystemAdmin — This tab will display only for users logged in as a system administrator.
l GroupAdmin — This tab will display only for users logged in as a group administrator.
l User — This tab will display only for users logged in as a regular user.
l You can also specify a comma-separated list of multiple user types for which to display this tab — for
example, "SystemAdmin,GroupAdmin".
Default = Commented out and uses internal default of "true". To assign a different value, uncomment the setting
and edit its value.
bucket.tiering.enabled
Takes its value from "Enable Auto Tiering" in the CMC's Configuration Settings page (Cluster -> Cluster Con-
fig -> Configuration Settings); use that setting instead.
tiering.perbucketcredentials.enabled
Takes its value from "Enable Per Bucket Credentials" in the CMC's Configuration Settings page (Cluster ->
Cluster Config -> Configuration Settings); use that setting instead.
tiering.customendpoint.enabled
Takes its value from "Enable Custom Endpoint" in the CMC's Configuration Settings page (Cluster -> Cluster
Config -> Configuration Settings); use that setting instead.
bucket.tiering.default.destination.list
Takes its value from common.csv: "cmc_bucket_tiering_default_destination_list" (page 453); use that set-
ting instead.
bucket.tiering.custom.url
Takes its value from "Default Tiering URL" in the CMC's Configuration Settings page (Cluster -> Cluster Con-
fig -> Configuration Settings); use that setting instead.
puppet.master.licensefile
504
7.2. Configuration Settings
This setting supports the feature whereby an updated HyperStore license file can be uploaded via the CMC.
Do not edit.
puppet.master.fileupdate.location
This setting supports the feature whereby an updated HyperStore license file can be uploaded via the CMC.
Do not edit.
local.ssh.privateKey
Private key to use when the CMC connects via SSH to the Configuration Master node or to other HyperStore
nodes when implementing node management functions. The Cloudian installation script automatically pop-
ulates this setting.
local.ssh.passphrase
Pass phrase to use when the CMC connects via SSH to the Configuration Master node or to other HyperStore
nodes when implementing node management functions. The Cloudian installation script automatically pop-
ulates this setting.
local.ssh.applianceKey
Private key to use when the CMC connects via SSH to a new HyperStore Appliance node when the appliance
node is being added to an existing system. The Cloudian installation script automatically populates this setting.
local.temp.dir
This setting supports the feature whereby an updated HyperStore license file can be uploaded via the CMC.
Do not edit.
remote.ssh.user
The user as which to connect via SSH to the Configuration Master node or other HyperStore nodes.
Default = root
remote.ssh.port
The port to which to connect via SSH to the Configuration Master node or other HyperStore nodes.
Default = 22
offload.services.node.options
Used internally by the CMC when invoking the installer script for certain operations. Do not edit.
Default = -m
uninstall.node.options
Used internally by the CMC when invoking the installer script for certain operations. Do not edit.
Default = -u -r
elasticsearch.enabled
For information about this setting, see "Enabling Elasticsearch Integration for Metadata Search" (page 157).
proactive.repair.queue.warning.enabled
The CMC Dashboard has a feature whereby it displays a "Long proactive repair queue" warning if the
505
Chapter 7. Reference
proactive repair queue for a particular node has more than 10,000 objects in it. This feature requires the Dash-
board to retrieve proactive repair queue length data from Cassandra, each time the Dashboard page is loaded
in your browser. This can sometimes result in slower loading times for the Dashboard.
The proactive.repair.queue.warning.enabled property enables and disables this Dashboard feature. If this prop-
erty is set to "false", then the Dashboard does not retrieve proactive repair queue length data from Cassandra
and does not display any warnings in regard to proactive repair queue length.
Default = false
cloudian.userid.length
Takes its value from common.csv: "cloudian_userid_length" (page 420); use that setting instead.
Note The survey file must be kept in the installation staging directory, not in a different directory. Do not
delete or move the survey file.
The survey file contains one line for each HyperStore host in your cluster (including the Configuration Master
host), with each line using the format below.
<regionname>,<hostname>,<ip4-address>,<datacenter-name>,<rack-name>[,<internal-interface>]
l <regionname> — HyperStore service region in which the host is located. The HyperStore system sup-
ports having multiple service regions with each region having its own independent storage cluster and
S3 object inventory, and with S3 application users able to choose a storage region when they create
storage buckets. Even if you will have only one region you must give it a name. The maximum allowed
length is 52 characters. The only allowed character types are lower case ASCII alphanumerical char-
acters and dashes (a-z0-9 and dashes). Do not include the string "s3" in the region name. Make sure
the region name matches the region string that you use in your S3 endpoints in your "DNS Set-Up"
(page 573).
l <hostname> — Short hostname of the host (as would be returned if you ran the hostname -s command
on the host). This must be the node's short hostname, not an FQDN.
Note Do not use the same short hostname for more than one node in your entire HyperStore sys-
tem. Each node must have a unique short hostname within your entire HyperStore system, even
in the case of nodes in different data centers or service regions that have different domains. For
example, in your HyperStore system do not have two nodes with the same short hostname vega
for which the FQDN of one is vega.east.com and the FQDN of the other is vega.west.com.
l <ip4-address> — IP address (v4) that the hostname resolves to. Do not use IPv6. This should be the IP
address associated with the host's default, external interface -- not an internal interface.
l <datacenter-name> — Name of the data center in which the host machine is located. The maximum
allowed length is 256 characters. The only allowed character types are ASCII alphanumerical char-
acters and dashes (A-Za-z0-9 and dashes).
506
7.2. Configuration Settings
l <rack-name> — Name of the server rack in which the host machine is located. The maximum allowed
length is 256 characters. The only allowed character types are ASCII alphanumerical characters and
dashes (A-Za-z0-9 and dashes).
Note Within a data center, use the same "rack name" for all of the nodes, even if some nodes
are on different physical racks than others. For example, if you have just one data center, all the
nodes must use the same rack name. And if you have two data centers named DC1 and DC2,
all the nodes in DC1 must use the same rack name as the other nodes in DC1; and all the
nodes in DC2 must use the same rack name as the other nodes in DC2.
l [<internal-interface>] — Use this field only for hosts that will use a different network interface for internal
cluster traffic than the rest of the hosts in the cluster do. For example, if most of your hosts will use "eth1"
for internal cluster traffic, but two of your hosts will use "eth2" instead, use this field to specify "eth2" for
each of those two hosts, and leave this field empty for the rest of the hosts in your survey file. (Later in
the installation procedure you will have the opportunity to specify the default internal interface for the
hosts in your cluster -- the internal interface used by all hosts for which you do not specify the internal-
interface field in your survey file.) If all of your hosts use the same internal network interface — for
example if all hosts use "eth1" for internal network traffic — then leave this field empty for all hosts in the
survey file.
Note Cassandra, Redis, and the HyperStore Service are among the services that will utilize the
internal interface for intra-cluster communications.
region1,arcturus,65.10.2.1,DC1,RAC1
This second example survey file is for a three-node HyperStore cluster with just one service region, one data
center, and one rack:
tokyo,cloudian-vm7,65.10.1.33,DC1,RAC1
tokyo,cloudian-vm8,65.10.1.34,DC1,RAC1
tokyo,cloudian-vm9,65.10.1.35,DC1,RAC1
This third example survey file below is for a HyperStore installation that spans two regions, with the first region
comprising two data centers and the second region comprising just one data center. Two of the hosts use a dif-
ferent network interface for internal network traffic than all the other hosts do.
boston,hyperstore1,65.1.0.1,DC1,RAC1
boston,hyperstore2,65.1.0.2,DC1,RAC1
boston,hyperstore3,65.1.0.3,DC1,RAC1
boston,hyperstore4,66.2.0.1,DC2,RAC1
boston,hyperstore5,66.2.0.2,DC2,RAC1
chicago,hyperstore6,68.3.0.1,DC3,RAC1
chicago,hyperstore7,68.3.0.2,DC3,RAC1
chicago,hyperstore8,68.3.2.1,DC3,RAC1,eth2
chicago,hyperstore9,68.3.2.2,DC3,RAC1,eth2
507
Chapter 7. Reference
these files.
All of these files are in sub-directories under the /etc/cloudian-<version>-puppet directory. For brevity, in the
section headings that follow /etc/cloudian-<version>-puppet is replaced with "..." and only the sub-directory is
specified.
Specify just the configuration file name, not the full path to the file.
In the background this invokes the Linux text editor vi to display and modify the configuration file. Therefore you
can use the standard keystrokes supported by vi to make and save changes to the file.
File Purpose
Configures TLS/SSL implementation for the Admin Service’s
adminsslconfigs.csv
HTTPS listener.
These files are for settings that are tailored to individual nodes,
as identified by the <node> segment of the file names (for
example, host1.csv, host2.csv, and so on). There will be one
such file for each node in your system. These files are created
<node>.csv
and pre-configured by the HyperStore install script, based on
information that you provided during installation. Settings in a
<node>.csv file override default settings from common.csv, for
the specified node.
508
7.2. Configuration Settings
File Purpose
pre-configured by the HyperStore install script, based on inform-
ation that you provided during installation.
File Purpose
This file configures the Admin Service’s underlying Jetty server
admin.xml.erb
functionality, for processing incoming HTTP requests.
This file configures HTTP Basic Authentication for the Admin Ser-
admin_realm.properties.erb
vice.
509
Chapter 7. Reference
File Purpose
"Allowlist" for log message based alerts. For more information
alert.allowlist.yaml
see Configurable Filtering of Log Message Based Alerts.
File Purpose
This file configures the CMC’s underlying Tomcat server func-
server.xml.erb
tionality, for processing incoming HTTP requests.
HyperStore uses the open source version of Salt for certain configuration management functions (such as con-
figuration management of the HyperStore firewall -- which from a user perspective is controlled through the
installer's Advanced Configuration Options menu). Do not edit any of the configuration files under /etc/cloud-
ian-<version>-puppet/modules/salt.
Note Salt configuration management activity is recorded in the log file /var/log/cloudian/salt.log.
When you use JMX to make a configuration setting change, the change is applied to the service dynamically —
you do not need to restart the service for the change to take effect. However, the setting change persists only
for the current running session of the affected service. If you restart the service it will use whatever setting is
in the configuration file. Consequently, if you want to change a setting dynamically and also have your change
persist through a service restart, you should change the setting in the configuration file as well as changing it
via JMX.
In the documentation of HyperStore configuration files, if a setting supports being dynamically changed via
JMX, the setting description indicates "Reloadable via JMX". It also indicates the name of the dynamically
changeable MBean attribute that corresponds to the configuration file setting. For example, the documentation
of the HyperStore Service configuration property repair.session.threadpool.corepoolsize includes a note that
says:
"Reloadable via JMX (HyperStore Service’s JMX port 19082; MBean attribute = com.gem-
ini.cloudian.hybrid.server → FileRepairService → Attributes → RepairSessionThreadPoolCorePoolSize)"
Note Some settings in HyperStore configuration files can be set through the CMC's Configuration Set-
tings page (Cluster -> Cluster Config -> Configuration Settings). The CMC uses JMX to apply setting
changes dynamically, and the CMC also automatically makes the corresponding configuration file
510
7.2. Configuration Settings
change and triggers a Puppet push so that your changes will persist across service restarts. For these
settings it’s therefore best to use the CMC to make any desired edits rather than directly using JMX. For
such configuration file settings, the descriptions in the HyperStore configuration file documentation
indicate that you should use the CMC to edit the setting. Consequently these settings are not flagged in
the documentation as JMX reloadable.
If you want to use anti-virus software to monitor OS files other than HyperStore-related files, configure the anti-
virus software to exclude these directories from monitoring:
l All HyperStore data directories (as specified by the configuration setting hyperstore_data_directory in
common.csv)
l /var/lib/{cassandra,cassandra_commit,redis}
l /var/log/{cloudian,cassandra,redis}
l /opt/{cassandra,cloudian,cloudian-packages,cloudianagent,dnsmasq,redis,tomcat}
l /etc/cloudian-<version>-puppet*
Note This topic describes the NTP configuration that HyperStore automatically implements. If instead
you are using your own custom NTP set-up, Cloudian recommends using at least 3 root clock
sources.
Accurate, synchronized time across the cluster is vital to HyperStore service. For example, object versioning
relies on it, and so does S3 authorization. It’s important to have a robust NTP set-up.
When you install your HyperStore cluster, the installation script automatically configures a robust NTP set-up
using ntpd, as follows:
l In each of your HyperStore data centers, four HyperStore nodes are configured as internal NTP servers.
These internal NTP servers will synchronize with external NTP servers -- from the pool.ntp.org project
by default -- and are also configured as peers of each other. (If a HyperStore data center has only four
or fewer nodes, then all the nodes in the data center are configured as internal NTP servers.)
l All other nodes in the data center are configured as clients of the four internal NTP servers.
511
Chapter 7. Reference
l In the event that all four internal NTP servers in a DC are unable to reach any of the external NTP serv-
ers, the four internal NTP servers will use "orphan mode" -- which entails the nodes choosing one of
themselves to be the "leader" to which the others will sync -- until such time as one or more of the
external NTP servers are reachable again.
Each HyperStore data center is independently configured, using this same approach.
To see which of your HyperStore hosts are configured as internal NTP servers, go to the CMC's Cluster Inform-
ation page (Cluster -> Cluster Config -> Cluster Information).
On the CMC's Cluster Information page you can also view the list of external NTP servers to which the internal
NTP servers will synchronize. By default the external NTP servers used are public servers from the pool.ntp.org
project:
l 0.centos.pool.ntp.org
l 1.centos.pool.ntp.org
l 2.centos.pool.ntp.org
l 3.centos.pool.ntp.org
IMPORTANT ! In order to connect to the default external NTP servers the internal NTP servers must be
allowed outbound internet access.
Note ntpd is configured to automatically start on host boot-up. However, it’s recommended that after
booting a HyperStore host, you verify that ntpd is running (which you can do with the ntpq -p command
— if ntpd is running this command will return a list of connected time servers).
7.2.6.2.1. Changing the List of External NTP Servers or Internal NTP Servers
You can use the HyperStore installer's "Advanced Configuration Options" to change either the list of external
NTP servers or the list of internal NTP servers. For more information see "Change Internal NTP Servers or
External NTP Servers" (page 265).
512
7.2. Configuration Settings
1. On the Configuration Master node, change to the installation staging directory. Then launch the
installer.
# ./cloudianInstall.sh
$ hspkg install
Once launched, the installer's menu options (such as referenced in the steps below) are the same
regardless of whether it was launched from the HSH command line or the OS command line.
2. At the installer's main menu enter 4 for "Advanced Configuration Options". Then at the Advanced Con-
figuration Options menu enter b for "Change S3, Admin or CMC ports".
3. Follow the prompts to specify your desired port numbers. The prompts indicate your current settings. At
each prompt press Enter to keep the current setting value, or type in a new value. The final prompt will
ask whether you want to save your changes -- type yes to do so.
4. Go back to the installer's main menu again and enter 2 for "Cluster Management". Then enter b for
"Push Configuration Settings to Cluster", and follow the prompts..
5. After returning to the Cluster Management menu again, enter c for "Manage Services", and restart the
affected services:
l For changed S3 or Admin ports, restart the S3 Service and the CMC. Note that the Admin service
is restarted automatically when you restart the S3 Service.
l For changed CMC port, restart the CMC
Do not exit the installer until you complete Step 6 below, if applicable.
6. If you have the HyperStore firewall enabled (as described in "HyperStore Firewall" (page 581)), the
firewall's configuration is automatically adjusted to accommodate the port number change that you
made. But you must push the updated firewall configuration out to the cluster by taking these steps with
the installer:
a. At the installer's main menu, enter 4 for "Advanced Configuration Options". Then at the
Advanced Configuration Options menu enter s for "Configure Firewall".
b. At the Firewall Configuration menu, enter x for "Apply configuration changes and return to pre-
vious menu". When prompted, enter yes to confirm that you want to apply your configuration
changes.
After your changes are successfully applied to the cluster you can exit the installer.
513
Chapter 7. Reference
these HyperStore service endpoints, their default values, and how the endpoints are used, see "DNS Set-Up"
(page 573).)
1. On the Configuration Master node, change to the installation staging directory. Then launch the
installer.
# ./cloudianInstall.sh
$ hspkg install
Once launched, the installer's menu options (such as referenced in the steps below) are the same
regardless of whether it was launched from the HSH command line or the OS command line.
2. From the installer's menu select "Advanced Configuration Options" and then select "Change S3, Admin,
CMC, or IAM/STS endpoints".
3. Follow the prompts to specify your desired endpoints. The prompts indicate your current settings. At
each prompt press Enter to keep the current setting value, or type in a new value.
For S3 service endpoint, the typical configuration is one endpoint per service region but you also have
the option of specifying multiple endpoints per region (if for example you want to have different S3 ser-
vice endpoints for different data centers within the same region). To do so simply enter a comma-sep-
arated list of endpoints at the prompt for the region's S3 service domain URL. Do not enclose the
comma-separated list in quotes. If you want to have different S3 endpoints for different data centers
within the same service region, the recommended S3 endpoint syntax is s3-<region-
name>.<dcname>.<domain>. For example if you have data centers named chicago and cleveland both
within the midwest service region, and your domain is enterprise.com, the S3 endpoints would be s3-
midwest.chicago.enterprise.com and s3-midwest.cleveland.enterprise.com. (Make sure that your DNS
set-up resolves the service endpoints in the way that you want -- for example, with one S3 service end-
point resolving to the virtual IP address of a load balancer in your Chicago data center and one S3 ser-
vice endpoint resolving to the virtual IP address of a load balancer in your Cleveland data center).
Note The only instance of the string "s3" should be the leading prefix (as in the examples
above). Do not also include "s3" in the <regionname> value, the <dcname> value, or the
<domain> value because having two instances of "s3" in the service endpoint may cause S3 ser-
vice requests to fail. For example, do not have a service endpoint such as "s3-toky-
o.s3.enterprise.com".
514
7.2. Configuration Settings
For S3 static website endpoint you can only have one endpoint per service region. For the Admin ser-
vice you can only have one endpoint for your whole HyperStore system. For the CMC service you can
only have one endpoint for your whole HyperStore system. For the IAM service you can only have one
endpoint for your whole HyperStore system.
The final prompt will ask whether you want to save your changes -- type yes to do so.
4. Go to the main menu again and choose "Cluster Management" → "Push Configuration Settings to
Cluster" and follow the prompts.
5. Go to the "Cluster Management" menu again, choose "Manage Services", and restart the S3 Service
and the CMC. If you are using DNSMASQ for HyperStore service endpoint resolution, then also restart
DNSMASQ.
Note If you are using your DNS environment for HyperStore service endpoint resolution, update
your DNS entries to match your custom endpoints if you have not already done so. For guid-
ance see "DNS Set-Up" in the Cloudian HyperStore Installation Guide.
Note If you are using per-group filtering of S3 endpoint displays in the CMC and you change an
S3 endpoint (using the procedure above), then you must go to the CMC's Manage Groups page
and edit the group configuration for any groups that are using S3 endpoint display filtering. If
you do not update the S3 endpoint display filtering for such groups, then neither the original S3
endpoint nor the replacement S3 endpoint will display for those groups. Note that per-group fil-
tering of S3 endpoint displays is not the default behavior (by default all users can see all of your
system's current S3 endpoints, listed in the CMC's Security Credentials page). If you did not
explicitly configure any groups to use S3 endpoint display filtering, then after changing S3 end-
points in the system you do not need to take any action in regard to the CMC's display of S3 end-
points -- all CMC users will automatically be able to see the new S3 endpoints.
Since the script runs automatically during installation and during cluster expansion, under normal cir-
cumstances you should not need to run the script yourself. However, you can run the performance con-
figuration optimization script indirectly through the installer's Advanced Configuration Options menu, if for
example you have made configuration changes on your own and your system is now under-performing as a
result. Running the script in this way will return the configuration settings to the optimized values as determined
by the script.
1. On the Configuration Master node, change to the installation staging directory and then launch the
installer:
# ./cloudianInstall.sh
515
Chapter 7. Reference
If you are using the HyperStore Shell (HSH) as a Trusted user, from any directory on the Configuration
Master node you can launch the installer with this command:
$ hspkg install
Once launched, the installer's menu options (such as referenced in the steps below) are the same
regardless of whether it was launched from the HSH command line or the OS command line.
2. From the installer's menu select "Advanced Configuration Options" and then select "Configure Per-
formance Parameters on Nodes".
3. At the prompt, specify a node for which to run the performance configuration optimization; or specify a
comma-separated list of nodes; or leave the prompt blank and press enter if you want to run the optim-
ization for all nodes in your system. When the script run is done the installer interface will prompt you to
continue to the next steps.
4. Go to the installer's main menu again and choose "Cluster Management" → "Push Configuration Set-
tings to Cluster" and follow the prompts.
5. Go to the "Cluster Management" menu again, choose "Manage Services", and restart the S3 Service,
the HyperStore Service, and the Cassandra Service.
l Introduction (below)
l "Admin Service Logs" (page 517)
l "Cassandra (Metadata DB) Logs" (page 518)
l "CMC Logs" (page 519)
l "HyperStore Firewall Log" (page 521)
l "HyperStore Service Logs" (page 521)
l "HyperStore Shell (HSH) Log" (page 526)
l "IAM Service Logs" (page 527)
l "Monitoring Agent and Collector Logs" (page 529)
l "Phone Home (Smart Support) Log" (page 530)
l "Redis (Credentials DB and QoS DB) and Redis Monitor Logs" (page 530)
l "S3 Service Logs (including Auto-Tiering, CRR, and WORM)" (page 533)
l "SQS Service Logs" (page 539)
The major HyperStore services each generate their own application log. The S3 Service, Admin Service, and
HyperStore Service, in addition to generating application logs, also generate transaction (request) logs.
The log descriptions below indicate each log's default location, logging level, rotation and retention policy, log
entry format, and where to modify the log's configuration.
Note With the exception of Cassandra and Redis logs, all HyperStore logs are located in /var/-
log/cloudian.
516
7.3. HyperStore Logs
Note For information on viewing logs from within the HyperStore Shell, see "Using the HSH to View
Logs" (page 566).
Log Entry The MessageCode uniquely identifies the log message, for messages of level WARN or
Format higher. For documentation of specific message codes including recommended operator
action, see the "Log Message Codes" section of the CMC's online Help.
Default Log-
INFO
ging Level
Rotation occurs if live file size reaches 10MB. Rotation also occurs at end of each day,
regardless of live file size.
Default Rota-
tion and Rotated files are named as cloudian-admin.log.YYYY-MM-DD.i.gz, where i is a rotation
Retention counter that resets back to 1 after each day. Rotated files are compressed with gzip.
Policy
Deletion of oldest rotated log file occurs if aggregate rotated files size (after compression)
reaches 100MB or if oldest rotated file age reaches 180 days.
yyyy-mm-dd HH:mm:ss,SSS|ClientIpAddress|HttpMethod|Uri|QueryParams|
DurationMicrosecs|HttpStatus
Note
Log Entry
* Query parameters are not logged for requests that involve user credentials.
Format
* The request log records Admin API requests for which authentication fails, as well
as requests for which the authentication succeeds. Success or failure is indicated by
the HttpStatus.
Log Entry
2021-10-27 14:54:01,170|10.20.2.57|GET|/group/list|limit:100|188212|200
Example
517
Chapter 7. Reference
Logging
Not applicable
Level
Rotation occurs if live file size reaches 100MB. Rotation also occurs at end of each day,
regardless of live file size.
Default Rota-
tion and Rotated files are named as cloudian-admin-request-info.log.YYYY-MM-DD.i.gz, where i is a
Retention rotation counter that resets back to 1 after each day. Rotated files are compressed with gzip.
Policy
Deletion of oldest rotated log file occurs if aggregate rotated files size (after compression)
reaches 2GB or if oldest rotated file age reaches 180 days.
Default Logging
INFO
Level
Rotation occurs if live file size reaches 20MB. Rotation also occurs at end of each day,
regardless of live file size.
Default Rotation
Rotated files are named as system.log.YYYY-MM-DD.i.gz, where i is a rotation counter
and Retention
that resets back to 1 after each day. Rotated files are compressed with gzip.
Policy
Deletion of oldest rotated log file occurs if aggregate rotated files size (after com-
pression) reaches 200MB or if oldest rotated file age reaches 30 days.
Configuration: /etc/cloudian-<version>-pup-
pet/modules/cassandra/templates/logback.xml.erb. For setting descriptions see the
online documentation for Logback:
Configuration
l FixedWindowRollingPolicy
l SizeBasedTriggeringPolicy
518
7.3. HyperStore Logs
DEBUG
Note In log4j-s3.xml.erb on your Configuration Master node there are three different
AsyncLogger instances for Cassandra request logging. The ERROR logger logs
entries when a Cassandra request results in an error; the SLOW logger logs entries
when a Cassandra request takes more than 5 seconds to process; and the NORMAL
Default Log- logger logs all Cassandra requests. The three loggers all write to /var/-
ging Level log/cloudian/cassandra-s3-tx.log, and the implementation prevents duplicate entries
across the three loggers. All three loggers are set to DEBUG level by default; and
each logger works only if set to DEBUG or TRACE. To disable a logger, set its level
to INFO or higher. For example to disable the NORMAL logger so that only error and
slow requests are recorded, set the NORMAL logger's level to INFO. Then do a Pup-
pet push and restart the S3 Service.
Rotation occurs if live file size reaches 10MB. Rotation also occurs at end of each day,
regardless of live file size.
Default Rota-
tion and Rotated files are named as cassandra-s3-tx.log.YYYY-MM-DD.i.gz, where i is a rotation
Retention counter that resets back to 1 after each day. Rotated files are compressed with gzip.
Policy
Deletion of oldest rotated log file occurs if aggregate rotated files size (after compression)
reaches 100MB or if oldest rotated file age reaches 180 days.
Note Currently only a fraction of request types from the S3 Service to Cassandra support this
request logging feature. These are request types that use a new DataStax Java driver (which supports
the request logging) rather than the older Hector driver (which does not).
In the case of log entries for user logins to the CMC, the MESSAGE value will be formatted
Log Entry as follows:
Format
Normal login
519
Chapter 7. Reference
SSO login
Default Log-
INFO
ging Level
Rotation occurs if live file size reaches 10MB. Rotation also occurs at end of each day,
regardless of live file size.
Default Rota-
tion and Rotated files are named as cloudian-ui.log.YYYY-MM-DD.i.gz, where i is a rotation counter
Retention that resets back to 1 after each day. Rotated files are compressed with gzip.
Policy
Deletion of oldest rotated log file occurs if aggregate rotated files size (after compression)
reaches 100MB or if oldest rotated file age reaches 180 days.
yyyy-mm-dd
HH:mm:ss,SSS|SourceIP|AdminID|Action|CanonicalUserId|UserId|GroupId|Result
The ui-action.log captures CMC actions relating to login, logout, and the creation, editing,
and deletion of user accounts. This log is generated on each node running the CMC, and on
Log Entry
each node records only activity processed by that node's CMC instance. Through the CMC
Format
you can download a CSV (comma-separated value) file that aggregates all the ui-action.log
content from across all CMC instances. For more information about downloading the CSV
file, and for more information about the content of this log, see 5.5.1.8 Download User Audit
Log.
Log Entry
2021-01-16 09:22:13,812|172.16.6.184|0%7Cadmin|CreateUser||testU2|test|Success
Example
Default Log-
Not applicable
ging Level
Rotation occurs if live file size reaches 10MB. Rotation also occurs at end of each day,
regardless of live file size.
Default Rota-
tion and Rotated files are named as ui-action.log.YYYY-MM-DD.i.gz, where i is a rotation counter that
Retention resets back to 1 after each day. Rotated files are compressed with gzip.
Policy
Deletion of oldest rotated log file occurs if aggregate rotated files size (after compression)
reaches 100MB or if oldest rotated file age reaches 90 days.
520
7.3. HyperStore Logs
The log records information about dropped packets, including the timestamp, host, firewall
Log Entry zone (cloudian-backend [for the designated internal interface] or cloudian-frontend [for all
Format other interfaces]), interface name and MAC address, source and destination address, pro-
tocol, TCP flags, and so on.
Rotation occurs hourly if the live file size has reached 10MB; or else daily regardless of file
Default Rota- size (except that there is no rotation of an empty live log file).
tion and
Rotated files are named as firewall.log-YYYY-MM-DD.HH.gz. Rotated files are compressed
Retention
with gzip.
Policy
Rotated files are retained for 180 days and then automatically deleted.
Rotation of this log is managed by the Linux logrotate utility. In the current version of Hyper-
Configuration
Store, the rotation settings for the HyperStore firewall log are not configurable.
l The MessageCode uniquely identifies the log message, for messages of level WARN
Log Entry
or higher. For documentation of specific message codes including recommended
Format
operator action, see the "Log Message Codes" section of the CMC's online Help.
l The S3RequestId value is present only in messages associated with implementing S3
requests.
521
Chapter 7. Reference
Default Log-
INFO
ging Level
Rotation occurs if live file size reaches 10MB. Rotation also occurs at end of each day, regard-
less of live file size.
Default Rota-
tion and Rotated files are named as cloudian-hyperstore.log.YYYY-MM-DD.i.gz, where i is a rotation
Retention counter that resets back to 1 after each day. Rotated files are compressed with gzip.
Policy
Deletion of oldest rotated log file occurs if aggregate rotated files size (after compression)
reaches 100MB or if oldest rotated file age reaches 180 days.
yyyy-mm-dd HH:mm:ss,SSS|IpAddressOfClientS3Server|S3RequestId|
HttpStatus|HttpOperation|OriginalUri|HyperStoreFilePath|ContentLength|
DurationMicrosecs|Etag|ECSuffix
2020-11-03 06:23:35,090|10.112.1.79|3f1b624f-d7b7-1e2c-a6b1-0a690b692ef3|200|GET|
/ec/ngrp%2Fm-MDA0OTI1MjcxNjA0Mzg0NTM2MzYx%2Fecbn%2Fmpu..0002.1|
/cloudian1/ec/std8ZdRJDskcPvmOg4/db35e009d75fff7fce122105f6d3b877/196/037/
90637448863864709563044774788357063423.1604384536362536381-0A70014F|5242880|15277|0|3
Log Entry
Example Replicated object:
2020-11-03 06:42:22,201|10.112.1.79|3f1b627a-d7b7-1e2c-a6b1-0a690b692ef3|200|PUT|
/file/ngrp%2Fhsfsbn%2Fhsfs|/cloudian1/hsfs/std8ZdRJDskcPvmOg4/9894597460837bab308e73
27144b4bcc/
227/232/169187708830921211004707767501030750269.1604385742187742235-
522
7.3. HyperStore Logs
0A70014F|1048576|186769|
379228ccb18fb5f795aaaa17a459f0ed|null
Logging
Not applicable
Level
Rotation occurs if live file size reaches 300MB. Rotation also occurs at end of each day,
regardless of live file size.
Default
Rotation Rotated files are named as cloudian-hyperstore-request-info.log.YYYY-MM-DD.i.gz, where i is
and Reten- a rotation counter that resets back to 1 after each day. Rotated files are compressed with gzip.
tion Policy
Deletion of oldest rotated log file occurs if aggregate rotated files size (after compression)
reaches 3GB or if oldest rotated file age reaches 180 days.
yyyy-mm-dd HH:mm:ss,SSS|Command#|ChunkName|ChunkFilePath
This log has entries when an hsstool cleanup or hsstool cleanupec operation results in
files being deleted from the node. A cleanup operation that determines that no files need to
be deleted from the node will not cause any entries to this log.
Log Entry
Format
Note For background information about "chunks" and chunk names see "hsstool
repairec" (page 363). For information about chunk file paths see "HyperStore Ser-
vice and the HSFS" (page 38).
2018-02-28 05:57:25,743|1|buser1/obj1|/var/lib/cloudian/hsfs/
Log Entry
SalA11OVu6oCThSSRafvH/7cf10597b0360421d7564e7c248b2445/165/206/
Example
16398559635448146388914806157301167971.1476861996241
Default Log-
INFO
ging Level
Rotation occurs if live file size reaches 10MB. Rotation also occurs at end of each day,
regardless of live file size.
Default Rota-
tion and Rotated files are named as cloudian-hyperstore-cleanup.log.YYYY-MM-DD.i.gz, where i is a
Retention rotation counter that resets back to 1 after each day. Rotated files are compressed with gzip
Policy
Deletion of oldest rotated log file occurs if aggregate rotated files size (after compression)
reaches 100MB or if oldest rotated file age reaches 180 days.
523
Chapter 7. Reference
This log has entries when a repair operation results in an attempt to repair data on the node.
A repair operation that determines that no repairs are needed on the node will not cause any
entries to this log (but will result in entries to the application log cloudian-hyperstore.log).
In cloudian-hyperstore-repair.log there are two different types of entries, with different fields.
The first type of entry records information about a repair action being taken and has this
format:
yyyy-mm-dd HH:mm:ss,SSS|RepairType|Command#|Coordinator|RepairEndpoint|
StreamFromEndpoint|ChunkName|Path|ChunkSize|Md5Hash|
RepairLatencyMillisecs|Suffix
The Coordinator is the node to which the hsstool repair or hsstool repairec command was
submitted. The StreamFromEndpoint is the node from which a replica or erasure coded frag-
Log Entry
ment was streamed in order to repair missing or bad data at the RepairEndpoint node. For
Format
erasure coded data repair, the fragment is streamed from the node that performed the decod-
ing and re-encoding of the repaired object.
The Suffix field indicates the fragment suffix in the case of erasure coding repairs. For replica
repairs, suffixes are not applicable and this field's value will be "-1".
Note For background information about "chunks" and chunk names see "hsstool
repairec" (page 363). For information about chunk file paths see "HyperStore Ser-
vice and the HSFS" (page 38).
Ok|yyyy-mm-dd HH:mm:ss,SSS|RepairType|ChunkName|Path|TaskType
The ECRepairTaskType field value indicates the type of erasure coded fragment problem
that was successfully repaired -- one of "[MD5_MISMATCH]", "[MISSING]", or "[DUPLICATE]".
524
7.3. HyperStore Logs
1SZe38K1qp98VDadplVxyq/257cb007cd57283bad02763e83864bb7/075/133/
27936503055090146240017134150181984843.1623390809396809455-0A140151|
524288|8ad63b8f18e934333bdc2164bd7f8553|8|2
Ok|2021-06-10 23:00:12,272|REC|m-MDAzYTMyMTcxNjIzMzkwODA5Mzk2/ecbn/mpmc..0002.2|
/cloudian3/ec/1SZe38K1qp98VDadplVxyq/257cb007cd57283bad02763e83864bb7/075/133/
27936503055090146240017134150181984843.1623390809396809455-0A140151|[MISSING]
Default Log-
Not applicable
ging Level
Rotation occurs if live file size reaches 10MB. Rotation also occurs at end of each day,
regardless of live file size.
Default Rota-
tion and Rotated files are named as cloudian-hyperstore-repair.log.YYYY-MM-DD.i.gz, where i is a
Retention rotation counter that resets back to 1 after each day. Rotated files are compressed with gzip
Policy
Deletion of oldest rotated log file occurs if aggregate rotated files size (after compression)
reaches 100MB or if oldest rotated file age reaches 180 days.
When repairec has been run, this log records an entry for each chunk for which the repair
attempt failed on the node. These entries have this format:
ChunkName|Path|OperationId|yyyy-mm-dd HH:mm:ss,SSS|FailReason[|TaskType]
The OperationId is a system-generated unique identifier of the EC repair run (i.e., each time
you run hsstool repairec the run will be assigned its own unique ID, so that different runs can
be distinguished from each other).
The FailReason field value indicates the failure reason -- for example "LESS_THAN_K" in
the event that fewer than K good fragments were available for the chunk.
Log Entry
Format The TaskType field may or may not be present, depending on the FailReason. When
present, the TaskType field indicates the type of EC fragment problem that the failed repair
task was attempting to address -- either "[MD5_MISMATCH]" or "[MISSING]" or "
[DUPLICATE]".
525
Chapter 7. Reference
Note For background information about "chunks" and chunk names see "hsstool
repairec" (page 363). For information about chunk file paths see "HyperStore Ser-
vice and the HSFS" (page 38).
Example #1:
ecbn/hoge|/cloudian2/ec/std8ZdRJDskcPvmOg4/96f96c866408a3fc9e42237ae94bce5d/
170/045/156312679751569259476355507179662176759.1631169767765768030-0A32C89D|
d8cd52cc-0e72-1cd4-b03c-000c29b5a219|2021-09-08 23:47:42,654|NODE_DOWN
Log Entry
Examples Example #2:
ecbn/sig|/cloudian1/ec/1IVfPFBIijTqx0GvLCwoi0/35418348e30905d622f692c451868e73/
172/190/25879583587069690807015587486564246118.1626338080577080582-0A700108|
3bad2bf2-95ce-1070-b320-0acf12cb1e97|2021-07-15 08:47:55,126|
EC_DECODE_FAILED|[MISSING]
Default Log-
Not applicable
ging Level
Rotation occurs if live file size reaches 10MB. Rotation also occurs at end of each day,
regardless of live file size.
Default Rota-
Rotated files are named as cloudian-hyperstore-repair-failure.log.YYYY-MM-DD.i.gz, where i
tion and
is a rotation counter that resets back to 1 after each day. Rotated files are compressed with
Retention
gzip
Policy
Deletion of oldest rotated log file occurs if aggregate rotated files size (after compression)
reaches 10000 MB or if oldest rotated file age reaches 180 days.
526
7.3. HyperStore Logs
Rotation of this log is managed by the Linux logrotate utility. In the current version of Hyper-
Configuration
Store, the rotation settings for the HyperStore shell log are not configurable.
Location
Note For an IAM overview see 14.1.1 HyperStore Support for the AWS IAM API.
Log Entry l The MessageCode uniquely identifies the log message, for messages of level WARN
Format or higher. For documentation of specific message codes including recommended oper-
ator action, see the "Log Message Codes" section of the CMC's online Help.
Default
Logging INFO
Level
Rotation occurs if live file size reaches 10MB. Rotation also occurs at end of each day, regard-
less of live file size.
Default
Rotation Rotated files are named as cloudian-iam.log.YYYY-MM-DD.i.gz, where i is a rotation counter
and Reten- that resets back to 1 after each day. Rotated files are compressed with gzip.
tion Policy
Deletion of oldest rotated log file occurs if aggregate rotated files size (after compression)
reaches 100MB or if oldest rotated file age reaches 180 days.
527
Chapter 7. Reference
Location Note This log records Security Token Service (STS) requests as well as
IAM requests.
yyyy-mm-dd HH:mm:ss,SSS|ClientIpAddress|AccountRootUserCanonicalId|
RequestorUserId|GroupId|Protocol:Action|IamUserId|RoleSessionArn|
RoleSessionId|TempCredentialsAccessKey|HttpStatus|ErrorCode|
ResponseData|DurationMicrosecs
2020-08-10 18:58:59,607|10.20.2.34|679d95846fb0f0047f5926ba16546552|testu159986|
myGroup8732|iam:PutRolePolicy|||||200|||6
Log Entry
Examples 2020-08-10 18:58:59,619|10.20.2.34|679d95846fb0f0047f5926ba16546552|testu159986|
myGroup8732|sts:AssumeRole|aidcc54d3e60de2a74e89ad639561df0||||200||
arn:aws:iam::679d95846fb0f0047f5926ba16546552:role/iammypath/rolen134094&
rolesn54301&asicbd8ef4bdd6e7e03c|118
Default Log-
INFO
ging Level
Rotation occurs if live file size reaches 100MB. Rotation also occurs at end of each day,
Default Rota- regardless of live file size.
tion and
Rotated files are named as cloudian-iam-request-info.log.YYYY-MM-DD.i.gz, where i is a
Retention
rotation counter that resets back to 1 after each day. Rotated files are compressed with gzip.
Policy
Deletion of oldest rotated log file occurs if aggregate rotated files size (after compression)
528
7.3. HyperStore Logs
Log Entry
yyyy-mm-dd HH:mm:ss,SSS PriorityLevel [ThreadId] ClassName:MESSAGE
Format
Default Log-
WARN
ging Level
Rotation occurs if live file size reaches 10MB. Rotation also occurs at end of each day,
regardless of live file size.
Default Rota-
tion and Rotated files are named as cloudian-agent.log.YYYY-MM-DD.i.gz, where i is a rotation
Retention counter that resets back to 1 after each day. Rotated files are compressed with gzip.
Policy
Deletion of oldest rotated log file occurs if aggregate rotated files size (after compression)
reaches 100MB or if oldest rotated file age reaches 180 days.
Log Entry The MessageCode uniquely identifies the log message, for messages of level WARN or
Format higher. For documentation of specific message codes including recommended operator
action, see the "Log Message Codes" section of the CMC's online Help.
Default Log-
WARN
ging Level
Rotation occurs if live file size reaches 10MB. Rotation also occurs at end of each day,
Default Rota-
regardless of live file size.
529
Chapter 7. Reference
Log Entry The MessageCode uniquely identifies the log message, for messages of level WARN or
Format higher. For documentation of specific message codes including recommended operator
action, see the "Log Message Codes" section of the CMC's online Help.
Default Log-
WARN
ging Level
Rotation occurs if live file size reaches 10MB. Rotation also occurs at end of each day,
regardless of live file size.
Default Rota-
tion and Rotated files are named as cloudian-phonehome.log.YYYY-MM-DD.i.gz, where i is a rotation
Retention counter that resets back to 1 after each day. Rotated files are compressed with gzip.
Policy
Deletion of oldest rotated log file occurs if aggregate rotated files size (after compression)
reaches 100MB or if oldest rotated file age reaches 180 days.
7.3.10. Redis (Credentials DB and QoS DB) and Redis Monitor Logs
Redis Credentials application log (redis-credentials.log)
Location On Redis Credentials nodes, /var/log/redis/redis-credentials.log
530
7.3. HyperStore Logs
Log Entry
18290:S 28 Jul 01:23:42.416 # Connection with master lost.
Example
Default Log-
NOTICE
ging Level
Default Rota-
Not rotated by default. You can set up rotation by using logrotate.
tion Policy
Redis Credentials application logging is configured in the main Redis configuration file. The
file name depends on the Redis node type -- master or slave. These templates are on the
Configuration Master node, under /etc/cloudian-<version>-puppet/modules/redis/templates/:
The only configurable logging settings are the log file name and the logging level. See the
commenting in the configuration file for more detail.
Log Entry
24401:M 28 Jul 01:41:46.963 * Calling fsync() on the AOF file.
Example
Default Log-
NOTICE
ging Level
Default Rota-
Not rotated by default. You can set up rotation by using logrotate.
tion Policy
Redis QoS application logging is configured in the main Redis configuration file. The file
name depends on the Redis node type -- master or slave. These templates are on the Con-
figuration Master node, under /etc/cloudian-<version>-puppet/modules/redis/templates/:
The only configurable logging settings are the log file name and the logging level. See the
commenting in the configuration file for more detail.
531
Chapter 7. Reference
Log Entry
yyyy-mm-dd HH:mm:ss,SSS S3RequestId MESSAGE
Format
Log Entry
2016-11-17 23:54:01,695 CLIENT setname ACCOUNT_GROUPS_M
Example
INFO
Note The default logging level of INFO disables these logs. If you want these logs to
be written, you must edit the Puppet template files log4j-s3.xml.erb (for logging S3
Default Log- Service access to Redis), log4j-admin.xml.erb (for logging Admin Service access to
ging Level Redis), and/or log4j-hyperstore.xml.erb (for logging HyperStore Service access to
Redis). Find the AsyncLogger name="redis.clients.jedis" block and change the level
from "INFO" to "TRACE". Then do a Puppet push, and then restart the relevant ser-
vices (S3-Admin and/or HyperStore).
Rotation occurs if live file size reaches 10MB. Rotation also occurs at end of each day,
regardless of live file size.
Default Rota-
tion and Rotated files are named as redis-{s3,admin,hss}-tx.log.YYYY-MM-DD.i.gz, where i is a rota-
Retention tion counter that resets back to 1 after each day. Rotated files are compressed with gzip.
Policy
Deletion of oldest rotated log file occurs if aggregate rotated files size (after compression)
reaches 100MB or if oldest rotated file age reaches 180 days.
Log Entry The MessageCode uniquely identifies the log message, for messages of level WARN or
Format higher. For documentation of specific message codes including recommended operator
action, see the "Log Message Codes" section of the CMC's online Help.
Default Log-
INFO
ging Level
Rotation occurs if live file size reaches 10MB. Rotation also occurs at end of each day,
regardless of live file size.
Default Rota-
tion and Rotated files are named as cloudian-redismon.log.YYYY-MM-DD.i.gz, where i is a rotation
Retention counter that resets back to 1 after each day. Rotated files are compressed with gzip.
Policy
Deletion of oldest rotated log file occurs if aggregate rotated files size reaches 100MB or if
oldest rotated file age reaches 180 days.
532
7.3. HyperStore Logs
Default Log-
INFO
ging Level
Rotation occurs if live file size reaches 10MB. Rotation also occurs at end of each day, regard-
less of live file size.
Default Rota-
tion and Rotated files are named as cloudian-s3.log.YYYY-MM-DD.i.gz, where i is a rotation counter
Retention that resets back to 1 after each day. Rotated files are compressed with gzip.
Policy
Deletion of oldest rotated log file occurs if aggregate rotated files size (after compression)
reaches 100MB or if oldest rotated file age reaches 180 days.
yyyy-mm-dd HH:mm:ss,SSS|ClientIpAddress|BucketOwnerUserId|Operation|
BucketName|RequestorUserId|RequestHeaderSize|RequestBodySize|
ResponseHeaderSize|ResponseBodySize|TotalRequestResponseSize|
DurationMicrosecs|UrlEncodedObjectName|HttpStatus|
Log Entry S3RequestId|Etag|ErrorCode|SourceBucketName/UrlEncodedSourceObjectName|
Format GroupId|CanonicalUserId|IamUserId|RoleSessionArn|RoleSessionId|
TempCredentialsAccessKey|UserAgent
l The Operation field indicates the S3 API operation. Note that "getBucket" indicates
GET Bucket (List Objects) Version 1 whereas "getBucketV2" indicates GET Bucket
533
Chapter 7. Reference
(List Objects) Version 2. (In the case of a health check request, the Operation field
indicates "healthCheck". In the case of requests submitted to the S3 Service by a sys-
tem cron job, the Operation field indicates the name of the cron job action, such as
"sytemBatchDelete").
l The Etag field is the Etag value from the response, if applicable to the request type.
For information about Etag see for example Common Response Headers from the
Amazon S3 REST API spec. This field’s value will be 0 for request/response types
that do not use an Etag value.
l The ErrorCode field is the Error Code in the response body, applicable only for poten-
tially long-running requests like PUT Object. If there is no Error Code in the response
body this field’s value will be 0. For possible Error Code values see Error
Responses from the Amazon S3 REST API spec.
Note In the case where the Operation field value is deleteObjects, the
ErrorCode field will be formatted as objectname1-errorcode1,objectname2-
errorcode2,objectname3-errorcode3..., and the object names will be URL-
encoded. If there are no errors the field is formatted as objectname1-0,ob-
jectname2-0,objectname3-0....
NOTE:
Cloudian HyperIQ is a solution for dynamic visualization and analysis of HyperStore system
monitoring data and S3 service usage data. HyperIQ is a separate product available from
Cloudian that deploys as virtual appliance on VMware or VirtualBox and integrates with your
existing HyperStore system. For more information about HyperIQ contact your Cloudian rep-
resentative.
2021-01-12
Log Entry 01:20:59,403|10.50.200.157||getService||nusr|714|0|55|324|1093|18816||200|
Example 21ed3781-63a0-1891-a017-000c29b5a219|0|0|||a9bfb2fc43f8e5b3802ab7ea028268e2|||||
aws-sdk-java/1.11.723 Linux/3.10.0-
Logging
Not applicable
Level
534
7.3. HyperStore Logs
Rotation occurs if live file size reaches 100MB. Rotation also occurs at end of each day,
regardless of live file size.
Default Rota-
tion and Rotated files are named as cloudian-request-info.log.YYYY-MM-DD.i.gz, where i is a rotation
Retention counter that resets back to 1 after each day. Rotated files are compressed with gzip.
Policy
Deletion of oldest rotated log file occurs if aggregate rotated files size (after compression)
reaches 2GB or if oldest rotated file age reaches 180 days.
If you use a load balancer in front of your S3 Service (as would typically be the case in a production envir-
onment), then the ClientIpAddress in your S3 request logs will by default be the IP address of a load balancer
rather than that of the end client. If you want the S3 request logs to instead show the end client IP address, your
options depend on what load balancer you're using.
If your load balancer is HAProxy or a different load balancer that supports the PROXY Protocol, enable S3 sup-
port for the PROXY Protocol (see "s3_proxy_protocol_enabled" (page 439) in common.csv) and configure
your load balancer to use the PROXY Protocol for relaying S3 requests to the S3 Service. Consult with your
Cloudian Sales Engineering or Support representative for guidance on load balancer configuration.
1. Configure your load balancers so that they pass the HTTP X-Forwarded-For header to the S3 Service.
This is an option only if you run your load balancers run in "HTTP mode" rather than "TCP mode". Con-
sult with your Cloudian Sales Engineering or Support representative for guidance on load balancer con-
figuration.
2. Configure your S3 Service to support the X-Forwarded-For header. You can enable S3 Service support
for this header by editing the configuration file s3.xml.erb on your Configuration Master node. The
needed configuration lines are already in that file; you only need to uncomment them.
Before uncommenting:
<!-- Uncomment the block below to enable handling of X-Forwarded- style headers -->
<!--
<Call name="addCustomizer">
<Arg><New class="org.eclipse.jetty.server.ForwardedRequestCustomizer"/></Arg>
</Call>
-->
After uncommenting:
<!-- Uncomment the block below to enable handling of X-Forwarded- style headers -->
<Call name="addCustomizer">
<Arg><New class="org.eclipse.jetty.server.ForwardedRequestCustomizer"/></Arg>
</Call>
After making this configuration edit, do a Puppet push and restart the S3 Service to apply your
change.
535
Chapter 7. Reference
Note All nodes participate in tiering objects to their intended destinations systems. In
the case of regular auto-tiering based on user-defined schedules, the tiering pro-
cessing workload is randomly spread across the nodes in the service region in which
Location the cron job host resides. In the case of Bridge Mode tiering (also known as proxy tier-
ing), whichever node processes the S3 uploading of an object into its source bucket
also processes the immediate tiering of that object to its intended destination system.
For more information on auto-tiering see "Auto-Tiering Feature Overview" (page
126).
yyyy-mm-dd HH:mm:ss,SSS|Command|Protocol|SourceBucket/Object|
SourceObjectVersion|TargetBucket|TargetObjectVersion|ObjectSize|
TotalRequestSize|Status|DurationMicrosecs|Mode|AttemptCount
l The Mode field will be one of AUTO_TIERING (for regular auto-tiering), BRIDGE_
Log Entry
MODE (for Bridge Mode auto-tiering), or SCHEDULER (for retry attempts for Bridge
Format
Mode tiering of objects)
l The AttemptCount is applicable only to Bridge Mode tiering and associated retries,
and indicates how many attempts have been made to tier the object. This field's value
will be "-1" if the Mode is AUTO_TIERING.
Logging
Not applicable
Level
Rotation occurs if live file size reaches 10MB. Rotation also occurs at end of each day,
regardless of live file size.
Default Rota-
tion and Rotated files are named as cloudian-tiering-request-info.log.YYYY-MM-DD.i.gz, where i is a
Retention rotation counter that resets back to 1 after each day. Rotated files are compressed with gzip.
Policy
Deletion of oldest rotated log file occurs if aggregate rotated files size (after compression)
reaches 100MB or if oldest rotated file age reaches 180 days.
Location Note Whichever S3 Service node processes a PUT of an object into a source bucket
configured for CRR will be the node that initiates the replication of the object to the
destination bucket. This node will have an entry for that object in its CRR request log.
536
7.3. HyperStore Logs
In the case of retries of replication attempts that failed with a temporary error the first
time, the retries will be logged in the CRR request log on the cron job node. For gen-
eral information on the cross-region replication feature, see "Cross-Region Rep-
lication Feature Overview" (page 138).
yyyy-mm-dd HH:mm:ss,SSS|SourceBucket/Object|ObjectVersionId|DestinationBucket|
CrrOperation|Status|DurationMillisecs|Size
2020-03-04 15:04:58,423|prod-2020-01-16/0299-0166180000362773.pdf|
Log Entry
fe15a1de-de25-f8af-9a28-b4a9fc08ca7e|prod-backup-2020-01-16|REPLICATEOBJECT|
Example
COMPLETED|61|328471
Logging
Not applicable
Level
Rotation occurs if live file size reaches 10MB. Rotation also occurs at end of each day,
regardless of live file size.
Default Rota-
tion and Rotated files are named as cloudian-crr-request-info.log.YYYY-MM-DD.i.gz, where i is a rota-
Retention tion counter that resets back to 1 after each day. Rotated files are compressed with gzip.
Policy
Deletion of oldest rotated log file occurs if aggregate rotated files size (after compression)
reaches 100MB or if oldest rotated file age reaches 180 days.
Location Note For information about the WORM feature see "Object Lock Feature Over-
view" (page 144).
537
Chapter 7. Reference
DateTime|Hostname|S3RequestId|S3Operation|Headers|Bucket|Object|
ObjectVersionId|CanonicalUserId|CanonicalIamUserId|StatusCode|
StatusMessage
Example 1: TGD30 for a Governance mode configuration with 30 day retention period
Example 2: T for a bucket with object lock enabled, but no default object lock con-
figuration
If the submitter of the request is an IAM user, the log entry shows the canonical user ID of the
IAM user as well as the canonical user ID of the parent user account.
2019-10-06 09:59:30,767|arcturus|a9d7e5b1-e85a-11e9-8519-52540014b047|
Log Entry
s3:GetBucketObjectLockConfiguration|TGD90|newbucket|||
Example
b584fb57480af5108e32d17f10c5cb7b||200|OK
Default Log-
INFO
ging Level
Rotation occurs if live file size reaches 10MB. Rotation also occurs at end of each day,
regardless of live file size.
Default Rota-
tion and Rotated files are named as s3-worm.log.YYYY-MM-DD.i.gz, where i is a rotation counter that
Retention resets back to 1 after each day. Rotated files are compressed with gzip.
Policy
Deletion of oldest rotated log file occurs if aggregate rotated files size (after compression)
reaches 100MB or if oldest rotated file age reaches 180 days.
538
7.3. HyperStore Logs
Location Note For an SQS overview (including information about how to enable the SQS Ser-
vice, which is disabled by default) see the SQS section in the Cloudian HyperStore
AWS APIs Support Reference.
Log Entry
yyyy-mm-dd HH:mm:ss,SSS PriorityLevel[ThreadId]MessageCode ClassName:MESSAGE
Format
Default Log-
INFO
ging Level
Rotation occurs if live file size reaches 10MB. Rotation also occurs at end of each day,
regardless of live file size.
Default Rota-
tion and Rotated files are named as cloudian-sqs-req.log.YYYY-MM-DD.i.gz, where i is a rotation
Retention counter that resets back to 1 after each day. Rotated files are compressed with gzip.
Policy
Deletion of oldest rotated log file occurs if aggregate rotated files size (after compression)
reaches 100MB or if oldest rotated file age reaches 180 days.
Location Note For an SQS overview (including information about how to enable the SQS Ser-
vice, which is disabled by default) see the SQS section in the Cloudian HyperStore
AWS APIs Support Reference.
Default Log-
INFO
ging Level
539
Chapter 7. Reference
Rotation occurs if live file size reaches 10MB. Rotation also occurs at end of each day,
regardless of live file size.
Default Rota-
tion and Rotated files are named as cloudian-sqs-req.log.YYYY-MM-DD.i.gz, where i is a rotation
Retention counter that resets back to 1 after each day. Rotated files are compressed with gzip.
Policy
Deletion of oldest rotated log file occurs if aggregate rotated files size (after compression)
reaches 100MB or if oldest rotated file age reaches 180 days.
l Introduction (below)
l "Admin Service Logs" (page 541)
l "Cassandra (Metadata DB) Logs" (page 542)
l "CMC Logs" (page 543)
l "HyperStore Firewall Log" (page 545)
l "HyperStore Service Logs" (page 546)
l "HyperStore Shell (HSH) Log" (page 551)
l "IAM Service Logs" (page 551)
l "Monitoring Agent and Collector Logs" (page 553)
l "Phone Home (Smart Support) Log" (page 554)
l "Redis (Credentials DB and QoS DB) and Redis Monitor Logs" (page 555)
l "S3 Service Logs (including Auto-Tiering, CRR, and WORM)" (page 557)
l "SQS Service Logs" (page 563)
The major HyperStore services each generate their own application log. The S3 Service, Admin Service, and
HyperStore Service, in addition to generating application logs, also generate transaction (request) logs.
The log descriptions below indicate each log's default location, logging level, rotation and retention policy, log
entry format, and where to modify the log's configuration.
Note With the exception of Cassandra and Redis logs, all HyperStore logs are located in /var/-
log/cloudian.
Note For information on viewing logs from within the HyperStore Shell, see "Using the HSH to View
Logs" (page 566).
540
7.3. HyperStore Logs
Log Entry The MessageCode uniquely identifies the log message, for messages of level WARN or
Format higher. For documentation of specific message codes including recommended operator
action, see the "Log Message Codes" section of the CMC's online Help.
Default Log-
INFO
ging Level
Rotation occurs if live file size reaches 10MB. Rotation also occurs at end of each day,
regardless of live file size.
Default Rota-
tion and Rotated files are named as cloudian-admin.log.YYYY-MM-DD.i.gz, where i is a rotation
Retention counter that resets back to 1 after each day. Rotated files are compressed with gzip.
Policy
Deletion of oldest rotated log file occurs if aggregate rotated files size (after compression)
reaches 100MB or if oldest rotated file age reaches 180 days.
yyyy-mm-dd HH:mm:ss,SSS|ClientIpAddress|HttpMethod|Uri|QueryParams|
DurationMicrosecs|HttpStatus
Note
Log Entry
* Query parameters are not logged for requests that involve user credentials.
Format
* The request log records Admin API requests for which authentication fails, as well
as requests for which the authentication succeeds. Success or failure is indicated by
the HttpStatus.
Log Entry
2021-10-27 14:54:01,170|10.20.2.57|GET|/group/list|limit:100|188212|200
Example
Logging
Not applicable
Level
Default Rota- Rotation occurs if live file size reaches 100MB. Rotation also occurs at end of each day,
tion and regardless of live file size.
541
Chapter 7. Reference
Default Logging
INFO
Level
Rotation occurs if live file size reaches 20MB. Rotation also occurs at end of each day,
regardless of live file size.
Default Rotation
Rotated files are named as system.log.YYYY-MM-DD.i.gz, where i is a rotation counter
and Retention
that resets back to 1 after each day. Rotated files are compressed with gzip.
Policy
Deletion of oldest rotated log file occurs if aggregate rotated files size (after com-
pression) reaches 200MB or if oldest rotated file age reaches 30 days.
Configuration: /etc/cloudian-<version>-pup-
pet/modules/cassandra/templates/logback.xml.erb. For setting descriptions see the
online documentation for Logback:
Configuration
l FixedWindowRollingPolicy
l SizeBasedTriggeringPolicy
542
7.3. HyperStore Logs
{'class':'LeveledCompactionStrategy'};
DEBUG
Note In log4j-s3.xml.erb on your Configuration Master node there are three different
AsyncLogger instances for Cassandra request logging. The ERROR logger logs
entries when a Cassandra request results in an error; the SLOW logger logs entries
when a Cassandra request takes more than 5 seconds to process; and the NORMAL
Default Log- logger logs all Cassandra requests. The three loggers all write to /var/-
ging Level log/cloudian/cassandra-s3-tx.log, and the implementation prevents duplicate entries
across the three loggers. All three loggers are set to DEBUG level by default; and
each logger works only if set to DEBUG or TRACE. To disable a logger, set its level
to INFO or higher. For example to disable the NORMAL logger so that only error and
slow requests are recorded, set the NORMAL logger's level to INFO. Then do a Pup-
pet push and restart the S3 Service.
Rotation occurs if live file size reaches 10MB. Rotation also occurs at end of each day,
regardless of live file size.
Default Rota-
tion and Rotated files are named as cassandra-s3-tx.log.YYYY-MM-DD.i.gz, where i is a rotation
Retention counter that resets back to 1 after each day. Rotated files are compressed with gzip.
Policy
Deletion of oldest rotated log file occurs if aggregate rotated files size (after compression)
reaches 100MB or if oldest rotated file age reaches 180 days.
Note Currently only a fraction of request types from the S3 Service to Cassandra support this
request logging feature. These are request types that use a new DataStax Java driver (which supports
the request logging) rather than the older Hector driver (which does not).
In the case of log entries for user logins to the CMC, the MESSAGE value will be formatted
as follows:
Normal login
Log Entry
Format Login <groupId>|<userId> from: <ipAddress> Success
SSO login
543
Chapter 7. Reference
Default Log-
INFO
ging Level
Rotation occurs if live file size reaches 10MB. Rotation also occurs at end of each day,
regardless of live file size.
Default Rota-
tion and Rotated files are named as cloudian-ui.log.YYYY-MM-DD.i.gz, where i is a rotation counter
Retention that resets back to 1 after each day. Rotated files are compressed with gzip.
Policy
Deletion of oldest rotated log file occurs if aggregate rotated files size (after compression)
reaches 100MB or if oldest rotated file age reaches 180 days.
yyyy-mm-dd
HH:mm:ss,SSS|SourceIP|AdminID|Action|CanonicalUserId|UserId|GroupId|Result
The ui-action.log captures CMC actions relating to login, logout, and the creation, editing,
and deletion of user accounts. This log is generated on each node running the CMC, and on
Log Entry
each node records only activity processed by that node's CMC instance. Through the CMC
Format
you can download a CSV (comma-separated value) file that aggregates all the ui-action.log
content from across all CMC instances. For more information about downloading the CSV
file, and for more information about the content of this log, see 5.5.1.8 Download User Audit
Log.
Log Entry
2021-01-16 09:22:13,812|172.16.6.184|0%7Cadmin|CreateUser||testU2|test|Success
Example
Default Log-
Not applicable
ging Level
Rotation occurs if live file size reaches 10MB. Rotation also occurs at end of each day,
regardless of live file size.
Default Rota-
tion and Rotated files are named as ui-action.log.YYYY-MM-DD.i.gz, where i is a rotation counter that
Retention resets back to 1 after each day. Rotated files are compressed with gzip.
Policy
Deletion of oldest rotated log file occurs if aggregate rotated files size (after compression)
reaches 100MB or if oldest rotated file age reaches 90 days.
544
7.3. HyperStore Logs
The log records information about dropped packets, including the timestamp, host, firewall
Log Entry zone (cloudian-backend [for the designated internal interface] or cloudian-frontend [for all
Format other interfaces]), interface name and MAC address, source and destination address, pro-
tocol, TCP flags, and so on.
Rotation occurs hourly if the live file size has reached 10MB; or else daily regardless of file
Default Rota- size (except that there is no rotation of an empty live log file).
tion and
Rotated files are named as firewall.log-YYYY-MM-DD.HH.gz. Rotated files are compressed
Retention
with gzip.
Policy
Rotated files are retained for 180 days and then automatically deleted.
Rotation of this log is managed by the Linux logrotate utility. In the current version of Hyper-
Configuration
Store, the rotation settings for the HyperStore firewall log are not configurable.
545
Chapter 7. Reference
l The MessageCode uniquely identifies the log message, for messages of level WARN
Log Entry
or higher. For documentation of specific message codes including recommended
Format
operator action, see the "Log Message Codes" section of the CMC's online Help.
l The S3RequestId value is present only in messages associated with implementing S3
requests.
Default Log-
INFO
ging Level
Rotation occurs if live file size reaches 10MB. Rotation also occurs at end of each day, regard-
less of live file size.
Default Rota-
tion and Rotated files are named as cloudian-hyperstore.log.YYYY-MM-DD.i.gz, where i is a rotation
Retention counter that resets back to 1 after each day. Rotated files are compressed with gzip.
Policy
Deletion of oldest rotated log file occurs if aggregate rotated files size (after compression)
reaches 100MB or if oldest rotated file age reaches 180 days.
yyyy-mm-dd HH:mm:ss,SSS|IpAddressOfClientS3Server|S3RequestId|
HttpStatus|HttpOperation|OriginalUri|HyperStoreFilePath|ContentLength|
DurationMicrosecs|Etag|ECSuffix
546
7.3. HyperStore Logs
2020-11-03 06:23:35,090|10.112.1.79|3f1b624f-d7b7-1e2c-a6b1-0a690b692ef3|200|GET|
/ec/ngrp%2Fm-MDA0OTI1MjcxNjA0Mzg0NTM2MzYx%2Fecbn%2Fmpu..0002.1|
/cloudian1/ec/std8ZdRJDskcPvmOg4/db35e009d75fff7fce122105f6d3b877/196/037/
90637448863864709563044774788357063423.1604384536362536381-0A70014F|5242880|15277|0|3
/file/ngrp%2Fhsfsbn%2Fhsfs|/cloudian1/hsfs/std8ZdRJDskcPvmOg4/9894597460837bab308e73
27144b4bcc/
227/232/169187708830921211004707767501030750269.1604385742187742235-
0A70014F|1048576|186769|
379228ccb18fb5f795aaaa17a459f0ed|null
Logging
Not applicable
Level
Rotation occurs if live file size reaches 300MB. Rotation also occurs at end of each day,
regardless of live file size.
Default
Rotation Rotated files are named as cloudian-hyperstore-request-info.log.YYYY-MM-DD.i.gz, where i is
and Reten- a rotation counter that resets back to 1 after each day. Rotated files are compressed with gzip.
tion Policy
Deletion of oldest rotated log file occurs if aggregate rotated files size (after compression)
reaches 3GB or if oldest rotated file age reaches 180 days.
yyyy-mm-dd HH:mm:ss,SSS|Command#|ChunkName|ChunkFilePath
This log has entries when an hsstool cleanup or hsstool cleanupec operation results in
files being deleted from the node. A cleanup operation that determines that no files need to
be deleted from the node will not cause any entries to this log.
Log Entry
Format
Note For background information about "chunks" and chunk names see "hsstool
repairec" (page 363). For information about chunk file paths see "HyperStore Ser-
vice and the HSFS" (page 38).
547
Chapter 7. Reference
16398559635448146388914806157301167971.1476861996241
Default Log-
INFO
ging Level
Rotation occurs if live file size reaches 10MB. Rotation also occurs at end of each day,
regardless of live file size.
Default Rota-
tion and Rotated files are named as cloudian-hyperstore-cleanup.log.YYYY-MM-DD.i.gz, where i is a
Retention rotation counter that resets back to 1 after each day. Rotated files are compressed with gzip
Policy
Deletion of oldest rotated log file occurs if aggregate rotated files size (after compression)
reaches 100MB or if oldest rotated file age reaches 180 days.
This log has entries when a repair operation results in an attempt to repair data on the node.
A repair operation that determines that no repairs are needed on the node will not cause any
entries to this log (but will result in entries to the application log cloudian-hyperstore.log).
In cloudian-hyperstore-repair.log there are two different types of entries, with different fields.
The first type of entry records information about a repair action being taken and has this
format:
yyyy-mm-dd HH:mm:ss,SSS|RepairType|Command#|Coordinator|RepairEndpoint|
StreamFromEndpoint|ChunkName|Path|ChunkSize|Md5Hash|
RepairLatencyMillisecs|Suffix
The Coordinator is the node to which the hsstool repair or hsstool repairec command was
submitted. The StreamFromEndpoint is the node from which a replica or erasure coded frag-
ment was streamed in order to repair missing or bad data at the RepairEndpoint node. For
erasure coded data repair, the fragment is streamed from the node that performed the decod-
ing and re-encoding of the repaired object.
The Suffix field indicates the fragment suffix in the case of erasure coding repairs. For replica
repairs, suffixes are not applicable and this field's value will be "-1".
Note For background information about "chunks" and chunk names see "hsstool
548
7.3. HyperStore Logs
repairec" (page 363). For information about chunk file paths see "HyperStore Ser-
vice and the HSFS" (page 38).
Ok|yyyy-mm-dd HH:mm:ss,SSS|RepairType|ChunkName|Path|TaskType
The ECRepairTaskType field value indicates the type of erasure coded fragment problem
that was successfully repaired -- one of "[MD5_MISMATCH]", "[MISSING]", or "[DUPLICATE]".
2021-06-10 23:00:12,265|REC|13|10.20.1.81|10.20.1.81|10.20.1.81|
m-MDAzYTMyMTcxNjIzMzkwODA5Mzk2/ecbn/mpmc..0002.2|/cloudian3/ec/
1SZe38K1qp98VDadplVxyq/257cb007cd57283bad02763e83864bb7/075/133/
Ok|2021-06-10 23:00:12,272|REC|m-MDAzYTMyMTcxNjIzMzkwODA5Mzk2/ecbn/mpmc..0002.2|
/cloudian3/ec/1SZe38K1qp98VDadplVxyq/257cb007cd57283bad02763e83864bb7/075/133/
27936503055090146240017134150181984843.1623390809396809455-0A140151|[MISSING]
Default Log-
Not applicable
ging Level
Rotation occurs if live file size reaches 10MB. Rotation also occurs at end of each day,
regardless of live file size.
Default Rota-
tion and Rotated files are named as cloudian-hyperstore-repair.log.YYYY-MM-DD.i.gz, where i is a
Retention rotation counter that resets back to 1 after each day. Rotated files are compressed with gzip
Policy
Deletion of oldest rotated log file occurs if aggregate rotated files size (after compression)
reaches 100MB or if oldest rotated file age reaches 180 days.
When repairec has been run, this log records an entry for each chunk for which the repair
Log Entry attempt failed on the node. These entries have this format:
Format
ChunkName|Path|OperationId|yyyy-mm-dd HH:mm:ss,SSS|FailReason[|TaskType]
549
Chapter 7. Reference
The OperationId is a system-generated unique identifier of the EC repair run (i.e., each time
you run hsstool repairec the run will be assigned its own unique ID, so that different runs can
be distinguished from each other).
The FailReason field value indicates the failure reason -- for example "LESS_THAN_K" in
the event that fewer than K good fragments were available for the chunk.
The TaskType field may or may not be present, depending on the FailReason. When
present, the TaskType field indicates the type of EC fragment problem that the failed repair
task was attempting to address -- either "[MD5_MISMATCH]" or "[MISSING]" or "
[DUPLICATE]".
Note For background information about "chunks" and chunk names see "hsstool
repairec" (page 363). For information about chunk file paths see "HyperStore Ser-
vice and the HSFS" (page 38).
Example #1:
ecbn/hoge|/cloudian2/ec/std8ZdRJDskcPvmOg4/96f96c866408a3fc9e42237ae94bce5d/
170/045/156312679751569259476355507179662176759.1631169767765768030-0A32C89D|
d8cd52cc-0e72-1cd4-b03c-000c29b5a219|2021-09-08 23:47:42,654|NODE_DOWN
Log Entry
Examples Example #2:
ecbn/sig|/cloudian1/ec/1IVfPFBIijTqx0GvLCwoi0/35418348e30905d622f692c451868e73/
172/190/25879583587069690807015587486564246118.1626338080577080582-0A700108|
3bad2bf2-95ce-1070-b320-0acf12cb1e97|2021-07-15 08:47:55,126|
EC_DECODE_FAILED|[MISSING]
Default Log-
Not applicable
ging Level
Rotation occurs if live file size reaches 10MB. Rotation also occurs at end of each day,
regardless of live file size.
Default Rota-
Rotated files are named as cloudian-hyperstore-repair-failure.log.YYYY-MM-DD.i.gz, where i
tion and
is a rotation counter that resets back to 1 after each day. Rotated files are compressed with
Retention
gzip
Policy
Deletion of oldest rotated log file occurs if aggregate rotated files size (after compression)
reaches 10000 MB or if oldest rotated file age reaches 180 days.
550
7.3. HyperStore Logs
Rotation of this log is managed by the Linux logrotate utility. In the current version of Hyper-
Configuration
Store, the rotation settings for the HyperStore shell log are not configurable.
Location
Note For an IAM overview see 14.1.1 HyperStore Support for the AWS IAM API.
551
Chapter 7. Reference
Log Entry l The MessageCode uniquely identifies the log message, for messages of level WARN
Format or higher. For documentation of specific message codes including recommended oper-
ator action, see the "Log Message Codes" section of the CMC's online Help.
Default
Logging INFO
Level
Rotation occurs if live file size reaches 10MB. Rotation also occurs at end of each day, regard-
less of live file size.
Default
Rotation Rotated files are named as cloudian-iam.log.YYYY-MM-DD.i.gz, where i is a rotation counter
and Reten- that resets back to 1 after each day. Rotated files are compressed with gzip.
tion Policy
Deletion of oldest rotated log file occurs if aggregate rotated files size (after compression)
reaches 100MB or if oldest rotated file age reaches 180 days.
Location Note This log records Security Token Service (STS) requests as well as
IAM requests.
yyyy-mm-dd HH:mm:ss,SSS|ClientIpAddress|AccountRootUserCanonicalId|
RequestorUserId|GroupId|Protocol:Action|IamUserId|RoleSessionArn|
RoleSessionId|TempCredentialsAccessKey|HttpStatus|ErrorCode|
ResponseData|DurationMicrosecs
552
7.3. HyperStore Logs
2020-08-10 18:58:59,607|10.20.2.34|679d95846fb0f0047f5926ba16546552|testu159986|
myGroup8732|iam:PutRolePolicy|||||200|||6
Log Entry
Examples 2020-08-10 18:58:59,619|10.20.2.34|679d95846fb0f0047f5926ba16546552|testu159986|
myGroup8732|sts:AssumeRole|aidcc54d3e60de2a74e89ad639561df0||||200||
arn:aws:iam::679d95846fb0f0047f5926ba16546552:role/iammypath/rolen134094&
rolesn54301&asicbd8ef4bdd6e7e03c|118
Default Log-
INFO
ging Level
Rotation occurs if live file size reaches 100MB. Rotation also occurs at end of each day,
regardless of live file size.
Default Rota-
tion and Rotated files are named as cloudian-iam-request-info.log.YYYY-MM-DD.i.gz, where i is a
Retention rotation counter that resets back to 1 after each day. Rotated files are compressed with gzip.
Policy
Deletion of oldest rotated log file occurs if aggregate rotated files size (after compression)
reaches 2GB or if oldest rotated file age reaches 180 days.
Log Entry
yyyy-mm-dd HH:mm:ss,SSS PriorityLevel [ThreadId] ClassName:MESSAGE
Format
Default Log-
WARN
ging Level
Rotation occurs if live file size reaches 10MB. Rotation also occurs at end of each day,
regardless of live file size.
Default Rota-
tion and Rotated files are named as cloudian-agent.log.YYYY-MM-DD.i.gz, where i is a rotation
Retention counter that resets back to 1 after each day. Rotated files are compressed with gzip.
Policy
Deletion of oldest rotated log file occurs if aggregate rotated files size (after compression)
reaches 100MB or if oldest rotated file age reaches 180 days.
553
Chapter 7. Reference
Log Entry The MessageCode uniquely identifies the log message, for messages of level WARN or
Format higher. For documentation of specific message codes including recommended operator
action, see the "Log Message Codes" section of the CMC's online Help.
Default Log-
WARN
ging Level
Rotation occurs if live file size reaches 10MB. Rotation also occurs at end of each day,
regardless of live file size.
Default Rota-
tion and Rotated files are named as cloudian-datacollector.log.YYYY-MM-DD.i.gz, where i is a rota-
Retention tion counter that resets back to 1 after each day. Rotated files are compressed with gzip.
Policy
Deletion of oldest rotated log file occurs if aggregate rotated files size (after compression)
reaches 100MB or if oldest rotated file age reaches 180 days.
Log Entry The MessageCode uniquely identifies the log message, for messages of level WARN or
Format higher. For documentation of specific message codes including recommended operator
action, see the "Log Message Codes" section of the CMC's online Help.
Default Log-
WARN
ging Level
Default Rota- Rotation occurs if live file size reaches 10MB. Rotation also occurs at end of each day,
tion and regardless of live file size.
Retention Rotated files are named as cloudian-phonehome.log.YYYY-MM-DD.i.gz, where i is a rotation
554
7.3. HyperStore Logs
counter that resets back to 1 after each day. Rotated files are compressed with gzip.
Policy Deletion of oldest rotated log file occurs if aggregate rotated files size (after compression)
reaches 100MB or if oldest rotated file age reaches 180 days.
7.3.13.10. Redis (Credentials DB and QoS DB) and Redis Monitor Logs
Log Entry
18290:S 28 Jul 01:23:42.416 # Connection with master lost.
Example
Default Log-
NOTICE
ging Level
Default Rota-
Not rotated by default. You can set up rotation by using logrotate.
tion Policy
Redis Credentials application logging is configured in the main Redis configuration file. The
file name depends on the Redis node type -- master or slave. These templates are on the
Configuration Master node, under /etc/cloudian-<version>-puppet/modules/redis/templates/:
The only configurable logging settings are the log file name and the logging level. See the
commenting in the configuration file for more detail.
Log Entry
24401:M 28 Jul 01:41:46.963 * Calling fsync() on the AOF file.
Example
555
Chapter 7. Reference
ging Level
Default Rota-
Not rotated by default. You can set up rotation by using logrotate.
tion Policy
Redis QoS application logging is configured in the main Redis configuration file. The file
name depends on the Redis node type -- master or slave. These templates are on the Con-
figuration Master node, under /etc/cloudian-<version>-puppet/modules/redis/templates/:
The only configurable logging settings are the log file name and the logging level. See the
commenting in the configuration file for more detail.
Log Entry
yyyy-mm-dd HH:mm:ss,SSS S3RequestId MESSAGE
Format
Log Entry
2016-11-17 23:54:01,695 CLIENT setname ACCOUNT_GROUPS_M
Example
INFO
Note The default logging level of INFO disables these logs. If you want these logs to
be written, you must edit the Puppet template files log4j-s3.xml.erb (for logging S3
Default Log- Service access to Redis), log4j-admin.xml.erb (for logging Admin Service access to
ging Level Redis), and/or log4j-hyperstore.xml.erb (for logging HyperStore Service access to
Redis). Find the AsyncLogger name="redis.clients.jedis" block and change the level
from "INFO" to "TRACE". Then do a Puppet push, and then restart the relevant ser-
vices (S3-Admin and/or HyperStore).
Rotation occurs if live file size reaches 10MB. Rotation also occurs at end of each day,
regardless of live file size.
Default Rota-
tion and Rotated files are named as redis-{s3,admin,hss}-tx.log.YYYY-MM-DD.i.gz, where i is a rota-
Retention tion counter that resets back to 1 after each day. Rotated files are compressed with gzip.
Policy
Deletion of oldest rotated log file occurs if aggregate rotated files size (after compression)
reaches 100MB or if oldest rotated file age reaches 180 days.
556
7.3. HyperStore Logs
Log Entry The MessageCode uniquely identifies the log message, for messages of level WARN or
Format higher. For documentation of specific message codes including recommended operator
action, see the "Log Message Codes" section of the CMC's online Help.
Default Log-
INFO
ging Level
Rotation occurs if live file size reaches 10MB. Rotation also occurs at end of each day,
regardless of live file size.
Default Rota-
tion and Rotated files are named as cloudian-redismon.log.YYYY-MM-DD.i.gz, where i is a rotation
Retention counter that resets back to 1 after each day. Rotated files are compressed with gzip.
Policy
Deletion of oldest rotated log file occurs if aggregate rotated files size reaches 100MB or if
oldest rotated file age reaches 180 days.
Default Log-
INFO
ging Level
Rotation occurs if live file size reaches 10MB. Rotation also occurs at end of each day, regard-
Default Rota- less of live file size.
tion and
Rotated files are named as cloudian-s3.log.YYYY-MM-DD.i.gz, where i is a rotation counter
Retention
that resets back to 1 after each day. Rotated files are compressed with gzip.
Policy
Deletion of oldest rotated log file occurs if aggregate rotated files size (after compression)
557
Chapter 7. Reference
yyyy-mm-dd HH:mm:ss,SSS|ClientIpAddress|BucketOwnerUserId|Operation|
BucketName|RequestorUserId|RequestHeaderSize|RequestBodySize|
ResponseHeaderSize|ResponseBodySize|TotalRequestResponseSize|
DurationMicrosecs|UrlEncodedObjectName|HttpStatus|
S3RequestId|Etag|ErrorCode|SourceBucketName/UrlEncodedSourceObjectName|
GroupId|CanonicalUserId|IamUserId|RoleSessionArn|RoleSessionId|
TempCredentialsAccessKey|UserAgent
l The Operation field indicates the S3 API operation. Note that "getBucket" indicates
GET Bucket (List Objects) Version 1 whereas "getBucketV2" indicates GET Bucket
(List Objects) Version 2. (In the case of a health check request, the Operation field
indicates "healthCheck". In the case of requests submitted to the S3 Service by a sys-
tem cron job, the Operation field indicates the name of the cron job action, such as
"sytemBatchDelete").
l The Etag field is the Etag value from the response, if applicable to the request type.
For information about Etag see for example Common Response Headers from the
Amazon S3 REST API spec. This field’s value will be 0 for request/response types
that do not use an Etag value.
l The ErrorCode field is the Error Code in the response body, applicable only for poten-
Log Entry
tially long-running requests like PUT Object. If there is no Error Code in the response
Format
body this field’s value will be 0. For possible Error Code values see Error
Responses from the Amazon S3 REST API spec.
Note In the case where the Operation field value is deleteObjects, the
ErrorCode field will be formatted as objectname1-errorcode1,objectname2-
errorcode2,objectname3-errorcode3..., and the object names will be URL-
encoded. If there are no errors the field is formatted as objectname1-0,ob-
jectname2-0,objectname3-0....
558
7.3. HyperStore Logs
NOTE:
Cloudian HyperIQ is a solution for dynamic visualization and analysis of HyperStore system
monitoring data and S3 service usage data. HyperIQ is a separate product available from
Cloudian that deploys as virtual appliance on VMware or VirtualBox and integrates with your
existing HyperStore system. For more information about HyperIQ contact your Cloudian rep-
resentative.
2021-01-12
Log Entry 01:20:59,403|10.50.200.157||getService||nusr|714|0|55|324|1093|18816||200|
Example 21ed3781-63a0-1891-a017-000c29b5a219|0|0|||a9bfb2fc43f8e5b3802ab7ea028268e2|||||
aws-sdk-java/1.11.723 Linux/3.10.0-
Logging
Not applicable
Level
Rotation occurs if live file size reaches 100MB. Rotation also occurs at end of each day,
regardless of live file size.
Default Rota-
tion and Rotated files are named as cloudian-request-info.log.YYYY-MM-DD.i.gz, where i is a rotation
Retention counter that resets back to 1 after each day. Rotated files are compressed with gzip.
Policy
Deletion of oldest rotated log file occurs if aggregate rotated files size (after compression)
reaches 2GB or if oldest rotated file age reaches 180 days.
If you use a load balancer in front of your S3 Service (as would typically be the case in a production envir-
onment), then the ClientIpAddress in your S3 request logs will by default be the IP address of a load balancer
rather than that of the end client. If you want the S3 request logs to instead show the end client IP address, your
options depend on what load balancer you're using.
If your load balancer is HAProxy or a different load balancer that supports the PROXY Protocol, enable S3 sup-
port for the PROXY Protocol (see "s3_proxy_protocol_enabled" (page 439) in common.csv) and configure
your load balancer to use the PROXY Protocol for relaying S3 requests to the S3 Service. Consult with your
Cloudian Sales Engineering or Support representative for guidance on load balancer configuration.
1. Configure your load balancers so that they pass the HTTP X-Forwarded-For header to the S3 Service.
This is an option only if you run your load balancers run in "HTTP mode" rather than "TCP mode". Con-
sult with your Cloudian Sales Engineering or Support representative for guidance on load balancer con-
figuration.
559
Chapter 7. Reference
2. Configure your S3 Service to support the X-Forwarded-For header. You can enable S3 Service support
for this header by editing the configuration file s3.xml.erb on your Configuration Master node. The
needed configuration lines are already in that file; you only need to uncomment them.
Before uncommenting:
<!-- Uncomment the block below to enable handling of X-Forwarded- style headers -->
<!--
<Call name="addCustomizer">
<Arg><New class="org.eclipse.jetty.server.ForwardedRequestCustomizer"/></Arg>
</Call>
-->
After uncommenting:
<!-- Uncomment the block below to enable handling of X-Forwarded- style headers -->
<Call name="addCustomizer">
<Arg><New class="org.eclipse.jetty.server.ForwardedRequestCustomizer"/></Arg>
</Call>
After making this configuration edit, do a Puppet push and restart the S3 Service to apply your
change.
Note All nodes participate in tiering objects to their intended destinations systems. In
the case of regular auto-tiering based on user-defined schedules, the tiering pro-
cessing workload is randomly spread across the nodes in the service region in which
Location the cron job host resides. In the case of Bridge Mode tiering (also known as proxy tier-
ing), whichever node processes the S3 uploading of an object into its source bucket
also processes the immediate tiering of that object to its intended destination system.
For more information on auto-tiering see "Auto-Tiering Feature Overview" (page
126).
yyyy-mm-dd HH:mm:ss,SSS|Command|Protocol|SourceBucket/Object|
SourceObjectVersion|TargetBucket|TargetObjectVersion|ObjectSize|
TotalRequestSize|Status|DurationMicrosecs|Mode|AttemptCount
l The Mode field will be one of AUTO_TIERING (for regular auto-tiering), BRIDGE_
Log Entry
MODE (for Bridge Mode auto-tiering), or SCHEDULER (for retry attempts for Bridge
Format
Mode tiering of objects)
l The AttemptCount is applicable only to Bridge Mode tiering and associated retries,
and indicates how many attempts have been made to tier the object. This field's value
will be "-1" if the Mode is AUTO_TIERING.
Logging
Not applicable
Level
Default Rota- Rotation occurs if live file size reaches 10MB. Rotation also occurs at end of each day,
560
7.3. HyperStore Logs
Note Whichever S3 Service node processes a PUT of an object into a source bucket
configured for CRR will be the node that initiates the replication of the object to the
Location destination bucket. This node will have an entry for that object in its CRR request log.
In the case of retries of replication attempts that failed with a temporary error the first
time, the retries will be logged in the CRR request log on the cron job node. For gen-
eral information on the cross-region replication feature, see "Cross-Region Rep-
lication Feature Overview" (page 138).
yyyy-mm-dd HH:mm:ss,SSS|SourceBucket/Object|ObjectVersionId|DestinationBucket|
CrrOperation|Status|DurationMillisecs|Size
2020-03-04 15:04:58,423|prod-2020-01-16/0299-0166180000362773.pdf|
Log Entry
fe15a1de-de25-f8af-9a28-b4a9fc08ca7e|prod-backup-2020-01-16|REPLICATEOBJECT|
Example
COMPLETED|61|328471
Logging
Not applicable
Level
Default Rota- Rotation occurs if live file size reaches 10MB. Rotation also occurs at end of each day,
561
Chapter 7. Reference
Location Note For information about the WORM feature see "Object Lock Feature Over-
view" (page 144).
DateTime|Hostname|S3RequestId|S3Operation|Headers|Bucket|Object|
ObjectVersionId|CanonicalUserId|CanonicalIamUserId|StatusCode|
StatusMessage
Example 1: TGD30 for a Governance mode configuration with 30 day retention period
Example 2: T for a bucket with object lock enabled, but no default object lock con-
figuration
If the submitter of the request is an IAM user, the log entry shows the canonical user ID of the
562
7.3. HyperStore Logs
IAM user as well as the canonical user ID of the parent user account.
2019-10-06 09:59:30,767|arcturus|a9d7e5b1-e85a-11e9-8519-52540014b047|
Log Entry
s3:GetBucketObjectLockConfiguration|TGD90|newbucket|||
Example
b584fb57480af5108e32d17f10c5cb7b||200|OK
Default Log-
INFO
ging Level
Rotation occurs if live file size reaches 10MB. Rotation also occurs at end of each day,
regardless of live file size.
Default Rota-
tion and Rotated files are named as s3-worm.log.YYYY-MM-DD.i.gz, where i is a rotation counter that
Retention resets back to 1 after each day. Rotated files are compressed with gzip.
Policy
Deletion of oldest rotated log file occurs if aggregate rotated files size (after compression)
reaches 100MB or if oldest rotated file age reaches 180 days.
Location Note For an SQS overview (including information about how to enable the SQS Ser-
vice, which is disabled by default) see the SQS section in the Cloudian HyperStore
AWS APIs Support Reference.
Log Entry
yyyy-mm-dd HH:mm:ss,SSS PriorityLevel[ThreadId]MessageCode ClassName:MESSAGE
Format
Default Log-
INFO
ging Level
Rotation occurs if live file size reaches 10MB. Rotation also occurs at end of each day,
regardless of live file size.
Default Rota-
tion and Rotated files are named as cloudian-sqs-req.log.YYYY-MM-DD.i.gz, where i is a rotation
Retention counter that resets back to 1 after each day. Rotated files are compressed with gzip.
Policy
Deletion of oldest rotated log file occurs if aggregate rotated files size (after compression)
reaches 100MB or if oldest rotated file age reaches 180 days.
563
Chapter 7. Reference
Location Note For an SQS overview (including information about how to enable the SQS Ser-
vice, which is disabled by default) see the SQS section in the Cloudian HyperStore
AWS APIs Support Reference.
Default Log-
INFO
ging Level
Rotation occurs if live file size reaches 10MB. Rotation also occurs at end of each day,
regardless of live file size.
Default Rota-
tion and Rotated files are named as cloudian-sqs-req.log.YYYY-MM-DD.i.gz, where i is a rotation
Retention counter that resets back to 1 after each day. Rotated files are compressed with gzip.
Policy
Deletion of oldest rotated log file occurs if aggregate rotated files size (after compression)
reaches 100MB or if oldest rotated file age reaches 180 days.
Note After making any configuration file edits, be sure to trigger a Puppet sync-up and then restart
the affected service (for example, the S3 Service if you've edited the log4j-s3.xml.erb file).
564
7.3. HyperStore Logs
Within a particular log’s RollingRandomAccessFile instance there are these editable settings:
l PatternLayout pattern="<pattern>" — The log entry format. This flexible formatting configuration is
similar to the printf function in C. For detail see PatternLayout from the online Apache Log4j2 doc-
umentation.
l TimeBasedTriggeringPolicy interval="<integer>" — Roll the log after this many days pass. (More pre-
cisely, the log rolls after interval number of time units pass, where the time unit is the most granular unit
of the date pattern specified within the filePattern element — which in the case of all HyperStore logs'
configuration is a day). Defaults to rolling once a day if interval is not specified. All HyperStore logs use
the default of one day.
l SizeBasedTriggeringPolicy size="<size>" — Roll the log when it reaches this size (for example "10
MB"). Note that this trigger and the TimeBasedTriggeringPolicy operate together: the log will be rolled if
either the time based trigger or the size based trigger occur.
l IfLastModified age="<interval>" — When a rolled log file reaches this age the system automatically
deletes it (for example "180d").
l IfAccumulatedFileSize exceeds="<size>" — When the aggregate size (after compression) of rolled
log files for this log reaches this size, the system automatically deletes the oldest rolled log file (for
example "100 MB"). Note that this setting works together with the IfLastModified setting -- old rolled log
files will be deleted if either the age based trigger or the aggregate size based trigger occur.
For each log's default value for the settings above, see the "HyperStore Logs" (page 540) overview section.
In the log4j-*.xml.erb files, in addition to RollingRandomAccessFile instances there are also Logger instances.
Each Logger instance contains an AppenderRef element that indicates which log that Logger instance applies
to, by referencing the log’s RollingRandomAccessFile name (for example AppenderRef ref="S3APP" means
that the Logger instance is associated with the S3 application log). Note that multiple Logger instances may be
associated with the same log — this just means that multiple core components of a service (for example, mul-
tiple components within the S3 Service) have separately configurable loggers. If you’re uncertain about which
Logger instance to edit to achieve your objectives, consult with Cloudian Support.
The Logger instances are where you can configure a logging level, using the level attribute:
l level="<level>" — Logging level. The following levels are supported (only events at the configured
level and above will be logged):
o OFF = Turn logging off.
o ERROR = Typically a fail-safe for programming errors or a server running outside the normal
operating conditions. Any error which is fatal to the service or application.
o WARN = Anything that can potentially cause application oddities, but where the server can con-
tinue to operate or recover automatically. Exceptions caught in “catch” blocks are commonly at
this level.
o INFO = Generally useful information to log (service start/stop, configuration assumptions, etc).
Info to always have available. Normal error handling, like a user not existing, is an example.
o DEBUG = Information that is diagnostically helpful.
565
Chapter 7. Reference
o TRACE = Very detailed information to "trace" the execution of a request or process through the
code.
o ALL = Log all levels.
For each log's default log level, see the "HyperStore Logs" (page 540) overview section.
To use the HSH to view HyperStore log files, first log into the Configuration Master node (via SSH) as an
HSH user. Upon successful login the HSH prompt will appear as follows:
<username>@<hostname>$
For example:
sa_admin@hyperstore1$
$ hslog /var/log/cloudian/<logfilename>
Note You must include the file path /var/log/cloudian/ -- not just the log file name.
For example:
$ hslog /var/log/cloudian/cloudian-admin.log
In the background this invokes the Linux command less to display the log file. Therefore you can use the stand-
ard keystrokes supported by less to navigate the display; for example:
For details about the logs you can view, see "HyperStore Logs" (page 540).
566
7.4. Host and Network
Note
* For guidance regarding how many nodes you should use to meet your initial workload requirements,
consult with your Cloudian sales representative.
* Running HyperStore on VMware ESXi and vSphere is supported, so long as the VMs have com-
parable specs to those described below. However, avoid KVM or Xen as there are known problems
with running HyperStore in those virtualization environments. For more guidance on deploying Hyper-
Store on VMware, ask your Cloudian representative for the "Best Practices Guide: Virtualized Cloudian
HyperStore on VMware vSphere and ESXi".
Note If you plan to use erasure coding for object data storage, 2 CPUs
per node is recommended. Also, be aware that the higher your erasure
coding m value (such as with k+m = 9+3 or 8+4), the higher the need for
metadata storage capacity. Consult with your Cloudian representative to
ensure that you have adequate metadata storage capacity to support your
use case.
567
Chapter 7. Reference
If you have not already done so, install CentOS 7.4 Minimal or newer (or RHEL 7.4 or newer) in accordance
with your hardware manufacturer's recommendations.
Below, see these additional requirements related to host systems on which you intend to install HyperStore:
l "Partitioning of Disks Used for the OS and Metadata Storage" (page 568)
l "Host Firewall Services Must Be Disabled" (page 568)
l "Python 2.7.x is Required" (page 569)
l "Do Not Mount /tmp Directory with 'noexec'" (page 569)
l "root User umask Must Be 0022" (page 569)
For the disks used for the OS and metadata storage -- typically two mirrored SSDs as noted in the hardware
requirements table above -- do not accept the default partition schemes offered by CentOS/RHEL:
l By default CentOS/RHEL allocates a large portion of disk space to a /home partition. This will leave
inadequate space for HyperStore metadata storage.
l By default CentOS/RHEL proposes using LVM. Cloudian recommends using standard partitions
instead.
Cloudian recommends that you manually create a partition scheme like this:
To install HyperStore the following services must be disabled on each HyperStore host machine:
l firewalld
l iptables
568
7.4. Host and Network
l SELinux
To disable filewalld:
RHEL/CentOS 7 uses firewalld by default rather than the iptables service (firewalld uses iptables commands
but the iptables service itself is not installed on RHEL/CentOS by default). So you do not need to take action in
regard to iptables unless you installed and enabled the iptables service on your hosts. If that's the case, then
disable the iptables service.
To disable SELinux, edit the configuration file /etc/selinux/config so that SELINUX=disabled. Save your change
and then restart the host.
HyperStore includes a built-in firewall service (a HyperStore-custom version of the firewalld service) that is
configured to protect HyperStore internal services while keeping HyperStore public services open. In fresh
installations of HyperStore 7.2 or later, the HyperStore firewall is enabled by default upon the completion of
HyperStore installation. In HyperStore systems originally installed as a version older than 7.2 and then later
upgraded to 7.2 or newer, the HyperStore firewall is available but is disabled by default. After installation of or
upgrade to HyperStore 7.2 or later, you can enable or disable the HyperStore firewall by using the installer's
Advanced Configuration Options.
Note For information about HyperStore port usage see "HyperStore Listening Ports" (page 577).
The HyperStore installer requires Python version 2.7.x. The installer will abort with an error message if any
host is using Python 3.x. To check the Python version on a host:
# python --version
Python 2.7.5
The /tmp directory on your host machines must not be mounted with the 'noexec' option. If the /tmp directory is
mounted with 'noexec', you will not be able to extract the HyperStore product package file and the HyperStore
installer (installation script) will not function properly.
On hosts on which you will install HyperStore, the root user umask value must be '0022' (which is the default on
Linux hosts). If the root user umask is other than '0022' the HyperStore installation will abort.
569
Chapter 7. Reference
Cloudian recommends that you use the HyperStore system_setup.sh tool to configure the disks and mount
points on your HyperStore nodes, as described in Configuring Network Interfaces, Time Zone, and Data Disks.
The tool is part of the HyperStore product package (when you extract the .bin file).
If you do not use the system setup tool for disk setup, use the information below to make sure that your
hosts meet HyperStore file system requirements.
l HyperStore will by default use the drive that the OS is on for storing system metadata (in the Metadata
DB, the Credentials DB, and the QoS DB). Cloudian recommends that you dedicate two drives to the
OS and system metadata in a RAID-1 mirroring configuration. Preferably the OS/metadata drives
should be SSDs.
l You must format all other available hard drives with ext4 file systems mounted on raw disks. These
drives will be used for storing S3 object data. RAID is not necessary on the S3 object data drives.
l Mirror the OS on the two SSDs. For more detailed recommendations for partitioning these disks see
"Partitioning of Disks Used for the OS and Metadata Storage" (page 568).
l Format each of the 12 HDDs with ext4 file systems and configure mount points such as /cloudian1,
/cloudian2, /cloudian3 and so on.
Note On the HDDs for storing object data, HyperStore does not support XFS file systems; VirtIO
disks; Logical Volume Manager (LVM); or Multipathing. For questions regarding these unsup-
ported technologies, contact Cloudian Support:
570
7.4. Host and Network
has 12 disks for object storage, then on all your hosts you could name the mount points /cloudian1, /cloudian2,
/cloudian3, and so on up through /cloudian12.
If in your installation cluster some hosts have more disks than others, use as much overlap in mount point nam-
ing as possible. For example, suppose that most of your hosts have 10 disks for storing object data while one
host has 12 disks. In this scenario, all of the hosts can have mount points /cloudian1, /cloudian2, /cloudian3,
and so on up through /cloudian10, while the one larger host has those same mount points plus also /cloud-
ian11 and /cloudian12.
Note Although uniformity of mount point naming across nodes (to the extent possible) is desirable for
simplicity's sake, the HyperStore installation does support a way to accommodate differences in the
number or names mount points across nodes -- this is described in "A Data Directory Mount Point
List (fslist.txt) Is Required" (page 572)..
7.4.2.3. Option for Putting the Metadata DB on Dedicated Drives Rather Than the
OS Drives
Regarding the Metadata DB (built on Cassandra), another supported configuration is to put your Cassandra
data on dedicated drives, rather than on the OS drives. In this case you would have:
l OS drives in RAID-1 configuration. The Credentials DB and QoS DB will also be written to these drives.
l Cassandra drives in RAID-1 configuration. On these drives will be written Cassandra data and also the
Cassandra commit log.
Note You must create a Cassandra data directory named as <mountpoint>/cassandra (for
example cassandradb/cassandra) and a Cassandra commit log directory named as <moun-
tpoint>/cassandra_commit (for example cassandradb/cassandra_commit).
l Multiple drives for S3 object data (with mount points for example /cloudian1, /cloudian2, /cloudian3 and
so on), with no need for RAID protection.
If you are not using UUIDs in fstab currently, follow the instructions below to modify your fstab so that it uses
UUIDs for the devices to which you will mount S3 object data directories (you do not need to do this for the OS/-
metadata mount points).
1. Check whether your fstab is currently using UUIDs for your S3 object data drives. In the example below,
there are two S3 object data drives and they are currently identified by device name, not by UUID.
# cat /etc/fstab
...
...
/dev/sdb1 /cloudian1 ext4 rw,noatime,barrier=0,data=ordered,errors=remount-ro 0 1
/dev/sdc1 /cloudian2 ext4 rw,noatime,barrier=0,data=ordered,errors=remount-ro 0 1
571
Chapter 7. Reference
3. Retrieve the UUIDs for your devices by using the blkid command.
# blkid
...
...
/dev/sdb1: UUID="a6fed29c-97a0-4636-afa9-9ba23e1319b4" TYPE="ext4"
/dev/sdc1: UUID="rP38Ux-3wzO-sP3Y-2CoD-2TDU-fjpO-ffPFZV" TYPE="ext4"
# Revised line
6. After editing fstab so that each device on which you will store S3 data is identified by a UUID, save your
changes and close the fstab file.
7. Remount the host’s file systems:
# mount -a
Repeat this process for each host on which you will install HyperStore.
Note If you use the system_setup.sh script to configure the disks and mount points on your nodes, the
script creates the needed mount point list files automatically and you can ignore the instructions below.
If all your nodes have the same data mount points -- for example if all nodes have as their data mount points
/cloudian1, /cloudian2, and so on through /cloudian12 -- you only need to create one mount point list file. If
some nodes have a different set of mount points than do other nodes -- for example if some nodes have more
data disks than other nodes -- you will need to create a default mount point list file and also a node-specific
mount point list file for each node that differs from the default.
In your installation staging directory create a file named fslist.txt and in the file enter one line for each of your
S3 data directory mount points, with each line using the format below.
<deviceName> <mountPoint>
/dev/sdc1 /cloudian1
/dev/sdd1 /cloudian2
...
572
7.4. Host and Network
Optionally, you can also include an entry for the Cassandra data directory and an entry for the Cassandra com-
mit log directory, if you do not want this data to be put on the same device as the operating system (see
"Option for Putting the Metadata DB on Dedicated Drives Rather Than the OS Drives" (page 571)). If you
do not specify these Cassandra directory paths in fslist.txt, then by default the system automatically puts Cas-
sandra data and commit log directories on the same device on which the operating system resides.
Do not use symbolic links when specifying your mount points. The HyperStore system does not support sym-
bolic links for data directories.
If some of your hosts have data directory mount point lists that differ from the cluster default, in the install-
ation staging directory create a <hostname>_fslist.txt file for each such host. For example, along with the
default fslist.txt file that specifies the mount points that most of your hosts use, you could also have a cloudian-
node11_fslist.txt file and a cloudian-node12_fslist.txt file that specify mount points for two non-standard nodes
that have hostnames cloudian-node11 and cloudian-node12.
# tune2fs -l <device>
# tune2fs -m 0 <device>
# tune2fs -m 0 /dev/sdc1
For your HyperStore system to be accessible to external clients, you must configure your DNS name servers
with entries for the HyperStore service endpoints. Cloudian recommends that you complete your DNS con-
figuration prior to installing the HyperStore system. This section describes the required DNS entries.
573
Chapter 7. Reference
Note If you are doing just a small evaluation and do not require that external clients be able to access
any of the HyperStore services, you have the option of using the lightweight domain resolution utility
dnsmasq which comes bundled with HyperStore -- rather than configuring your DNS environment to
support HyperStore service endpoints. During installation of HyperStore software you can use the con-
figure-dnsmasq option if you want to use dnsmasq for domain resolution. Details are in the software
installation procedure.
By default the HyperStore system uses a standard format for each service endpoint, building on two values that
are specific to your environment:
During HyperStore installation you will supply your domain and your service region names, and the interactive
installer will show you the default service endpoints derived from the domain and region names. During install-
ation you can accept the default endpoints or specify custom endpoints instead. The table that follows below is
based on the default endpoint formats.
Note
* Including the string "s3" in your domain or in your region name(s) is not recommended. By default
HyperStore generates S3 service endpoints by prepending an "s3-" prefix to your <region-
name>.<domain> combination. If you include "s3" within either your domain or your region name, this
will result in two instances of "s3" in the system-generated S3 service endpoints, and this may cause
S3 service requests to fail for some S3 clients.
* If you specify custom endpoints during installation, do not use IP addresses in your endpoints.
* HyperStore by default derives the S3 service endpoint(s) as s3-<regionname>.<domain>. However
HyperStore also supports the format s3.<regionname>.<domain> (with a dot after the "s3" rather than a
dash) if you specify custom endpoints with this format during installation.
The table below shows the default format of each service endpoint. The examples show the service endpoints
that the system would automatically generate if the domain is enterprise.com and the region name is boston.
574
7.4. Host and Network
IAM Service end- iam.<domain> This is the service endpoint for accessing
point HyperStore’s implementation of the Identity
iam.enterprise.com
and Access Management API.
(one per entire
system)
STS Service end- sts.<domain> This is the service endpoint for accessing
point HyperStore’s implementation of the Security
sts.enterprise.com
Token Service API.
(one per entire
system)
Note Resolve the STS endpoint to
the same address as the IAM end-
point, or use CNAME to map the STS
endpoint to the IAM endpoint.
SQS Service end- s3-sqs.<domain> This is the service endpoint for accessing
point HyperStore’s implementation of the Simple
s3-sqs.enterprise.com
Queue Service (SQS) API.
(one per entire
system)
Note The SQS Service is disabled by
575
Chapter 7. Reference
default. .
IMPORTANT ! Cloudian Best Practices suggest that a highly available load balancer be used in pro-
duction environments where consistent performance behavior is desirable. For environments where a
load balancer is unavailable, other options are possible. Please consult with your Cloudian Sales
Engineer for alternatives.
For a production environment, in your DNS configuration each HyperStore service endpoint should resolve to
the virtual IP address(es) of two or more load balancers that are configured for high availability. For more detail
see "Load Balancing" (page 577).
If you want to use a custom S3 endpoint that does not include a region string, the installer allows you to do so.
Note however that if your S3 endpoints lack region strings the system will not be able to support the region
name validation aspect of AWS Signature Version 4 authentication for S3 requests (but requests can still suc-
ceed without the validation).
If you want to use multiple S3 endpoints per service region -- for example, having different S3 endpoints
resolve to different data centers within one service region -- the installer allows you to do this. For this
approach, the recommended syntax is s3-<regionname>.<dcname>.<domain> -- for example s3-boston.d-
c1.enterprise.com and s3-boston.dc2.enterprise.com.
Note . If you change any endpoints, be sure to update your DNS configuration.
576
7.4. Host and Network
IMPORTANT ! Cloudian Best Practices suggest that a highly available load balancer be used in pro-
duction environments where consistent performance behavior is desirable. For environments where a
load balancer is unavailable, other options are possible. Please consult with your Cloudian Sales
Engineer for alternatives. The discussion below assumes that you are using a load balancer.
HyperStore uses a peer-to-peer architecture in which each node in the cluster can service requests to the S3,
Admin, CMC, IAM, STS, and SQS service endpoints. In a production environment you should use load bal-
ancers to distribute S3, Admin, CMC, IAM, STS, and SQS service endpoint requests evenly across all the
nodes in your cluster. In your DNS configuration the S3, Admin, CMC, IAM, STS, and SQS service endpoints
should resolve to the virtual IP address(es) of your load balancers; and the load balancers should in turn dis-
tribute request traffic across all your nodes. Cloudian recommends that you set up your load balancers prior
to installing the HyperStore system.
For high availability it is preferable to use two or more load balancers configured for failover between them (as
versus having just one load balancer which would then constitute a single point of failure). The load balancers
could be commercial products or you can use open source technologies such as HAProxy (load balancer soft-
ware for TCP/HTTP applications) and Keepalived (for failover between two or more load balancer nodes). If
you use software-defined solutions such as these open source products, for best performance you should
install them on dedicated load balancing nodes -- not on any of your HyperStore nodes.
For a single-region HyperStore system, for each service configure the load balancers to distribute request
traffic across all the nodes in the system.
l Configure each region's S3 service endpoint to resolve to load balancers in that region, which distribute
traffic across all the nodes within that region.
l Configure the Admin, IAM, STS, SQS, and CMC service endpoints to resolve to load balancers in the
default service region, which distribute traffic to all the nodes in the default service region. (You will
specify a default service region during the HyperStore installation process. For example, you might
have service regions boston and chicago, and during installation you can specify that boston is the
default service region.)
For detailed guidance on load balancing set-up, request a copy of the HyperStore Load Balancing Best
Practice Guide from your Cloudian Sales Engineering representative.
Note The HyperStore S3 Service supports PROXY Protocol for incoming connections from a load bal-
ancer. This is disabled by default, but after HyperStore installation is complete you can enable it by con-
figuration if you wish. For more information see "s3_proxy_protocol_enabled" (page 439) in
common.csv.
577
Chapter 7. Reference
Each HyperStore node includes a built-in HyperStore Firewall that implements port restrictions appropriate to a
HyperStore cluster. The HyperStore Firewall is disabled by default in HyperStore systems that were originally
installed as a version older than 7.2; and enabled by default in HyperStore systems that originally installed as
version 7.2 or newer. You can enable/disable the firewall on all HyperStore nodes by using the installer's
Advanced Configuration Options.
Note If you are installing HyperStore across multiple data centers and/or multiple service regions,
the HyperStore nodes in each data center and region will need to be able to communicate with the
HyperStore nodes in the other data centers and regions. This includes services that listen on the
internal interface (such as Cassandra, the HyperStore Service, and Redis). Therefore you will need to
configure your networking so that the internal networks in each data center and region are connected
to each other (for example, by using a VPN).
Requests relayed by an
S3 Service
81 All HAProxy load balancer using
the PROXY Protocol
Requests relayed by an
HAProxy load balancer using
4431 All
the PROXY Protocol with SSL
(if enabled by configuration)
578
7.4. Host and Network
IMPORTANT ! The
Admin Service Admin Service is inten-
19443 All ded to be accessed
only by the CMC and
by system admin-
istrators using other
types of clients (such as
cURL). Do not expose
the Admin Service to a
public network.
Communication between
9078 Internal primary and backup Redis Mon-
Redis Monitor itor instances
579
Chapter 7. Reference
580
7.4. Host and Network
Each HyperStore node includes a built-in HyperStore Firewall that is pre-configured with settings appropriate
for a typical HyperStore deployment. The HyperStore Firewall is either enabled or disabled by default depend-
ing on whether your original HyperStore installation was older than version 7.2:
l In systems originally installed as version 7.2 or newer, the HyperStore Firewall is enabled by default.
l In systems originally installed as a version older than 7.2 and then later upgraded to 7.2 or newer, the
HyperStore Firewall is available but is disabled by default.
You can enable or disable the HyperStore Firewall by using the installer's Advanced Configuration Options, as
described below in "Enabling or Disabling the HyperStore Firewall" (page 582). When the Firewall is
enabled, you can optionally customize certain aspects of the Firewall's behavior, as described further below in
"Customizing the HyperStore Firewall" (page 584).
Cloudian, Inc. strongly recommends using a firewall to protect sensitive internal services such as Cas-
sandra, Redis, and so on, while allowing access to public services -- particularly in environments where no
dedicated internal interface(s) have been specified during HyperStore installation; or the internal, back-end net-
work is not a closed network only available between HyperStore nodes. The pre-configured HyperStore Fire-
wall serves this purpose, if enabled. Alternatively, if you have upgraded to HyperStore 7.2 from an older
version and you already have a custom firewall in place that you have been successfully using with Hyper-
Store, you may prefer to keep using that firewall -- since in HyperStore 7.2 the HyperStore Firewall is limited as
to how much it can be customized.
If you have upgraded to HyperStore 7.2 from an older version and you do wish to enable the HyperStore Fire-
wall rather than continuing to use your own custom firewall, then before enabling the HyperStore Firewall do
the following:
581
Chapter 7. Reference
l If you have created custom firewalld Zone and Service configuration files, make a backup copy of those
files for your own retention needs. When you enable the HyperStore Firewall your existing Zone and
Service configuration files will be deleted from the /etc/firewalld directory.
l Disable your existing firewall service on each node. For example, to disable firewalld do the following
on each node:
# systemctl stop firewalld
# systemctl disable firewalld
The HyperStore installer will not allow you to enable the HyperStore Firewall on your nodes if existing
firewall rules are in effect on any of the HyperStore nodes.
Note The HyperStore Firewall service is implemented as a custom version of the firewalld service,
named cloudian-firewalld.
1. On the Configuration Master node, change into your current HyperStore version installation staging
directory. Then launch the installer.
# ./cloudianInstall.sh
$ hspkg install
Once launched, the installer's menu options (such as referenced in the steps below) are the same
regardless of whether it was launched from the HSH command line or the OS command line.
4. At the Cloudian HyperStore Firewall Configuration menu enter a for "Enable/Disable Cloudian Hyper-
Store Firewall".
5. In the Enable/Disable Cloudian HyperStore Firewall interface, the Firewall's current status is displayed.
At the prompt enter enable to enable the Firewall or disable to disable the Firewall. After the interface
indicates that the Firewall is set as you specified, press any key to return to the Firewall Configuration
menu.
6. At the Firewall Configuration menu enter x for "Apply configuration changes and return to previous
menu". Then at the next prompt that displays enter yes to confirm that you want to apply your con-
figuration changes. This will apply your change to all nodes in your HyperStore system (all nodes in all
of your data centers and service regions).
The installer interface should then display a status message "result: OK" for each node, to indicate the
successful applying of your configuration change to each node.
582
7.4. Host and Network
Note If the installer displays a Warning about one or more nodes not responding you can try the
Firewall Configuration menu's x option again, and this retry may work for the node(s) if the prob-
lem the first time was a temporary network issue. If one of your nodes is down, the node will auto-
matically be updated with your configuration change when the node comes back online.
You are now done with enabling or disabling the HyperStore Firewall. You do not need to do a Puppet push
or restart any services.
l On all IP interfaces, all TCP ports will allow inbound traffic originating from other HyperStore nodes.
In a multi-DC or multi-region system, this includes inbound traffic originating from HyperStore nodes in
other DCs or regions.
l On all IP interfaces, only the following TCP ports will allow inbound traffic that originates from
sources other than HyperStore nodes:
o Admin HTTP service port (18081 by default)
o Admin HTTPS service port (19443 by default)
o CMC HTTP service port (8888 by default)
o CMC HTTPS service port (8443 by default)
o IAM HTTP service port (16080 by default)
o IAM HTTPS service port (16443 by default)
o S3 HTTP service port (80 by default)
o S3 HTTPS service port (443 by default)
o S3 PROXY HTTP service port (81)
o S3 PROXY HTTPS service port (4431)
o SSH service port (22)
o SQS HTTP service port (18090 by default)
o SQS HTTPS service port (18443 by default)
Traffic originating from sources other than HyperStore nodes will be blocked (DROP'd) on all TCP ports
other than those listed above.
Note The Firewall also allows incoming ICMP traffic originating from sources other than Hyper-
Store nodes.
When the Firewall is enabled, the Firewall configuration will automatically adjust to system changes in the fol-
lowing ways:
l If you resize your cluster by adding or removing nodes, or by adding or removing a data center or ser-
vice region, the Firewall accommodates this change automatically. In the case of expanding your
cluster, the Firewall will be automatically enabled on new nodes, and the Firewall on the existing nodes
will allow inbound traffic from the new nodes, on any port. In the case of removing nodes, the Firewall
583
Chapter 7. Reference
on the existing nodes will be updated such that the removed nodes are no longer part of the cluster and
can only access the cluster's public services.
Note If the Firewall is disabled when you add new nodes to your cluster, the Firewall will also be
disabled on the new nodes.
l If you change the port number that a particular HyperStore public service uses, the Firewall's con-
figuration is automatically adjusted accordingly. After you complete a port change you only need to
apply the updated Firewall configuration to the cluster. For instructions see "Changing S3, Admin, or
CMC Listening Ports" (page 513).
1. On the Configuration Master node, change into your current HyperStore version installation staging dir-
ectory. Then launch the installer.
# ./cloudianInstall.sh
$ hspkg install
Once launched, the installer's menu options (such as referenced in the steps below) are the same
regardless of whether it was launched from the HSH command line or the OS command line.
4. At the Cloudian HyperStore Firewall Configuration menu enter the menu letter corresponding to the ser-
vice for which you want to deny or allow access (for example h for S3 HTTP).
584
7.4. Host and Network
5. In the Allow/Deny Access to Service <Service Type> interface, the service's current setting is displayed
("Allow" or "Deny"). At the prompt enter deny to deny access to the service or allow to allow access to
the service. After the interface indicates that the service is set as you specified, press any key to return
to the Firewall Configuration menu.
6. At the Firewall Configuration menu enter x to apply your configuration changes. Then at the next
prompt that displays enter yes to confirm that you want to apply your configuration changes. This will
apply your change to all nodes in your HyperStore system (all nodes in all of your data centers and ser-
vice regions).
The installer interface should then display a status message "result: OK" for each node, to indicate the
successful applying of your configuration change.
Note If the installer displays a Warning about one or more nodes not responding you can try the
Firewall Configuration menu's x option again, and this retry may work for the node(s) if the prob-
lem the first time was a temporary network issue. If one of your nodes is down, the node will auto-
matically be updated with your configuration change when the node comes back online.
You are now done with customizing the HyperStore Firewall. You do not need to do a Puppet push or restart
any services.
585
Chapter 7. Reference
l Smart Support — The Smart Support feature (also known as "Phone Home") securely transmits Hyper-
Store daily diagnostic information to Cloudian Support over the internet. HyperStore supports con-
figuring this feature to use an explicit forward proxy for its outbound internet access (after installation,
the relevant settings in common.csv are phonehome_proxy_host and the other phonehome_proxy_*
settings that follow after it). To use a forward proxy with this feature you should configure your forward
proxy to support access to *.s3-support.cloudian.com (that is, to any sub-domain of s3-sup-
port.cloudian.com).
l Auto-Tiering and Cross-Region Replication — If you want to use either the auto-tiering feature or the
cross-region replication feature (CRR), the S3 Service running on each of your HyperStore nodes
requires outbound internet access. These features do not support configuring an explicit forward proxy,
but you can use transparent forward proxying if you wish. (Setting up transparent forward proxying is
outside the scope of this documentation.)
l Pre-Configured ntpd — Accurate, synchronized time across the cluster is vital to HyperStore service. In
of your HyperStore data centers four of your HyperStore nodes are automatically configured to act as
internal NTP servers. (If a HyperStore data center has only four or fewer nodes, then all the nodes in the
data center are configured as internal NTP servers.) These internal NTP servers are configured to con-
nect to external NTP servers — by default the public servers from the pool.ntp.org project. In order to
connect to the external NTP servers, the internal NTP servers must be allowed outbound internet
access. This feature does not support configuring an explicit forward proxy, but you can use transparent
forward proxying if you wish. (Setting up transparent forward proxying is outside the scope of this doc-
umentation.)
IMPORTANT ! If you do not allow HyperStore hosts to have outbound connectivity to the inter-
net, then during the interactive installation process -- when you are prompted to specify the
NTP servers that HyperStore hosts should connect to -- you must specify NTP servers within
your environment, rather than the public NTP servers that HyperStore connects to by default. If
HyperStore hosts cannot connect to any NTP servers, the installation will fail.
After HyperStore installation, to see which of your HyperStore nodes are internal NTP servers, log into
the CMC and go to Cluster → Cluster Config → Cluster Information. On that CMC page you can also
see your configured list of external NTP servers.
586