0% found this document useful (0 votes)
74 views99 pages

LightOS Install Guide V3 2 1

Uploaded by

bharatraut
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
74 views99 pages

LightOS Install Guide V3 2 1

Uploaded by

bharatraut
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 99

Lightbits Installation Guide

Lightbits Version: v3.2.1

Lightbits Labs

April 2023

1
CONTENTS CONTENTS

Contents
Lightbits™ v3.x Installation and Configuration Guide 4

About the Installation Guide 5

Lightbits Cluster Overview 6


Lightbits Cluster Topology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
Planning for the Lightbits Cluster Software Installation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
Installation Files Backup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
Lightbits Cluster Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
Nodes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
Failure Domains . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

Lightbits Cluster Software Installation Process 11


Installation Preparation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
General System Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
Lightbits Server Prerequisites . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
Required Ports for Installation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
Relevant Lightbits Support Documentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

Lightbits Cluster Software Installation 15


Before You Begin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
Connecting to the Lightbits Software Repository . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
Verifying Network Connectivity for the Servers in the Cluster . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
Configuring the Ansible Environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
Prepare Installation Workstation (Ansible Controller) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
Ansible Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
Docker Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
Copying the Ansible Environment Tarball . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
Inventory Structure and Adding the Ansible Hosts File . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
Multi-Tenancy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
Red Hat Linux Installation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
Configuring Global Variables in Ansible . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
NTP vs Chrony . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
Use Lightos Kernel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
Single IP Dual NUMA Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
Verifying Hosts Connection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
Testing Connectivity via Ping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
Testing Connectivity via SSH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
Defining Configuration Files for Each “Ansible Host” (Server) in the Cluster . . . . . . . . . . . . . . . . . . . 31
Defining Failure Domains . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
Setting the SSD Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
Confirming the Required Directory Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
Running the Ansible Installation Playbook to Install Lightbits Cluster Software . . . . . . . . . . . . . . . . . . 38
Lightbits Cluster Installation Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
Ansible Installation Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
Prebuilt Ansible Docker Installation Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
Verify Successful Installation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
Post Installation Steps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
Back Up Important Content . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
Check Cluster Health . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

Linux Cluster Client Software Installation 44


Connecting to the Cluster Client DEB Repository . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
Connecting to the Cluster Client RPM Repository . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
Installing the New Kernel on CentOS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
Configuring the Client to Boot from the New Kernel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
Installing the Lightbits NVMe Command Line Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
How To Replace With Latest Version From Lightbits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

2023 Lightbits Labs Proprietary And Confidential Under NDA Only


CONTENTS CONTENTS

Installing the Lightbits NVMe Command Line Interface (Ubuntu) . . . . . . . . . . . . . . . . . . . . . . . . . 48


Loading the NVMe/TCP Host Software and Enabling Multipath . . . . . . . . . . . . . . . . . . . . . . . . . . 49
Load the NVME TCP module . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
Multipath . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
Reboot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

Provisioning Storage and Connecting the Cluster Client to Lightbits 50


Creating a Volume on the Lightbits Storage Server . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
Connecting the Cluster Client to Lightbits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
Confirming the Cluster Client Connection to Lightbits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

Troubleshooting 55
Ansible Role Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
SSH Strict Key Errors When Using sshpass . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
Free Space in Linux OS for etcd Logical Volume Manager Use . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
Recovering from Cluster Installation Failure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
Log Artifacts Collection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
Fully Clean Lightbits From Servers or Cluster . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

Appendixes 59
Host Configuration File Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
Host Configuration File Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
Example 1: Data Network Interface Manually Configured . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
Example 2: Data Network Interface Automatically Configured . . . . . . . . . . . . . . . . . . . . . . . . . 63
Example 3: Override the Lightbits Configurations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
Example 4: Provide Custom Datapath Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
Example 5: Use the Linux Volume Manager (LVM) Partition for etcd Data . . . . . . . . . . . . . . . . . 65
Example 6: Profile-Generator Overrides . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
Example 7: Dual Instance Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
Example 8: Single IP Dual NUMA Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
Performing an Offline Installation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
Offline Ansible Controller Installation and Self-Signed Certificates . . . . . . . . . . . . . . . . . . . . . 70
Configuring the Data Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
Automatic Data Network Configuration (Recommended) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
Manual Data Network Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
etcd Partitioning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
Using SSH-Key Authentication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
Network Time Protocol Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
Automated Client Connectivity Verification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
Configuring Grafana and Prometheus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
Prerequisite . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
Installing Grafana and Prometheus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
Usage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
Outcome . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
Lightbits Monitoring Integration with Existing Grafana and Prometheus . . . . . . . . . . . . . . . . . . . 75
Using Grafana and Prometheus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
Open TCP Ports and Verify . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
Installation Behind HTTP-Proxy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
Single-IP-Dual-NUMA Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
Adding a JWT Token To a Configuration File . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
Running lbcli From a Non-Lightbits Server . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98

About - Legal 99

2023 Lightbits Labs Proprietary And Confidential Under NDA Only


LIGHTBITS™ V3.X INSTALLATION AND CONFIGURATION GUIDE

Lightbits™ v3.x Installation and Configuration Guide


The Lightbits™ cluster storage solution distributes services and replicates data across different Lightbits servers to guaran-
tee service and data availability when one or more Lightbits servers experience transient or permanent failures. A cluster
of Lightbits servers replicates data internally and keeps it fully consistent and available in the presence of failures. From
the perspective of clients accessing the data, data replication is transparent, and server failover is seamless.
Lightbits also protects the storage cluster from additional failures not related to the SSDs (e.g., CPU, memory, NICs)
failures, software failures, network failures, or rack power failures. It provides additional data security through in-server
Erasure Coding (EC) that protects servers from SSD failures and enables non-disruptive maintenance routines that
temporarily disable access to storage servers (e.g., TOR firmware upgrades).

2023 Lightbits Labs Proprietary And Confidential Under NDA Only


ABOUT THE INSTALLATION GUIDE

About the Installation Guide


This installation guide is for system administrators who are installing the Lightbits storage server software. It includes
instructions for installing the Lightbits cluster software, installing the cluster client software, and connecting clients to the
Lightbits Cluster.
Use the information in this installation guide to:
• Plan for the Lightbits cluster software installation in your environment.
• Successfully install the software so that a cluster of Lightbits servers is ready for use.
Lightbits Labs™ recommends that you follow the installation instructions in the order that they are written to ensure a
successful installation.

2023 Lightbits Labs Proprietary And Confidential Under NDA Only


LIGHTBITS CLUSTER OVERVIEW

Lightbits Cluster Overview


This section provides you with information about the major components of the Lightbits cluster software solution and
how they work together.
It also contains recommended best practice tips for collecting information required to use the automated installation script.
For the installation script to download and install the Lightbits software onto your system’s storage nodes, you must have
details about your specific environment—such as your specific networking details.

Lightbits Cluster Topology


The following is a basic diagram that shows the components and resources required to automatically install Lightbits
software onto your servers in your data center.

Note: There are two types of installation methods: the online installation method, which connects to online
repositories to download the Lightbits software; and the offline installation method, which grabs the Lightbits
software files locally from the Ansible host. Note that for the offline method, Lightbits will provide the software
files in advance. This guide mainly covers the online installation method, but notes are provided on what differs
with the offline installation method as well.

Figure 1: Lightbits Cluster Resources

Based on the numbers next to each component or resource in the diagram, see the following table for a description of the
components and resources in the Lightbits cluster topology diagram.
Lightbits Cluster Topology Components Table

# Component or Resource Description


1 dl.lightbitslabs.com Lightbits supplied configuration files
and installation tools via remote
repository.
2 dl.lightbitslabs.com The Lightbits software is maintained
in a password-protected software
repository, referred to as “The
Lightbits Repo”.

2023 Lightbits Labs Proprietary And Confidential Under NDA Only


Lightbits Cluster Topology LIGHTBITS CLUSTER OVERVIEW

# Component or Resource Description


3 Preferred Linux repo.org There are many publicly available
Linux repositories that are already
configured in your environment.
Some standard tools might need to be
downloaded from these repos or
updated Linux kernel files to allow
your clients to perform NVMe/TCP
functions. These standard files are
not available from Lightbits but are
part of the core operating system.
4 Public Network Time Protocol Server Lightbits cluster nodes remain in sync
using NTP or Chrony. This is
automatically configured by the
Ansible installation script. However,
a custom time service configuration is
possible.
5 Management Network To connect and configure a given
server, the standard Secure Shell
(SSH) protocol is used. Each server
must be reachable and is configured
over this management network. This
network is separate from the network
that Lightbits will send and receive
application data over. It is also
possible to use the same network for
both management and data networks.
6 Data Network Acts as the interconnection between
the enterprise’s “Clients” or
“Application Servers”. This is a
separate network from the
management network network and
carries all of the NVMe/TCP data
traffic.
7 Cluster Installation Workstation This server is where you download
the Lightbits installation software.
This is composed of an Ansible script.
This server must be outside of the
planned cluster. The server will
automatically download the Lightbits
cluster software files from the
dl.lightbitslabs.com repo to each of
the Lightbits cluster servers. Note:
For information about the system
requirements for the installation
workstation, see System
Requirements.
8 Lightbits Cluster Servers The cluster servers store the
application data. Each node is what
essentially makes up the data storage
portion of a Lightbits cluster. All of
the Lightbits software from the “The
Lightbits Repo” are installed on these
servers.

2023 Lightbits Labs Proprietary And Confidential Under NDA Only


Planning for the Lightbits Cluster Software Installation LIGHTBITS CLUSTER OVERVIEW

# Component or Resource Description


9 Clients These are the enterprise’s application
servers and where your applications
live. The “client” part of the cluster
is connected to the Lightbits cluster
via the data network. It might be
necessary for you to update the client
“kernel” using a standard repo
manager program such as “yum” in
the case of CentOS or RHEL.

Planning for the Lightbits Cluster Software Installation


At a very high level, the following command automatically installs a Lightbits cluster:
ansible - playbook -i ansible / inventories / cluster_example / hosts playbooks /deploy - lightos .
yml

When this command completes, you will have a Lightbits cluster.


To use this Ansible command successfully, you will need to provide the Ansible software with information about your data
center’s specific environment. This means you should gather some details and enter them into text files that Ansible uses
during the Lightbits installation operations.
However, to use this Ansible command successfully, you will need to prepare the environment and provide the Ansible
software with information about your data center’s specific environment into specific configuration files. This includes the
hosts file seen above, and other Ansible yaml files. This means that you should gather some details and enter them into
text files that Ansible uses during the Lightbits installation operations.
The remainder of this Installation Guide is organized into the following flow: - Section 3 covers preparing the environment.
- Section 4 covers setting up, configuring, and running the Ansible installation - which deploys/installs the Lightbits cluster.
- Section 5 and 6 cover connecting a client to a live cluster. - Section 7 covers additional troubleshooting. - Section 8
covers additional information that can be useful for other sections.
The installation process generally follows the path in the following diagram. When Ansible runs, it reads the text files you
configured (in step 3), connects to the Lightbits software repository (in step 1), and downloads Lightbits software onto
each storage server that will exist in the cluster (in step 4).

Installation Files Backup


During the installation process, Ansible generates certification files required by etcd used by Lightbits, the API service,
and Admin. These files are not critical but are very important in case of changes required in the cluster (adding/replac-
ing/recovering a server).
Lightbits’ recommendation is to back up this directory: lightos-certificates (using Ansible), or lightos-
certificates (using lb-docker) - for future use, if required.
If the certificate files are lost, Lightbits can help work through a procedure of regenerating those files.
Additionally, we advise backing up the Ansible configuration files in the ansible directory and the created jwt files:
lightos-system-jwt and lightos-default-admin-jwt.

Lightbits Cluster Architecture


The Lightbits cluster storage solution distributes services and replicates data across different Lightbits servers. This
guarantees service and data availability when one or more Lightbits servers experience transient or permanent failures. A
cluster of Lightbits servers replicates data internally and keeps it fully consistent and available in the presence of failures.
From the perspective of clients accessing the data, data replication is transparent, and server failover is seamless.

2023 Lightbits Labs Proprietary And Confidential Under NDA Only


Lightbits Cluster Architecture LIGHTBITS CLUSTER OVERVIEW

Figure 2: Lightbits Installation Path

Lightbits also protects the storage cluster from additional failures not related to the SSDs (e.g., CPU, memory, NICs),
software failures, network failures, or rack power failures. It provides additional data security through in-server Erasure
Coding (EC) that protects servers from SSD failures and enables non-disruptive maintenance routines that temporarily
disable access to storage servers (e.g., TOR firmware upgrades).
The following sections describe the failure domain and volume components used in the Lightbits cluster architecture.

Note: For more information about Lightbits cluster architecture, see the Deploying Reliable High-Performance
Storage with Lightbits Whitepaper.

Nodes
Each server can be split into multiple logical nodes. Each logical node owns a specific set of SSDs and CPUs, and a
portion of the RAM and NVRAM. The physical network can be shared or exclusive per node.
Nodes can be across NUMAs or per NUMA. There is no relation or limitation between a logical node and the NUMA of
the resources used by the logical node.
Each storage server runs a single Node Manager service. The service controls all the logical nodes of the storage server.

Note: The current Lightbits release only supports up to two logical nodes per server. The single logical node
deployment is commonly referred to as “single instance, node or NUMA deployment”. Dual logical node deployment
is referred to as “dual instance, node or NUMA deployment”.

Failure Domains
Users define the Failure Domains (FD) based on data center topology and the level of protection that it strives to achieve.
Each server in the cluster can be assigned to a set of FDs.
An example of an FD definition is separating racks of servers by FD labels. In this case, all servers in the same rack are
assigned the same FD label, while servers in different racks are assigned distinct labels (e.g., FD label = rack ID). Two
replicas of the same volume will not be located on two nodes in the same rack.
The system stores different replicas of the data on separate FDs to keep data protected from failures.
The definition of an FD is expressed by assigning FD labels to the storage nodes. Single or multiple FD labels can be
assigned to every node.

2023 Lightbits Labs Proprietary And Confidential Under NDA Only


Lightbits Cluster Architecture LIGHTBITS CLUSTER OVERVIEW

Another example of an FD definition is grid topology, in which every node is assigned a label of a row and a label of a
column. In this case, the volume is not stored on two servers that are placed on the same row or on the same column.

Note: Per the previous section, servers can be configured using a single or dual instance. The same Failure Domain
rules apply to dual instance, in addition to the fact that volumes will never be placed on a different node of the
same server. This is because any server failure will usually affect both nodes.

For more information on Failure Domain configuration, see the Lightbits Administration Guide.

2023 Lightbits Labs Proprietary And Confidential Under NDA Only


LIGHTBITS CLUSTER SOFTWARE INSTALLATION PROCESS

Lightbits Cluster Software Installation Process


The process of installing Lightbits products includes the installation of the Lightbits software on the storage server. It
can also include the installation of a new kernel on the client if the client’s kernel version is less than v5.3.5.
The following chart summarizes the steps for completing the Lightbits cluster software installation and required actions
on the clients.

Figure 3: Lightbits Clustering Multipath Replication Design

Lightbits recommends you complete each of these steps in the order that they are written to ensure a successful software
installation and connection between the Lightbits Storage Server and the clients.

Note: To complete the installation process, you must have the Lightbits Installation - Customer Addendum
that was sent to you by Lightbits. The customer addendum contains customer-specific information and is referred
to throughout the installation procedure.

Installation Preparation
Before you begin the installation, Lightbits recommends that you create a reference table to list the networking and server
names you will use for your Lightbits cluster. The following is an example of a table you can use with the Configuring
the Ansible Environment section.
Installation Planning Table

Note: The following represents a cluster with three Lightbits servers with a single client.

Management Data NIC


Server Name Role Network IP Interface Name Data NIC IP NVMe Drives
server00 Lightbits Storage 192.168.16.22 ens1 10.10.10.100 6
Server 1
server01 Lightbits Storage 192.168.16.92 ens1 10.10.10.101 6
Server 2
server02 Lightbits Storage 192.168.16.32 ens1 10.10.10.102 6
Server 3
client00 client 192.168.16.45 ens1 10.10.10.103 N/A

Note that during this installation, client00 will function as the Ansible installation host. However, any server will work
that has connectivity over SSH to the management IP of each server.

2023 Lightbits Labs Proprietary And Confidential Under NDA Only


General System Requirements LIGHTBITS CLUSTER SOFTWARE INSTALLATION PROCESS

This table appears throughout this installation guide to help you follow the Lightbits installation process, to show the
progress you have made to complete the installation, and to successfully configure a cluster of servers.
Additional relevant information about this cluster: * The servers are Centos 7.9; however, other OSs are supported
with different build releases. For additional information, see General System Requirements. * The Lightbits GA release
will be installed. Our Red Hat build is supported for additional OSs. For additional information, see General System
Requirements. * We will install in Single Instance/NUMA/node mode; however, if the servers have dual or more NUMAs,
we could also do a dual instance/NUMA/node installation. Examples for this will be provided.
Lightbits Cluster Installation Process

# Installation Steps
1 Connecting your installation workstation to Lightbits’
software repository
2 Verifying the network connectivity of the servers used in
the cluster
3 Setting up an Ansible environment on your installation
workstation
4 Installing a Lightbits cluster by running the Ansible
installation playbook
5 Updating clients (if required)
6 Provisioning storage, connecting clients, and performing
IO tests

General System Requirements


Before you begin installing the Lightbits product, you should be aware of the following installation considerations:
• The system administrator performing this installation must have the following permissions:
– SSH accessibility (needed packages/permissions)
– Root user permissions are required to complete the installation (can use normal user with sudo access).

• The Linux distribution that your clients use must have the NVMe/TCP client-side drivers. These drivers are included
starting with Linux kernel v5.3.5 and above.
– If your system’s Linux distribution does not include this kernel version or a later version, download back-ported
NVMe/TCP client side drivers for specific kernels and distributions from the Lightbits drivers webpage.

Lightbits Server Prerequisites


Consider the following prerequisites for the storage servers that will host the Lightbits software.
• Lightbits recommends that you plan to use two networking interfaces on the Lightbits servers: one for management
and another as a data interface. This is not required, as the data interface can function as both management and
data. For dual instance/node/NUMA configurations, an additional network interface can be used.
• We support the following server-based OS installations: Centos 7.9, Alma/Rocky/Red Hat 8.4, 8.6, 8.7.
• If persistent memory (Intel Optane or NVDIMM) is used, configure the pmem properly per different vendors’ servers.
Please consult with the server vendor or with Lightbits for any additional questions. Please note: you should enable
the memory interleaving in the BIOS/uEFI, and for Intel Optane, you should use the App Direct Mode.
• You must have Python v3.6 (or higher) installed on the Lightbits servers. Additionally, it is advised to have network-
scripts, yum-utils and net-tools installed.
Example command to install:
$ yum install -y python3
$ yum install -y network - scripts
$ yum install -y yum - utils
$ yum install -y net - tools

• The Lightbits software kernel requires a boot partition with at least 512 MB available.

2023 Lightbits Labs Proprietary And Confidential Under NDA Only


Required Ports for Installation LIGHTBITS CLUSTER SOFTWARE INSTALLATION PROCESS

• To complete the installation process, you will need information from your version of the Lightbits Installation-
Customer Addendum. If you do not have the customer addendum, contact a Lightbits representative to receive
a copy.
• For more information about which Python version supports Ansible, see the Ansible Installation Guide.
• Lightbits comes in two kinds of releases: GA or Red Hat. The GA releases support the Centos OS and come with
a Lightbits-customized kernel. The Red Hat releases support Red Hat-based OS (Alma and Rocky included), and
require a specific system kernel to be installed. For more, see Red Hat Linux Installation.
The following table details the supported Lightbits operating systems and kernels.

Lightbits Release Release Type Kernel Version Supported OS


3.2.1~b1237 RHEL 4.18.0_425.19.2.el8 Alma, Red Hat 8.6
3.2.1~b1245 RHEL 4.18.0_372.9.1.el8 Alma, Red Hat 8.6
3.2.1~b1236 GA 4.14.252_0017303255861b045c6f9_rel_lb Centos 7.9
3.1.2~b1127 RHEL 4.18.0_425.3.1.el8 Alma, Red Hat 8.6
3.1.2~b1130 RHEL 4.18.0_372.9.1.el8 Alma, Red Hat 8.6
3.1.2~b1125 GA 4.14.252_0017303255861b045c6f9_rel_lb Centos 7.9
3.1.1~b1119 RHEL 4.18.0-425.3.1.el8 Alma, Red Hat 8.6
3.1.1~b1118 RHEL 4.18.0-372.9.1.el8 Alma, Red Hat 8.6
3.1.1~b1116 GA 4.14.252_0017303255861b045c6f9_rel_lb Centos 7.9
3.0.5~b1107 RHEL 4.18.0-372.32.1.el8_6 Alma, Red Hat 8.6
3.0.5~b1105 RHEL 4.18.0-372.9.1.el8 Alma, Red Hat 8.6
3.0.5~b1102 GA 4.14.252-0017303255861b045c6f9_rel_lb Centos 7.9
3.0.4~b1085 RHEL 4.18.0-372.32.1.el8_6 Alma, Red Hat 8.6
3.0.3~b1062 RHEL 4.18.0-372.26.1.el8_6 Alma, Red Hat 8.6
3.0.3~b1061 RHEL 4.18.0-372.9.1.el8 Alma, Red Hat 8.6
3.0.3~b1059 GA 4.14.252_0017303255861b045c6f9_rel_lb Centos 7.9
2.3.22~b1031 RHEL 4.18.0-372.26.1.el8_6 Alma, Red Hat 8.6
3.0.1~b1007 RHEL 4.18.0-372.19.1.el8_6 Alma, Red Hat 8.6
3.0.1~b1004 GA 4.14.252_0017303255861b045c6f9_rel_lb Centos 7.9
2.3.20~b988 RHEL 4.18.0-372.19.1.el8_6 Alma, Red Hat 8.6
2.3.19~b962 RHEL 4.18.0-372.16.1.el8_6 Alma, Red Hat 8.6
2.3.18~b951 RHEL 4.18.0-372.13.1.el8_6 Alma, Red Hat 8.6
2.3.17~b930 RHEL 4.18.0-305.12.1.el8_4 Red Hat 8.4
2.3.17~b927 RHEL 4.18.0-372.9.1.el8 Alma, Red Hat 8.6
2.3.17~b923 GA 4.14.252_001730324769e3ea3c709_rel_lb Centos 7.9
2.3.16~b887 RHEL 4.18.0-305.12.1.el8_4 Red Hat 8.4
2.3.16~b886 GA 4.14.252_001730324769e3ea3c709_rel_lb Centos 7.9
2.3.14~b806 RHEL 4.18.0-305.12.1.el8_4 Red Hat 8.4
2.3.14~b805 GA 4.14.252_001730324769e3ea3c709_rel_lb Centos 7.9
2.3.12~b793 GA 4.14.252_001730324769e3ea3c709_rel_lb Centos 7.9
2.3.8~b664 GA 4.14.216_41421769bde239058b6e_rel_lb Centos 7.9

Note: For Lightbits GA releases, the kernel version shown is installed on the servers by the Ansible installation. For
Lightbits RHEL releases, the kernel version shown must be pre-installed on the servers for the Ansible installation
of Lightbits in order to work.

Required Ports for Installation


The Lightbits cluster software requires access to several ports to complete its installation process.
The following table lists the default ports used by the Lightbits components:
Required Ports

Component Management/Data NIC Port (TCP) Default location


Management CLI Management 443 None

2023 Lightbits Labs Proprietary And Confidential Under NDA Only


Relevant Lightbits Support Documentation LIGHTBITS CLUSTER SOFTWARE INSTALLATION PROCESS

Component Management/Data NIC Port (TCP) Default location


etcd peer port Data 2380 roles/etcd/defaults/
main.yml
Exporter port Management 8090 roles/install-lightos/
defaults/main.yml
Duroslight port Note: Data 4420,8009 roles/install-lightos/
NVMe client connects to defaults/main.yml
Duroslight via this port.
Replicator port Note: Data 22226 roles/install-lightos/
Other nodes connect for defaults/main.yml
replication to the node via
this port.

Note: If using Single IP Dual Numa configuration (see Single-IP-Dual-NUMA Configuration), open the above
ports and two additional ports: 4421 and 22227. Duroslight will use 4420 and the additional 4421 port. Replicator
will use 22226 and the additional 22227 port.

See the Open TCP ports and verify according to the examples below of how to open and test TCP ports.
If you need to check a port’s accessibility, you can use the following procedure with the open-source nmap program:
1. Install the open-source nmap program with the following command: bash $ yum install -y nmap > >Note: If
testing port accessibility from a non-rpm/yum based operating system, the installation will differ, but the commands
below should still work, as nmap installs and relies on nc (netcat). >
2. Check a port’s accessibility with either of the following commands: bash $ nc -v -z <ip> <start port>-<end
port> or bash $ nc -v -u <ip> <start port>-<end port>
3. You must have the netcat program running in listen mode on the server you are testing with the following command:
bash $ nc -l -p <port>

Relevant Lightbits Support Documentation


This installation and configuration guide is part of a documentation set that provides complete information about using
Lightbits products.
This document set includes the following Lightbits Support documentation.

Document Description
Lightbits Installation Guide (this document) Contains the instructions to install the Lightbits cluster
software, installs the Linux cluster client software, and
then connects the cluster client to Lightbits.
Installation Guide - Customer Addendum Includes customer-specific passwords to access installation
files.
Lightbits Administration Guide Provides detailed information about the operations you
can perform using the Lightbits lbcli CLI command and
REST API. Note: After you complete the installation
process in this document, you should refer to the
Administrator’s Guide for important management and
automation instructions.
User’s Manual: Lightbits REST and CLI API Lists the low level details for the REST API and CLI
command usage. This document is typically used as a
reference manual when building and administering the
system. Note: See the Administrator’s Guide for detailed
examples for using the REST API and CLI commands.

The following diagram shows how to use the documents to install, test, and maintain Lightbits products, and how the
above referenced documents can be used to support the typical user experience.

2023 Lightbits Labs Proprietary And Confidential Under NDA Only


LIGHTBITS CLUSTER SOFTWARE INSTALLATION

Figure 4: Lightbits Documentation Set

Lightbits Cluster Software Installation


Installation Planning Table
The table below details a Lightbits cluster installed on three servers, with a connected client.

Management Data NIC


Server Name Role Network IP Interface Name Data NIC IP NVMe Drives
server00 Lightbits Storage 192.168.16.22 ens1 10.10.10.100 6
Server 1
server01 Lightbits Storage 192.168.16.92 ens1 10.10.10.101 6
Server 2
server02 Lightbits Storage 192.168.16.32 ens1 10.10.10.102 6
Server 3
client00 client 192.168.16.45 ens1 10.10.10.103 N/A

Note that: * The servers are Centos 7.9; however, other OSs are supported with different build releases. For additional
information, see General System Requirements. * The Lightbits GA release will be installed. Our Red Hat build is
supported for additional OSs. For additional information, see General System Requirements. * We will install using
Single Instance/NUMA/node mode; however, if the server supports dual or more NUMAs, we could also do a dual
instance/NUMA/node installation. Example Ansible configurations are provided in Host Configuration File Examples.

This section includes:

Before You Begin


Connecting to the Lightbits Software Repository
Verifying Network Connectivity for the Servers in the Cluster
Configuring the Ansible Environment
Running the Ansible Installation Playbook to Install Lightbits Cluster Software

Before You Begin


Lightbits recommends that you plan to use two networking interfaces for the Lightbits cluster installation: one for control
and another as a data storage node.
Before you begin, it is recommended to review the General System Requirements and Required Ports sections of this
guide.
Note: * The data interfaces must be on the same subnet (in pre-configured interfaces or as an input for Ansible). * To
install the cluster software, you need an Ansible module, and Ansible application-deployment tool v4.2.0 or later. * The
Python netaddr module, which is used to represent and manipulate network addresses. * There is support for multiple
Ansible tags (for cleanup for example), by using comma-separated tags. * Based on the placement of SSDs in the server,
check if you need to allow cross-NUMA devices in the profile.

2023 Lightbits Labs Proprietary And Confidential Under NDA Only


Connecting to the Lightbits Software Repository LIGHTBITS CLUSTER SOFTWARE INSTALLATION

Also, review the data networking and NVMe drive placement of the servers. This will be important during the installation
configuration phase.
The online installation requires an internet connection and the need to configure several files on your system. The file
repository URL is accessible and the RPMs are updated. The data interfaces of each server must be pre-configured on
the same subnet. Our subnet is 10.0.10.0/24 (if the data interfaces are not pre-configured, they can be configured later
using Ansible).
Additionally, check the NUMA placement of the NVMe drives. The command below shows which NUMA each NVMe
drive belongs to (you can update your table with this information). Note that the main example of this installation section
assumes that each server has six NVMe drives in NUMA 0.
$ lspci -mm | grep -Ei "nvme|SSD|Non - Volatile memory controller " | awk '{print $1}' |
xargs -I{} bash -c 'D=/ sys/bus/pci/ devices /0000:{}/; echo -n "$D: "; echo $(cat $D/
numa_node 2> /dev/null), $(cat $D/ label 2> /dev/null)' | nl
# The example output is unrelated to the cluster we are installing . However , it shows
how to interpret the command output . This shows 8 drives in NUMA0 and 8 drives in
NUMA1 . The column after numa_node shows the NUMA ID.
1 /sys/bus/pci/ devices /0000:62:00.0/ numa_node 0
2 /sys/bus/pci/ devices /0000:63:00.0/ numa_node 0
3 /sys/bus/pci/ devices /0000:64:00.0/ numa_node 0
4 /sys/bus/pci/ devices /0000:65:00.0/ numa_node 0
5 /sys/bus/pci/ devices /0000:66:00.0/ numa_node 0
6 /sys/bus/pci/ devices /0000:67:00.0/ numa_node 0
7 /sys/bus/pci/ devices /0000:68:00.0/ numa_node 0
8 /sys/bus/pci/ devices /0000:69:00.0/ numa_node 0
9 /sys/bus/pci/ devices /0000: b3 :00.0/ numa_node 1
10 /sys/bus/pci/ devices /0000: b4 :00.0/ numa_node 1
11 /sys/bus/pci/ devices /0000: b5 :00.0/ numa_node 1
12 /sys/bus/pci/ devices /0000: b6 :00.0/ numa_node 1
13 /sys/bus/pci/ devices /0000: b7 :00.0/ numa_node 1
14 /sys/bus/pci/ devices /0000: b8 :00.0/ numa_node 1
15 /sys/bus/pci/ devices /0000: b9 :00.0/ numa_node 1
16 /sys/bus/pci/ devices /0000: ba :00.0/ numa_node 1

The online installation requires an internet connection on the Lightbits servers and the need to configure several files on
your system.

Note: An offline installation method is available that does not require an internet connection to access the file
repository URL. For more information, see Performing an Offline Installation.

Connecting to the Lightbits Software Repository


Lightbits Cluster Installation Process

# Installation Steps
1 Connecting your installation workstation to
Lightbits’ software repository
2 Verifying the network connectivity of the servers used in
the cluster
3 Setting up an Ansible environment on your installation
workstation
4 Installing a Lightbits cluster by running the Ansible
installation playbook
5 Updating clients (if required)

2023 Lightbits Labs Proprietary And Confidential Under NDA Only


Connecting to the Lightbits Software Repository LIGHTBITS CLUSTER SOFTWARE INSTALLATION

# Installation Steps
6 Provisioning storage, connecting clients, and performing
IO tests

Notes: - To proceed, see the Linux Repo File Customer TOKEN section in your Lightbits Installation Customer
Addendum, for the token that is required to access the yum repository. Access to this repository is required to
install the Lightbits cluster software.
- Contact Lightbits Support if you do not have this addendum document.
- If you are using the offline installation method, you can skip this step and proceed to Verifying Network Connec-
tivity for the Servers in the Cluster.
- For information on installing Red Hat, see Red Hat Linux Installation. Note that Red Hat releases will have a
slightly different baseurl, which will be visible in the Lightbits Installation Customer Addendum.

Verify that you have the TOKEN & baseurl for the Lightbits RPM Repository. Log in to any of the future Lightbits
servers and test the connection to the repository.

Note: Ideally you will want to test from each Lightbits server. However, testing on one and verifying that the rest
have internet connectivity should be sufficient. If one of the servers is not able to reach the repository, there will
be clear error messages during the install, which can be resolved later.

1. In your preferred text editor, open a new file in the workstation’s following CentOS directory: bash /etc/yum.
repos.d/lightos.repo
2. Copy the following template into the file.
# Lightbits repository
[ lightos ]
name= lightos
baseurl =https :// dl. lightbitslabs .com/< YOUR_TOKEN >/ lightos -3-< Minor Ver >-x-ga/rpm/el
/7/ $basearch
repo_gpgcheck =0
enabled =1
gpgcheck =0
autorefresh =1
type=rpm -md

For the <YOUR_TOKEN>, enter the Lightbits token that was included in your copy of the Lightbits Installation Customer
Addendum.
Verify that the baseurl path is correct, with the Lightbits Installation Customer Addendum. Specifically the parts after
the <YOUR_TOKEN>.
3. Save the lightos.repo file.
4. Verify your system’s connectivity to the repository by entering the yum repolist command. This command displays
the enabled software repositories. For example:
$ yum repolist

Make sure that the command exits successfully. If it shows any error, address those before continuing.

Note: For information on installing Red Hat, see Red Hat Linux Installation.

2023 Lightbits Labs Proprietary And Confidential Under NDA Only


Verifying Network Connectivity for the Servers in the Cluster LIGHTBITS CLUSTER SOFTWARE INSTALLATION

Verifying Network Connectivity for the Servers in the Cluster


Lightbits Cluster Installation Process

# Installation Steps
1 Connecting your installation workstation to Lightbits’
software repository
2 Verifying the network connectivity of the servers
used in the cluster
3 Setting up an Ansible environment on your installation
workstation
4 Installing a Lightbits cluster by running the Ansible
installation playbook
5 Updating clients (if required)
6 Provisioning storage, connecting clients, and performing
IO tests

Lightbits recommends that you verify the network connectivity for the servers you plan to use in the Lightbits cluster
before you run the Ansible playbook. To simply confirm the connectivity status, use a ping command for each of the
management NIC IPs and data NIC IPs in the servers.
Referring back to the Installation Planning Table, the example uses three Lightbits servers. Each server has a management
IP.
Before proceeding with the installation, enter the following ping command from the Ansible installation host, to confirm
that each Lightbits server is accessible via the Management Network IP.
$ ping -c 4 192.168.16.22
PING 192.168.16.22 (192.168.16.22) 56(84) bytes of data.
64 bytes from 192.168.16.22: icmp_seq =1 ttl =64 time =0.208 ms
--- 192.168.16.22 ping statistics ---
1 packets transmitted , 1 received , 0% packet loss , time 0ms
rtt min/avg/max/mdev = 0.208/0.208/0.208/0.000 ms

Continuing with the example, a ping command is sent to each of the management network IPs and data network IPs.

Server Management Network IP Data Network IP


Lightbits server00 ping -c 4 192.168.16.22 ping -c 4 10.10.10.100
Lightbits server01 ping -c 4 192.168.16.92 ping -c 4 10.10.10.101
Lightbits server02 ping -c 4 192.168.16.32 ping -c 4 10.10.10.102

Before continuing, confirm the following connections: * The Ansible installation host has ping connectivity to each Lightbits
storage server’s management network IP. It does not need to have data network connectivity. * Each Lightbits storage
server has connectivity between all of the management network IPs. * Each Lightbits storage server has connectivity
between all of the data network IPs. * Additionally, review the section on Required Ports, and make sure all of those
ports are open and accessible on the Lightbits storage servers.

Configuring the Ansible Environment


Lightbits Cluster Installation Process

# Installation Steps
1 Connecting your installation workstation to Lightbits’
software repository

2023 Lightbits Labs Proprietary And Confidential Under NDA Only


Prepare Installation Workstation (Ansible Controller) LIGHTBITS CLUSTER SOFTWARE INSTALLATION

# Installation Steps
2 Verifying the network connectivity of the servers used in
the cluster
3 Setting up an Ansible environment on your
installation workstation
4 Installing a Lightbits cluster by running the Ansible
installation playbook
5 Updating clients (if required)
6 Provisioning storage, connecting clients, and performing
IO tests

This section includes:

Prepare Installation Workstation (Ansible Controller)


Copying the Ansible Environment Tarball
Creating the Inventory Structure and Adding the Ansible Hosts File
Multi-Tenancy
Red Hat Linux Installation
Configuring Global Variables in Ansible
Verifying Hosts Connection
Defining Configuration Files for Each “Ansible Host” (Server) in the Cluster
Defining Failure Domains
Setting the SSD Configuration
Confirming the Required Directory Structure

Prepare Installation Workstation (Ansible Controller)


The Ansible Installation host or Ansible Controller is the host running the Ansible playbook to install the
Lightbits cluster.
We support two ways to set up the Ansible Controller:
• Ansible and dependencies installed on the Ansible Controller.
• Using a prebuilt Ansible Docker image.
Choose one of the methods and follow the steps - for either the Ansible or Docker method.

Ansible Method
For the Ansible method, follow these instructions to install the dependencies.

Install Ansible And Dependencies


The following tools are required to complete the Lightbits cluster software installation:
• sshpass or ssh-key authentication

• Python v3.6 or higher


• Python Modules: ansible, netaddr, python_jwt, six
If you have validated the networking environment as described in the previous section, log in to your installation worksta-
tion and begin downloading and installing the required tools on your workstation.

Install sshpass or Use ssh-key Authentication


The Python tool is essential for running commands remotely on each of the servers used in the cluster. To run these
commands, you must install a Secure Shell (SSH) authentication software package. There are two ways to install this
package.

2023 Lightbits Labs Proprietary And Confidential Under NDA Only


Prepare Installation Workstation (Ansible Controller) LIGHTBITS CLUSTER SOFTWARE INSTALLATION

• Use the Linux sshpass utility.


To install “sshpass”, enter the following command at the CLI:
$ yum install -y sshpass

• Use ssh-key authentication.


To use ssh-key authentication, see Using SSH-Key Authentication.
• Package required for Multi-Tenancy
$ yum install -y libselinux - python3

Install the Required Python Version from CentOS Repo


The Ansible installer is a module installed with Python. Lightbits recommends that you have Python v3.x or above
installed on your system.
If Linux reports that Python 3.x is not installed, use the following command:
$ yum install python36…
.
Complete !

Install Ansible Module Using PyPI


Check if Ansible is installed, as well as its version; for example:
ansible --version

Command 'ansible ' not found

If not found, you can also install Ansible using pip for Python3:
$ pip3 install ansible

Verify that Ansible is installed, as well as its version, by entering:


$ ansible --version
ansible [core 2.11.2]
config file = None
configured module search path = ['/root /. ansible / plugins / modules ', '/usr/ share / ansible
/ plugins / modules ']
ansible python module location = /usr/ local /lib/ python3 .9/ site - packages / ansible
ansible collection location = /root /. ansible / collections :/ usr/ share / ansible /
collections
executable location = /usr/ local /bin/ ansible
python version = 3.9.2 (default , Feb 19 2021 , 17:33:48) [GCC 10.2.1 20201203]
jinja version = 3.0.1
libyaml = False

2023 Lightbits Labs Proprietary And Confidential Under NDA Only


Copying the Ansible Environment Tarball LIGHTBITS CLUSTER SOFTWARE INSTALLATION

Note: If the installation fails due to a UnicodeEncodeError, it is because the locale is not fully configured on the
Ansible host. Set the local LC_ALL environment variable and run the “pip3 install ansible” command again. For
example for UTF8 systems, set the local to: export LC_ALL=en_US.UTF-8

Install Additional Python Modules


The Python netaddr module and python_jwt are also required.
These modules are used to manipulate network addresses, and generate JWT tokens as part of the installation.
At your workstation CLI enter:
$ pip3 install netaddr python_jwt six

Docker Method
Rather than installing and preparing Ansible and its dependencies, we also provide a custom Ansible image to deploy the
Lightbits cluster image that contains all dependencies.

Using a Prebuilt Ansible Docker Image

Note: Make sure se-linux is in Permissive or disabled:

[ root@client1 ~]#
$ getenforce
Permissive

The only prerequisites to use this image are:


• Having Docker installed.
• Access to Lightbits public registry or a private registry to fetch the lb-ansible image.
• This method requires a Docker login. Log in using the steps below and the credentials provided in the Lightbits
Installation Customer Addendum.
$ docker login docker . lightbitslabs .com
Username : lightos -3-< Minor Ver >-x-ga
Password : <YOUR_TOKEN >

Note: The Docker username and password can be extracted from the repository baseurl. The username is the
path bit after the TOKEN in the baseurl. The password is the token.

Copying the Ansible Environment Tarball


Lightbits Support provided you with an installation tarball along with your Installation Addendum that contains all of
the configuration files that the Ansible playbook requires.
Copy the tarball file to your installation workstation home directory and unpack the tar file using the instructions below.
Create a light-app directory and extract the contents into the new directory.

2023 Lightbits Labs Proprietary And Confidential Under NDA Only


Inventory Structure and Adding the Ansible Hosts File LIGHTBITS CLUSTER SOFTWARE INSTALLATION

Note: The directory does not have to be called light-app and does not have to be in $HOME. We only suggest to
do so, as the example going forward through the sections will assume that the installation package was extracted
into $HOME/light-app/.

$ mkdir ~/ light -app


$ tar -xvzf light -app -install - environment -v<Version >. tgz -C ~/ light -app

Unpacking this tarball creates the following Ansible directory structure inside of the light-app directory, which contains
the Ansible environment where the “ansible-playbook” command runs.
���
ansible�
��� inventories�
��� ...���
ansible .cfg���
playbooks�
��� ...���
plugins�
��� ...���
roles
��� ...

Inventory Structure and Adding the Ansible Hosts File


The Ansible playbook installer requires configuration files to drive it.
In Ansible terminology, each Lightbits storage server is referred to as a “host”. Details about the Lightbits storage servers
must be entered into the Ansible “hosts” file that is stored in an “inventory” temporary directory structure.
Complete the following steps to configure the Ansible “hosts” file, which describes all of the Lightbits storage server names
and their management IPs.

Notes: - The servers’ hostnames and Ansible names do not have to match. We usually refer to the servers’ Ansible
names as server00, server01, server02, etc. These will become the servers’ identifying names with the Lightbits
software. Therefore this will give the servers a name of serverXX going forward.
- For the ansible_host field, provide the management IP. However if the servers are only configured with data IPs
and no management IP, then provide the data IPs of the server (in this case the data IP doubles as the management
and data IP).

1. Open a text editor and edit the copied hosts example file, which is now found in the new ~/light-app/
ansible/inventories/cluster_example/hosts path. Replace the ansible_host, ansible_ssh_pass, and
ansible_become_user values with your environment’s relevant values. This is for each server that will be in
your cluster. Refer to the following example for reference.

2023 Lightbits Labs Proprietary And Confidential Under NDA Only


Inventory Structure and Adding the Ansible Hosts File LIGHTBITS CLUSTER SOFTWARE INSTALLATION

server00 ansible_host =192.168.16.22 ansible_connection =ssh ansible_ssh_user =root


ansible_ssh_pass = light ansible_become_user =root ansible_become_pass = light
server01 ansible_host =192.168.16.92 ansible_connection =ssh ansible_ssh_user =root
ansible_ssh_pass = light ansible_become_user =root ansible_become_pass = light
server02 ansible_host =192.168.16.32 ansible_connection =ssh ansible_ssh_user =root
ansible_ssh_pass = light ansible_become_user =root ansible_become_pass = light
client00 ansible_host =192.168.16.45 ansible_connection =ssh ansible_ssh_user =root
ansible_ssh_pass = light ansible_become_user =root ansible_become_pass = light

[ duros_nodes ]
server00
server01
server02

[ duros_nodes :vars]
local_repo_base_url = https :// dl. lightbitslabs .com/< YOUR_TOKEN >/ lightos -3-< Minor Ver
>-x-ga/rpm/el /7/ $basearch
auto_reboot =true
cluster_identifier =ae7bdeef -897e -4c5b -abef -20234 abf21bf

[etcd]
server00
server01
server02

[ initiators ]
client00

• You can replace the ansible_host flag’s value with the interface DNS name or IP address. In this example, the
management network IP addresses from the cluster details table are used, not the data network IPs.
• Also in this example hosts file, there is a “local_repo_base_url” entry that includes . This information was provided
to you in the Customer Addendum. You will need to enter this value here before proceeding.
2. Remove the client00 line in the top section and “[initiators]” sections.

Note: It is possible to set up the Ansible files to install and configure clients. However, this section only describes
how to install the Lightbits storage servers. The next section details how to configure and connect clients.

2023 Lightbits Labs Proprietary And Confidential Under NDA Only


Inventory Structure and Adding the Ansible Hosts File LIGHTBITS CLUSTER SOFTWARE INSTALLATION

server00 ansible_host =192.168.16.22 ansible_connection =ssh ansible_ssh_user =root


ansible_ssh_pass = light ansible_become_user =root ansible_become_pass = light
server01 ansible_host =192.168.16.92 ansible_connection =ssh ansible_ssh_user =root
ansible_ssh_pass = light ansible_become_user =root ansible_become_pass = light
server02 ansible_host =192.168.16.32 ansible_connection =ssh ansible_ssh_user =root
ansible_ssh_pass = light ansible_become_user =root ansible_become_pass = light
[ duros_nodes ]
server00
server01
server02
[ duros_nodes :vars]
local_repo_base_url = https :// dl. lightbitslabs .com/< YOUR_TOKEN >/ lightos -3-< Minor Ver >-x-ga
/rpm/el /7/ $basearch
auto_reboot =true
cluster_identifier =ae7bdeef -897e -4c5b -abef -20234 abf21bf
[etcd]
server00
server01
server02

3. Take into account the following information when filling out the “hosts” file.
More information about the hosts file: - The top section of the “hosts” describes how Ansible will connect to other
servers to install Lightbits. It also provides a friendly name for each Lightbits server; for example, “server00”. These
names will be used going forward by the Lightbits software as the identifying name for the servers. - The “duros_nodes”
section describes where Lightbits will be installed. - The “duros_nodes:vars” section describes from where Lightbits will
be installed. In this case/example the repo URL is provided, as this uses the online method. However, for the offline instal-
lation method this section is different. For more, see Single-IP-Dual-NUMA Configuration. - The “local_repo_base_url”
field must be filled in with the and remainder of the URI to properly direct the Ansible installation to the correct Lightbits
repository. Different release versions will have slightly different paths. The TOKEN is provided in the Customer Installa-
tion Addendum. The “local_repo_base_url” must be correct or else the installation will fail. The “local_repo_base_url”
value should be the same as the “baseurl” value in Connecting to the Lightbits Software Repository. If that worked
successfully, then it is ok. - The “auto_reboot” and “cluster_identifier” fields should be left as is. - The “auto_reboot”
field instructs the servers to reboot during the installation. This is an important part of the installation. - The “clus-
ter_identifier” can be left as is, as it is unused. Note that the cluster will end up getting an auto-generated ID called
the “clusterName” key, which can be changed after installation. - Lightbits uses “etcd” for the key/value database of the
cluster. The “etcd” section describes where etcd will be installed and which servers will become members of etcd. Every
server appearing in “duros_nodes” must appear in the “etcd” section.
Host File Server Variables

Variable Required Description


local_repo_base_url Yes Mandatory unless offline installation
is used. This is the same value
entered for the “baseurl” you
configured in the Connecting to the
Lightbits Software Repository section.
auto_reboot No A False value means that the
installation will wait for user
instructions to either reboot or not
after installation. A True value
means that the installation will
reboot in case of a kernel change
without user instructions. The default
value is False.

2023 Lightbits Labs Proprietary And Confidential Under NDA Only


Multi-Tenancy LIGHTBITS CLUSTER SOFTWARE INSTALLATION

Variable Required Description


cluster_identifier No An identifier of the cluster that is
used to filter the logs of a specific
cluster.

Note: To use ansible_ssh_private_key_file instead of ansible_ssh_pass, see Using SSH-Key Authentica-


tion.

Note: For information on installing Red Hat, see Red Hat Linux Installation.

4. The final “hosts” file will look similar to the above output. Save and exit out of the ~/light-app/ansible/
inventories/cluster_example/hosts file.

Multi-Tenancy
Lightbits v2.2.1 and above enforces tenant isolation on the control plane (“multi-tenancy”). With multi-tenancy, multiple
tenants can share a Lightbits cluster without being able to see or affect each other’s resources when accessing the Lightbits
API or using the Lightbits command line tools.
Command line tools and all other API users must use the v2 Lightbits API. The v2 API includes provisions for authenti-
cation and authorization via standard JSON Web Tokens (“JWTs”), as well as transport security for all API operations.
The following three predefined roles are created by default:
• cluster-admin (system scope)
• admin (project scope)
• viewer (project scope)
Currently, roles cannot be added.
At installation, the user can provide their own certificate and CA to be used by the peers. If these files are not provided,
the installation will generate self-signed certificates.

Certificates Directory
By default, certificates are stored at certificates_directory=~/lightos-certificates on the Ansible controller ma-
chine.
certificates_directory can be overridden via cmd-line:
ansible - playbook playbooks /deploy - lightos .yml \
-e 'certificates_directory =/ path/to/ certs ' ...

Or via group_vars/all:
yaml
certificates_directory =/ path/to/ certs

Certificate Types
Implementing multi-tenancy involves three sets of certificates:
• etcd Certificates For mTLS Peer Communication
• API Service Certificates For TLS
• System Scope Cluster Admin Certificates

2023 Lightbits Labs Proprietary And Confidential Under NDA Only


Multi-Tenancy LIGHTBITS CLUSTER SOFTWARE INSTALLATION

etcd Certificates for mTLS Peer Communication


All etcd services serve client APIs only on localhost. This minimizes the exposure of etcd to outside malicious activity.
Peer communication must be encrypted at all times, since etcd passes sensitive traffic between its peers.
The installation script expects the following files to be present at certificates_directory on the Ansible controller
machine:
etcd -ca -key.pem
etcd -ca.pem
{ ansible_hostname }-cert -etcd -peer -key.pem
{ ansible_hostname }-cert -etcd -peer.pem

• etcd-ca: Certificate authority (CA) parameters for etcd certificates. This CA is used to sign certificates used by
etcd (such as peer and server certificates).
• {ansible_hostname}-cert-etcd-peer: The peer certificate is used by etcd for peer communication.
These files are passed to the following etcd parameters: --peer-cert-file and --peer-key-file.

Note: {ansible_hostname} is the name we gave the etcd node in the hosts file.

Example
A 3-node cluster with server00-02 will result in:
etcd -ca -key.pem
etcd -ca.pem
server00 -cert -etcd -peer -key.pem
server00 -cert -etcd -peer.pem
server01 -cert -etcd -peer -key.pem
server01 -cert -etcd -peer.pem
server02 -cert -etcd -peer -key.pem
server02 -cert -etcd -peer.pem

Notes: - These names are hard-coded in the installation script. Only the source directory can change.
- If these files are not provided, the installation will generate self-signed certificates and place them at
certificates_directory on the Ansible controller machine.

API Service Certificates For TLS


All API endpoints are TLS-enabled by default.
The user can provide their own SSL certificates, or the installation process will generate a self-signed certificate.
These are the files used by api-service to set up TLS communication.
cert -lb -api -service -key.pem
cert -lb -api - service .pem

System Scope Cluster Admin Certificates


These files will be stored in etcd and used to authenticate a system-scope project.
These are the files used to generate system scope credentials:

2023 Lightbits Labs Proprietary And Confidential Under NDA Only


Red Hat Linux Installation LIGHTBITS CLUSTER SOFTWARE INSTALLATION

cert -lb -admin -key.pem


cert -lb - admin .pem

Generating Self-Signed Certificates


The Lightbits installation playbook checks for certificates_directory existence. If it does not exist, a folder will be
created and populated with self-signed certificates. If it exists, the playbook will verify that all expected certificates files
are present. In case one is missing the installation will fail.

Notes: - Certificate file names are hard-coded in the installation script. Only the source directory can change.
These are pairs and go together.

File name format :

- `<name >.pem `: Certificate .


- `<name >-key.pem `: RSA private key that matches the certificate .

• In case we want to regenerate the self-signed certificates, we should delete the certificates_directory and all of
its content.

Bring Your Own Certificates


You can provide your own certificates for each of the components.
You can override part or all of the files before running the install-lightos.yaml playbook.

Red Hat Linux Installation


The following summarizes the key points of distribution-specific information for Red Hat 8.
Note that the default “GA” Lightbits installation is based on Centos and the “RHEL” builds are based on Red Hat. The
“GA” builds install a Lightbits modified kernel; however, the “RHEL” builds are installed on top of the Red Hat kernel.
Make sure that all of the Lightbits servers are on the same Red Hat distribution and kernel. Ensure also that the specific
kernel is set as the default kernel via: grubby --default-kernel.
Install the Lightbits release that matches the kernel and distribution of the servers. For additional information on supported
Lightbits operating systems and kernels, see General System Requirements.
Before the Ansible installation:

Note: If the latest Lightbits release supports a kernel or distribution newer than your OS, upgrade your OS and
kernel to match the supported OS and kernel before continuing with the Lightbits installation.

1. (Applies only to Red Hat) Make sure Red Hat subscription manager is registered and attached.
2. Edit the hosts file with the required target details. Consult with Lightbits Support for the Red Hat repository
baseurl value.
3. To ensure that the kernel does not get overwritten by another kernel, add to group_vars/all.yml. Add
use_lightos_kernel: false.
4. From Red Hat 8 based releases and onward, Chrony took over NTP as the default network time protocol. Edit
all.yml to ensure that NTP is not installed and Chrony is configured. Comment out the NTP sections and set the
following NTP variables to false.

2023 Lightbits Labs Proprietary And Confidential Under NDA Only


Configuring Global Variables in Ansible LIGHTBITS CLUSTER SOFTWARE INSTALLATION

ntp_enabled : false
chrony_enabled : true
ntp_manage_config : false
use_lightos_kernel : false
# ntp_servers :
# - "0{{ '.' + ntp_area if ntp_area else '' }}. pool.ntp.org iburst "
# - "1{{ '.' + ntp_area if ntp_area else '' }}. pool.ntp.org iburst "
# - "2{{ '.' + ntp_area if ntp_area else '' }}. pool.ntp.org iburst "
# - "3{{ '.' + ntp_area if ntp_area else '' }}. pool.ntp.org iburst "

# ntp_version : "ntp -4.2.6 p5 -29. el7. centos . x86_64 "

# ntp_packages :
# - "autogen - libopts *. rpm"
# - " ntpdate *. rpm"
# - "ntp *. rpm"

5. Install the Lightbits software as described in the Lightbits Cluster Software Installation Process section.

Configuring Global Variables in Ansible


Review the configuration of the global variables in the Ansible file all.yml, located in the following location:
~/ light -app/ ansible / inventories / cluster_example / group_vars /all.yml

If all of the machines in the cluster have PMEM (NVDIMM or Intel Optane) installed, the persistent_memory flag
must be set as follows:
persistent_memory : true

If there are machines in the cluster that do not have PMEM installed, then set this flag to false.

Note: Since the persistent_memory flag is a global property for all of the clusters, it is important to declare this
flag only once under the all.yml file and not in host_vars files with different values.

IP ACL allows support for restricted/non-restricted access to a cluster. This feature must be enabled during installation,
by setting the enable_iptables flag; otherwise it cannot be used.
When the enable_iptables flag is set to true, access to the cluster nodes is allowed only from client IPs that are defined
per volume using the ip_acl setting of each volume. By default, it is set to false. In order to use this mode, add the
following to all.yml:
enable_iptables : true

NTP vs Chrony
Check if your OS prefers NTP or Chrony, and proceed using that option.

NTP:
For NTP configurations, the default settings in the all.yml file can be used (note that the defaults are greyed out). You
can uncomment and edit the prefered NTP servers, NTP version, and its dependencies packages. For more information
on these parameters, see Network Time Protocol Configuration.

2023 Lightbits Labs Proprietary And Confidential Under NDA Only


Configuring Global Variables in Ansible LIGHTBITS CLUSTER SOFTWARE INSTALLATION

# nvme_subsystem_nqn_suffix : " some_suffix "


# ntp_enabled : true
# chrony_enabled : false
# ntp_manage_config : true
# ntp_servers :
# - "0{{ '.' + ntp_area if ntp_area else '' }}. pool.ntp.org iburst "
# - "1{{ '.' + ntp_area if ntp_area else '' }}. pool.ntp.org iburst "
# - "2{{ '.' + ntp_area if ntp_area else '' }}. pool.ntp.org iburst "
# - "3{{ '.' + ntp_area if ntp_area else '' }}. pool.ntp.org iburst "
# ntp_version : "ntp -4.2.6 p5 -29. el7. centos . x86_64 "
# ntp_packages :
# - "autogen - libopts *. rpm"
# - " ntpdate *. rpm"
# - "ntp *. rpm"

Note: When configuring to use NTP, make sure chrony_enabled is either commented out like above, or set to
false.

Chrony:
For Chrony configurations - which are the default for Red Hat 8 based releases and onward - configure the all.yml settings
as follows:
Disable NTP and enable Chrony. Additionally, make sure that the Chrony service is configured on your servers.
ntp_enabled : false
chrony_enabled : true
ntp_manage_config : false
use_lightos_kernel : false
# ntp_servers :
# - "0{{ '.' + ntp_area if ntp_area else '' }}. pool.ntp.org iburst "
# - "1{{ '.' + ntp_area if ntp_area else '' }}. pool.ntp.org iburst "
# - "2{{ '.' + ntp_area if ntp_area else '' }}. pool.ntp.org iburst "
# - "3{{ '.' + ntp_area if ntp_area else '' }}. pool.ntp.org iburst "
# ntp_version : "ntp -4.2.6 p5 -29. el7. centos . x86_64 "
# ntp_packages :
# - "autogen - libopts *. rpm"
# - " ntpdate *. rpm"
# - "ntp *. rpm"

Note: Verify that the date and time are in sync on the Lightbits storage servers and the Ansible installation host.
You can use date and run it simultaneously on all servers.

Use Lightos Kernel


If a Red Hat based release is used, add the following to all.yml:
use_lightos_kernel : false

Setting use_lightos_kernel to false ensures that the kernel that is on the servers remains as is.
If a GA based release is used, remove that line or set it true.

2023 Lightbits Labs Proprietary And Confidential Under NDA Only


Verifying Hosts Connection LIGHTBITS CLUSTER SOFTWARE INSTALLATION

use_lightos_kernel : true

Setting use_lightos_kernel to true will install the Lightbits supplied kernel, which is a requirement for the Lightbits GA
releases.

Single IP Dual NUMA Configuration


The Single IP dual NUMA configuration (see Single-IP-Dual-NUMA Configuration), requires changes to the all.yml file.
Follow the directions in that section to properly configure this setup.

Verifying Hosts Connection


Verify connectivity from the Ansible Installation Host to all of the machines where Lightbits will be installed. First verify
that ping works. Then verify that SSH connectivity works using the normal user and then using a privileged (become)
user.

Testing Connectivity via Ping


Use the ping command to verify that all machines in the cluster respond. Enter the following Ansible shell command:
$ cd ~/ light -app
$ ansible -i ansible / inventories / cluster_example / hosts all -m ping

A successful response from this Ansible ping is as follows:


server02 | SUCCESS => {
" ansible_facts ": {
" discovered_interpreter_python ": "/usr/bin/ python "
},
" changed ": false ,
"ping": "pong"
}
Server00 | SUCCESS => {
" ansible_facts ": {
" discovered_interpreter_python ": "/usr/bin/ python "
},
" changed ": false ,
"ping": "pong"
}
server01 | SUCCESS => {
" ansible_facts ": {
" discovered_interpreter_python ": "/usr/bin/ python "
},
" changed ": false ,
"ping": "pong"
}

Note: If you see an error related to ssh-key authentication, see Troubleshooting.

Testing Connectivity via SSH


This command tests a connection using the user configured with the ansible_ssh_user field in the “hosts” file.

2023 Lightbits Labs Proprietary And Confidential Under NDA Only


Defining Configuration Files for Each “Ansible Host” (Server) inLIGHTBITS
the Cluster CLUSTER SOFTWARE INSTALLATION

$ cd ~/ light -app
$ ansible all -i ansible / inventories / cluster_example / hosts -m command -a id

The expected output for each machine is its “id” output, which should return the username and groups.
The following is an example of a good output:
server00 | CHANGED | rc =0 >>
uid =0( root) gid =0( root) groups =0( root)
server01 | CHANGED | rc =0 >>
uid =0( root) gid =0( root) groups =0( root)
server02 | CHANGED | rc =0 >>
uid =0( root) gid =0( root) groups =0( root)

Any output other than an error is a good output. If there are connection issues, verify that SSH is properly set up and
make sure that the ansible_ssh_user and ansible_ssh_pass are properly configured.
After the above command is successful, test that the privileged user can access the machines over SSH. Note that if the
ansible_ssh_user is root, you can skip this final verification.
$ cd ~/ light -app
$ ansible all -i ansible / inventories / cluster_example / hosts -m command -a id -b

Notes: - The last test will use the ansible_become_user from the hosts file, which is usually root. This will
ideally test connecting to each machine via ansible_ssh_user and ansible_ssh_pass, and raise the privilege
to the ansible_become_user using the sudo password - which is configured via the ansible_become_pass.
- If key-based authentication is used instead and you get a connectivity error, make sure that it is properly
configured.

Defining Configuration Files for Each “Ansible Host” (Server) in the Cluster
Return to the /~/light-app/ansible/inventories/cluster_example directory you created in Inventory Structure and
Adding the Ansible Hosts File.
~/ light -app/ ansible / inventories / cluster_example
|-- cluster_example
|-- group_vars
| |-- all.yml
|-- hosts
|-- host_vars
|-- client00 .yml <- This file can be ignored or deleted .
|-- server00 .yml
|-- server01 .yml
|-- server02 .yml

From this path we will edit each of the yml files found in the ~/light-app/ansible/inventories/cluster_example/host_vars
subdirectory. In our example cluster, we have three Lightbits storage nodes that are defined by the files:
• host_vars/server00.yml
• host_vars/server01.yml
• host_vars/server02.yml

2023 Lightbits Labs Proprietary And Confidential Under NDA Only


Defining Configuration Files for Each “Ansible Host” (Server) inLIGHTBITS
the Cluster CLUSTER SOFTWARE INSTALLATION

1. In each of the host variable files, update the following required variables:
Required Variables for the Host Variable File

Variable Description
name The cluster server’s name. Example: serverXX. Must
match the filename (without the extension) and server
names configured in th “hosts” file.
instanceID The configuration parameters for the logical node in this
server. Currently, Lightbits supports up to two logical
nodes per server.
ec_enabled (per logical node) Enables Erasure Coding (EC) protects
against SSD failure within the storage server by preventing
IO interruption. Normal operation continues during
reconstruction when a drive is removed.
failure domains (per logical node) The servers sharing a network, power
supply, or physical location that are negatively affected
together when there are network, power, cooling, or other
critical service experience problems. Different copies of the
data are stored in different FDs to keep data protected
from various failures. To specify the servers in the FD, you
must add the server names. For further information, see
Defining Failure Domains.
data_ip (per logical node) The data IP used to connect to other
servers. Can be IPv4 or IPv6.
storageDeviceLayout (per logical node) Sets the SSD configuration for a node.
This includes the number of initial SSD devices, the
maximum number of SSDs allowed, allowance for NUMA
across devices, and memory partitioning and total capacity.
For further information, see Setting the SSD
Configuration.
initialDeviceCount The number of NVMe drives accounted for this instance to
use.
maxDeviceCount The maximum number of NVMe drives current instances
can support. Commonly configured equal to
initialDeviceCount or higher.
allowCrossNumaDevices Leave this setting set as “false” if all of the accounted
NVMe drives for this instance are in the same NUMA. Set
it to “true” if to access the NVME drives this instanceID
will need to do cross-NUMA communication.
deviceMatcers This determines which NVMe drives will be considered for
data and which will be ignored. For example, if the OS
drive is an NVME drive, it can be ignored using the name
option. The default settings do a good job by only
counting NVMe drives greater than 300 GiB and without
partitions to be part of the data.

To update these parameters, the cluster details table is useful.


Installation Planning Table Sample

Note: The following is an example for three Lightbits servers in a cluster with a single client.

2023 Lightbits Labs Proprietary And Confidential Under NDA Only


Defining Configuration Files for Each “Ansible Host” (Server) inLIGHTBITS
the Cluster CLUSTER SOFTWARE INSTALLATION

Management Data NIC


Server Name Role Network IP Interface Name Data NIC IP NVMe Drives
server00 Lightbits Storage 192.168.16.22 ens1 10.10.10.100 6
Server 1
server01 Lightbits Storage 192.168.16.92 ens1 10.10.10.101 6
Server 2
server02 Lightbits Storage 192.168.16.32 ens1 10.10.10.102 6
Server 3
client00 client 192.168.16.45 ens1 10.10.10.103 N/A

Examples for the three host variable files follow.


server00.yml
name: server00
nodes:
- instanceID : 0
data_ip : 10.10.10.100
failure_domains :
- server00
ec_enabled : true
lightfieldMode : SW_LF
storageDeviceLayout :
initialDeviceCount : 6
maxDeviceCount : 12
allowCrossNumaDevices : false
deviceMatchers :
# - model =~ ".*"
- partition == false
- size >= gib (300)
# - name =~ " nvme0n1 "

server01.yml
name: server01
nodes:
- instanceID : 0
data_ip : 10.10.10.101
failure_domains :
- server01
instanceID : 0
ec_enabled : true
lightfieldMode : SW_LF
storageDeviceLayout :
initialDeviceCount : 6
maxDeviceCount : 12
allowCrossNumaDevices : false
deviceMatchers :
# - model =~ ".*"
- partition == false
- size >= gib (300)
# - name =~ " nvme0n1 "

server02.yml

2023 Lightbits Labs Proprietary And Confidential Under NDA Only


Defining Failure Domains LIGHTBITS CLUSTER SOFTWARE INSTALLATION

name: server02
nodes:
- instanceID : 0
data_ip : 10.10.10.102
failure_domains :
- server02
ec_enabled : true
lightfieldMode : SW_LF
storageDeviceLayout :
initialDeviceCount : 4
maxDeviceCount : 12
allowCrossNumaDevices : false
deviceMatchers :
initialDeviceCount : 6
maxDeviceCount : 12
allowCrossNumaDevices : false
deviceMatchers :
# - model =~ ".*"
- partition == false
- size >= gib (300)
# - name =~ " nvme0n1 "

Notes: - See Host Configuration File Variables for the entire list of variables available for the host variable files.
- You can also reference additional host configuration file examples.
- Typically the servers should already be configured with the data_ip. However, the Ansible playbook can configure
the data NIC IP; for that you will need to add a section data_ifaces with the data interface name. For further
information, see Configuring the Data Network. Also section 4.4.9 shows an example of this configuration.
- If you need to create a separate partition for etcd data on the boot device, see etcd Partitioning.
- Based on the placement of SSDs in the server, check if you need to make a change in the client profile to permit
cross-NUMA devices.
- Starting from Version 3.1.1, data IP can be IPv6. For example: data_ip: 2600:80b:210:440:ac0:ebff:fe8b:
ebc0

Defining Failure Domains


A Failure Domain (FD) encompasses a section of a network, power supply, or physical location negatively affected when
network, power, cooling, or other critical service experiences problems occur. Different copies of the data are stored in
different FDs to keep data protected from various failures.
To specify the servers in the FD, you can configure it with items in the failure_domain array of the server configuration
files. Take into consideration the server00.yml and server01.yml configuration below:
Server00 failure_domains array is configured with its own server name and the rack it is placed in, “rack00”.

2023 Lightbits Labs Proprietary And Confidential Under NDA Only


Defining Failure Domains LIGHTBITS CLUSTER SOFTWARE INSTALLATION

name: server00
data_ifaces :
- bootproto : static
conn_name : ens1
ifname : ens1
ip4: 10.10.100/24
nodes:
- instanceID : 0
data_ip : 10.10.100
failure_domains :
- server00
- rack00
ec_enabled : true
lightfieldMode : SW_LF
storageDeviceLayout :
initialDeviceCount : 6
maxDeviceCount : 12
allowCrossNumaDevices : false
deviceMatchers :
# - model =~ ".*"
- partition == false
- size >= gib (300)
# - name =~ " nvme0n1 "

Server01 failure_domains array is configured with its own server name and the rack it is placed in “rack00”.
name: server01
data_ifaces :
- bootproto : static
conn_name : ens1
ifname : ens1
ip4: 10.10.10.101/24
nodes:
- instanceID : 0
data_ip : 10.10.10.101
failure_domains :
- server01
- rack00
ec_enabled : true
lightfieldMode : SW_LF
storageDeviceLayout :
initialDeviceCount : 6
maxDeviceCount : 12
allowCrossNumaDevices : false
deviceMatchers :
# - model =~ ".*"
- partition == false
- size >= gib (300)
# - name =~ " nvme0n1 "

Make a note of the items in both server00 and server01 failure_domains arrays.
Since both servers share the same “rack00”, volumes replicas will not be shared between these two servers (and their
nodes).
If the lists were default, then volume replicas would be shared between the servers. Default means server00 failure_domain

2023 Lightbits Labs Proprietary And Confidential Under NDA Only


Setting the SSD Configuration LIGHTBITS CLUSTER SOFTWARE INSTALLATION

array only had “server00”, and server01 failure_domain array only had “server01”.

Notes: - At a minimum or good default configuration, configure the failure_domains with the server names. Add
other items as above with “rack00”, to help control the flow of volume replication.
- In a dual instance/node setup, volume replicas will not land on other nodes of same server.
- See Host Configuration File Variables for the entire list of variables available for the host variable files.

Notes: - The configurations above have a “data_ifaces” section for each server configuration. Typically this section
is not included as the servers should be preconfigured with their data IPs; however, we can instruct Ansible to
configure the data IPs during the Lightbits installation, so that the “data_ifaces” section tells Ansible to configure
the IP and subnet on said interface.
- Note that for ipv6 addresses, you will use ‘ip6: ip/prefix’ format. For example: ip6: 2001:0db8:0:f101::1/64.
- The addresses used for ip4 or ip6 fields must match the address used in data_ip. The only difference is that ip4
and ip6 show the subnet or prefix as well. However, note that data_ip only shows the address without the subnet
or prefix.

Setting the SSD Configuration


To allow for future storage expansion, you will need to set the Maximum Device Count to the total number of drive slots
physically available in the Lightbits node during the initial Lightbits configuration process.
Setting the Maximum Device Count to the maximum number of drive slots allows you to start the Lightbits node with
empty drive slots in the server chassis. This is because you only need a small amount of storage and plan to add more
SSDs into the empty drive slots as demand increases.
For example, your storage server chassis has 12 SSD slots, but initially, you only want to configure Lightbits to use eight
drives. So in this case, you need to:
• Set your Maximum Device Count to 12.
• Physically install only eight drives.
• Leave four drive slots empty for later use.

Note: If Erasure Coding is enabled (ec_enabled: true), you must have a minimum of six SSDs installed in that
node.

To specify the SSD configuration for a node, you must enter a value for the total drive slots available for your Lightbits
node to the host configuration file as follows:

2023 Lightbits Labs Proprietary And Confidential Under NDA Only


Confirming the Required Directory Structure LIGHTBITS CLUSTER SOFTWARE INSTALLATION

name: server00
data_ifaces :
- bootproto : static
conn_name : ens1
ifname : ens1
ip4: 10.10.10.100/24
nodes:
- instanceID : 0
data_ip : 10.10.10.100
failure_domains :
- server00
ec_enabled : true
data_ip : 10.10.10.100
lightfieldMode : SW_LF
storageDeviceLayout :
initialDeviceCount : 6
maxDeviceCount : 12
allowCrossNumaDevices : false
deviceMatchers :
# - model =~ ".*"
- partition == false
- size >= gib (300)
# - name =~ " nvme0n1 "

Note: See Host Configuration File Variables for the entire list of variables available for the host variable files.

Confirming the Required Directory Structure


Before proceeding, change to the light-app directory. It is important to be in this directory when running the ansible-
playbook in the next section. Ansible depends on some files and directories to be in certain places.
$ cd ~/ light -app

Run ls and make sure you see the following files and folders.
ansible
ansible .cfg
playbooks
plugins
roles

Additionally confirm the structure of your ansible directory to be similar to this:

2023 Lightbits Labs Proprietary And Confidential Under NDA Only


Running the Ansible Installation Playbook to Install Lightbits Cluster Software
LIGHTBITS CLUSTER SOFTWARE INSTALLATION

$ tree ansible
ansible���
inventories
��� cluster_example
��� group_vars
� ��� all.yml
��� host_vars
� ��� client_0 .yml
� ��� server00 .yml
� ��� server01 .yml
� ��� server02 .yml
��� hosts

Running the Ansible Installation Playbook to Install Lightbits Cluster Software


Lightbits Cluster Installation Process

# Installation Steps
1 Connecting your installation workstation to Lightbits’
software repository
2 Verifying the network connectivity of the servers used in
the cluster
3 Setting up an Ansible environment on your installation
workstation
4 Installing a Lightbits cluster by running the
Ansible installation playbook
5 Updating clients (if required)
6 Provisioning storage, connecting clients, and performing
IO tests

As discussed in Prepare Installation Workstation (Ansible Controller), we support installing using Ansible or using a
prebuilt Docker image that contains Ansible. Pick the method that applies to your installation environment and follow
the commands to install the Lightbits cluster software on the storage servers. Afterwards, go to the bottom of the section
to confirm a successful installation.

Note: For both methods, we provide the simple default installation methods. However, we provide other more
advanced installation configuration examples in the Ansible Docker section. Note, however, that these same
examples can be adopted into the Ansible method; for that you will just skip the Docker commands and just refer
to the ansible-playbook commands as the template.

Ansible Installation Method


Running the Ansible Controller

Note: The Ansible playbook operations below can take several minutes. The output will report the status of all
the tasks that succeeded/failed on the nodes.

To install the cluster software and configure the cluster, change into light-app directory with cd ~/light-app, and enter
the following command to run the playbook:
ansible - playbook -i ansible / inventories / cluster_example / hosts playbooks /deploy - lightos .
yml -vvv

2023 Lightbits Labs Proprietary And Confidential Under NDA Only


Running the Ansible Installation Playbook to Install Lightbits Cluster Software
LIGHTBITS CLUSTER SOFTWARE INSTALLATION

Notes: - This command must be run from the directory where light-app was extracted to. Then all of the paths
will work as displayed.
- The inventory file points to a “hosts” file, which instructs Ansible where to deploy Lightbits.
- The selected playbook, “deploy-lightos.yml”, instructs Ansible on how to install and configure the Lightbits cluster
on the servers mentioned in the “hosts” file.
- Ansible will log to its default path as specified by ansible.cfg. By default that is /var/log/ansible.log. The log
path can be changed by prefixing ANSIBLE_LOG_PATH=/var/log/ansible.log ansible-playbook ...
- The following files will be created into the home directory: lightos-system-jwt & lightos-default-admin-jwt.
- Certificates used by the cluster will be saved into a new directory, lightos-certificates. This directory will be
created in the home directory.
- It is recommended to make a secure backup of this content, or at a minimum, the jwt files and lightos-certificates
directory.
- Debug level verbosity is enabled with -vvv. It helps diagnose any issues if they happen.

When the installation is done, the cluster will be bootstrapped with a system-scope project. You will need access to the
JWT. By default the cluster-admin JWT is placed in ~/lightos-system-jwt of the Ansible host. This path can be changed
by editing group_vars/all.yml before running the ansible-playbook, and appending this variable system_jwt_path: "
{{ '~/lightos-system-jwt' | expanduser }}"

Prebuilt Ansible Docker Installation Method


If using the prebuilt Docker image with Ansible, ensure that Docker is logged in. Refer to the bottom of Prepare Installation
Workstation (Ansible Controller) for instructions on how to log in using Docker.
This first subsection shows instructions on launching a simple or default Lightbits installation. The subsections afterwards
show other variations to the installation that might apply to more complex configurations.
Additionally, each Docker example requires the correct Docker URL docker.lightbitslabs.com/lightos-3-<Minor
Ver>-x-ga/lb-ansible:4.2.0. Note that the path bit is incomplete and requires substitution. Refer to the Lightbits
Installation Customer Addendum for the correct Docker image URL.

Running Using the lb-ansible Docker Image


mkdir -p /opt/lightos - certificates
cd ~/ light -app
docker run -it --rm --net=host \
-v /opt/lightos - certificates :/ lightos - certificates \
-v `pwd `:/ ansible \
-w / ansible \
-e ANSIBLE_LOG_PATH =/ ansible / ansible .log \
docker . lightbitslabs .com/lightos -3-< Minor Ver >-x-ga/lb - ansible :4.2.0 \
sh -c 'ansible - playbook \
-e system_jwt_path =/ ansible / lightos_jwt \
-e lightos_default_admin_jwt =/ ansible / lightos_default_admin_jwt \
-e certificates_directory =/ lightos - certificates \
-i ansible / inventories / cluster_example / hosts \
playbooks /deploy - lightos .yml -vvv '

We will pre-create the /opt/lightos-certificates directory, so that our certificates get saved outside of the container.
Command breakdown:
• Mount host’s /opt/lightos-certificates to docker’s /lightos-certificates to store generated certificates on the host.
Docker will create the /opt/lightos-certificates directory on the host if it is missing.
• Mount the current working directory or $PWD to /ansible inside the container, to have access to the playbook and
roles. The current working directory at this point will be where light-app was extracted.
• Set the WORKDIR to /ansible inside the container. This sets the current working directory within docker to
/ansible.

2023 Lightbits Labs Proprietary And Confidential Under NDA Only


Running the Ansible Installation Playbook to Install Lightbits Cluster Software
LIGHTBITS CLUSTER SOFTWARE INSTALLATION

• Configure Ansible to write logs to /ansible/ansible.log.


• Run the playbook with specified hosts from an inventory folder in the ansible/inventories/cluster_example.
• Set system_jwt_path to be placed at $PWD/lightos_jwt after the container is closed.
• Set lightos_default_admin_jwt to be placed at $PWD/lightos_default_admin_jwt after the container is closed.

Note: For information on installing Red Hat, see Red Hat Linux Installation.

Custom Inventory Folder


If the inventory folder is placed in a different location, such as: /path/to/inventory, you can mount this path as well
and use it:
mkdir -p /opt/lightos - certificates
cd ~/ light -app
docker run -it --rm --net=host \
-v /opt/lightos - certificates :/ lightos - certificates \
-v `pwd `:/ ansible \
-v /path/to/ inventory :/ inventory \
-w / ansible \
-e ANSIBLE_LOG_PATH =/ ansible / ansible .log \
docker . lightbitslabs .com/lightos -3-< Minor Ver >-x-ga/lb - ansible :4.2.0 \
sh -c 'ansible - playbook \
-e system_jwt_path =/ ansible / lightos_jwt \
-e lightos_default_admin_jwt =/ ansible / lightos_default_admin_jwt \
-e certificates_directory =/ lightos - certificates \
-i / inventory / hosts \
playbooks /deploy - lightos .yml -vvv '

Note: For information on installing Red Hat, see Red Hat Linux Installation.

Using SSH-Keys Present On Ansible-Controller Host


If you use SSH-Keys present on your ansible-controller machine and you copied these keys to the authorized_keys
on target hosts you will want to use this key inside the container.
The following example shows how to mount the ~/.ssh folder so that Ansible running inside the container will use it.
mkdir -p /opt/lightos - certificates
cd light -app
docker run -it --rm --net=host \
-v /opt/lightos - certificates :/ lightos - certificates \
-v `pwd `:/ ansible \
-v ${HOME }/. ssh:${HOME }/. ssh \
-w / ansible \
-e ANSIBLE_LOG_PATH =/ ansible / ansible .log \
docker . lightbitslabs .com/lightos -3-< Minor Ver >-x-ga/lb - ansible :4.2.0 \
sh -c 'ansible - playbook \
-e system_jwt_path =/ ansible / lightos_jwt \
-e lightos_default_admin_jwt =/ ansible / lightos_default_admin_jwt \
-e certificates_directory =/ lightos - certificates \
-i ansible / inventories / cluster_example / hosts \
playbooks /deploy - lightos .yml -vvv '

2023 Lightbits Labs Proprietary And Confidential Under NDA Only


Post Installation Steps LIGHTBITS CLUSTER SOFTWARE INSTALLATION

Verify Successful Installation


When the installation completes with no errors, you will see an output similar to the following:
PLAY RECAP *************************************************************************
server00 :ok =68 changed =19 unreachable =0 failed =0 skipped =34 rescued =0 ignored =0
server01 :ok =67 changed =18 unreachable =0 failed =0 skipped =33 rescued =0 ignored =0
server02 :ok =67 changed =18 unreachable =0 failed =0 skipped =33 rescued =0 ignored =0

Notes: - The “failed=0” indicates that the installation finished without errors.
- If the installation process failed, see Recovering from Cluster Installation Failure.

The installation flow is now complete, and you can move on to the client configuration sections of the Installation Guide.

Note: You should also make sure you back up your installation files properly. For more, see Lightbits Software
Installation Planning.

Post Installation Steps


After a successful Lightbits cluster installation, perform the following steps.

Back Up Important Content


Back up the following contents to a secure location. This will be useful if another node is added in the future or other
troubleshooting is required.
1. Back up the ~/light-app directory and all of its contents. The contents of the Ansible directory will be helpful in
the future if there is a need to add servers or check over how a previous installation was done.
2. Back up the generated JWTs. Back up the system-jwt and the default project jwt. By default they are placed in
the home directory: ~/lightos-system-jwt and ~/lightos-default-admin-jwt.
3. Back up the generated certificates. By default these are placed in ~/lightos-certificates.

Check Cluster Health


1. Copy the contents of the system jwt file ~/lightos-system-jwt into the clipboard.
From the Ansible host, run cat ~/lightos-system-jwt; echo;. Note that we add the last echo to generate a new line
at the end, so that it is easy to determine where the JWT ends.
The contents will be similar to this:
export LIGHTOS_JWT = eyJhbGciOiJSUzI1Ni <... CONTENTS OF JWT ... >5 PcYPBRBaFEuMsT9gQNQA

Note: The contents of a JWT are long and all are on a single line.

2. Log in to any Lightbits server and paste the contents into the shell. The JWT will now be available via the
$LIGHTOS_JWT environment variable.
3. Check the state of the servers, nodes, and cluster.
The servers look healthy as they all state “NoRiskOfServiceLoss”:

2023 Lightbits Labs Proprietary And Confidential Under NDA Only


Post Installation Steps LIGHTBITS CLUSTER SOFTWARE INSTALLATION

server00 :~# lbcli -J $LIGHTOS_JWT list servers


NAME UUID State RiskOfServiceLoss State
LightOSVersion
server00 5c2b7375 -64fa -583e -8ebe - e82b8e1e1a53 Enabled NoRiskOfServiceLoss
3.1.2~ b1125
server01 bb5433d2 -9740 -5130 - a5eb -47 f623c19b4d Enabled NoRiskOfServiceLoss
3.1.2~ b1125
server02 a9097bf0 -6005 -5 dca -8f07 - eeaca114ec70 Enabled NoRiskOfServiceLoss
3.1.2~ b1125

The nodes also look healthy, as they all state to be “Active”:


server00 :~# lbcli -J $LIGHTOS_JWT list nodes
Name UUID State NVMe endpoint Failure
domains Local rebuild progress
server00 -0 d2c336f2 -c9e5 -5a4c -951e- ede739d10774 Active 10.10.10.100:4420 [
server00 ] None
server01 -0 810 ed593 -a97b -5495 - b08a - be5e37a65f82 Active 10.10.10.101:4420 [
server01 ] None
server02 -0 2cf6ce67 -fe5e -5a98 -86bf -98 a5792a8916 Active 10.10.10.102:4420 [
server02 ] None

Also check that the cluster health state is ok:

2023 Lightbits Labs Proprietary And Confidential Under NDA Only


Post Installation Steps LIGHTBITS CLUSTER SOFTWARE INSTALLATION

server00 :~# lbcli -J $LIGHTOS_JWT get cluster -o yaml


ETag: "0"
MaxAllowedVersion : 3.2.X
MinAllowedVersion : 3.1.X
MinVersionInCluster : 3.1.2~ b1125
UUID: c2719be6 -00b8 -4e96 -b80a -6 a84ecc3e638
apiEndpoints :
- 10.10.10.100:443
- 192.168.16.22:443
- 10.10.10.101:443
- 192.168.16.92:443
- 10.10.10.102:443
- 192.168.16.32:443
clusterName : c2719be6 -00b8 -4e96 -b80a -6 a84ecc3e638
currentMaxReplicas : 3
discoveryEndpoints :
- 10.10.10.100:8009
- 10.10.10.101:8009
- 10.10.10.102:8009
health :
numDegradedVolumes : 0
numInactiveNodes : 0
numNotAvailableVolumes : 0
numReadOnlyVolumes : 0
state: OK <------------------------------------------ health state is OK
statistics :
compressionRatio : 1
effectivePhysicalStorage : " 52311333273 "
estimatedFreeLogicalStorage : " 41226327633 "
estimatedLogicalStorage : " 51963745873 "
freePhysicalStorage : " 41490028953 "
installedPhysicalStorage : " 94489280512 "
logicalStorage : " 16106127360 "
logicalUsedStorage : " 10737418240 "
managedPhysicalStorage : " 60129542144 "
physicalUsedStorage : " 10821304320 "
physicalUsedStorageIncludingParity : " 10821304320 "
subsystemNQN : nqn .2016 -01. com. lightbitslabs :uuid :2304 a078 -e5d0 -40e8 -94d6 -3656 d24d1337 :
suffix
supportedMaxReplicas : 3

At this point the cluster’s health has been confirmed at the node, server, and cluster level.

2023 Lightbits Labs Proprietary And Confidential Under NDA Only


LINUX CLUSTER CLIENT SOFTWARE INSTALLATION

Linux Cluster Client Software Installation


To connect to the Lightbits storage server, the cluster client software requires the appropriate NVMe over TCP kernel
module and application support. The client should support NVMe/TCP with an ANA (Asymmetric Namespace Access)
enabled kernel. NVME multipath has to be enabled on the clients as well.

Connecting to the Cluster Client DEB Repository


Prepare your Debian/Ubuntu based client with the Lightbits repository, using these steps.

Note: The Linux Repo File Customer TOKEN section in your Lightbits Installation Customer Addendum has
the TOKEN that is required to install the Lightbits NVME-Client-DEBs.

1. Run the following commands:


apt -get install -y debian - keyring
apt -get install -y debian -archive - keyring
apt -get install -y apt -transport - https
apt -get install curl

2. Add the apt key:


curl -1sLf 'https :// dl. lightbitslabs .com/<TOKEN >/ lightos -3-< Minor Ver >-x-ga/cfg/gpg
/gpg.<KEY >. key ' | apt -key add -

Note: All of the required parameters for the curl command above will be in your Lightbits Installation
Customer Addendum. This includes the TOKEN, path, and GPG KEY fingerprint.

3. Create the lightos repo:


curl -1sLf 'https :// dl. lightbitslabs .com/<TOKEN >/ lightos -3-< Minor Ver >-x-ga/cfg/
setup / config .deb.txt? distro = ubuntu & codename = xenial ' > /etc/apt/ sources .list.d/
lightos .list

Notes: - Token and path are provided via the Customer Addendum.
- Replace “xenial” in the URL with the correct codename of your Ubuntu OS.
- Run lsb_release -a to verify your codename.

4. Editing the repository file to point at the correct GPG key or force trust.
Correct GPG Key:
By default, the created lightos.list repo file points to an incorrect path for the GPG key, so running apt-get update
at this point will fail.
First, confirm the correct GPG key path by running apt-key list.
Locate the Lightbits key. It should sit in the /etc/apt/trusted.gpg file.
Edit the repo file /etc/apt/sources.list.d/lightos.list and replace [signed-by=/path/to/key] to the correct path
[signed-by=/etc/apt/trusted.gpg].

2023 Lightbits Labs Proprietary And Confidential Under NDA Only


Connecting to the Cluster Client RPM Repository LINUX CLUSTER CLIENT SOFTWARE INSTALLATION

$ cat /etc/apt/ sources .list.d/ lightos .list


deb [signed -by =/ etc/apt/ trusted .gpg] https :// dl. lightbitslabs .com/<TOKEN >/
lightos -3-< Minor Ver >-x-ga/deb/ ubuntu xenial main
deb -src [signed -by =/ etc/apt/ trusted .gpg] https :// dl. lightbitslabs .com/<TOKEN >/
lightos -3-< Minor Ver >-x-ga/deb/ ubuntu xenial main

Force Trust:
In case you want to bypass the GPG verification, edit the /etc/apt/sources.list.d/lightos.list file and replace
[signed-by=/path/to/key] with [signed-by=/etc/apt/trusted.gpg] after the deb and deb-src parts:
$ cat /etc/apt/ sources .list.d/ lightos .list

deb [ trusted =yes] https :// dl. lightbitslabs .com/<TOKEN >/ lightos -3-< Minor Ver >-x-
ga/deb/ ubuntu xenial main

deb -src [ trusted =yes] https :// dl. lightbitslabs .com/<TOKEN >/ lightos -3-< Minor Ver
>-x-ga/deb/ ubuntu xenial main

5. Run the apt-update command:


apt -get update

Connecting to the Cluster Client RPM Repository


Follow the steps in Connecting to the Lightbits Software Repository to prepare the Lightbits repository on your RPM-
based client. Make sure to have the TOKEN ready, which is provided in your Lightbits Installation Customer Addendum.
This is required to install the Lightbits NVME-Client-RPMs.

Note: You can copy the repo file content from an installed Lightbits server’s file, /etc/yum.repos.d/lightos.repo.
It will have the correct token and baseurl.

Note: An optional Ansible playbook is available to you that performs the following:
- Installs kernel v5.x, which includes the nvme-tcp upstream driver.
- Creates a small 4GB volume with a replication factor of 2.
- Runs the nvme connect command to connect the client machine to the cluster volume.
- Runs an fio read/write workload for 30 seconds.
- Performs a cleanup that disconnects the nvme client and removes the volume.

For more information about using this optional playbook, see Automated Client Connectivity Verification.

Installing the New Kernel on CentOS

Notes: - Before proceeding with the installation, you must have the GNU Wget software installed. You can
download the software at https://www.gnu.org/software/wget/
- You can use any kernel version v5.3.5 or above, which is written in the following instructions.
- The instructions below are only for Centos 7.9. Updating the kernel will vary for different OSs. Please verify with
the official OS documentation for how to upgrade the kernel.

To install the latest kernel on CentOS 7.9, perform the following steps.
1. Update the yum repo:

2023 Lightbits Labs Proprietary And Confidential Under NDA Only


Configuring the Client to Boot from the New Kernel LINUX CLUSTER CLIENT SOFTWARE INSTALLATION

yum update
2. Install elrepo for CentOS 7.9:
yum install https :// www. elrepo .org/elrepo -release -7. el7. elrepo . noarch .rpm
rpm --import https :// www. elrepo .org/RPM -GPG -KEY - elrepo .org

3. Install the kernel:


yum --enablerepo=elrepo-kernel -y install kernel-ml
Configuring the Client to Boot from the New Kernel details how to ensure that the kernel is set to default, so that it
boots up on reboot.

Configuring the Client to Boot from the New Kernel


You must configure the client to boot from the new kernel that you just installed.
1. Identify the installed kernel from Installing the New Kernel on CentOS. In this example we will assume it is kernel
kernel-ml-5.4.11-1.el7.elrepo.x86_64. In your case, the kernel will be a newer and higher number.
2. Find the new kernel grub entry with the following command.
$ grubby --info=ALL

3. Identify the new kernel index in the output list of the command above. In the following example, the new kernel
has an index value of 0 because it is at the top of the list of available kernels.
index =0
kernel =/ boot/kernel -ml -5.4.11 -1. el7. elrepo . x86_64
args="ro crashkernel =auto rd.lvm.lv= CentOS_rack05 - server67 /root rd.lvm.lv=
CentOS_rack05 - server67 /swap rhgb quiet LANG= en_US .UTF -8"
root =/ dev/ mapper / CentOS_rack05 --server67 -root
initrd =/ boot/initramfs -kernel -ml -5.4.11 -1. el7. elrepo . x86_64 .img
title= CentOS Linux (kernel -ml -5.4.11 -1. el7. elrepo . x86_64 ) 7 (Core)
index =1
kernel =/ boot/vmlinuz -3.10.0 -957. el7. x86_64
args="ro crashkernel =auto rd.lvm.lv= CentOS_rack05 - server67 /root rd.lvm.lv=
CentOS_rack05 - server67 /swap rhgb quiet LANG= en_US .UTF -8"
root =/ dev/ mapper / CentOS_rack05 --server67 -root
initrd =/ boot/initramfs -3.10.0 -957. el7. x86_64 .img
title= CentOS Linux (3.10.0 -957. el7. x86_64 ) 7 (Core)
index =2
kernel =/ boot/vmlinuz -0- rescue -9758554168974 f5dbe0d6dac5a6ac621
args="ro crashkernel =auto rd.lvm.lv= CentOS_rack05 - server67 /root rd.lvm.lv=
CentOS_rack05 - server67 /swap rhgb quiet "
root =/ dev/ mapper / CentOS_rack05 --server67 -root
initrd =/ boot/initramfs -0- rescue -9758554168974 f5dbe0d6dac5a6ac621 .img
title= CentOS Linux (0- rescue -9758554168974 f5dbe0d6dac5a6ac621 ) 7 (Core)
index =3
non linux entry

4. Use the following command to set the default kernel index value.
In this example, the new kernel grub entry index value number is 0. So we set the default index to 0. This will make
the OS boot off of this kernel on the next boot.

2023 Lightbits Labs Proprietary And Confidential Under NDA Only


Installing the Lightbits NVMe Command Line Interface LINUX CLUSTER CLIENT SOFTWARE INSTALLATION

$ grubby --set -default - index 0

5. Verify the correct kernel version is set.


$ grubby --default - kernel
/boot/kernel -ml -5.4.11 -1. el7. elrepo . x86_64

6. Reboot the system to load the Lightbits kernel.


$ shutdown -r now

7. After the client reboots, you must log in and verify that the client is now running from the new kernel using the
Linux command uname -r.
For example:
$ uname -r
kernel -ml -5.4.11 -1. el7. elrepo . x86_64

Installing the Lightbits NVMe Command Line Interface


The NVMe command line interface (CLI) is a standard command line interface to run NVMe over fabrics commands
from the client. Lightbits provides a customized NVMe CLI for Lightbits that will be available in future versions of the
public/upstream NVMe CLI version.
The nvme CLI program is provided by the nvme-cli package. Typically this is already installed; however, if it is missing
or you are unsure, you can install using apt-get install nvme-cli.

How To Replace With Latest Version From Lightbits


1. Make sure the Lightbits repository is configured on the client. Refer to Connecting to the Cluster Client DEB
Repository.
2. (Optional) If a public NVMe CLI version is installed on your system, you can replace it with the NVMe CLI version
supplied by Lightbits. Before installing the supplied NVMe CLI from the Lightbits repository, you’ll need to remove
the public NVMe CLI from your system.
To check if you have an NVMe CLI package installed, enter the following in the system’s command shell:
$ apt list --installed | grep nvme -cli

WARNING: apt does not have a stable CLI interface. Use it with caution in scripts.
3. (Optional) If the command returns this value, you need to delete the NVMe CLI package from your system with
the following command:
$ apt -get remove nvme -cli

4. With the public NVMe CLI version deleted from the system, you can install the NVMe CLI from the Lightbits RPM
repository by entering the following in the system’s command shell:

2023 Lightbits Labs Proprietary And Confidential Under NDA Only


Installing the Lightbits NVMe Command Line Interface (Ubuntu)
LINUX CLUSTER CLIENT SOFTWARE INSTALLATION

$ apt -get install nvme -cli

5. Enter the following command to verify that the NVMe CLI version is v1.9.1.
$ apt list --installed | grep nvme -cli

WARNING: apt does not have a stable CLI interface. Use it with caution in scripts.

Installing the Lightbits NVMe Command Line Interface (Ubuntu)


The NVMe command line interface (CLI) is a standard command line interface to run NVMe over fabrics commands
from the client. Lightbits provides a customized NVMe CLI for Lightbits that will be available in future versions of the
public/upstream NVMe CLI version.

Note: These instructions will work on any Lightbits client’s side deb that you want to install on your client.

1. (Optional) If a public NVMe CLI version is installed on your system, you can replace it with the NVMe CLI version
supplied by Lightbits. Before installing the supplied NVMe CLI from the Lightbits repository you’ll need to remove
the public NVMe cli from your system.
To check if you have an NVMe CLI package installed, enter the following in the system’s command shell:
$ apt list --installed |grep nvme -cli

WARNING : apt does not have a stable CLI interface . Use with caution in scripts .

nvme -cli/bionic -updates ,now 1.5 -1 ubuntu1 amd64 [ installed ]

2. (Optional) If the command returns this value, you need to delete the NVMe CLI package from your system with
the following command:
$ apt -get remove nvme -cli

3. With the public NVMe CLI version deleted from the system, you can install the NVMe CLI from the Lightbits RPM
repository by entering the following in the system’s command shell:
$ apt -get install nvme -cli

4. Enter the following command to verify that the NVMe CLI version is v1.9-1.
$ apt list --installed |grep nvme -cli

WARNING : apt does not have a stable CLI interface . Use with caution in scripts .

nvme -cli/xenial ,now 1.9~2.3.4 -1~ bionic amd64 [ installed ]

The output for this command can include additional package names with the nvme string.

2023 Lightbits Labs Proprietary And Confidential Under NDA Only


Loading the NVMe/TCP Host Software and Enabling Multipath
LINUX CLUSTER CLIENT SOFTWARE INSTALLATION

Loading the NVMe/TCP Host Software and Enabling Multipath


For a client to connect to an NVMe over TCP volume on a Lightbits cluster, the nvme-tcp modules must be loaded. To
fully utilize the multipathing features of Lightbits, multipath must then be enabled on the nvme-core module.

Load the NVME TCP module


To use NVMe/TCP, you must load the NVMe host modules by entering the following command in the system’s command
shell.
$ modprobe nvme -tcp
$ lsmod | grep nvme

The output is similar to the following example:


nvme_tcp 24576 0
nvme_fabrics 20480 1 nvme_tcp
nvme_core 49152 4 nvme_fabrics , nvme_tcp

Make the setting boot persistent by loading the module on boot with this setting:
$ echo nvme_tcp > /etc/modules -load.d/ nvme_tcp .conf

Multipath
By default, multipath should be enabled with the nvme_core module.
However, you can run the following command to check:
$ grep -r "" /sys/ module / nvme_core / parameters

If you see /sys/modules/nvme_core/parameters/multipath:N, then multipath is not enabled.


Enable multipath using one of the following methods.

Enable Multipath Using Grubby


Figure out your current loaded kernel with grubby:
$ grubby --default - kernel

The output should show the full path of the kernel in the format of /boot/vmlinuz-...
Now configure the kernel boot arguments to load enable multipath. Make sure to put the full path of the default kernel
into the command below:
$ grubby --args= nvme_core . multipath =Y --update - kernel /boot/vmlinuz -...

2023 Lightbits Labs Proprietary And Confidential Under NDA Only


PROVISIONING STORAGE AND CONNECTING THE CLUSTER CLIENT TO LIGHTBITS

Note: grubby is available out of the box on Red Hat and Centos based flavors. For distributions that do not have
grubby, use the next method.

Enable Multipath Using a Configuration File


First, create a configuration file that enables multipath:
echo " options nvme_core multipath =Y" > /etc/ modprobe .d/50 - nvme_core .conf

Then, update the initramfs, which OSes use to load and configure modules on boot. Use the appropriate tool for the OS:
* On Red Hat/Centos, run dracut -f. * On Debian/Ubuntu systems, run update-initramfs -u.

Reboot
It is recommended to reboot the client to make sure that all of the settings are loaded properly. Make sure that the
nvme_tcp modules are loaded on boot and that multipath is enabled.
$ lsmod | grep nvme; grep -r "" /sys/ module / nvme_core / parameters ;
nvme_tcp 24576 0
nvme_fabrics 20480 1 nvme_tcp
nvme_core 49152 4 nvme_fabrics , nvme_tcp
...
/sys/ modules / nvme_core / parameters / multipath :Y
...

Provisioning Storage and Connecting the Cluster Client to Lightbits


Lightbits Cluster Installation Process

# Installation Step
1 Connecting your installation workstation to Lightbits’
software repository
2 Verifying the network connectivity of the servers used in
the cluster
3 Setting up an Ansible environment on your installation
workstation
4 Installing a Lightbits cluster by running the Ansible
installation playbook
5 Updating clients (if required)
6 Provisioning storage, connecting clients, and
performing IO tests

With the Lightbits software installed and the Lightbits management service running, you can create a volume and connect
that volume to your application clients.

This section includes:

Creating a Volume on the Lightbits Storage Server


Connecting the Cluster Client to Lightbits

Creating a Volume on the Lightbits Storage Server


To create a volume on the cluster, log into any of the Lightbits cluster servers and enter the lbcli create volume command.

2023 Lightbits Labs Proprietary And Confidential Under NDA Only


Connecting the Cluster Client to LightbitsSTORAGE AND CONNECTING THE CLUSTER CLIENT TO LIGHTBITS
PROVISIONING

Sample Command
$ lbcli -J $LIGHTOS_JWT create volume --size="2 GiB" --name=vol1 --acl="acl3" --replica -
count =2 --project -name= default

Note: By default, the LIGHTOS_JWT is generated during the Lightbits installation on the Ansible installa-
tion host, and is saved to ~/lightos-system-jwt. See Post Installation Steps for an example of how to get
LIGHTOS_JWT.

Sample Output
Name UUID State Size Replicas ACL
vol1 76 c3eae8 -7ade -4394 -82e5 -056 d05a92b5e Creating 2.0 GiB 3 values :"acl3"

This example command creates a volume with 2 GiB of capacity, an Access Control List (ACL) string “acl3”, and a
replication factor of 3.

Note: Only clients that mention the ACL value of “acl3” during connect can connect to this volume. This is
detailed in Connecting the Cluster Client to Lightbits.

Connecting the Cluster Client to Lightbits


After creating a volume on the Lightbits storage server, log in to one or more of your application clients and use the
Lightbits NVMe CLI utility to make a connection to the Lightbits cluster.
Before you begin, enter a Linux ping command to check the TCP/IP connectivity between your application client and
the Lightbits storage servers. In the example below, the client has a data NIC connected to reach the 10.10.10.x network.
The IP 10.10.10.100 is data NIC of one of the Lightbits storage servers.

Sample Command
$ ping -c 1 10.10.10.100

Sample Ouptut
PING 10.10.10.100 (10.10.10.100) 56(84) bytes of data.
64 bytes from 10.10.10.100: icmp_seq =1 ttl =255 time =0.032 ms

--- 10.20.20.10 ping statistics ---


1 packets transmitted , 1 received , 0% packet loss , time 0ms
rtt min/avg/max/mdev = 0.032/0.032/0.032/0.000 ms

This output indicates that this application client has a connection to the data NIC IP address on the Lightbits storage
server where volumes were created.
Repeat this ping check for the other Lightbits cluster servers: 10.10.10.101 and 10.10.10.102.
After you have checked the TCP/IP connectivity between your application client and the Lightbits storage servers, use
the nvme CLI utility to connect the application client via NVMe/TCP to the Lightbits storage server.
To use the nvme CLI utility on your application client, you will need the following details.

2023 Lightbits Labs Proprietary And Confidential Under NDA Only


Connecting the Cluster Client to LightbitsSTORAGE AND CONNECTING THE CLUSTER CLIENT TO LIGHTBITS
PROVISIONING

Required Lightbits Storage Cluster Connection Details

Required Data Description Connect Command Parameter


Lightbits Data NIC IP address The data NIC IP address of each -a
Lightbits cluster node. These values
can be retrieved from the Lightbits
management server using the
lbcli list nodes command.
ACL string You used this ACL string when you -q
created the volume on the Lightbits
storage server node.
Subsystem NQN The Lightbits cluster subsystem NQN -n
value can be retrieved from the
Lightbits management server using
the lbcli get cluster command.
TCP port The data TCP port for each of the -s
Lightbits cluster nodes can be
retrieved from the Lightbits
management server using the lbcli
list nodes command.

Enter the lbcli get cluster command on any Lightbits storage server to identify the subsytem NQN.

Sample Command
$ lbcli -J $LIGHTOS_JWT get cluster -o yaml

Note: By default the LIGHTOS_JWT is generated during the Lightbits installation on the ansible installation
host and is saved to ~/lightos-system-jwt. Post Installation Steps shows one way to get LIGHTOS_JWT.

Sample Output
UUID: 95 a251b6 -0885 -4 f5b -a0eb -90 e90a2009a3
currentMaxReplicas : 3
...
subsystemNQN : nqn .2014 -08. org. nvmexpress :NVMf:uuid:b5fe744a -b919 -465a -953a- a8a0df7b9d31
<--- subsystem NQN
supportedMaxReplicas : 3

Enter the lbcli list nodes command to identify the NIC IP address and TCP port.

Sample Command
$ lbcli -J $LIGHTOS_JWT list nodes

Sample Output

2023 Lightbits Labs Proprietary And Confidential Under NDA Only


Confirming the Cluster Client Connection STORAGE
PROVISIONING to LightbitsAND CONNECTING THE CLUSTER CLIENT TO LIGHTBITS

Name UUID State NVMe endpoint Failure


domains Local rebuild progress
server00 -0 08 fdb3bd -925a -5e73 -adde -8 daf881969d3 Active 10.10.10.100:4420 [ server00 ]
None
server01 -0 112 a555f -8168 -5 f07 -a4e0 - bf8f5b59c740 Active 10.10.10.101:4420 [ server01 ]
None
server02 -0 bc759c13 -856d -5521 -9 ba2 -752259 abf8f0 Active 10.10.10.102:4420 [ server02 ]
None

With the IP, port, subsystem NQN and ACL values for the volume, you can execute the nvme connect command to
connect to all of the nodes in the cluster.

Sample NVMe Connect Commands


$ nvme connect -t tcp -a 10.10.10.100 --ctrl -loss -tmo -1 -n \
nqn .2014 -08. org. nvmexpress :NVMf:uuid:b5fe744a -b919 -465a -953a- a8a0df7b9d31 -s 4420 -q
acl3
$ nvme connect -t tcp -a 10.10.10.101 --ctrl -loss -tmo -1 -n \
nqn .2014 -08. org. nvmexpress :NVMf:uuid:b5fe744a -b919 -465a -953a- a8a0df7b9d31 -s 4420 -q
acl3
$ nvme connect -t tcp -a 10.10.10.102 --ctrl -loss -tmo -1 -n \
nqn .2014 -08. org. nvmexpress :NVMf:uuid:b5fe744a -b919 -465a -953a- a8a0df7b9d31 -s 4420 -q
acl3

Notes: - We are using an ACL value/hostnqn of “acl3”, so that we can connect to the volume created, as detailed
in Creating a Volume on the Lightbits Storage Server.
- Use the client procedure for each node in the cluster. Remember to use the correct NVME-Endpoint for each
node.
- Using the --ctrl-loss-tmo -1 flag allows for infinite attempts to reconnect nodes, and prevents a timeout from
occuring when attempting to connect with a node in a failure state.
- Starting from Version 3.1.1, data IP can be IPv6.
- See the discovery-client documentation of the Lightbits Administration Guide. Like nvme connect, this can
connect to NVMe over TCP volumes. However, it can also monitor the nodes and if new nodes/paths are created
or removed, it will properly maintain those.

Confirming the Cluster Client Connection to Lightbits


Lightbits Cluster Installation Process

# Installation Steps
1 Connecting your installation workstation to Lightbits’
software repository
2 Verifying the network connectivity of the servers used in
the cluster
3 Setting up an Ansible environment on your installation
workstation
4 Installing a Lightbits cluster by running the Ansible
installation playbook
5 Updating clients (if required)
6 Provisioning storage, connecting clients, and performing
IO tests

2023 Lightbits Labs Proprietary And Confidential Under NDA Only


Confirming the Cluster Client Connection STORAGE
PROVISIONING to LightbitsAND CONNECTING THE CLUSTER CLIENT TO LIGHTBITS

Each /dev/nvmeX is a successful NVMe over TCP connection to a server in the cluster. When the optimized path is
connected, a block device is created with the name /dev/nvmeXnY, which can then be used as any block device (create
fs on top of it and mount it).
If you see a multipath error (with the nvme block devices showing up as 0 byte, or each replica/nvme connection showing
up as a separate nvme block device), refer to the Lightbits Troubleshooting Guide, or contact Lightbits Support.
After you have entered the nvme connect command, you can confirm the client’s connection to Lightbits by entering
the nvme list command. This will list all of the NVMe block devices. For more information on each connection’s
multipathing, you can use nvme list-subsys, which will list all of the NVMe character devices.

Note: The nvme list and lsblk command will show the NVMe block device that is created upon a successful
connection. It will be of the format nvme0n1. The nvme list-subsys will list all of the paths that make up these
block devices; these paths appear as character devices. So from the output below we can conclude that block device
nvme0n1 is made of 3 character devices: nvme0, nvme1, and nvme2. When we need to interact with the block device
- for example to create a filesystem and mount it - we will interact with the block device, nvme0n1, and not the
character devices (nvme0,nvme1, and nvme2).

Sample Command
$ nvme list - subsys

Sample Output
nvme - subsys0 - NQN=nqn .2014 -08. org. nvmexpress :NVMf:uuid:b5fe744a -b919 -465a -953a-
a8a0df7b9d3
\
+- nvme0 tcp traddr =10.10.10.100 trsvcid =4420 live
+- nvme1 tcp traddr =10.10.10.101 trsvcid =4420 live
+- nvme2 tcp traddr =10.10.10.102 trsvcid =4420 live

Next, review your connected block devices to see the newly connected NVMe/TCP block device using the Linux lsblk
command.

Sample Command
$ lsblk

Sample Output
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
nvme0n1 259:1 0 2G 0 disk
sdb 8:16 0 223.6 G 0 disk
|-sdb2 8:18 0 222.6 G 0 part
| |-CentOS00 -swap 253:1 0 22.4G 0 lvm [SWAP]
| |-CentOS00 -home 253:2 0 150.2 G 0 lvm /home
| |-CentOS00 -root 253:0 0 50G 0 lvm /
|-sdb1 8:17 0 1G 0 part /boot
sda 8:0 0 111.8 G 0 disk

A new nvme0n1 block device with 2GB of storage is identified and available.

2023 Lightbits Labs Proprietary And Confidential Under NDA Only


TROUBLESHOOTING

To determine which node in the cluster is the primary and which is secondary for this block device, enter the nvme list
-subsys command with the block device name.

Sample Command
$ nvme list - subsys /dev/ nvme0n1

Sample Output
nvme - subsys0 - NQN=nqn .2014 -08. org. nvmexpress :NVMf:uuid:b5fe744a -b919 -465a -953a-
a8a0df7b9d31 \
+- nvme0 tcp traddr =10.10.10.100 trsvcid =4420 live optimized
+- nvme1 tcp traddr =10.10.10.101 trsvcid =4420 live inaccessible
+- nvme2 tcp traddr =10.10.10.102 trsvcid =4420 live

In the output, the optimized status identifies the primary node, and an inaccessible status for the secondary node. In this
case we can see that server 10.10.10.100 is the primary node with the optimized path. All of the IO from the client will
go to 10.10.10.100. The cluster will then replicate the data between the other nodes.

Troubleshooting

Note: For additional troubleshooting-related information, see the Lightbits Troubleshooting Guide, or contact
Lightbits Support.

Ansible Role Errors


Confirm that the duroslight ports are synchronized in the Ansible default yml file, which can be overridden in inventory
ymls, and the node-manager configuration Ansible default yml:
~/ light -app/ ansible / roles /install - lightos / defaults /main.yml

SSH Strict Key Errors When Using sshpass


If you use the sshpass utility method in your hosts file, you can receive an error related to SSH keys in the Known Hosts
file, such as:
$ ansible -i ansible / inventories / cluster_example / hosts all -m ping
node02 | FAILED ! => {
"msg": " Using a SSH password instead of a key is not possible because Host Key
checking is enabled and sshpass does not support this. Please add this host 's
fingerprint to your known_hosts file to manage this host."
}

To avoid this error, you need to disable StrictHostKeyChecking in the /etc/ssh/ssh_config, or log into each node
from your installation workstation at least once.
By default, StrictHostKeyChecking is enabled in the file /etc/ssh/ssh_config. You can disable this by un-remarking
it in ssh_config and setting it to:

2023 Lightbits Labs Proprietary And Confidential Under NDA Only


Free Space in Linux OS for etcd Logical Volume Manager Use TROUBLESHOOTING

StrictHostKeyChecking no

Or, you can leave StrictHostKeyChecking enabled and log into each node from the installation workstation and
“answer yes” to permanently add the host to the Known Hosts files.
The first time you SSH from one server to another the following SSH exchange occurs:
$ ssh root@192 .168.16.22
The authenticity of host '192.168.16.22 (192.168.16.22) ' can 't be established .
ECDSA key fingerprint is SHA256 : zouTZEZF2oUXfIGpnvWutrOR4 / fBnd5ARqXNJj0iqD0 .
ECDSA key fingerprint is
MD5 :7d:0f:0a:3f :27:08:2 e :66:93: ae:f5 :08: c8 :13:23: af.
Are you sure you want to continue connecting (yes/no)? Yes
Warning : Permanently added '192.168.16.22 ' ( ECDSA ) to the list of known hosts .
root@192 .168.16.22 's password :
Last login : Wed Nov 13 19:06:13 2019 from cluster - manager
[ root@node00 ~]#

So, by logging into all the servers at least once from your installation workstation before you run the Ansible playbook,
there will be no issues using the sshpass method.

Free Space in Linux OS for etcd Logical Volume Manager Use


If your Linux operating system has volume groups that were created for the home, root, and swap file systems and are
utilizing 100% of the storage, you must reduce one of these volume groups. The Lightbits installer requires at least 10GB
of space to create an LVM for use with etcd.
For example, review the Linux OS logical volumes. The Linux Virtual Server (LVS) software is used in this example.
$ lvs
LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync
Convert
home CentOS_lightos -c3 -wi -ao ---- <64.24g
root CentOS_lightos -c3 -wi -ao ---- 50.00 g
swap CentOS_lightos -c3 -wi -ao ---- 4.00g
$ lvs
LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync
Convert
home CentOS_lightos -c3 -wi -ao ---- <64.24g
root CentOS_lightos -c3 -wi -ao ---- 50.00 g
swap CentOS_lightos -c3 -wi -ao ---- 4.00g

Note: If the Linux Virtual Server (LVS) software reports anything but “CentOS” for the Volume
Group name used for the LinuxOS file system, you will need to specify the exact name in the ~/light-
app/ansible/inventories/cluster_example/host_vars file for that node. For more information, see the
etcd_vg_name variable description in the Host Configuration File Variables list.

In this example, the LinuxOS was installed onto a 118 GB drive and the entire amount is allocated. You can resize the
home LVM by 20 GB to free up some space.
To resize this file system, you need to:
1. Move any files you have in the /home file system to a safe location.
2. Unmount, resize, and recreate the file system.

2023 Lightbits Labs Proprietary And Confidential Under NDA Only


Recovering from Cluster Installation Failure TROUBLESHOOTING

3. Remount the file system.


To identify how much space is available to free up, use lsblk as follows:
$ lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 119.2 G 0 disk
|-sda1 8:1 0 1G 0 part /boot
|-sda2 8:2 0 118.2 G 0 part
|-CentOS -root 253:0 0 50G 0 lvm /
|-CentOS -swap 253:1 0 4G 0 lvm [SWAP]
|-CentOS -home 253:2 0 64.2G 0 lvm /home

In this example, the LinuxOS is installed on device “sda” and on partition sda1 with 119.2 GB of space available. It is
possible to take 20 GB away from home to free up some space and still have over 44 GB remaining.
1. Mount and record the current mount path for home.
$ mount
/dev/ mapper / CentOS_lightos --c3 -home on /home type xfs (rw ,relatime ,attr2 ,inode64 ,
noquota )

2. Unmount home and then resize it.


$ umount /home
$ lvresize -L -20G CentOS_lightos -c3/home

3. Remake the home file system.


$ mkfs.xfs -f /dev/ mapper / CentOS_lightos --c3 -home

4. Remount home.
$ mount /dev/ mapper / CentOS_lightos --c3 -home /home

Note: For information on installing Red Hat, see Red Hat Linux Installation.

Recovering from Cluster Installation Failure


At times during cluster deployment, errors occur and the configuration step must be retried. To do that, a playbook is
provided to stop all services and delete the data-plane and control-plane data and configuration.
Cleanup command for a full cluster:
ansible - playbook -i ansible / inventories / cluster_example / hosts playbooks /cleanup -lightos -
playbook .yml -t cleanup

Cleanup for one Lightbits server:

2023 Lightbits Labs Proprietary And Confidential Under NDA Only


Log Artifacts Collection TROUBLESHOOTING

ansible - playbook -i ansible / inventories / cluster_example / hosts playbooks /cleanup -lightos -


playbook .yml -t cleanup --limit <server_name >

Note: Replace with the name of the server that will be removed as listed in the hosts file, so it can be of the
form: server00, server01, etc.

Reconfigure command:
ansible - playbook -i ansible / inventories / cluster_example / hosts playbooks /configure -
lightos - playbook .yml

The following is important for better understanding cleanup and configure: * When a Lightbits installation is done via a
deploy-lightos playbook - as described in Running the Ansible Installation Playbook to Install Lightbits Cluster Software
- it runs two playbooks in order. First it runs an install playbook, which installs all of the Lightbits dependencies and
packages and does a reboot. Then it runs the configure playbook, which sets up all of the Lightbits services. * The cleanup
playbook removes all of the Lightbits configurations. It does not uninstall any of the packages that were installed. * The
configure playbook does not install any Lightbits packages; it simply reconfigures all of them. Only run this if you are
certain that the deploy-lightos playbook ran through the install playbook on all servers; otherwise, use the deploy-lightos
playbook as described in Running the Ansible Installation Playbook to Install Lightbits Cluster Software.

Log Artifacts Collection


Logs can be gathered from each Lightbits server configured in the hosts file.
$ ansible - playbook -i ansible / inventories / cluster_example / hosts playbooks /logs.yml

The output of each Lightbits server is saved into a dated directory inside /tmp/ on the Ansible host.

Notes: - This can be run against servers that have Lightbits installed, as well as servers that do not have Lightbits
installed.
- To properly gather logs from Lightbits servers, the playbook will depend on the jwt being in the ~/lightos-system-
jwt file.

Fully Clean Lightbits From Servers or Cluster


To fully clean Lightbits packages and configuration from a server, run the following steps.

Note/Caution: Do not run this on servers that are active or show up on a Lightbits cluster. Only run this against
servers that need Lightbits removed; otherwise, the cluster state will be in danger. All of the commands run from
the servers must be run as highest privelege (root).

1. From the Ansible host, run the cleanup playbook to unconfigure the server.
ansible - playbook -i ansible / inventories / cluster_example / hosts playbooks /cleanup -lightos -
playbook .yml -t cleanup --limit <server_name >

2023 Lightbits Labs Proprietary And Confidential Under NDA Only


APPENDIXES

Note: Make sure matches the name of the server from where Lightbits must be removed. Usually it will be of the
form “server00” or “server01”, etc.

2. Erase ETCD from the Lightbits server.


rm -vrf /usr/bin/etcd /usr/bin/ etcdctl /etc/etcd /opt/etcd /var/lib/etcd;
rm -vf /etc/ systemd / system /etcd. service ;
rm -vf /etc/ systemd / system /multi -user. target . wants /etcd. service

3. Uninstall Lightbits packages with these steps from the Lightbits server.
First, find out the version of Lightbits that is installed.
lbcli version | awk '{ print $NF}'`

The output will be like “3.0.1~b1004”. This is the format used for the next steps.
You can also see the latest installed rpms using rpm -qa --last, which will list the latest installed packages at the top
of the list.
Extract the version number in a format similar to this: “A.B.C~bD”. In this example, it is 3.0.1~b1004. But each case
could be different.
rpm -qa | grep 3.0.1~ b1004

Now you can manually remove each package with yum remove PKG -y or rpm -e PKG. However, if all of the packages
look Lightbits-related, then run the next command and it will uninstall them.
Important: Make sure to replace 3.0.1~b1004 with your version.
bash <(echo "( set -xeu"; rpm -qa | grep 3.0.1~ b1004 | xargs -I % echo yum remove % -y;
echo ")")

4. To fully uninstall Lightbits from a server, note that some releases (GA releases) install a specific kernel during deploy,
so it is recommended to uninstall it and set another kernel as the default. Refer to your OS documentation on how
to uninstall a kernel and set another kernel as default.

Note: The Lightbits Red Hat releases do not need to run this step. Instead, edit /etc/yum.conf and remove or
comment out the line that shows exclude=redhat-release* kernel* kmod-kvdo*.

Appendixes
The following sections provide additional information to help you complete the Lightbits installation.

This section includes:

Host Configuration File Variables


Host Configuration File Examples
Configuring the Data Network
etcd Partitioning
Using SSH-Key Authentication
Network Time Protocol Configuration

2023 Lightbits Labs Proprietary And Confidential Under NDA Only


Host Configuration File Variables APPENDIXES

Automated Client Connectivity Verification


Configuring Grafana and Prometheus
Using Grafana and Prometheus
Open TCP Ports and Verify
Installation Behind HTTP-Proxy
Single-IP-Dual-NUMA Configuration
Adding a JWT Token To a Configuration File

Host Configuration File Variables


Each host configuration file includes some basic configuration variables.
See Host Configuration File Examples for instances of how these variables are used in a host configuration file.
data_ifaces :
bootproto :
ifname :
ip4:
ip6:
instances :
instanceID :
data_ip :
failure_domains :
ec_enabled :
storageDeviceLayout :
initialDeviceCount :
maxDeviceCount :
allowCrossNumaDevices :
deviceMatchers :
- model =~
- partition ==
- size >=
- path =
- name =~
use_lvm_for_etcd :
etcd_lv_name :
etcd_settings_user :
etcd_lv_size :
etcd_vg_name :
auto_reboot :
datapath_config :
listen -client -urls:

Host Configuration File Variable Notes

2023 Lightbits Labs Proprietary And Confidential Under NDA Only


Host Configuration File Variables APPENDIXES

Variable Required Description


data_ifaces No If provided, the Ansible playbook
configures the interface. The
configuration is permanent, and
results in a new
ifcfg-&lt;iface-name> file. If this
variable is not provided, no action is
taken, the playbook assumes that the
interfaces are valid, and the link is up
and configured. If data-ifaces is used,
you must also use the bootproto,
conn_name, ifname, and ip4
variables.
bootproto No IP allocation type (dynamic or static).
Only static is supported in Lightbits
2. Default value: Static.
ifname No The interface name, such as eth0 or
enp0s2, that the data path in the ip4
variable is dedicated to.
ip4 No The data IPv4 address and subnet to
set for the interface mentioned in the
ifname field. The format is
“IP/subnet”. Example:
“10.10.10.100/24”
ip6 No The data IPv6 address and prefix to
set for the interface mentioned in the
ifname field. The format is
“IP/prefix”. Example:
“2001:0db8:0:f101::1/64”
data_ip Yes The data/etcd IP used to connect to
other nodes. The subnet or prefix is
not required, only the address.
instances Yes A list of instance IDs, one for each
logical data-path instance.
failure_domains No The servers sharing a network, power
supply, or physical location that are
negatively affected together when
network, power, cooling, or other
critical service experiences problems.
For more information, see the
Defining Failure Domains procedure.
Default value: Empty list.
instanceID Yes A unique number assigned to this
logical node. Only two logical nodes
per server are supported in Lightbits.
This means that the value is “0”
and/or “1”.
storageDeviceLayout Yes The storageDeviceLayout key, under
the node-specific settings, groups the
information required to detect the
initial storage configuration of the
node.
initialDeviceCount Yes A setting specifying the initial count
of physical drives the system will
start with on the first startup.

2023 Lightbits Labs Proprietary And Confidential Under NDA Only


Host Configuration File Variables APPENDIXES

Variable Required Description


maxDeviceCount Yes The pre-determined, maximum
number of physical nvme drives that
this node can contain; i.e. the number
of NVME drive slots available for this
node.
allowCrossNumaDevices No An optional setting, specifying
whether block devices can be used by
system-nodes that are affiliated with
a different Numa ID, or instance IDs
that the block device is attached to.
Default: false. Note: Do not allow
devices attached on different NUMAs
to be used by this node.
deviceMatchers No This section contains a list of
matching conditions for locating the
wanted physical drives to be used by
the system.
partition No Whether or not a device is a partition
(true/false).
model No The vendor/model of the device. For
example: name =~ “nvme5n.*” or
name == “nvme1n1”.
name No The file name of the device.
path No The full path of the device.
size No The capacity of the physical device.
size >= mib(1000), size == gib(20),
size <= tib(50).
ec_enabled Yes Enables Erasure Coding (EC) for
protecting against SSD failure within
the storage server. Normal operation
continues during reconstruction when
a drive is removed. At least six
NVMe devices must be in the node
for erasure coding to be enabled.
name Yes A unique, user-friendly name for the
node.
use_lvm_for_etcd No Use the Linux Volume Manager
(LVM) partition for etcd data.
Default value: false. Note: If this
variable is not used in the host
configuration file, the system uses the
default fault value. The following
etcd variables are only relevant if the
use_lvm_for_etcd variable value is
true.
etcd_lv_name No Logical volume name for etcd data
local volume management.
etcd_settings_user No Key-value map for overriding the etcd
service settings.
etcd_lv_size No Logical volume size for etcd data
local volume management.
etcd_vg_name No Volume group name for etcd data
local volume management.
Mandatory if use_lvm_for_etcd
is used.
datapath_config Yes The path to the system-profile yml
file.

2023 Lightbits Labs Proprietary And Confidential Under NDA Only


Host Configuration File Examples APPENDIXES

Variable Required Description


etcd_settings_user No User etcd settings.
listen-client-urls No http://127.0.0.1:2379
profile_generator_overrides_dir No Directory path containing
<system-profile>.yaml file to
override the profile-generator
generated one.
auto_reboot No If set to false the system will not
automatically reboot after
installation.

Notes: - The user must provide the etcd volume group name in the etcd_vg_name variable, and confirm that
there is enough server space to create a new logical volume. The default logical volume name (etcd_lv_name) is
“etcd” and the default volume size (etcd_lv_size) is 10GB.
- If there is not enough space in the server, the user must reduce the other logical volume sizes before the cluster
software installation to allocate the required space. For more details, see https://www.rootusers.com/lvm-resize-
how-to-decrease-an-lvm-partition
.

Host Configuration File Examples


This section covers various configuration examples and how they affect the host configuration files. The first host configu-
ration file will be shown server00.yml, and the host configuration files for the remaining servers will follow the same pattern.
For more complex examples, two host configuration files will be shown to help better understand the configuration.

Example 1: Data Network Interface Manually Configured


Host configuration with no data interfaces (data_ifaces) provided. The playbook does not need to configure the IP on
the data interface. The user configured the interfaces prior to running the playbook.
---
nodes :
- instanceID : 0
data_ip : 10.10.10.100
failure_domains :
- server00
ec_enabled : true
lightfieldMode : SW_LF
storageDeviceLayout :
initialDeviceCount : 6
maxDeviceCount : 12
allowCrossNumaDevices : false
deviceMatchers :
# - model =~ ".*"
- partition == false
- size >= gib (300)
# - name =~ " nvme0n1 "

Example 2: Data Network Interface Automatically Configured


Host configuration with a single data interface. The playbook configured the interface. The user did not configure interface
with the data IP.

2023 Lightbits Labs Proprietary And Confidential Under NDA Only


Host Configuration File Examples APPENDIXES

---
data_ifaces :
- bootproto : static
conn_name : ens1
ifname : ens1
ip4: 10.10.10.100/24
nodes :
- instanceID : 0
data_ip : 10.10.10.100
failure_domains :
- server00
ec_enabled : true
lightfieldMode : SW_LF
storageDeviceLayout :
initialDeviceCount : 6
maxDeviceCount : 12
allowCrossNumaDevices : false
deviceMatchers :
# - model =~ ".*"
- partition == false
- size >= gib (300)
# - name =~ " nvme0n1 "

Note: The example above shows the format for setting IPv4 addresses using ip4: ip/subnet. IPv6 addresses
can be set using ip6: ip/prefix.

Example 3: Override the Lightbits Configurations


Host configuration with Lightbits override. The provided value overrides the key listen-client-urls.
---
nodes :
- instanceID : 0
data_ip : 10.10.10.100
failure_domains :
- server00
ec_enabled : true
lightfieldMode : SW_LF
storageDeviceLayout :
initialDeviceCount : 6
maxDeviceCount : 12
allowCrossNumaDevices : false
deviceMatchers :
# - model =~ ".*"
- partition == false
- size >= gib (300)
# - name =~ " nvme0n1 "
etcd_settings_user :
listen -client -urls: http ://127.0.0.1:2379 , http ://10.16.173.14:2379 , http
://192.168.16.219:2379

Example 4: Provide Custom Datapath Configuration


Host configuration with custom datapath configuration provided.

2023 Lightbits Labs Proprietary And Confidential Under NDA Only


Host Configuration File Examples APPENDIXES

By default, the playbook inspects the remote machine and determines the directory containing the specific configuration for
Duroslight and backend services (datapath configuration). The excluding node-manager configuration uses the following
logic:
<system_vendor >-< processor_count >-processor -< processor_cores >- cores

---
nodes :
- instanceID : 0
data_ip : 10.10.10.100
failure_domains :
- server00
ec_enabled : true
lightfieldMode : SW_LF
storageDeviceLayout :
initialDeviceCount : 6
maxDeviceCount : 12
allowCrossNumaDevices : false
deviceMatchers :
# - model =~ ".*"
- partition == false
- size >= gib (300)
# - name =~ " nvme0n1 "
datapath_config : custom -datapath - config

Example 5: Use the Linux Volume Manager (LVM) Partition for etcd Data
Host configuration with custom lvm partition for etcd data.
---
nodes :
- instanceID : 0
data_ip : 10.10.10.100
failure_domains :
- server00
ec_enabled : true
lightfieldMode : SW_LF
storageDeviceLayout :
initialDeviceCount : 6
maxDeviceCount : 12
allowCrossNumaDevices : false
deviceMatchers :
# - model =~ ".*"
- partition == false
- size >= gib (300)
# - name =~ " nvme0n1 "
use_lvm_for_etcd : true
etcd_lv_name : etcd
# etcd_settings_user :
etcd_lv_size : 15 GiB
etcd_vg_name : centos

2023 Lightbits Labs Proprietary And Confidential Under NDA Only


Host Configuration File Examples APPENDIXES

Note: For information on installing Red Hat, see Red Hat Linux Installation.

Example 6: Profile-Generator Overrides


Enable humans to override profile-generator output and provide for each server a custom file that will be taken by profile-
generator as the system-profile.
Each host may be different so each host can specify its own override file.
nodes :
- instanceID : 0
data_ip : 10.10.10.100
failure_domains :
- server00
ec_enabled : true
storageDeviceLayout :
initialDeviceCount : 6
maxDeviceCount : 12
allowCrossNumaDevices : false
deviceMatchers :
- partition == false
- size >= gib (300)
profile_generator_overrides_dir : /tmp/ overrides .d/ server00

In case the cluster is homogeneous and we want to apply the same override to all nodes, we can provide a single setting
in the groups/all.yml file or via the cmd with:
ansible - playbook -i ansible / inventories / cluster_example / hosts playbooks /deploy - lightos .
yml -e profile_generator_overrides_dir =/ tmp/ overrides .d

In the above example, we specify profile_generator_overrides_dir which is a directory on the Ansible Controller that
will be copied to the target machine.

Example 7: Dual Instance Configuration


In a dual instance configuration. Lightbits is run from both NUMAs of the CPU.
Below is an example configuration for server00.yml and server01.yml.
In this configuration, both servers NUMA0 and 1 have 12 NVME drives, for a total of 24 NVME drives in the server.
Each server has a single managment IP configured (not shown below) and two data IPs (one for each NUMA). Each
instance has a unique data IP set for it.
server00.yml

2023 Lightbits Labs Proprietary And Confidential Under NDA Only


Host Configuration File Examples APPENDIXES

---
name: server00
nodes :
- instanceID : 0
data_ip : 172.16.10.10
failure_domains :
- server00
ec_enabled : true
lightfieldMode : SW_LF
storageDeviceLayout :
initialDeviceCount : 12
maxDeviceCount : 12
allowCrossNumaDevices : false
deviceMatchers :
# - model =~ ".*"
- partition == false
- size >= gib (300)
# - name =~ " nvme0n1 "
- instanceID : 1
data_ip : 172.16.20.10
failure_domains :
- server00
ec_enabled : true
lightfieldMode : SW_LF
storageDeviceLayout :
initialDeviceCount : 12
maxDeviceCount : 12
allowCrossNumaDevices : false
deviceMatchers :
# - model =~ ".*"
- partition == false
- size >= gib (300)
# - name =~ " nvme0n1 "

server01.yml

2023 Lightbits Labs Proprietary And Confidential Under NDA Only


Host Configuration File Examples APPENDIXES

---
name: server01
nodes :
- instanceID : 0
data_ip : 172.16.10.11
failure_domains :
- server01
ec_enabled : true
lightfieldMode : SW_LF
storageDeviceLayout :
initialDeviceCount : 12
maxDeviceCount : 12
allowCrossNumaDevices : false
deviceMatchers :
# - model =~ ".*"
- partition == false
- size >= gib (300)
# - name =~ " nvme0n1 "
- instanceID : 1
data_ip : 172.16.20.11
failure_domains :
- server01
ec_enabled : true
lightfieldMode : SW_LF
storageDeviceLayout :
initialDeviceCount : 12
maxDeviceCount : 12
allowCrossNumaDevices : false
deviceMatchers :
# - model =~ ".*"
- partition == false
- size >= gib (300)
# - name =~ " nvme0n1 "

Example 8: Single IP Dual NUMA Configuration


This example is for Single IP Dual Instance configuration. For more information, see Single-IP-Dual-NUMA Configuration.
The below is an example of a full server00.yml, with a similar physical configuration as the above example. However,
instead of using two data IPs, it only uses one data IP for both of its instances.

Note: Unlike dual instance configuration from the example above, which had a unique data IP per instance, each
instance on a server has the same IP.

2023 Lightbits Labs Proprietary And Confidential Under NDA Only


Performing an Offline Installation APPENDIXES

---
name: server00
nodes :
- instanceID : 0
data_ip : 10.10.10.100
failure_domains :
- server00
ec_enabled : true
lightfieldMode : SW_LF
storageDeviceLayout :
initialDeviceCount : 12
maxDeviceCount : 12
allowCrossNumaDevices : false
deviceMatchers :
# - model =~ ".*"
- partition == false
- size >= gib (300)
# - name =~ " nvme0n1 "
- instanceID : 1
data_ip : 10.10.10.100
failure_domains :
- server00
ec_enabled : true
lightfieldMode : SW_LF
storageDeviceLayout :
initialDeviceCount : 12
maxDeviceCount : 12
allowCrossNumaDevices : false
deviceMatchers :
# - model =~ ".*"
- partition == false
- size >= gib (300)
# - name =~ " nvme0n1 "

Performing an Offline Installation


The offline installation scenario is used when there is no internet access to download the required Lightbits RPMs and
their dependencies. In such a case, the machine being used for installation should include the Lightbits cluster software
RPM files and their dependencies.
During the offline installation, the software packages are copied to the target machine and installed locally.
To complete the offline installation:
1. Copy the packages to the installation server.
2. Enter the following commands on the installation machine under hosts. See Creating the Inventory Structure and
Adding the Ansible Hosts File.
source_type = offline
source_etcd_binary =<path to etcd binary zip >
source_rpms_dir =<path to rpms >
source_dependencies_rpms_dir =<path to dependencies rpms >
dest_dir =<path >

For example:

2023 Lightbits Labs Proprietary And Confidential Under NDA Only


Configuring the Data Network APPENDIXES

[ duros_nodes :vars]
source_type = offline
source_etcd_binary ="/root/ lightos_release /deps/etcd -v3 .4.1 - linux - amd64 .tar.gz"
source_rpms_dir ="/root/ lightos_release / target_rpms "
source_dependencies_rpms_dir ="/root/ lightos_release /deps"
dest_dir ="/tmp/rpms"

Offline Ansible Controller Installation and Self-Signed Certificates


The Lightbits cluster installation requires SSL certificates. You can provide these certificates, or the playbook will
automatically generate self-signed certificates. To create these certificates Ansible downloads some binaries from the
internet at runtime.
In case of an offline Ansible controller, the installation script requires that certificates_directory will be present and
contain all needed certificates before running the playbook.
This directory and its content can come from two sources:
• If you bring your own organization certificates.
• Use the self-signed certificates generated by the initial cluster installation process.
You will need to copy the folder credentials_directory to the Ansible Controller machine, before running the
installation script.

Configuring the Data Network


Nodes in the Lightbits server clusters communicate via a high-speed data network interface.
All nodes in the cluster must be configured with an IP address from the same accessible network before running the
Ansible playbook.
You can configure the network using an automatic (recommended) or manual method.

Note: For full examples, see Host Configuration File Examples.

Automatic Data Network Configuration (Recommended)


The Ansible playbook can help you automatically set the data interface IP when some optional network host variables are
transferred.
This means that to make deployment easier the playbook configures the data network interface persistently for you by
specifying for each host the list variable. For example:
data_ifaces :
- bootproto : static
conn_name : ens1
ifname : ens1
ip4: 10.20.20.10/27

In this example, we have set the playbook to permanently configure interface ens1 with static IP 10.20.20.10.

Manual Data Network Configuration


In this method, you assign the data IPs on the data interfaces for each node on the cluster.
To set the data IPs:
1. Log into the machine with the following command:

2023 Lightbits Labs Proprietary And Confidential Under NDA Only


etcd Partitioning APPENDIXES

$ ssh root@rack03 - server70

2. Detect the data interface with the businfocommand.


$ lshw -c net -businfo
Bus info Device Class Description
=======================================================
pci@0000 :01:00.0 eno1 network I350 Gigabit Network Connection
pci@0000 :01:00.1 eno2 network I350 Gigabit Network Connection
pci@0000 :02:00.0 ens1f0 network MT27710 Family [ConnectX -4 Lx]
pci@0000 :02:00.1 ens1f1 network MT27710 Family [ConnectX -4 Lx]

3. Set a new data interface IP and net mask IP for the data NIC. In the following example, the card is ens1f0 :
$ cat >/etc/ sysconfig /network - scripts /ifcfg - ens1f0 <<EOL
DEVICE = ens1f0
NM_CONTROLLED =no
IPADDR =10.20.20.10
NETMASK =255.255.255.224
ONBOOT =yes
BOOTPROTO = static
EOL

4. Toggle the NIC down and then up again by entering the ifdown command, waiting at least 30 seconds, and then
entering the ifupcommand.
$ ifdown ens1f0
$ ifup ens1f0

5. Verify that the data interface’s IP is updated.


$ ip -4 a…

4: ens1f0 : <BROADCAST ,MULTICAST ,UP ,LOWER_UP > mtu 1500 qdisc mq state UP group
default qlen 1000
inet 10.20.20.10/27 brd 10.20.20.31 scope global ens1f0
valid_lft forever preferred_lft forever

Note: Use nmtui (NetworkManager-tui) if NetworkManager is installed, and ip -4 -br a or ip -br a to verify
the ip (for a cleaner view).

etcd Partitioning
Based on your boot device’s write latency performance, you might need to create a separate partition for etcd data on
the boot device. If you have questions about the need to use etcd partitioning, contact Lightbits.
To use etcd partitioning:
1. Confirm that a partition pre-allocated for etcd exists on the node and has at least 10 GB of space.

2023 Lightbits Labs Proprietary And Confidential Under NDA Only


Using SSH-Key Authentication APPENDIXES

2. If it does not already exist on the node, configure an LVM group.


3. Enter the LVM group name for the etcd_vg_namevariable in the host configuration file.

Using SSH-Key Authentication


To use key authentication, you must provide the SSH key file used in all the cluster servers to the hosts file, which is
usually located from the light-app directory in this path: ansible/inventories/cluster_example/hosts.
To use SSH-key authentication instead of a plain text password, see the knowledge base article How To Configure SSH
Key-Based Authentication on a Linux Server.
After you have configured the SSH key for authentication, you can connect from the installation server to the target with
the ansible_ssh_private_key_file instead ofansible_ssh_pass, in the following format:
ansible_ssh_private_key_file =< private RSA key file path >

With the default configuration, the top section of the hosts file lines are configured as below:
server00 ansible_host =rack11 - server92 ansible_connection =ssh ansible_ssh_user
=root ansible_ssh_pass = light ansible_become_user =root
ansible_become_pass = light

As an example, assume that the SSH key for the servers is located at /root/mykey.txt. If so, change the configuration line
to this:
server00 ansible_host =rack11 - server92 ansible_connection =ssh ansible_ssh_user
=root ansible_ssh_private_key_file =/ root/ mykey .txt ansible_become_user =
root ansible_become_pass = light

Network Time Protocol Configuration


The Network Time Protocol (NTP) must be installed and configured on the cluster nodes to keep the cluster nodes in
sync with each other.
You can use one of the following methods to install the required NTP packages.

Method 1
The latest RPMs are retrieved from the OS repository and installed on the cluster nodes.

Method 2
The specific NTP version required by the customer is installed on the cluster nodes. To use this method:
1. Edit the all.yml file: ~/light-app/ansible/inventories/cluster_example/group_vars/all.yml.
2. Edit or append the following to the all.yml file, using the specific version that you want to install. For example:
ntp_version : ntp -4.2.6 p5 -29. el7. CentOS . x86_64

Method 3
The NTP is installed using an offline method.
1. Edit the all.yml file: ~/light-app/ansible/inventories/cluster_example/group_vars/all.yml.

2023 Lightbits Labs Proprietary And Confidential Under NDA Only


Network Time Protocol Configuration APPENDIXES

2. Under the new group_vars directory, create a new all.yml file.


3. Edit or append the following section to the all.yml file with the desired packages to install. This section’s order
first lists the prerequisites and then the desired package. For example:
ntp_packages :
- "autogen - libopts *. rpm"
- " ntpdate *. rpm"
- "ntp *. rpm"

4. The desired NTP packages must be copied to the dest_dir. For more, see Performing an Offline Installation.

Configuring the NTP Server


After you have installed NTP on all of the cluster nodes, you must configure the NTP service to sync with a global NTP
server that is inside or outside the enterprise.
The default NTP configuration is implemented during the cluster software installation and configuration process run by
the Ansible tool, which uses the defaults provided in the NTP package (Global Server Pool).
To overwrite the defaults provided in the NTP package and provide these overrides to other NTP servers, complete the
following steps:
1. Edit the all.yml file: ~/light-app/ansible/inventories/cluster_example/group_vars/all.yml.
2. Edit or append the following sections to the all.yml, using the relevant NTP servers for your system.
ntp_enabled : true
ntp_manage_config : true
ntp_servers :
- "0{{ '.' + ntp_area if ntp_area else '' }}. pool.ntp.org iburst "
- "1{{ '.' + ntp_area if ntp_area else '' }}. pool.ntp.org iburst "
- "2{{ '.' + ntp_area if ntp_area else '' }}. pool.ntp.org iburst "
- "3{{ '.' + ntp_area if ntp_area else '' }}. pool.ntp.org iburst "

Additional Note
In order to ensure NTP client consistency and synchronization with the NTP servers, it is highly recommended to eliminate
the NetworkManager from updating /etc/resolv.conf. Incorrect configuration of the file could cause the NTP client to
communicate with the NTP server, and therefore create time drifting between the cluster nodes.
This can be done by:
As the root user, create the /etc/NetworkManager/conf.d/90-dns-none.conf file with the following content - by using a
text editor:
[main]
dns=none
Reload the NetworkManager service :

# systemctl reload NetworkManager


Note: After you reload the service , NetworkManager no longer updates the /etc/ resolv .
conf file. However , the last contents of the file are preserved .

Optionally , remove the Generated by NetworkManager comment from /etc/ resolv .conf to
avoid confusion .

2023 Lightbits Labs Proprietary And Confidential Under NDA Only


Automated Client Connectivity Verification APPENDIXES

Note: For information on installing Red Hat, see Red Hat Linux Installation.

Automated Client Connectivity Verification


After you finish installing Lightbits and configuring the nodes for a cluster, you can use an optional Ansible playbook that
verifies the success of the cluster installation and runs basic IO operations to verify the client connectivity.
To use this optional Ansible playbook, the following must be configured:
1. In the host files, the Ansible host file must have the “initiators” section to declare another client server. For more,
see Creating the Inventory Structure and Adding the Ansible Host File.
2. If you want the Ansible script to configure the P, you must add the host variables file that includes the data_ifaces
section. For more, see Configuring the Data Network.

Note: It is important that the inventory folder is shared with the cluster inventory folder so that you can
fetch all cluster IPs.

Enter the following command to start the Ansible playbook:


$ ansible - playbook -i <hosts file > playbooks /deploy -nvme -tcp - initiator .yml

Configuring Grafana and Prometheus


Prometheus gathers statistics from the Lightbits cluster. Grafana in turn represents everything in graphs on dashboards.
This monitoring package can monitor several clusters at once, and multiple clusters can be configured.

Prerequisite
• docker-ce

Installing Grafana and Prometheus

Note: These monitoring packages should be installed on host machines, not on the Lightbits target servers.

sudo yum install lightos - monitoring - images lightos - monitoring - clustering

Note: See Connecting to the Lightbits Software Repository for additional information.

Usage
After lightos monitor rpms (lightos-monitoring-clustering, lightos-monitoring-images), run the following:
/var/lib/ monitoring - images / deploy .sh deploy - clustering

Edit the following file:


/var/lib/ monitoring - clustering / configure_grafana / configure_grafana .yml

In the Clusters section, change the instance names for your cluster hosts (remove the extra lines in case of a single cluster).

2023 Lightbits Labs Proprietary And Confidential Under NDA Only


Configuring Grafana and Prometheus APPENDIXES

clusters :
cluster_1 :
- rack01 - server01
- rack02 - server02
- rack03 - server03
cluster_2 :
- rack04 - server04
- rack05 - server05

Then run:
/var/lib/ monitoring - images / deploy .sh configure - monitor

Outcome
Running the following:
docker ps

• We should see two Dockers running Prometheus and Grafana.


• They are running on port http://localhost:9090 (Prometheus) and http://localhost:3000 (Grafana).
• The Grafana user/password is:
– user: admin
– password: foobar
• Inside Grafana we should have two dashboards:
– cluster_tab - showing information about the cluster.
– nodes_tab - showing information per node.

Lightbits Monitoring Integration with Existing Grafana and Prometheus


You can also integrate Lightbits’ reference monitoring metrics with your existing Prometheus and Grafana platform.
Getting the Metrics
After installing Lightbits monitoring, you can get Lightbits’ reference monitoring metrics in the /var/lib/monitoring-
clustering/ folder, as illustrated in the example below.

2023 Lightbits Labs Proprietary And Confidential Under NDA Only


Configuring Grafana and Prometheus APPENDIXES

[ root@localhost ~]# tree /var/lib/monitoring - clustering /


/var/lib/ monitoring - clustering /���
configure_grafana�
��� configure_grafana .yml�
��� README .md�
��� roles�
��� grafana�
� ��� defaults�
� � ��� main.yml�
� ��� requirements .yml�
� ��� tasks�
� ��� main.yml�
��� prometheus�
��� tasks�
� ��� main.yml�
��� templates�
��� api.yaml.j2�
��� lightbox - exporter .yaml.j2���
file_sd_configs�
��� api�
� ��� targets .yaml�
��� lightbox - exporter�
��� targets .yaml���
grafana�
��� dashboards�
��� cluster_tab .json�
��� nodes_tab .json�
��� performance_tab .json���
prometheus
��� alert. rules .yaml
��� prometheus .yml
��� record . rules .yaml

Integrating Grafana
There are two options for integrating the Grafana reference metrics: * Manually create the data source for Lightbits
Prometheus with the Grafana GUI, and then manually create a dashboard by importing reference metrics. * Integrate
the reference files directly, as shown in the example below:
Merge the data source configuration in monitoring-clustering/configure_grafana/roles/grafana/defaults/main.yml with
the existing data source.
Note that a different version of Grafana may have a different format for the configuration.
You can also easily create a data source manually with the GUI.

2023 Lightbits Labs Proprietary And Confidential Under NDA Only


Configuring Grafana and Prometheus APPENDIXES

[ root@localhost ~]# vim /usr/ share / grafana /conf/ provisioning / datasources / sample .yaml
...
data sources :
- name: Prometheus
type: prometheus
url: http :// localhost :9090

Copy the Dashboard metrics files to the Grafana configuration folder .


[ root@localhost ~]# tree monitoring - clustering / grafana /
monitoring - clustering / grafana /���
dashboards
��� cluster_tab .json
��� nodes_tab .json
��� performance_tab .json

[ root@localhost ~]# vim /usr/ share / grafana /conf/ provisioning / dashboards / sample .yaml
...
providers :
- name: 'default '
orgId: 1
folder : ''
folderUid : ''
type: file
options :
path: /var/lib/ grafana / dashboards

[ root@localhost ~]# mkdir /var/lib/ grafana / dashboards


[ root@localhost ~]# cp monitoring - clustering / grafana / dashboards /* /usr/ share / grafana /
conf/ provisioning / dashboards / -a
# restart the Grafana service

[ root@localhost ~]# /usr/sbin/grafana - server -homepath /usr/ share / grafana /

Use the GUI to verify the result. Access the Prometheus GUI using the instructions above. For example:
http://localhost:9090/ or http://monitoring-server:9090/. Note that when using the installation above, the Grafana and
Prometheus are the same host.
Integrating Prometheus
To integrate Prometheus, merge the configuration inside the Lightbits reference Prometheus configuration files and Light-
bits reference configure files - as shown below:
[ root@localhost ~]# tree monitoring - clustering / prometheus /
monitoring - clustering / prometheus /���
alert. rules .yaml���
prometheus .yml���
record .rules.yaml

You will need to manually merge contents inside of prometheus.yml with your existing prometheus.yml.

2023 Lightbits Labs Proprietary And Confidential Under NDA Only


Configuring Grafana and Prometheus APPENDIXES

[ root@localhost prometheus ]# vim /usr/ local / prometheus / prometheus .yml


...
rule_files :
- "alert . rules .yaml"
- " record . rules .yaml"

A scrape configuration containing exactly one endpoint to scrape. Here, it is Prometheus itself:

2023 Lightbits Labs Proprietary And Confidential Under NDA Only


Configuring Grafana and Prometheus APPENDIXES

scrape_configs :
- job_name : lightos
scheme : http
scrape_timeout : 25s
scrape_interval : 30s
metrics_path : / metrics
honor_timestamps : True
params :
collect []:
- clustering
- datapath
- meminfo
- textfile
- lightfield
- netstat
- netdev
- cpufreq
file_sd_configs :
- refresh_interval : 10s
files:
- 'file_sd_configs /lightbox - exporter /*. yaml '

- job_name : lightos - smart


scheme : http
scrape_timeout : 10s
scrape_interval : 5m
metrics_path : / metrics
honor_timestamps : True
params :
collect []:
- smart
file_sd_configs :
- refresh_interval : 10s
files:
- 'file_sd_configs /lightbox - exporter /*. yaml '

- job_name : api - service


scheme : https
tls_config :
insecure_skip_verify : True
scrape_timeout : 10s
scrape_interval : 15s
metrics_path : / metrics
honor_timestamps : True
file_sd_configs :
- refresh_interval : 10s
files:
- 'file_sd_configs /api /*. yaml '
...

And copy the other two associated rule configuration files to the Prometheus configuration file folder.
[ root@localhost prometheus ]# cp alert . rules .yaml record . rules .yaml /usr/ local / prometheus
/

2023 Lightbits Labs Proprietary And Confidential Under NDA Only


Using Grafana and Prometheus APPENDIXES

Copy the Lightbits reference target files to the Prometheus configuration file folder, and update the IP address of the
Lightbits cluster.
[ root@localhost ~]# tree monitoring - clustering / file_sd_configs
monitoring - clustering / file_sd_configs���
api�
��� targets .yaml���
lightbox - exporter
��� targets .yaml
[ root@localhost ~]# cp monitoring - clustering / file_sd_configs /usr/ local / prometheus / -a

Update the prometheus.yml with the new location of the target files.
[ root@localhost prometheus ]# vim /usr/ local / prometheus / prometheus .yml
...
files:
- 'file_sd_configs /lightbox - exporter /*. yaml '
...

Restart the Prometheus service.


Use the GUI to check whether Prometheus is working properly. Access the Prometheus GUI using the instructions above.
For example: http://localhost:9090/ or http://monitoring-server:9090/. Note that when using the installation above, the
Grafana and Prometheus are the same host.

Using Grafana and Prometheus


Using Grafana
Log in to Grafana using the access instructions in Configuring Grafana and Prometheus.

2023 Lightbits Labs Proprietary And Confidential Under NDA Only


Using Grafana and Prometheus APPENDIXES

From the welcome dashboard, click Dashboards and then Manage.

Figure 5: Grafana Welcome Dashboard

Three dashboards will be visible:


• cluster_tab: stats and monitoring for the full cluster
• nodes_tab: stats and monitoring for each node
• performance_tab: performance stats and monitoring

2023 Lightbits Labs Proprietary And Confidential Under NDA Only


Using Grafana and Prometheus APPENDIXES

Below is a screenshot of the cluster_tab dashboard. This is composed of multiple sections of graphs, statistics, and
tables.

Figure 6: cluster_tab Dashboard

2023 Lightbits Labs Proprietary And Confidential Under NDA Only


Using Grafana and Prometheus APPENDIXES

Hovering over each section reveals an arrow with additional options.

Figure 7: cluster_tab Arrows

2023 Lightbits Labs Proprietary And Confidential Under NDA Only


Using Grafana and Prometheus APPENDIXES

Click on each artifact’s arrow button. Clicking the View option will expand the window to full screen.

Figure 8: cluster_tab View

Using Prometheus
Log in to Prometheus using the access instructions in Configuring Grafana and Prometheus.
Prometheus can be used to query any of the time series metrics received from a Lightbits cluster. The metrics come in at
the cluster level and node level. This means that most metrics can be viewed for each node and also for the cluster as a
whole. Prometheus is also helpful in figuring out the full names of metrics, which then can be used for creating dashboards
in Grafana.
As an example, let’s look at the write bandwidth for the whole cluster. The values will be shown in their raw format.
We can assume that this will be in “bytes/seconds”; however, if this is not the case, we could compare with other known
values.

2023 Lightbits Labs Proprietary And Confidential Under NDA Only


Using Grafana and Prometheus APPENDIXES

Step 1
Make sure Use Local Time and Enable Autocomplete are enabled. Local time will help in lining up the times to
your timezone, regardless of the server’s timezone. Autocomplete will help explore all of the different metrics.

Figure 9: Prometheus Autocomplete

2023 Lightbits Labs Proprietary And Confidential Under NDA Only


Using Grafana and Prometheus APPENDIXES

Step 2
Start by writing “instance:cluster” into the expression field. As characters are entered, it will show available metrics in
the drop-down. As more characters are entered, the drop-down menu converges on specific metrics.

Figure 10: Prometheus Metrics

2023 Lightbits Labs Proprietary And Confidential Under NDA Only


Using Grafana and Prometheus APPENDIXES

With Enable Autocomplete, as text is typed into the expression field, Prometheus will then show metrics that have
matching text as a drop-down.

Figure 11: Prometheus Enable Autocomplete

2023 Lightbits Labs Proprietary And Confidential Under NDA Only


Using Grafana and Prometheus APPENDIXES

As you enter more text, you will see less metrics that are more specific.

Figure 12: Prometheus Specific Metrics

2023 Lightbits Labs Proprietary And Confidential Under NDA Only


Using Grafana and Prometheus APPENDIXES

Scroll to the bottom of the drop-down metric names:

Figure 13: Prometheus Metrics Drop-Down

Here we can see that we have “write_iops” and “write_throughput” as options. Since we want to know about write
bandwidth, the suitable metric would be “instance:cluster:write_throughput”.
Tip
One good way to know what to type into the Expression field is to study the drop-down. Another is to simply view all
of the available metrics.
To view all possible Prometheus metrics, curl, wget or open your browser to Prometheus Metrics.
The output will be large, but it will have all of the metrics. Here are example snippets of the output (searching for the
word “throughput”):

" instance : cluster : read_throughput "," instance : cluster : total_read_bytes "," instance : cluster
: total_reads "," instance : cluster : total_write_bytes "," instance : cluster : total_writes ","
instance : cluster : write_iops "," instance : cluster : write_throughput "…

" instance :node: read_throughput "," instance :node: receive_bytes_total "," instance :node:
receive_drop_total "," instance :node: receive_errs_total "," instance :node:
receive_packets_total "," instance :node: replication_write_iops_rx "," instance :node:
replication_write_iops_tx "," instance :node: replications_write_throughput_rx ","
instance :node: replications_write_throughput_tx "

2023 Lightbits Labs Proprietary And Confidential Under NDA Only


Using Grafana and Prometheus APPENDIXES

Step 3
Finish typing “instance:cluster:write_throughput” into the Expression field, or select it from the drop-down menu, and
enter Execute.

Figure 14: Prometheus Expression Field

Here we can see the raw value of the cluster write_throughput expressed in bytes. We can see that the current write
throughput is 98709485 bytes per second. This matches the fio job running in the background.
The following is the fio command that was launched from the same client.
root@rack02 - server65 [ client_0 ]:~# fio --direct =1 --rw= write --numjobs =8 --iodepth =1 --
ioengine = posixaio --bs=4k -- group_reporting =1 --filesize =1G --directory =/ test/ --
time_based =1 --runtime =3600 s --name=test

2023 Lightbits Labs Proprietary And Confidential Under NDA Only


Using Grafana and Prometheus APPENDIXES

The fio output also shows 93 MiB/s:

Figure 15: Prometheus fio Output

2023 Lightbits Labs Proprietary And Confidential Under NDA Only


Using Grafana and Prometheus APPENDIXES

The following is the output of iostat -tmx 3, also showing 93 MiB/s:

Figure 16: Prometheus iostat Output

2023 Lightbits Labs Proprietary And Confidential Under NDA Only


Using Grafana and Prometheus APPENDIXES

Step 4
Click Graph to view the graph output. The duration of the graph and end time and shading of the graphs are adjustable
with the buttons.
Here the graph shows the last 1 hour’s worth of data. However, any time period can be viewed by adjusting the values in
the boxes.
Note that there was a period of no throughput when the the fio job was cancelled temporarily.

Figure 17: Prometheus Cancelled fio Job

2023 Lightbits Labs Proprietary And Confidential Under NDA Only


Open TCP Ports and Verify APPENDIXES

Step 5
In Prometheus, you can also:
• Create alerts (this can also be done in Grafana).
• Stack other metrics to compare. Click Add Panel and then follow the same steps above to add another expression.
As an example, in the screenshot below, another panel was added to the bottom showing the write IOPs metric of
the entire cluster, by using the expression ”instance:cluster:write_iops”.

Figure 18: Prometheus Add Panel

Open TCP Ports and Verify


TCP ports in CentOS 7 for example can be blocked either using the IPTABLES service or using the firewall service. The
following is an example of how to use the IPTABLES service to open a TCP port and then test it using the netcat utility.
1. Check if a port is blocked.
In this example, we can check if port 80 is accepting traffic by entering the iptables command with grep:
$ iptables -nL | grep 80

If the iptables command returns no data, the port needs to be opened.


2. To open TCP Port 80, enter the iptables command as follows:

2023 Lightbits Labs Proprietary And Confidential Under NDA Only


Open TCP Ports and Verify APPENDIXES

$ iptables -I INPUT -p tcp --dport 80 -j ACCEPT

3. Re-enter the iptables -nL command to see if the port is now open.
$ iptables -nL | grep 80
ACCEPT tcp -- 0.0.0.0/0 0.0.0.0/0 tcp dpt :80

4. Install the netcat utility.


$ yum install nc

5. Run netcat as a server listing on port 80.


$ nc -l -p 80

6. From another server, install the netcat utility.


$ yum install nc

7. Run netcat to the server you are running iperf3 to verify that port 80 is accepting commands.
$ nc -z -v 192.168.16.7 80
Ncat: Version 7.50 ( https :// nmap.org/ncat )
Ncat: Connected to 192.168.16.7:80.
Ncat: 0 bytes sent , 0 bytes received in 0.01 seconds .
nc

Open TCP Port Example


Open TCP Port 80:
# nc -z -v 192.168.40.41 80
Ncat: Version 7.50 ( https :// nmap.org/ncat )
Ncat: Connected to 192.168.40.41:80.
Ncat: 0 bytes sent , 0 bytes received in 0.01 seconds .

Closed TCP Port Example


Closed TCP Port 80:
# nc -z -v 192.168.40.31 80
Ncat: Version 7.50 ( https :// nmap.org/ncat )
Ncat: Connection refused .

2023 Lightbits Labs Proprietary And Confidential Under NDA Only


Installation Behind HTTP-Proxy APPENDIXES

Installation Behind HTTP-Proxy


In order to install behind http_proxy, you will need to update group_vars/all.yml with the following:
proxy_env :

http_proxy : http :// < proxy -host >

https_proxy : https :// < proxy -host >

These settings will be passed to all tasks accessing the web for the installation of RPMs and other binaries, through the
proxy settings provided.

Note: You will need to ensure that the formatting is correct (yaml formatting). This can be in a separate page.

Single-IP-Dual-NUMA Configuration
Starting from version 3.1.2, Lightbits supports dual NUMA configuration with single data IP (network interface) for the
server.
The typical installation uses one data IP for instance ID 0, and another data IP for instance ID 1, but with this feature
both instance IDs use the same IP.
Therefore, instead of using two network interfaces in a dual NUMA server, only one network interface will be utilized for
the data network in both NUMAs.
To configure this single IP for both NUMA, duroslight and replicator use different ports for the different NUMAs.
Example
Configure this by appending the following lines into all.yml:
duroslight_ports :
0: "4420"
1: "4421"
replicator_ports :
0: "22226 "
1: "22227 "

The above settings allow Duroslight and replication to run off of the same IP (single IP), but with different ports for each
instance. This therefore allows two different instances of Lightbits to run off of the same IP, by using different ports.

2023 Lightbits Labs Proprietary And Confidential Under NDA Only


Adding a JWT Token To a Configuration File APPENDIXES

==> ansible / inventories / cluster_example / host_vars / server00 .yml <==


---
name: server00
nodes:
- instanceID : 0
data_ip : 10.16.64.1
... skip ...
- instanceID : 1
data_ip : 10.16.64.1
... skip ...

==> ansible / inventories / cluster_example / host_vars / server01 .yml <==


---
name: server01
nodes:
- instanceID : 0
data_ip : 10.16.64.2
... skip ...
- instanceID : 1
data_ip : 10.16.64.2
... skip ...

==> ansible / inventories / cluster_example / host_vars / server02 .yml <==


---
name: server02
nodes:
- instanceID : 0
data_ip : 10.16.64.3
... skip ...
- instanceID : 1
data_ip : 10.16.64.3
... skip ...

Adding a JWT Token To a Configuration File


When running lbcli commands, the jwt token must be provided via the -J variable, like this: lbcli -J $LIGHTOS_JWT
get cluster.
There is another way also, which is to configure the system jwt into a configuration file on the Lightbits server. In this
way lbcli commands can be run from that server without the -J variable.

Note: Having the system JWT preconfigured introduces security concerns, because any lbcli command can now
be run. Therefore it’s important to ensure that the server is secured.

1. After deploying the cluster, grab the system jwt. From the Ansible installation host, the file will be in located in
~/lightos-system-jwt. Show the content of the file with cat:
cat ~/ lightos -system -jwt

The output should show the token, as below. Note that the token has been cut for brevity.
export LIGHTOS_JWT = eyJhbGciOi < remaining jwt content > BaFEuMsT9gQNQA

2023 Lightbits Labs Proprietary And Confidential Under NDA Only


Adding a JWT Token To a Configuration File APPENDIXES

Copy the jwt token portion (everything after “LIGHTOS_JWT=”). Note that its long output will span multiple lines of
terminal output; however, it should only take up one line in a file.
2. On a Lightbits server, edit /etc/lbcli/lbcli.yaml and append the jwt to the bottom.
jwt: <jwt>
The full content of /etc/lbcli/lbcli.yaml will be similar to this:
output - format : human - readable
dial - timeout : 5s
command - timeout : 60s
insecure -skip -tls - verify : true
debug : false
api - version : 2
insecure - transport : false
endpoint : https ://127.0.0.1:443
jwt: eyJhbGciOi < remaining jwt content > BaFEuMsT9gQNQA

Running lbcli From a Non-Lightbits Server


Lightbits supports running lbcli remotely against a cluster from a non-Lightbits server. The limitation is that it has to
be a x86_64 Linux server. Make sure that the remote server has IP connectivity to the management or data IP.
1. Copy the lbcli binary to /usr/bin/ on the remote server.
2. On the remote server, create the /etc/lbcli directory.
3. Create the /etc/lbcli/lbcli.yaml file, specifying the management or data IP and the jwt.
output - format : human - readable
dial - timeout : 5s
command - timeout : 60s
insecure -skip -tls - verify : true
debug : false
api - version : 2
insecure - transport : false
endpoint : https ://10.10.10.100:443
jwt: eyJhbGciOi < remaining jwt content > BaFEuMsT9gQNQA

Note: You can only provide one endpoint.

2023 Lightbits Labs Proprietary And Confidential Under NDA Only


ABOUT - LEGAL

About - Legal
Lightbits Labs (Lightbits) is leading the digital data center transformation by making high-performance elastic block
storage available to any cloud. Creators of the NVMe® over TCP (NVMe/TCP) protocol, Lightbits software-defined
storage is easy to deploy at scale and delivers performance equivalent to local flash to accelerate cloud-native applications
in bare metal, virtual, or containerized environments. Backed by leading enterprise investors including Cisco Investments,
Dell Technologies Capital, Intel Capital, JP Morgan Chase, Lenovo, and Micron, Lightbits is on a mission to make
high-performance elastic block storage simple, scalable and cost-efficient for any cloud.
www.lightbitslabs.com
[email protected]
US Offices
1830 The Alameda, San Jose, CA 95126
1412 Broadway 21st Floor, New York, NY 10018
Israel Office
17 Atir Yeda Street,Kfar Saba, Israel 4464313

The information in this document and any document referenced herein is provided for informational purposes only, is
provided as is and with all faults and cannot be understood as substituting for customized service and information that
might be developed by Lightbits Labs ltd for a particular user based upon that user’s particular environment. Reliance
upon this document and any document referenced herein is at the user’s own risk.
The software is provided “As is”, without warranty of any kind, express or implied, including but not limited to the
warranties of merchantability, fitness for a particular purpose and non-infringement. In no event shall the contributors or
copyright holders be liable for any claim, damages or other liability, whether in an action of contract, tort or otherwise,
arising from, out of or in connection with the software or the use or other dealings with the software.
Unauthorized copying or distributing of included software files, via any medium is strictly prohibited.
COPYRIGHT (C) 2023 LIGHTBITS LABS LTD. - ALL RIGHTS RESERVED

2023 Lightbits Labs Proprietary And Confidential Under NDA Only

You might also like