SlideShare a Scribd company logo
Aerospike 3
Configuration:
Basic Configuration
Young Paik
Director of Sales Engineering
young@aerospike.com
Aerospike aer . o . spike [air-oh- spahyk]
noun, 1. tip of a rocket that enhances speed and stability
Curriculum
This training module is an overview of the
Aerospike Database and covers the basic
configuration of a single Aerospike Database
Cluster.

This course has been split into 3 areas:
1. Overview
2. Database Service Configuration
3. Database Storage Configuration

© 2014 Aerospike. All rights reserved. Confidential

Pg. 2
Aerospike
Basic Configuration
Overview
Overview
In order to understand how to configure
Aerospike, it is important to understand some of
the basic concepts and the terminology regarding
the database.

© 2014 Aerospike. All rights reserved. Confidential

Pg. 4
Agenda
High Level Terminology
 Cluster formation
 How Data Is Distributed
 What Happens When A Node Fails
 What Happens When A Node Is Added
 The Client


© 2014 Aerospike. All rights reserved. Confidential

Pg. 5
Terminology High Level
2. The Aerospike database service is
provided by a cluster. Your code does not
need to be aware of the internal structure.

Cluster
Node1

Node2

Node4

1. Clients are Web/application servers that
have your code (in blue) that uses the
Aerospike SDK (in yellow) to make
connections to the Aerospike database
service.

Node3

NodeN

3. A cluster is made up of individual nodes (servers)
that store data in a distributed manner. These
nodes can store multiple copies of the data, which
is the replication factor. For example, Primary + 1
Secondary copy means a replication factor of 2.

© 2014 Aerospike. All rights reserved. Confidential

Pg. 6
Cluster Formation
The basic way this operates is that each node
must send heartbeats that can be heard by other
nodes. When enough of the heartbeats from one
server have been missed by the others, it will be
removed from the cluster.
We will look into this in
more detail later.

Node 2

Node 3

© 2014 Aerospike. All rights reserved. Confidential

Node 1

Node 4

Pg. 7
Distributing Data: The Partition Map
Distributing data can be done in many ways.
Aerospike has chosen a method that:
1. Automatically balances data across nodes.
2. Makes it easy to migrate (rebalance) should a
node crash or be added.
3. Allows for a single network connection from
any client to the cluster.
4. Does not require the developer to understand
how the data is distributed.
5. Takes into account replica copies of the data.
© 2014 Aerospike. All rights reserved. Confidential

Pg. 8
Partitioning Map
For simplicity, let’s take a 3 node cluster with only 9 partitions and a
replication factor of 2. Aerospike normally uses 4096 partitions.

© 2014 Aerospike. All rights reserved.

Pg. 9
No Sharding & No Hotspots
Data is Distributed Randomly, using Hash technology
Example using 3 nodes in a cluster
cookie-abcdefg-12345678

182023kh15hh3kahdjsh

➤

Every key is hashed into a
20 byte (fixed length) string
using a hash function

➤

This hash + additional data
(fixed 64 bytes)
are stored in DRAM in the index

Partition
ID

Master
node

Replica
node

➤

12 bits of this hash are used to
compute the partition id

…

1

2

➤

There are 4096 partitions

1820

2

3

➤

The partition id maps to the node id

1821

3

2

4096

2

1
© 2014 Aerospike. All rights reserved.

Pg. 10
Losing A Node
Take a 3 node cluster with 12 partitions with a replication factor of 2. When
everything is stable, every thing will be evenly distributed.
Partition Map
Partition Master Replica

A

N1

N2

B

N2

N3

C

N3

N1

D

N1
N2

N1

F

N3

N2

G

N1

N2

H

N2

N3

I

N3
N1
N2
N3

Mas Rep

Mas Rep

N1

L

Mas Rep

N3

K

N3

N1

J

N2

N3

E

N1

N2

A

C

B

A

C

B

D

E

E

F

F

D

G

I

H

G

I

H

J

K

K

L

L

J

© 2014 Aerospike. All rights reserved. Confidential

Pg. 11
Losing A Node
So what happens if a node dies?

Partition Map
Partition Master Replica

A

N1

N2

B

N2

N3

C

N3

N1

D

N1
N2

N1

F

N3

N2

G

N1

N2

H

N2

N3

I

N3
N1
N2
N3

Mas Rep

Mas Rep

N1

L

Mas Rep

N3

K

N3

N1

J

N2

N3

E

N1

N2

A

C

B

A

C

B

D

E

E

F

F

D

G

I

H

G

I

H

J

K

K

L

L

J

© 2014 Aerospike. All rights reserved. Confidential

Pg. 12
Losing A Node
Some of the partitions will only have a single copy.

Partition Map
Partition Master Replica

A

N1

N2

B

N2

N3

C

N3

N1

D

N1
N2

N1

F

N3

N2

G

N1

N2

H

N2

N3

I

N3
N1
N2
N3

Mas Rep

Mas Rep

N1

L

Mas Rep

N3

K

N3

N1

J

N2

N3

E

N1

N2

A

C

B

A

C

B

D

E

E

F

F

D

G

I

H

G

I

H

J

K

K

L

L

J

© 2014 Aerospike. All rights reserved. Confidential

Pg. 13
Losing A Node
So the cluster will exclude the missing node and create a new partition map.

Partition Map
Partition Master Replica

A

N1

N2

B

N2

N1

C

N2

N1

D

N1
N2

N1

F

N1

N2

G

N1

N2

H

N2

N1

I

N2
N1
N2

N1

L

N1

Mas Rep

N2

K

Mas Rep

N1

J

N2

N2

E

N1

N2

A

C

B

A

D

E

E

F

G

I

H

G

J

K

K

L

© 2014 Aerospike. All rights reserved. Confidential

Pg. 14
Losing A Node
It will then begin to make copies of all the data, one partition at a time.

Partition Map
Partition Master Replica

A

N1

N2

B

N2

N1

C

N2

N1

D

N1
N2

N1

F

N1

N2

G

N1

N2

H

N2

N1

I

N2
N1

N2

K

N2
N1

Mas Rep

N1

L

Mas Rep

N1

J

N2

N2

E

N1

N2

A

C

B

A

D

E

E

F

G

I

H

G

J

K

K

L

F

© 2014 Aerospike. All rights reserved. Confidential

Pg. 15
Losing A Node
Once it has completed all the partitions, the cluster will be in a stable state
again. With 2 full copies of all data.
Partition Map
Partition Master Replica

A

N1

N2

B

N2

N1

C

N2

N1

D

N1
N2

N1

F

N1

N2

G

N1

N2

H

N2

N1

I

N2

N1

J

N1

N2

K

N2
N1

N2

Mas Rep

Mas Rep

N1

L

N2

N2

E

N1

A

C

B

A

D

E

E

F

G

I

H

G

J

K

K

L

F

B

C

D

L

H

I

J

© 2014 Aerospike. All rights reserved. Confidential

Pg. 16
Adding A Node
Now let’s start with the same situation, but add a node this time. The same
starting state: 12 partitions, 3 nodes, replication factor of 2.
Partition Map
Partition Master Replica

A

N1

N2

B

N2

N4
3

C

N3

N1

D

N4
1
N2

N1

F

N3

N2

G

N1

N4
2

H

N4
2

N3

I

N3

N1

J

N1
N2
N4
3

Mas Rep

Mas Rep

Mas Rep

N4
1

L

N3

N3

K

N2

N3

E

N1

N2

A

C

B

A

C

B

D

E

E

F

F

D

G

I

H

G

I

H

J

K

K

L

L

J

© 2014 Aerospike. All rights reserved. Confidential

Pg. 17
Adding A Node
When the new node is added, it starts empty.

Partition Map

N1

N2

N3

N4

Mas Rep

Mas Rep

Mas Rep

Mas Rep

Partition Master Replica

A

N1

N2

B

N2

N4
3

C

N3

N1

D

N4
1

N3

E

N2

N1

F

N3

N2

G

N1

N4
2

H

N4
2

N3

I

N3

N1

J

N1

N3

K

N2

N4
1

L

N4
3

N2

A

C

B

A

C

B

D

E

E

F

F

D

G

I

H

G

I

H

J

K

K

L

L

J

© 2014 Aerospike. All rights reserved. Confidential

Pg. 18
Adding A Node
The cluster creates a new partition map, with the new node included.

Partition Map

N1

N2

N3

N4

Mas Rep

Mas Rep

Mas Rep

Mas Rep

Partition Master Replica

A

N1

N2

B

N2

N4

C

N3

N1

D

N4

N3

E

N2

N1

F

N3

N2

G

N1

N4

H

N4

N3

I

N3

N1

J

N1

N3

K

N2

N4

L

N4

N2

A

C

B

A

C

B

D

E

E

F

F

D

G

I

H

G

I

H

J

K

K

L

L

J

© 2014 Aerospike. All rights reserved. Confidential

Pg. 19
Adding A Node
The cluster will then migrate (rebalance) the partitions, one at a time to the
new node. During this time it is possible for the partition map to be out of
sync with the actual data distribution. Aerospike nodes will proxy the request.
Partition Map

N1

N2

N3

N4

Mas Rep

Mas Rep

Mas Rep

Mas Rep

Partition Master Replica

A

N1

N2

B

N2

N4

C

N3

N1

D

N4

N3

E

N2

N1

F

N3

N2

G

N1

N4

H

N4

N3

I

N3

N1

J

N1

N3

K

N2

N4

L

N4

N2

A

C

B

A

C

B

D

E

E

F

F

D

G

I

H

G

I

H

J

K

K

L

L

J

© 2014 Aerospike. All rights reserved. Confidential

B

Pg. 20
Adding A Node
Once all the partitions have migrated, the database will be in a new stable
state, with replicated copies of all data again.
Partition Map

N1

N2

N3

N4

Mas Rep

Mas Rep

Mas Rep

Mas Rep

Partition Master Replica

A

N1

N2

B

N2

N4

C

N3

N1

D

N4

N3

E

N2

N1

F

N3

N2

G

N1

N4

H

N4

N3

I

N3

N1

J

N1

N3

K

N2

N4

L

N4

N2

A

C

B

A

C

D

D

B

G

E

E

F

F

H

H

G

J

I

K

L

I

J

L

K

© 2014 Aerospike. All rights reserved. Confidential

Pg. 21
The Client – Who Does What
Developers don’t have to think about all that happens with this. The
Aerospike SDK will automatically handle any rerouting.

Your code:
• Operate (read/write/update) on
a key

Aerospike SDK:
• Continually maintain partition
map
• Hash key, determine master
node
• Communicate with master node
• Optionally communicate with
the replica if the master does
not respond

© 2014 Aerospike. All rights reserved. Confidential

Pg. 22
The Client – Supported Languages
Language

Aerospike 2.x

Aerospike 3.x

Java

✔

✔

C

✔

✔

C#

✔

✔

C libevent

✔

*

Erlang

✔

*

PHP

✔

*

Python

✔

*

Aerospike 3 supports:
➤ User Defined Functions
➤ Secondary Indexes
➤ Aggregation queries
* Aerospike 2 clients will support Aerospike 2 features in Aerospike 3.
© 2014 Aerospike. All rights reserved. Confidential

Pg. 23
Agenda
High Level Terminology
 Cluster formation
 How Data Is Distributed
 What Happens When A Node Fails
 What Happens When A Node Is Added
 The Client


© 2014 Aerospike. All rights reserved. Confidential

Pg. 24
Aerospike
Basic Configuration
Database Service
Special Note
These training slides move from topic to topic. While
this generally corresponds to a location (stanza) in
the configuration file, this is not always true.
Parameters that are most commonly problematic are
denoted in RED. Pay special attention to these, since
the ramifications of improperly setting these
variables may take months to show up or be difficult
to fix once set.

© 2014 Aerospike. All rights reserved. Confidential

Pg. 26
Prerequisites
In order to properly configure the database, it is
important to have information on the following:






Programming language(s) used by clients.
Network configuration (will you be using unicast or multicast)
The kind of storage you will be using (RAM, SSD).
Storage volume requirements.
Hardware you will be using.

© 2014. All rights reserved. Confidential

Pg. 27
Aerospike Configuration
Administrators must configure Aerospike in many
different areas:







Server process
Logging
Network
UDF configuration
Data storage (covered in Part 2 of Webinar Series)
Cross Datacenter Replication (XDR, not covered)

Many of the settings in the default configuration file will work on most
servers, but this is usually not optimal and will result in poor performance.
This training module covers only the most important variables, but there are
many more possible configurations covered in other modules.
© 2014 Aerospike. All rights reserved. Confidential

Pg. 28
Aerospike Configuration File Notes
The main Aerospike configuration file contains all
the configuration variables for a node.






Located at /etc/aerospike/aerospike.conf on
each node.
NOT centrally managed by Aerospike.
Most variables can be changed dynamically while the
Aerospike node is up.
If you wish for changes to the file to be persistent, you
must edit the configuration file manually.
You may choose to use shorthand (K, M, G) to represent
large numbers. For example 4 gigabytes can be
represented as 4G, which is mathematically
4*1024*1024*1024.

© 2014 Aerospike. All rights reserved. Confidential

Pg. 29
Configuration File
There are 7 major stanzas in an Aerospike
configuration file.









service (required)
logging (required)
network (required)
mod-lua (required for 3.x)
cluster (optional)
namespace (at least 1 required)
xdr (optional)

These will look like this:
service {
...
}

© 2014 Aerospike. All rights reserved. Confidential

Pg. 30
Aerospike
Configuration
Server Process
Server Process
This section covers the behavior of the high level
database process.
Topics covered:






Linux user/group running the process
Whether or not to run as a daemon
Single replica limit
Location of the PID (Process ID)
Transaction settings for storage

© 2014 Aerospike. All rights reserved. Confidential

Pg. 32
Linux User/Group
Description

Controls the Linux username/group that runs the Aerospike
database.

Stanza location

service

Config parameters
(defaults)

user (root)
group (root)

Notes

If you set the username/group to a non-root user, you must
make sure that the following are writable by the user/group
you select:
- the log file (/var/log/aerospike/aerospike.log by
default)
- the persistence file (if using RAM + disk for persistence)
- any Flash/SSD devices you are using
- the PID file

Change dynamically

No

Best practices

Most customers run the daemon as root.
You must be careful if you are changing users on an already
running database. The major issue is permissions to files/SSDs.
Be sure to test thoroughly when doing so.

© 2014 Aerospike. All rights reserved. Confidential

Pg. 33
Run as a Daemon
Description

Whether or not the database process will run as a daemon.

Stanza location

service

Config parameters
(defaults)

run-as-daemon

Notes

You MUST remove the parameter completely (or comment it
out) to set as false. Even setting it as “false” will make the
node start up as a daemon.

Change dynamically

No

Best practices

This option is normally used because the node is having issues
starting up. By not running as a daemon, you can see messages
from the console directly. Once the service starts properly,
switch back to running as a daemon.

© 2014 Aerospike. All rights reserved. Confidential

Pg. 34
Single Replica Limit
Description

Sets the limit at which the cluster will no longer maintain a
replica of the data. This is done as a safety measure so
administrators may choose between

Stanza location

service

Config parameters
(defaults)

paxos-single-replica-limit (1)

Notes

If the cluster size is less than or equal to this value, keep only a
single copy of all data in the cluster.

Change dynamically

No

Best practices

There is no single best practice. This depends on what the
administrator believes is the best choice. If you believe that
evicting data and poorer performance is acceptable, set this at
a level consistent with what you believe is a worst (but
possible) case of node loss. If you would prefer to maintain
performance, but are willing to live with possible loss of data,
keep this at 1.

© 2014 Aerospike. All rights reserved. Confidential

Pg. 35
Location of the PID File
Description

Location of the PID (process identifier) file. This simply stores
the PID of the Aerospike database process (asd for version 3.x).

Stanza location

service

Config parameters
(defaults)

pidfile (none)

Notes

File location set to this value. Note that this must be writable
by the Linux user running the process.

Change dynamically

No

Best practices

The file is normally stored in /var/run/asd.pid

© 2014 Aerospike. All rights reserved. Confidential

Pg. 36
Transaction Settings for Storage
Description

Sets configuration for how queues and threads read from
storage

Stanza location

service

Config parameters
(defaults)

transaction-queues (4)
transaction-threads-per-queue (4)

Notes

Changes to the behavior vary greatly. We strongly recommend
sticking to the settings in the “Best practices” section below.

Change dynamically

No

Best practices

You should set both to “4” if using only RAM or RAM +
persistence namespaces. Set both to “8” if using any Flash/SSD
namespaces.

© 2014 Aerospike. All rights reserved. Confidential

Pg. 37
Server Process Example Config
For the server process here are examples of the configuration for a
standard production environment for an SSD cluster.
service {
user root
group root
run-as-daemon
paxos-single-replica-limit 1
pidfile /var/run/asd.pid
transaction-queues 8
transaction-threads-per-queue 8
...
}

© 2014 Aerospike. All rights reserved. Confidential

Pg. 38
Aerospike
Configuration
The Network
The Network
Networking is crucial to the function of any
distributed system.
Topics covered:







File descriptor limit (connection limit)
The main database service
Cluster formation (heartbeats)
The fabric (inter-node communication)
Direct telnet access

© 2014 Aerospike. All rights reserved. Confidential

Pg. 40
Maximum Number of File Descriptors
Description

This is the maximum number of Linux file descriptors that the
server will be able to set. This is not the just the number of
open files, but also the maximum number of connections.

Stanza location

service

Config parameters
(defaults)

proto-fd-max (15000)
proto-fd-idle-ms (600000)

Note

There is also a maximum value that is set by the operating
system. The Aerospike installer normally sets the OS maximum
at 100,000. The proto-fd-max variable is limited by this
number.
The proto-fd-idle-ms sets the timeout for transactions

Change dynamically

Yes

Best practices

For production use, this should be set at 15,000. It may be set
as low as 1,000 for development work. Sometimes when using
certain client languages this, should be set at much higher such
as 30,000.
The proto-fd-idle-ms should normally be used when you will be
using a client with many short-lived connections, such as PHP.
Then set this to 10,000. When not set with these languages,
performance will suffer.

© 2014 Aerospike. All rights reserved. Confidential

Pg. 41
Main Database Service
Description

This is the configuration for the main database service. This is
the port that applications will use to connect to this node.

Stanza location

network:service

Config parameters
(defaults)

address
access-address
port
reuse-address

Notes

address: the is the IP address that the service will listen on. You may also
specify “any”
access-address: for servers with multiple IP addresses, this is the one it
will share with the other nodes to use. This should match the address that
the client applications will use.
port: cannot be blank, standard value is 3000
reuse-address: sets whether or not to reuse the addresses when the
service comes back up. No value is required, but can be true or false.

Change dynamically

No

Best practices

Normally, you will want to set the following:
address any
access-address [IP address used by applications]
port 3000
reuse-address true
It is important that every node (even the first) point to some
other node that will be in the cluster. This allows you to restart
the first server as well.
© 2014 Aerospike. All rights reserved. Confidential

Pg. 42
Cluster Formation
There are 2 different ways that a cluster can
form. One is to use multicast connections, the
other is to use mesh (or unicast).
The basic way this operates is that each node
must send heartbeats that can be heard by other
nodes. When enough of the heartbeats from one
server have been missed by the others, it will be
removed from the cluster.
You must choose one and only one mode for each
cluster.
© 2014 Aerospike. All rights reserved. Confidential

Pg. 43
Cluster Formation
Heartbeat - Multicast

• When starting a multicast cluster,
you start with isolated nodes (4 in
this example).
• Each node will send a heartbeat to a
multicast IP address, so all the
nodes will know of each other.
• The cluster will form with the list of
nodes. This map is also stored in
each client, so they will know where
to go for any given record. One of
the nodes will create the partition
map and will distribute it to the rest
of the nodes in the cluster.

Cluster
Node 1

Node 2

Multicast IP

Node 3

44

Node 4
Cluster Formation - Multicast
Description

This section controls how the cluster will be formed from
individual nodes.

Stanza location

network:heartbeat

Config parameters
(defaults)

mode multicast
address
port
interval (150)
timeout (10)

Notes

Mode must be multicast to use this mechanism.
There is no default port, but is 9918 is standard.
interval is in milliseconds.
timeout is the number of missed heartbeats, before the node is
declared dead.

Change dynamically

interval –yes
timeout – yes
others - no

Best practices

For most production uses, use an interval of “150” and a
timeout of “15”. For cloud environments, use “250” and “25”.
However, note that most cloud environments like Amazon EC2
do not allow multicast.
See following for note on multicast*

© 2014 Aerospike. All rights reserved. Confidential

Pg. 45
Regarding Multicast
Even in environments where multicast is possible,
there is often some configuration work on the
network devices, such as the switches.
If you find that multicast has worked for 3-5
minutes, but then stops, chances are you must do
one of the following to switch with the vlan
containing the nodes:
1. Turn off IGMP snooping
OR
2.

Turn on IGMP snooping, and also enable the
querier (a.k.a multicast routing)
© 2014 Aerospike. All rights reserved. Confidential

Pg. 46
Cluster Formation
Heartbeat – Mesh (unicast)
• In the event that multicast is not possible, you can elect to use the
mesh. This uses standard unicast. In this case you will need to
bring up a single node first.
• As you bring up additional nodes, each one will be configured to
communicate with a node that is already a part of the cluster
(usually the first one) and share heartbeats with it.

Node 1

Node 3

© 2014 Aerospike. All rights reserved. Confidential

Node 2

Node 4

Pg. 47
Cluster Formation
Heartbeat – Mesh (unicast)
• In the event that multicast is not possible, you can elect to use the
mesh. This uses standard unicast. In this case you will need to
bring up a single node first.
• As you bring up additional nodes, each one will be configured to
communicate with a node that is already a part of the cluster
(usually the first one) and share heartbeats with it.

Node 1

Node 3

© 2014 Aerospike. All rights reserved. Confidential

Node 2

Node 4

Pg. 48
Cluster Formation
Heartbeat – Mesh (unicast)
• In the event that multicast is not possible, you can elect to use the
mesh. This uses standard unicast. In this case you will need to
bring up a single node first.
• As you bring up additional nodes, each one will be configured to
communicate with a node that is already a part of the cluster
(usually the first one) and share heartbeats with it.

Node 1

Node 3

© 2014 Aerospike. All rights reserved. Confidential

Node 2

Node 4

Pg. 49
Cluster Formation – Mesh (Unicast)
Description

This section controls how the cluster will be formed from
individual nodes.

Stanza location

network:heartbeat

Config parameters
(defaults)

mode mesh
port
mesh-address
mesh-port
interval (150)
timeout (10)

Notes

Mode must be mesh to use this mechanism
The standard port is 3002, this is the address used by this node
mesh-address and mesh-port are the IP address and port used
by the next node.
interval and timeout are as in Multicast.

Change dynamically

interval –yes
timeout – yes
others - no

Best practices

Aerospike has found that this mechanism works in production
with up to 20 nodes.
For most production uses, use an interval of “150” and a
timeout of “15”. For cloud environments, use “250” and “25”.
Note that most cloud environments like Amazon EC2 do not
allow multicast.
© 2014 Aerospike. All rights reserved. Confidential

Pg. 50
Fabric
Description

The fabric controls intra-cluster communication between
nodes.

Stanza location

network:fabric

Config parameters
(defaults)

address
port

Notes

The address should be the IP address that the fabric should
respond on (you may also use “any”)
The port is required and normally set to 3001

Change dynamically

No

Best practices

It is possible to configure the fabric to communicate on a
different network device from the service.

© 2014 Aerospike. All rights reserved. Confidential

Pg. 51
Direct Telnet Access
Description

Aerospike offers a direct telnet connection into the server to
administrate the node when you are having difficulty
communicating through the normal service port (default 3000)

Stanza location

network:info

Config parameters
(defaults)

address
port

Notes

The address should be the IP address that the info service
should respond on (you may also use “any”)
The port is required and normally set to 3003

Change dynamically

No

Best practices

You can use a standard telnet command to the appropriate IP
address and port to issue various commands for debugging.
Please see the Aerospike documentation on how to issue
commands through this interface:
https://docs.aerospike.com/display/AS2/Using+telnet+when+t
he+Service+Port+is+Busy

© 2014 Aerospike. All rights reserved. Confidential

Pg. 52
Network Example Config (1 of 3)
For the connections variables, both configuration variables default to good
values and can even be left unset in the file. You should only set them if:


If your node is in a test environment and the node hardware is low-level, set proto-fdmax to 1000.



If your clients have short lived connections (such as for PHP) you may want to apply the
following:
 proto-fd-max 100000
 proto-fd-idle-ms 10000

service
...
proto-fd-max 15000
proto-fd-idle-ms 600000
...
}

© 2014 Aerospike. All rights reserved. Confidential

Pg. 53
Network Example Config (2 of 3)
If using multicast for heartbeats on IP address 239.1.99.222 and if you wish for your clients to access this node on the IP address
10.100.1.215, your config file may look like this:
network {
service {
address any
port 3000
# If this server has multiple IP addresses, answer on this one (access-address)
access-address 10.100.1.215
reuse-address
}
heartbeat {
mode multicast
# This address is the multicast IP address used by all the servers in the cluster
address 239.1.99.222
port 9918
interval 150
timeout 10
}
fabric {
port 3001
}
info {
port 3003
}
}

© 2014 Aerospike. All rights reserved. Confidential

Pg. 54
Network Example Config (3 of 3)
If using mesh (unicast) for heartbeats. The IP address 10.100.1.215, your config file may look like this:
network {
service {
address any
port 3000
# If this server has multiple IP addresses, answer on this one (access-address)
access-address 10.100.1.215
reuse-address
}
heartbeat {
mode mesh
port 3002
# The mesh address is the IP address of another node in the cluster
mesh-address 10.100.1.214
mesh-port 3002
interval 150
timeout 10
}
fabric {
port 3001
}
info {
port 3003
}
}

© 2014 Aerospike. All rights reserved. Confidential

Pg. 55
Aerospike
Configuration
Logging
Logging
By default, Aerospike logs all messages in the
main log file.
Topics covered:




Location of logs
Log level (changing what is logged)

© 2014 Aerospike. All rights reserved. Confidential

Pg. 57
Log File
Description

This is the location of the actual log file itself.

Stanza location

logging:file

Config parameters
(defaults)

file

Notes

Aerospike normally puts the logs in
/var/log/aerospike/aerospike.log

Change dynamically

No

Best practices

The log file must be writable by the user running the node
process.
For 3.x, you should use /var/log/aerospike/aerospike.log
The log file does not automatically rotate. Instructions for
rotating through the logs can be found at:
https://docs.aerospike.com/display/V3/Logging

© 2014 Aerospike. All rights reserved. Confidential

Pg. 58
Log Level
Description

Sets the log level for different messages.

Stanza location

logging:file

Config parameters
(defaults)

context

Notes

There are different contexts and levels. You can specify
different levels for different contexts.
Contexts:
Levels:
any
critical
batch
warning
info
info
query
debug
rw
detail
scan
udf

Change dynamically

Yes

Best practices

Set “any” to “info”. Only change to a deeper level when
debugging an issue. Make sure to change back afterwards, in
order to avoid unnecessary logging.

© 2014 Aerospike. All rights reserved. Confidential

Pg. 59
Aerospike
Configuration
UDF Configuration
UDF Configuration
Aerospike has the ability to perform functions on
the server. These are done through functions
called UDFs, which are stored on each node in the
cluster. This feature is only available in Aerospike
3.x.
Topics covered:




Location of system UDFs (provided by Aerospike)
Location of user UDFs
Whether or not to use a cache

© 2014 Aerospike. All rights reserved. Confidential

Pg. 61
System UDF Directory
Description

This is the location where the system will store UDFs

Stanza location

mod-lua

Config parameters
(defaults)

system-path (/opt/aerospike/sys/udf/lua)

Notes

System UDFs are only cached and are loaded, when the server
starts.

Change dynamically

No

Best practices

There should be no reason for administrators to change the
contents of this directory directly.

© 2014 Aerospike. All rights reserved. Confidential

Pg. 62
User UDF Directory
Description

This is the location where the server will store user created
UDFs.

Stanza location

mod-lua

Config parameters
(defaults)

user-path (/opt/aerospike/usr/udf/lua)

Notes

The contents of this directory should be maintained by the
server. Users should never have to alter the contents manually,
but rather through the Aerospike interfaces.

Change dynamically

No

Best practices

Do not make manual changes to the contents of this directory.

© 2014 Aerospike. All rights reserved. Confidential

Pg. 63
User Cache Setting
Description

This determines whether the server should cache UDFs or load
them at runtime.

Stanza location

mod-lua

Config parameters
(defaults)

cache-enabled (true)

Notes
Change dynamically

Best practices

This should be set to “true” for production use. This will yield
the best performance.
Use “false” to help in debugging issues with UDFs.

© 2014 Aerospike. All rights reserved. Confidential

Pg. 64
Aerospike Configuration
Configuration of:






Server process
Logging
Network
UDF configuration
Data storage (covered in Part 2 of Webinar Series)

© 2014 Aerospike. All rights reserved. Confidential

Pg. 65
Thank You
Send all questions/comments/complaints to
YOUNG PAIK
YOUNG@AEROSPIKE.COM

More Related Content

Configuring Aerospike - Part 1

  • 1. Aerospike 3 Configuration: Basic Configuration Young Paik Director of Sales Engineering [email protected] Aerospike aer . o . spike [air-oh- spahyk] noun, 1. tip of a rocket that enhances speed and stability
  • 2. Curriculum This training module is an overview of the Aerospike Database and covers the basic configuration of a single Aerospike Database Cluster. This course has been split into 3 areas: 1. Overview 2. Database Service Configuration 3. Database Storage Configuration © 2014 Aerospike. All rights reserved. Confidential Pg. 2
  • 4. Overview In order to understand how to configure Aerospike, it is important to understand some of the basic concepts and the terminology regarding the database. © 2014 Aerospike. All rights reserved. Confidential Pg. 4
  • 5. Agenda High Level Terminology  Cluster formation  How Data Is Distributed  What Happens When A Node Fails  What Happens When A Node Is Added  The Client  © 2014 Aerospike. All rights reserved. Confidential Pg. 5
  • 6. Terminology High Level 2. The Aerospike database service is provided by a cluster. Your code does not need to be aware of the internal structure. Cluster Node1 Node2 Node4 1. Clients are Web/application servers that have your code (in blue) that uses the Aerospike SDK (in yellow) to make connections to the Aerospike database service. Node3 NodeN 3. A cluster is made up of individual nodes (servers) that store data in a distributed manner. These nodes can store multiple copies of the data, which is the replication factor. For example, Primary + 1 Secondary copy means a replication factor of 2. © 2014 Aerospike. All rights reserved. Confidential Pg. 6
  • 7. Cluster Formation The basic way this operates is that each node must send heartbeats that can be heard by other nodes. When enough of the heartbeats from one server have been missed by the others, it will be removed from the cluster. We will look into this in more detail later. Node 2 Node 3 © 2014 Aerospike. All rights reserved. Confidential Node 1 Node 4 Pg. 7
  • 8. Distributing Data: The Partition Map Distributing data can be done in many ways. Aerospike has chosen a method that: 1. Automatically balances data across nodes. 2. Makes it easy to migrate (rebalance) should a node crash or be added. 3. Allows for a single network connection from any client to the cluster. 4. Does not require the developer to understand how the data is distributed. 5. Takes into account replica copies of the data. © 2014 Aerospike. All rights reserved. Confidential Pg. 8
  • 9. Partitioning Map For simplicity, let’s take a 3 node cluster with only 9 partitions and a replication factor of 2. Aerospike normally uses 4096 partitions. © 2014 Aerospike. All rights reserved. Pg. 9
  • 10. No Sharding & No Hotspots Data is Distributed Randomly, using Hash technology Example using 3 nodes in a cluster cookie-abcdefg-12345678 182023kh15hh3kahdjsh ➤ Every key is hashed into a 20 byte (fixed length) string using a hash function ➤ This hash + additional data (fixed 64 bytes) are stored in DRAM in the index Partition ID Master node Replica node ➤ 12 bits of this hash are used to compute the partition id … 1 2 ➤ There are 4096 partitions 1820 2 3 ➤ The partition id maps to the node id 1821 3 2 4096 2 1 © 2014 Aerospike. All rights reserved. Pg. 10
  • 11. Losing A Node Take a 3 node cluster with 12 partitions with a replication factor of 2. When everything is stable, every thing will be evenly distributed. Partition Map Partition Master Replica A N1 N2 B N2 N3 C N3 N1 D N1 N2 N1 F N3 N2 G N1 N2 H N2 N3 I N3 N1 N2 N3 Mas Rep Mas Rep N1 L Mas Rep N3 K N3 N1 J N2 N3 E N1 N2 A C B A C B D E E F F D G I H G I H J K K L L J © 2014 Aerospike. All rights reserved. Confidential Pg. 11
  • 12. Losing A Node So what happens if a node dies? Partition Map Partition Master Replica A N1 N2 B N2 N3 C N3 N1 D N1 N2 N1 F N3 N2 G N1 N2 H N2 N3 I N3 N1 N2 N3 Mas Rep Mas Rep N1 L Mas Rep N3 K N3 N1 J N2 N3 E N1 N2 A C B A C B D E E F F D G I H G I H J K K L L J © 2014 Aerospike. All rights reserved. Confidential Pg. 12
  • 13. Losing A Node Some of the partitions will only have a single copy. Partition Map Partition Master Replica A N1 N2 B N2 N3 C N3 N1 D N1 N2 N1 F N3 N2 G N1 N2 H N2 N3 I N3 N1 N2 N3 Mas Rep Mas Rep N1 L Mas Rep N3 K N3 N1 J N2 N3 E N1 N2 A C B A C B D E E F F D G I H G I H J K K L L J © 2014 Aerospike. All rights reserved. Confidential Pg. 13
  • 14. Losing A Node So the cluster will exclude the missing node and create a new partition map. Partition Map Partition Master Replica A N1 N2 B N2 N1 C N2 N1 D N1 N2 N1 F N1 N2 G N1 N2 H N2 N1 I N2 N1 N2 N1 L N1 Mas Rep N2 K Mas Rep N1 J N2 N2 E N1 N2 A C B A D E E F G I H G J K K L © 2014 Aerospike. All rights reserved. Confidential Pg. 14
  • 15. Losing A Node It will then begin to make copies of all the data, one partition at a time. Partition Map Partition Master Replica A N1 N2 B N2 N1 C N2 N1 D N1 N2 N1 F N1 N2 G N1 N2 H N2 N1 I N2 N1 N2 K N2 N1 Mas Rep N1 L Mas Rep N1 J N2 N2 E N1 N2 A C B A D E E F G I H G J K K L F © 2014 Aerospike. All rights reserved. Confidential Pg. 15
  • 16. Losing A Node Once it has completed all the partitions, the cluster will be in a stable state again. With 2 full copies of all data. Partition Map Partition Master Replica A N1 N2 B N2 N1 C N2 N1 D N1 N2 N1 F N1 N2 G N1 N2 H N2 N1 I N2 N1 J N1 N2 K N2 N1 N2 Mas Rep Mas Rep N1 L N2 N2 E N1 A C B A D E E F G I H G J K K L F B C D L H I J © 2014 Aerospike. All rights reserved. Confidential Pg. 16
  • 17. Adding A Node Now let’s start with the same situation, but add a node this time. The same starting state: 12 partitions, 3 nodes, replication factor of 2. Partition Map Partition Master Replica A N1 N2 B N2 N4 3 C N3 N1 D N4 1 N2 N1 F N3 N2 G N1 N4 2 H N4 2 N3 I N3 N1 J N1 N2 N4 3 Mas Rep Mas Rep Mas Rep N4 1 L N3 N3 K N2 N3 E N1 N2 A C B A C B D E E F F D G I H G I H J K K L L J © 2014 Aerospike. All rights reserved. Confidential Pg. 17
  • 18. Adding A Node When the new node is added, it starts empty. Partition Map N1 N2 N3 N4 Mas Rep Mas Rep Mas Rep Mas Rep Partition Master Replica A N1 N2 B N2 N4 3 C N3 N1 D N4 1 N3 E N2 N1 F N3 N2 G N1 N4 2 H N4 2 N3 I N3 N1 J N1 N3 K N2 N4 1 L N4 3 N2 A C B A C B D E E F F D G I H G I H J K K L L J © 2014 Aerospike. All rights reserved. Confidential Pg. 18
  • 19. Adding A Node The cluster creates a new partition map, with the new node included. Partition Map N1 N2 N3 N4 Mas Rep Mas Rep Mas Rep Mas Rep Partition Master Replica A N1 N2 B N2 N4 C N3 N1 D N4 N3 E N2 N1 F N3 N2 G N1 N4 H N4 N3 I N3 N1 J N1 N3 K N2 N4 L N4 N2 A C B A C B D E E F F D G I H G I H J K K L L J © 2014 Aerospike. All rights reserved. Confidential Pg. 19
  • 20. Adding A Node The cluster will then migrate (rebalance) the partitions, one at a time to the new node. During this time it is possible for the partition map to be out of sync with the actual data distribution. Aerospike nodes will proxy the request. Partition Map N1 N2 N3 N4 Mas Rep Mas Rep Mas Rep Mas Rep Partition Master Replica A N1 N2 B N2 N4 C N3 N1 D N4 N3 E N2 N1 F N3 N2 G N1 N4 H N4 N3 I N3 N1 J N1 N3 K N2 N4 L N4 N2 A C B A C B D E E F F D G I H G I H J K K L L J © 2014 Aerospike. All rights reserved. Confidential B Pg. 20
  • 21. Adding A Node Once all the partitions have migrated, the database will be in a new stable state, with replicated copies of all data again. Partition Map N1 N2 N3 N4 Mas Rep Mas Rep Mas Rep Mas Rep Partition Master Replica A N1 N2 B N2 N4 C N3 N1 D N4 N3 E N2 N1 F N3 N2 G N1 N4 H N4 N3 I N3 N1 J N1 N3 K N2 N4 L N4 N2 A C B A C D D B G E E F F H H G J I K L I J L K © 2014 Aerospike. All rights reserved. Confidential Pg. 21
  • 22. The Client – Who Does What Developers don’t have to think about all that happens with this. The Aerospike SDK will automatically handle any rerouting. Your code: • Operate (read/write/update) on a key Aerospike SDK: • Continually maintain partition map • Hash key, determine master node • Communicate with master node • Optionally communicate with the replica if the master does not respond © 2014 Aerospike. All rights reserved. Confidential Pg. 22
  • 23. The Client – Supported Languages Language Aerospike 2.x Aerospike 3.x Java ✔ ✔ C ✔ ✔ C# ✔ ✔ C libevent ✔ * Erlang ✔ * PHP ✔ * Python ✔ * Aerospike 3 supports: ➤ User Defined Functions ➤ Secondary Indexes ➤ Aggregation queries * Aerospike 2 clients will support Aerospike 2 features in Aerospike 3. © 2014 Aerospike. All rights reserved. Confidential Pg. 23
  • 24. Agenda High Level Terminology  Cluster formation  How Data Is Distributed  What Happens When A Node Fails  What Happens When A Node Is Added  The Client  © 2014 Aerospike. All rights reserved. Confidential Pg. 24
  • 26. Special Note These training slides move from topic to topic. While this generally corresponds to a location (stanza) in the configuration file, this is not always true. Parameters that are most commonly problematic are denoted in RED. Pay special attention to these, since the ramifications of improperly setting these variables may take months to show up or be difficult to fix once set. © 2014 Aerospike. All rights reserved. Confidential Pg. 26
  • 27. Prerequisites In order to properly configure the database, it is important to have information on the following:      Programming language(s) used by clients. Network configuration (will you be using unicast or multicast) The kind of storage you will be using (RAM, SSD). Storage volume requirements. Hardware you will be using. © 2014. All rights reserved. Confidential Pg. 27
  • 28. Aerospike Configuration Administrators must configure Aerospike in many different areas:       Server process Logging Network UDF configuration Data storage (covered in Part 2 of Webinar Series) Cross Datacenter Replication (XDR, not covered) Many of the settings in the default configuration file will work on most servers, but this is usually not optimal and will result in poor performance. This training module covers only the most important variables, but there are many more possible configurations covered in other modules. © 2014 Aerospike. All rights reserved. Confidential Pg. 28
  • 29. Aerospike Configuration File Notes The main Aerospike configuration file contains all the configuration variables for a node.      Located at /etc/aerospike/aerospike.conf on each node. NOT centrally managed by Aerospike. Most variables can be changed dynamically while the Aerospike node is up. If you wish for changes to the file to be persistent, you must edit the configuration file manually. You may choose to use shorthand (K, M, G) to represent large numbers. For example 4 gigabytes can be represented as 4G, which is mathematically 4*1024*1024*1024. © 2014 Aerospike. All rights reserved. Confidential Pg. 29
  • 30. Configuration File There are 7 major stanzas in an Aerospike configuration file.        service (required) logging (required) network (required) mod-lua (required for 3.x) cluster (optional) namespace (at least 1 required) xdr (optional) These will look like this: service { ... } © 2014 Aerospike. All rights reserved. Confidential Pg. 30
  • 32. Server Process This section covers the behavior of the high level database process. Topics covered:      Linux user/group running the process Whether or not to run as a daemon Single replica limit Location of the PID (Process ID) Transaction settings for storage © 2014 Aerospike. All rights reserved. Confidential Pg. 32
  • 33. Linux User/Group Description Controls the Linux username/group that runs the Aerospike database. Stanza location service Config parameters (defaults) user (root) group (root) Notes If you set the username/group to a non-root user, you must make sure that the following are writable by the user/group you select: - the log file (/var/log/aerospike/aerospike.log by default) - the persistence file (if using RAM + disk for persistence) - any Flash/SSD devices you are using - the PID file Change dynamically No Best practices Most customers run the daemon as root. You must be careful if you are changing users on an already running database. The major issue is permissions to files/SSDs. Be sure to test thoroughly when doing so. © 2014 Aerospike. All rights reserved. Confidential Pg. 33
  • 34. Run as a Daemon Description Whether or not the database process will run as a daemon. Stanza location service Config parameters (defaults) run-as-daemon Notes You MUST remove the parameter completely (or comment it out) to set as false. Even setting it as “false” will make the node start up as a daemon. Change dynamically No Best practices This option is normally used because the node is having issues starting up. By not running as a daemon, you can see messages from the console directly. Once the service starts properly, switch back to running as a daemon. © 2014 Aerospike. All rights reserved. Confidential Pg. 34
  • 35. Single Replica Limit Description Sets the limit at which the cluster will no longer maintain a replica of the data. This is done as a safety measure so administrators may choose between Stanza location service Config parameters (defaults) paxos-single-replica-limit (1) Notes If the cluster size is less than or equal to this value, keep only a single copy of all data in the cluster. Change dynamically No Best practices There is no single best practice. This depends on what the administrator believes is the best choice. If you believe that evicting data and poorer performance is acceptable, set this at a level consistent with what you believe is a worst (but possible) case of node loss. If you would prefer to maintain performance, but are willing to live with possible loss of data, keep this at 1. © 2014 Aerospike. All rights reserved. Confidential Pg. 35
  • 36. Location of the PID File Description Location of the PID (process identifier) file. This simply stores the PID of the Aerospike database process (asd for version 3.x). Stanza location service Config parameters (defaults) pidfile (none) Notes File location set to this value. Note that this must be writable by the Linux user running the process. Change dynamically No Best practices The file is normally stored in /var/run/asd.pid © 2014 Aerospike. All rights reserved. Confidential Pg. 36
  • 37. Transaction Settings for Storage Description Sets configuration for how queues and threads read from storage Stanza location service Config parameters (defaults) transaction-queues (4) transaction-threads-per-queue (4) Notes Changes to the behavior vary greatly. We strongly recommend sticking to the settings in the “Best practices” section below. Change dynamically No Best practices You should set both to “4” if using only RAM or RAM + persistence namespaces. Set both to “8” if using any Flash/SSD namespaces. © 2014 Aerospike. All rights reserved. Confidential Pg. 37
  • 38. Server Process Example Config For the server process here are examples of the configuration for a standard production environment for an SSD cluster. service { user root group root run-as-daemon paxos-single-replica-limit 1 pidfile /var/run/asd.pid transaction-queues 8 transaction-threads-per-queue 8 ... } © 2014 Aerospike. All rights reserved. Confidential Pg. 38
  • 40. The Network Networking is crucial to the function of any distributed system. Topics covered:      File descriptor limit (connection limit) The main database service Cluster formation (heartbeats) The fabric (inter-node communication) Direct telnet access © 2014 Aerospike. All rights reserved. Confidential Pg. 40
  • 41. Maximum Number of File Descriptors Description This is the maximum number of Linux file descriptors that the server will be able to set. This is not the just the number of open files, but also the maximum number of connections. Stanza location service Config parameters (defaults) proto-fd-max (15000) proto-fd-idle-ms (600000) Note There is also a maximum value that is set by the operating system. The Aerospike installer normally sets the OS maximum at 100,000. The proto-fd-max variable is limited by this number. The proto-fd-idle-ms sets the timeout for transactions Change dynamically Yes Best practices For production use, this should be set at 15,000. It may be set as low as 1,000 for development work. Sometimes when using certain client languages this, should be set at much higher such as 30,000. The proto-fd-idle-ms should normally be used when you will be using a client with many short-lived connections, such as PHP. Then set this to 10,000. When not set with these languages, performance will suffer. © 2014 Aerospike. All rights reserved. Confidential Pg. 41
  • 42. Main Database Service Description This is the configuration for the main database service. This is the port that applications will use to connect to this node. Stanza location network:service Config parameters (defaults) address access-address port reuse-address Notes address: the is the IP address that the service will listen on. You may also specify “any” access-address: for servers with multiple IP addresses, this is the one it will share with the other nodes to use. This should match the address that the client applications will use. port: cannot be blank, standard value is 3000 reuse-address: sets whether or not to reuse the addresses when the service comes back up. No value is required, but can be true or false. Change dynamically No Best practices Normally, you will want to set the following: address any access-address [IP address used by applications] port 3000 reuse-address true It is important that every node (even the first) point to some other node that will be in the cluster. This allows you to restart the first server as well. © 2014 Aerospike. All rights reserved. Confidential Pg. 42
  • 43. Cluster Formation There are 2 different ways that a cluster can form. One is to use multicast connections, the other is to use mesh (or unicast). The basic way this operates is that each node must send heartbeats that can be heard by other nodes. When enough of the heartbeats from one server have been missed by the others, it will be removed from the cluster. You must choose one and only one mode for each cluster. © 2014 Aerospike. All rights reserved. Confidential Pg. 43
  • 44. Cluster Formation Heartbeat - Multicast • When starting a multicast cluster, you start with isolated nodes (4 in this example). • Each node will send a heartbeat to a multicast IP address, so all the nodes will know of each other. • The cluster will form with the list of nodes. This map is also stored in each client, so they will know where to go for any given record. One of the nodes will create the partition map and will distribute it to the rest of the nodes in the cluster. Cluster Node 1 Node 2 Multicast IP Node 3 44 Node 4
  • 45. Cluster Formation - Multicast Description This section controls how the cluster will be formed from individual nodes. Stanza location network:heartbeat Config parameters (defaults) mode multicast address port interval (150) timeout (10) Notes Mode must be multicast to use this mechanism. There is no default port, but is 9918 is standard. interval is in milliseconds. timeout is the number of missed heartbeats, before the node is declared dead. Change dynamically interval –yes timeout – yes others - no Best practices For most production uses, use an interval of “150” and a timeout of “15”. For cloud environments, use “250” and “25”. However, note that most cloud environments like Amazon EC2 do not allow multicast. See following for note on multicast* © 2014 Aerospike. All rights reserved. Confidential Pg. 45
  • 46. Regarding Multicast Even in environments where multicast is possible, there is often some configuration work on the network devices, such as the switches. If you find that multicast has worked for 3-5 minutes, but then stops, chances are you must do one of the following to switch with the vlan containing the nodes: 1. Turn off IGMP snooping OR 2. Turn on IGMP snooping, and also enable the querier (a.k.a multicast routing) © 2014 Aerospike. All rights reserved. Confidential Pg. 46
  • 47. Cluster Formation Heartbeat – Mesh (unicast) • In the event that multicast is not possible, you can elect to use the mesh. This uses standard unicast. In this case you will need to bring up a single node first. • As you bring up additional nodes, each one will be configured to communicate with a node that is already a part of the cluster (usually the first one) and share heartbeats with it. Node 1 Node 3 © 2014 Aerospike. All rights reserved. Confidential Node 2 Node 4 Pg. 47
  • 48. Cluster Formation Heartbeat – Mesh (unicast) • In the event that multicast is not possible, you can elect to use the mesh. This uses standard unicast. In this case you will need to bring up a single node first. • As you bring up additional nodes, each one will be configured to communicate with a node that is already a part of the cluster (usually the first one) and share heartbeats with it. Node 1 Node 3 © 2014 Aerospike. All rights reserved. Confidential Node 2 Node 4 Pg. 48
  • 49. Cluster Formation Heartbeat – Mesh (unicast) • In the event that multicast is not possible, you can elect to use the mesh. This uses standard unicast. In this case you will need to bring up a single node first. • As you bring up additional nodes, each one will be configured to communicate with a node that is already a part of the cluster (usually the first one) and share heartbeats with it. Node 1 Node 3 © 2014 Aerospike. All rights reserved. Confidential Node 2 Node 4 Pg. 49
  • 50. Cluster Formation – Mesh (Unicast) Description This section controls how the cluster will be formed from individual nodes. Stanza location network:heartbeat Config parameters (defaults) mode mesh port mesh-address mesh-port interval (150) timeout (10) Notes Mode must be mesh to use this mechanism The standard port is 3002, this is the address used by this node mesh-address and mesh-port are the IP address and port used by the next node. interval and timeout are as in Multicast. Change dynamically interval –yes timeout – yes others - no Best practices Aerospike has found that this mechanism works in production with up to 20 nodes. For most production uses, use an interval of “150” and a timeout of “15”. For cloud environments, use “250” and “25”. Note that most cloud environments like Amazon EC2 do not allow multicast. © 2014 Aerospike. All rights reserved. Confidential Pg. 50
  • 51. Fabric Description The fabric controls intra-cluster communication between nodes. Stanza location network:fabric Config parameters (defaults) address port Notes The address should be the IP address that the fabric should respond on (you may also use “any”) The port is required and normally set to 3001 Change dynamically No Best practices It is possible to configure the fabric to communicate on a different network device from the service. © 2014 Aerospike. All rights reserved. Confidential Pg. 51
  • 52. Direct Telnet Access Description Aerospike offers a direct telnet connection into the server to administrate the node when you are having difficulty communicating through the normal service port (default 3000) Stanza location network:info Config parameters (defaults) address port Notes The address should be the IP address that the info service should respond on (you may also use “any”) The port is required and normally set to 3003 Change dynamically No Best practices You can use a standard telnet command to the appropriate IP address and port to issue various commands for debugging. Please see the Aerospike documentation on how to issue commands through this interface: https://docs.aerospike.com/display/AS2/Using+telnet+when+t he+Service+Port+is+Busy © 2014 Aerospike. All rights reserved. Confidential Pg. 52
  • 53. Network Example Config (1 of 3) For the connections variables, both configuration variables default to good values and can even be left unset in the file. You should only set them if:  If your node is in a test environment and the node hardware is low-level, set proto-fdmax to 1000.  If your clients have short lived connections (such as for PHP) you may want to apply the following:  proto-fd-max 100000  proto-fd-idle-ms 10000 service ... proto-fd-max 15000 proto-fd-idle-ms 600000 ... } © 2014 Aerospike. All rights reserved. Confidential Pg. 53
  • 54. Network Example Config (2 of 3) If using multicast for heartbeats on IP address 239.1.99.222 and if you wish for your clients to access this node on the IP address 10.100.1.215, your config file may look like this: network { service { address any port 3000 # If this server has multiple IP addresses, answer on this one (access-address) access-address 10.100.1.215 reuse-address } heartbeat { mode multicast # This address is the multicast IP address used by all the servers in the cluster address 239.1.99.222 port 9918 interval 150 timeout 10 } fabric { port 3001 } info { port 3003 } } © 2014 Aerospike. All rights reserved. Confidential Pg. 54
  • 55. Network Example Config (3 of 3) If using mesh (unicast) for heartbeats. The IP address 10.100.1.215, your config file may look like this: network { service { address any port 3000 # If this server has multiple IP addresses, answer on this one (access-address) access-address 10.100.1.215 reuse-address } heartbeat { mode mesh port 3002 # The mesh address is the IP address of another node in the cluster mesh-address 10.100.1.214 mesh-port 3002 interval 150 timeout 10 } fabric { port 3001 } info { port 3003 } } © 2014 Aerospike. All rights reserved. Confidential Pg. 55
  • 57. Logging By default, Aerospike logs all messages in the main log file. Topics covered:   Location of logs Log level (changing what is logged) © 2014 Aerospike. All rights reserved. Confidential Pg. 57
  • 58. Log File Description This is the location of the actual log file itself. Stanza location logging:file Config parameters (defaults) file Notes Aerospike normally puts the logs in /var/log/aerospike/aerospike.log Change dynamically No Best practices The log file must be writable by the user running the node process. For 3.x, you should use /var/log/aerospike/aerospike.log The log file does not automatically rotate. Instructions for rotating through the logs can be found at: https://docs.aerospike.com/display/V3/Logging © 2014 Aerospike. All rights reserved. Confidential Pg. 58
  • 59. Log Level Description Sets the log level for different messages. Stanza location logging:file Config parameters (defaults) context Notes There are different contexts and levels. You can specify different levels for different contexts. Contexts: Levels: any critical batch warning info info query debug rw detail scan udf Change dynamically Yes Best practices Set “any” to “info”. Only change to a deeper level when debugging an issue. Make sure to change back afterwards, in order to avoid unnecessary logging. © 2014 Aerospike. All rights reserved. Confidential Pg. 59
  • 61. UDF Configuration Aerospike has the ability to perform functions on the server. These are done through functions called UDFs, which are stored on each node in the cluster. This feature is only available in Aerospike 3.x. Topics covered:    Location of system UDFs (provided by Aerospike) Location of user UDFs Whether or not to use a cache © 2014 Aerospike. All rights reserved. Confidential Pg. 61
  • 62. System UDF Directory Description This is the location where the system will store UDFs Stanza location mod-lua Config parameters (defaults) system-path (/opt/aerospike/sys/udf/lua) Notes System UDFs are only cached and are loaded, when the server starts. Change dynamically No Best practices There should be no reason for administrators to change the contents of this directory directly. © 2014 Aerospike. All rights reserved. Confidential Pg. 62
  • 63. User UDF Directory Description This is the location where the server will store user created UDFs. Stanza location mod-lua Config parameters (defaults) user-path (/opt/aerospike/usr/udf/lua) Notes The contents of this directory should be maintained by the server. Users should never have to alter the contents manually, but rather through the Aerospike interfaces. Change dynamically No Best practices Do not make manual changes to the contents of this directory. © 2014 Aerospike. All rights reserved. Confidential Pg. 63
  • 64. User Cache Setting Description This determines whether the server should cache UDFs or load them at runtime. Stanza location mod-lua Config parameters (defaults) cache-enabled (true) Notes Change dynamically Best practices This should be set to “true” for production use. This will yield the best performance. Use “false” to help in debugging issues with UDFs. © 2014 Aerospike. All rights reserved. Confidential Pg. 64
  • 65. Aerospike Configuration Configuration of:      Server process Logging Network UDF configuration Data storage (covered in Part 2 of Webinar Series) © 2014 Aerospike. All rights reserved. Confidential Pg. 65
  • 66. Thank You Send all questions/comments/complaints to YOUNG PAIK [email protected]

Editor's Notes

  • #2: FastestBest uptimePredictable performanceconsistency
  • #45: Automatic multicast gossip protocol for node discoveryPaxos consensus algorithm determines nodes in clusterOrdered list of nodes determines data locationData partitions balanced for minimal data motionVote initiated and terminated in 100 milliseconds