Capella Architecture Overview
Capella Architecture Overview
Couchbase Capella TM
Architecture Overview
Table of Contents
Introduction 3
Database-as-a-Service 3
Mobile App Services 3
Database-as-a-Service 4
Core Database Design 4
JSON Document Data Model 5
Data Access Methods 6
Organizing Concepts for Documents 7
Deployment Design Concepts 7
Services 8
Distributed Design 9
As-a-Service Aspects 10
Architecture 10
Management 11
Deployment 11
Development 12
Connecting 13
Operations 14
Security 15
Summary 23
Resources 23
WHITEPAPER 2
Introduction
These modern requirements have driven Couchbase's development from inception to:
Database-as-a-Service
Capella is the fastest, easiest, and most affordable way to start with Couchbase. As
a fully managed DBaaS, it automates setup, configuration, replication, and ongoing
operations so you can focus on development and improve your time to market.
And Capella puts industry-leading performance, flexibility, and cloud scalability
at your fingertips.
WHITEPAPER 3
Database-as-a-Service
Couchbase has multiple data access models including key-value, SQL++ query service,
full-text search service, eventing service, operational analytics aggregation service, and
backup service. In the Couchbase design, every access model can simultaneously utilize
the cluster’s data.
WHITEPAPER 4
For cloud deployments, it is advantageous from a cost perspective to red-line
infrastructure instances before adding them, and to avoid idle and underutilized node
instances. Couchbase transparently manages the topology, process management,
statistics gathering, high availability, and data movement between these services.
Traditional databases increase latency and block application operations while running
synchronous operations (for example, while persisting data to disk or maintaining
indexes). Couchbase allows write operations to happen at memory and network speeds
while asynchronously processing replication, persistence, and index management. Spikes
in write operations don't block read or query operations, while background processes
will persist data as fast as possible without slowing down the rest of the system. ACID
transactions are available to the developer to ensure durability and consistency while
data is in flight. Multiple transaction options allow the developer to decide when and
where to increase latency in exchange for durability and consistency of transactions.
Somewhat higher latency can be anticipated when multi-document and cross-collection
transactions are implemented.
WHITEPAPER 5
JSON document flexibility
In the Couchbase document model, a schema is the result of an application’s
structuring of its documents and their containment structures such as buckets, scopes,
and collections. Schemas can be defined by application developers and managed by
applications. This is in contrast to the relational model where the database (and the
database administrator) manages the schema. Couchbase created the bucket-scope-
collection-document organizational hierarchy (further explained below) to allow
maximum flexibility in defining application data metamodels. A single JSON document’s
structure offers even more flexibility for the developer beyond the dynamic nature of
scopes and collections. A JSON document’s structure consists of its inner arrangement
of attribute-value pairs. How the documents are designed or updated over time is
up to the application developer. They can be normalized, denormalized, or a hybrid
depending on the needs and evolution of the application. Using JSON, the developer
can avoid the lengthy schema design, testing, and deployment cycles of traditional
RDBMS-based systems.
Access
Description
method
WHITEPAPER 6
Organizing Concepts for Documents
Couchbase offers a flexible multi-level data containment and organization structure to
organize documents, optimize cluster performance, and facilitate horizontal scaling.
This data containment model consists of four levels: buckets, scopes, collections, and
documents. This model maps easily to familiar RDBMS constructs of databases, schema,
tables, and rows.
• Documents—Documents are stored within buckets, but can also be organized within
scopes and collections.
• Services—The core of Couchbase is the Data Service that feeds and supports all the
other systems and data access methods. Multiple services that offer different types
of data access or processing include Query, Indexing, Backup, Search, Analytics,
and Eventing. A service is an isolated set of processes dedicated to particular tasks.
For example, indexing, full-text search, and query are each managed as separate
services. One or more services can be run on one or more nodes as needed.
• Nodes—Capella nodes are virtual machines that host single instances of Couchbase
Server within a cloud service provider. Nodes can be added or removed easily
through the Capella Control Plane and data is then automatically redistributed
evenly across all nodes.
WHITEPAPER 7
Services
Each service has its own resource quotas, and where applicable, related indexing and
inter-node communication capabilities. This provides several very flexible methods to
scale services when needed. In addition to scaling up to larger machines or scaling
out to more nodes, Couchbase also provides the ability to scale specific services
independently from one another using multi-dimensional scaling. This MDS is the
foundation for Couchbase workload isolation and is covered in more detail below.
Couchbase is different from other platforms where a monolithic set of services are
installed on every node in a cluster. Instead, Couchbase uses a core data capability that
feeds all the other services and a shared-nothing architecture that allows developer
control over workload isolation. Small-scale environments can share the same workloads
across one or more nodes, while higher scale and performance can be achieved with
dedicated nodes to handle specific workloads. This provides the ultimate in scale-out
flexibility. The cluster can be scaled in or out and its service topology can be changed on
demand with zero interruption or change to the application.
Applications communicate directly with each service through a common SDK that is
always aware of the topology of the cluster and how services are configured.
• Data Service—The Data Service, or key-value (KV) engine, is the foundation for
storing data and must run on at least one node of every database. It is responsible
for caching, persisting, and serving data to applications and other services. The
cache provides consistent low latency for individual document read and write
operations and streams documents to other services via Database Change Protocol
(DCP). Due to their simplicity, KV operations execute with extremely low (often
sub-millisecond) latency. The KV store is accessed using simple CRUD (create, read,
update, delete) APIs, and provides the simplest interface when accessing documents
using their IDs.
• Query Service—An engine for processing SQL++ queries. SQL++ combines the
flexibility of JSON with the expressive power of SQL. It provides a rich set of features
and familiar data definition language (DDL), data manipulation language (DML), and
query language statements, but can operate in the face of NoSQL database features
such as key-value storage, multi-valued attributes, and nested objects. Also, users can
define ACID transactions within SQL++ for one or more documents across collections
and nodes. Transactions in SQL++ have adopted a nearly identical syntax to SQL for
relational databases. The Query Service uses a cost-based query optimizer, patented
in 2021, to take advantage of indexes that are available.
WHITEPAPER 8
• Index Service—Indexing is an important part of making queries run efficiently and
self-update as data mutates. This service supports multi-index types and includes
an Index Advisor that recommends specific indexes to build based upon query
statements and data structure.
• Search Service—An engine for performing full-text searches on stored JSON data.
Users can create and query inverted indexes for searching of free-form text within
a document. Customers using the search service often no longer need a third-party
search tool.
Distributed Design
Capella’s distributed nature makes high availability, scaling, and disaster recovery easier.
• Data transport via DCP—As data mutates, in-memory replication is used to maintain
data updates within Capella and to external services such as Spark or Kafka that are
fed from the DCP stream.
WHITEPAPER 9
As-a-Service Aspects
Architecture
Capella’s core architectural aspect is the split of the web UI Control Plane designed for
management and the Data Plane for data storage.
MFA
SSO
Capella Control Plane
Web Users
Web UI & External APIs
Couchbase Database 1
Couchbase Database 2
Control Plane
The Control Plane is a web UI that manages the cloud orchestration Infrastructure-as-a-
Service (IaaS), monitoring, alerting, security, access, billing, and support capabilities. It’s
the access point for your organization's users and also allows access to infrastructure-
as-code (IaC) tools such as Terraform. Backend services can be accessed by REST-based
tools via a management API.
WHITEPAPER 10
Data Plane
The Data Plane is where you manage your Capella databases. A database resides in a
single region (distributed across multiple availability zones) within a single cloud service
provider (CSP), but the Control Plane can control multiple databases across various
cloud service providers. Furthermore, data can be replicated between databases, with
that replication configured within the Control Plane. From a security perspective, the
Data Plane has no internet access unless IPs are specifically allowed or a connection is
established through VPC peering or an AWS PrivateLink connection. Disk data, backups,
and all traffic is encrypted.
Management
Users, projects, and RBAC
An organization is the top-level organizational element for managing users, projects,
and databases. By default, organizational members do not have data access, but instead
have access to the Control Plane. A project is used to organize groups of databases.
People must be added to projects and assigned access to databases within a project.
Organizations can also add SSO groups (teams) with a project role to access databases
within a project.
Deployment
Database deployment
When deploying a database, you must choose a CSP, a region, and a CIDR block. (A
default is provided, but can be changed before deploying.) Couchbase is constantly
adding regions, and up-to-date regions can be found on the Couchbase Documentation
pages for AWS, GCP, and Azure. You can select the version of the Couchbase Server
you would like to use and assign services to nodes. These services can be changed after
deployment. You can also choose your support plan and availability mode (single or
multi-availability zones), and can choose to purchase credits on a prepaid or pay-as-
you-go basis.
WHITEPAPER 11
Storage engine
Capella supports two different backend storage mechanisms, which are set per bucket.
A single Capella database can have a mix of Couchstore and Magma buckets.
• Couchstore—Couchstore is the default bucket storage engine that has been in use
for more than 10 years. It’s optimized for high performance with large datasets
while using fewer system resources. (The minimum bucket size for the Couchstore
backend is 100 MiB.) If you have a small dataset that can fit in-memory, then you
should consider using Couchstore.
• Magma—Capella’s latest storage engine is designed for high performance with very
large datasets that don’t fit in-memory. It’s ideal for use cases that rely primarily on
disk access. The performance of disk access will be as good as the underlying disk
subsystems. Magma can work with very low amounts of memory for large datasets
(e.g., for a node holding 5 TiB of data, Magma can be used with only 64 GiB RAM).
It’s especially suited for datasets that won’t fit into available memory.
You can learn more about Capella’s storage engines in our Couchbase documentation.
Development
Couchbase provides several tools for developers:
Playground
This tool is integrated into Capella and comes with an SDK tutorial and a SQL++ tutorial.
The SDK playground offers examples of multiple SDK languages. SQL++ gives examples
of the Couchbase query language. Both tutorials are designed to guide a new developer
through several chapters from basics to more advanced concepts.
Query Workbench
The Query Workbench allows users to access data via SQL++ and see data in JSON
and tabular formats. The tool provides a built-in index advice feature that tells users
what indexes are needed to optimize queries. Inverted search indexes can be created
to support search, and JavaScript-based user-defined functions can be used to
manipulate data.
Couchbase Shell
Couchbase Shell (cbsh) is a modern, productive shell that provides CLI access to
Capella. It can be obtained via GitHub.
WHITEPAPER 12
Connecting
No need to No need to
IP Allowlist IP Allowlist
ACME Data Plane VPC ACME Data Plane VPC ACME Data Plane VPC
WHITEPAPER 13
Couchbase SDKs
Capella works with the latest versions of all supported Couchbase SDKs. Developers can
choose from over 10 SDKs of their favorite programming languages.
Connectors
To exchange data with other platforms, we offer various big data connectors for
products like Kafka, Spark, and Elasticsearch.
REST API
Couchbase provides a series of RESTful APIs that enable you to integrate with Capella
to perform operations such as:
Application connection
For connecting applications to Capella, you have several options:
• Public connection—This is the simplest option and requires the use of IP addresses
for encrypted data to traverse the public internet. Public connections should not be
used for production environments.
• VPC peering—The CSP backbone contains both the connection and traffic. This
option reduces networking costs compared to a public connection.
Operations
Scaling
Capella makes it easy to evolve your configuration. You can add or remove nodes at
any time and change the amount of RAM, vCPUs, disk space, and type of disk volume
(general purpose or high performance). Changes are made with a few clicks, and Capella
automatically rebalances data to the new configuration.
Multi-dimensional scaling
MDS allows you to further optimize your configuration as application needs evolve over
time. MDS allows workloads to be scaled independently and hardware usage to be
optimized to help drive down total cost of ownership.
WHITEPAPER 14
Geo-replication
Capella’s cross data center replication (XDCR) technology replicates data between
databases in different regions. XDCR provides an easy way to replicate active data
in-memory to multiple geographically diverse data centers either for disaster recovery
or for high availability. It can be set up on a per-bucket or per-collection basis and can
be unidirectional or bidirectional. It also provides built-in conflict resolution if the same
document was mutated on a separate database before it was replicated.
Security
Couchbase supports the most critical and sensitive workloads for industry-leading
businesses every day. Capella’s security architecture is based on industry best
practices for security and three key pillars: Verify explicitly, least privilege, and platform
monitoring. You can learn more and get detailed security whitepapers and information
about compliance at our Trust Center.
Infrastructure security
The foundation of security in a cloud database is a hardened environment that
removes nonessential software, roles, and ports while leveraging an IaaS provider's
alerting and auditing services. Trusted and immutable operating system (OS) images
are used to protect the OS, with verification upon deployment and ongoing scanning
for vulnerabilities afterward. Additionally, end-to-end configuration is automated
via templates to ensure consistency. Monitoring is also in place to identify
potential misconfigurations.
WHITEPAPER 15
Networking security
By default, the Data Plane only allows clusters to connect to trusted IP addresses that
have been defined within the Control Plane. Any attempted connection from an IP
address not in a cluster’s list of allowed IP addresses will be denied. With VPC peering,
traffic never crosses the public internet, which reduces threat vectors and DDoS
attacks. If you’re using AWS for your data plane, you can further enhance security by
using PrivateLink to contain traffic within the CSP backbone with unidirectional access.
Alternatively, you can set up Capella as a private service that functions as if it were
hosted directly within a team's Amazon VPC. This allows access to a specific service or
application, and only private endpoints can initiate a connection.
Access security
To bolster security access, Capella is designed so that the Control Plane and Data Plane
live in separate VPCs. Access to data is separate from access to the Control Plane, and
specific credentials must be established for application access. All admins, users, and
applications must authenticate in order to gain access and then be authorized with
specific access rights. Multi-factor authentication is possible and recommended.
Data security
Data is encrypted in transit and at rest. In transit, data is encrypted via TLS, which
cannot be turned off. If you want to extend data storage encryption within the database,
this can be done at the field level within JSON documents. Also, backup data is written
to encrypted disks using the cloud provider's native encryption process. Capella creates,
manages, and controls cryptographic keys using a CSP’s key management system
(e.g., KMS for AWS).
Vulnerability management
Capella protects against threats like brute force attacks, rate-limiting attacks, cross-
site request forgery, and more. Capella maintains centralized logs securely and alerts
Couchbase site reliability engineers (SREs) of operational concerns should they arise.
To reduce potential vulnerabilities, patching is automated and includes monitoring alerts
and management reviews. Couchbase has established a formal Incident Response Policy
to inform you in the event of a security-related event.
WHITEPAPER 16
App Services for Mobile and IoT
App Services also manages secure data access with role-based access control, providing
authentication for mobile users. These key capabilities in Capella are offered as a ready-
to-use service for mobile and IoT developers, making it faster and easier than ever to
build highly performant and reliable applications.
More importantly, when multiple Couchbase Lite clients are close to one another, but
have no internet, they can still do peer-to-peer syncing. This is a feature unique to
Couchbase that enables offline-first collaboration without the need for any central
control point.
WHITEPAPER 17
App Services Architecture
Web UI
Encrypted Encrypted
mTLS auth mTLS auth
TLS
App App App
Endpoint 1 Endpoint 2 Endpoint 3
When you create an App Service and associate it with a Couchbase Server cluster, you
are effectively extending or enabling it for data sync. A Couchbase Server cluster can
only be linked to one App Service.
WHITEPAPER 18
An App Service can handle multiple client applications, each represented by an App
Endpoint. Conceptually, an App Endpoint represents the instance of your application on
the App Service. Each App Endpoint is backed by a server bucket. If you have multiple
applications, each will have its own App Endpoint.
Mobile, desktop, and web client apps can access and sync data by connecting to the
corresponding App Endpoint.
Couchbase Cluster
App Services
App Endpoint 1
Admin Admin
credentials Admin REST URL Public REST URL Websockets URL Metrics URL credentials
(locked (locked
IP Access) IP Access)
Load Balancer
App User
Credentials Secure Public Secure Public Secure Secure
Data Access Data Access Data Sync Data Sync
WHITEPAPER 19
Secure public REST API
Applications can also access data securely over a public REST endpoint. This is useful
when there is reliable network connectivity and no need for offline storage, or when
the apps are running on hardware that doesn’t have local storage for running a local
embedded database like Couchbase Lite.
User Journey
Security
Setup Auth Provider
PREPARE
4
5 Monitoring, Scaling, Reconfiguration, Decommissioning
Prerequisite: App Services requires a Couchbase Capella server cluster. Follow these
steps to create a Capella server cluster and set up a bucket.
WHITEPAPER 20
Prepare
Authentication provider
Authentication providers define how users are authenticated with the app services.
A default auth provider of basic auth is selected for you during App Endpoint creation.
So you can skip this config if the default option works for your application.
• Basic Auth—This is where the app users are authenticated using username and
password credentials that are Base64 encoded and passed in as part of the
authorization header of an HTTP request.
WHITEPAPER 21
User management
With the exception of “Anonymous” mode, all client-side access must be authenticated
with suitable user credentials. The choice of how users (and roles) are created depends
on the Authentication Provider that is configured.
• Basic Auth—Users are created via the Capella web UI or via Admin REST endpoint.
Access control
Access control is implemented using the channel-based access control model of
Couchbase Mobile. Access control specifies who has access to what data. This is
specified via a JavaScript access control function. Read access control is at the
granularity of a document, while write access control is at the granularity of a field.
Connect
After completing the security setup for the App Endpoint, unpause the App Endpoint
to bring it online. Once online, apps can be connected using any connection points
discussed earlier.
Operate
Once your App Service is operational, you can administer the App Service and App
Endpoints and change the configuration to meet the evolving needs of the apps.
Monitoring
Metrics dashboards provide insights into resource utilization of the App Service as well
as the operational state of the App Endpoints. These include stats such as the number
of documents read/written, error counts, number of active replications, etc.
Activity log
All key system events of type info, warning, and error are recorded in the activity center.
Users are also alerted to key events that may need attention, such as significantly high
memory utilization over an extended period of time.
On-demand scaling
To keep up with the evolving needs of the app, users can scale App Services horizontally
and/or vertically by changing the number of nodes and/or compute type.
WHITEPAPER 22
Summary
• High-performance operations
Resources
• Couchbase Capella product page
• Capella Documentation
WHITEPAPER 23
Modern customer experiences need a flexible database platform that can
power applications spanning from cloud to edge and everything in between.
Couchbase’s mission is to simplify how developers and architects develop, deploy
and consume modern applications wherever they are. We have reimagined the
database with our fast, flexible and affordable cloud database platform Capella,
allowing organizations to quickly build applications that deliver premium experiences
to their customers—all with best-in-class price performance. More than 30% of
the Fortune 100 trust Couchbase to power their modern applications.