KAAS Design
KAAS Design
2
3 As an application developer, I want Confluent to provide me with a fully managed Kafka environment
4 (KaaS) so I can point my current Kafka producer and consumer clients at KaaS and forget about the
5 complexities of operating a complex stateful system.
6
7 Functional requirements:
8 1. System has to support multiple tenants.
9 2. Client can register on Confluent with initial information such as topic names, data retention
10 policy and replication policy.
11 3. Clients can tune the provided initial config.
12 4. Clients can create a topic and publish data into it.
13 5. Clients can subscribe to a topic and read data from it.
14 6. Clients can provision more capacity for read and writes.
15 7. Provide clients to view the important metrics via dashboards
16 8. Billing and metering for clients depending on usage.
23 Assumptions:
24 High level architecture:
25
26
27 Components
28
29 Confluent front end:
30 Entry point for the application is these front-end systems acting as a proxy fleet. This is responsible for
31 routing the requests to appropriate service, Authentication of the users , Rate limiting , request
32 validation and request deduplication .
33
34 Control plane:
35 In the current system there are roughly two types of different requests: the regular requests for
36 publish and subscriptions and the occasional requests to change meta information about the system
37 such as registering topics/provisioning more hardware. The latter type of requests will be handled by
38 the components in the control plane and former type of requests will be handled by the components
39 in data plane (Addressed in next section). Since each requests have different types of availability and
40 scalability requirements a clean separation is maintained. For background activities such as resource
41 clean ups , automations will be orchestrated by the control plane. Necessary infrastructure can be
42 expressed in YAML files for easy reproduction of the resources.
43
44 Data plane:
45 Data plane hosts the components which are on the primary paths of publishing and subscribing
46 messages. Components hosted as part of this will be data storage, data ingestion , key management
47 and data serving components.
48
49 Detailed design
50
51
52
53
54 Key decisions
55 Separation of control activities from the core activities
56 Control activities such as provisioning more hardware, registering new topics, registering new
57 subscriptions, deleting topics and cleanup activities to address fault tolerance are separated as they
58 have different scalability and availability requirements. Background activities for cluster cleanup,
59 monitoring health of the machines, security patchups, disaster recovery will be orchestrated by the
60 control plane’s infra management service.
61
62 Security and compliance
63 All communication channels between clients and KAAS will be secure with usage of TLS. Data stored on
64 the file system will be encrypted with keys procured from Key management service. Keys will be
65 periodically rotated.
66
67 Multi tenant clients
68 Depending on the client preferences either clients can prefer hosting their respective data in a
69 separate vpc or share with other tenants. In case of sharing with other tenants, additional reliability is
70 achieved with shuffle sharding. Errors and failures will be contained to the level of part of client’s data
71 or a specific client.
72
73 Redundancy
74 Redundancy will be achieved by enabling additional back up machines. Data will be replicated on all
75 replica machines and by default replication can be enabled up to 3 availability zones. In case of
76 multiple consumers and increased read throughput , additional read replicas can be provisioned.
77
78 APIs
79 Control plane apis:
80 1. createTopic
81 a. Request – userKey, topicName
82 b. Response – Http response code
83 2. updateConfiguration
84 a. Request – userKey , config to change (Throughput , Data retention ) and new value
85 b. Response – Http response code
86 3. deleteTopic
87 a. Request - userKey , topicName
88 b. Response – Http response code
89 Data plane apis:
90 1. publishData
91 a. Request – userKey , topicName , payload
92 c. Response- Http response code
93 2. batchPublish
94 a. Request – userKey , topicName , batch payload
95 d. Response- Http response code
96 3. readData
97 a. Request – userKey , topicName , offset
98 b. Response – Http response code
99
100
101 Data model
102
103 ClientInfo Table
104 1. Client id
105 2. Topic name
106 3. Data retention time
107 4. Other meta information
108
109 Subscriptions Table
110 1. Subscription Id
111 2. Topic name
112 3. Consumer Client Id
113
114 ConsumerInfo Table
115 1. Subscription Id
116 2. Offset
117
118 Data (Embedded database in broker machines)
119 1. Offset
120 2. payload