|
| 1 | +# GEP-3539: ClusterIP Gateway - Gateway API to Expose Pods on Cluster-Internal IP Address |
| 2 | + |
| 3 | +* Issue: [#3539](https://github.com/kubernetes-sigs/gateway-api/issues/3539) |
| 4 | +* Status: Experimental |
| 5 | + |
| 6 | +## TLDR |
| 7 | + |
| 8 | +Gateway API enables advanced traffic routing and can be used to expose a |
| 9 | +logical set of pods on a single IP address within a cluster. It can be seen |
| 10 | +as the next generation ClusterIP providing more flexibility and composability |
| 11 | +than Service API. This comes at the expense of some additional configuration |
| 12 | +and manageability burden. |
| 13 | + |
| 14 | +## Goals |
| 15 | + |
| 16 | +* Define Gateway API usage to accomplish ClusterIP Service style behavior |
| 17 | +* Propose DNS layout and record format for ClusterIP Gateway |
| 18 | +* Extend the use of Gateway API to provide NodePort and LoadBalancer Service |
| 19 | + type of functionality |
| 20 | + |
| 21 | +## Non-Goals |
| 22 | + |
| 23 | +* Make significant changes to Gateway API |
| 24 | +* Provide path for existing ClusterIP Services in a cluster to migrate to |
| 25 | + Gateway API model |
| 26 | + |
| 27 | +## API Changes |
| 28 | + |
| 29 | +* EndpointSelector is recognized as a backend |
| 30 | +* DNS record format for ClusterIP Gateways |
| 31 | + |
| 32 | +## Introduction |
| 33 | + |
| 34 | +Gateway API provides a generic and composable model for defining L4 and L7 |
| 35 | +routing in Kubernetes. Very simply, it describes how to get traffic into pods. |
| 36 | +ClusterIP provides similar functionality of an ingress point for routing traffic |
| 37 | +into pods. As the Gateway API has evolved, there have been discussions around whether |
| 38 | +it can be a substitute for the increasingly complex and overloaded Service API. This |
| 39 | +document aims to describe what this could look like in practice, with a focus on |
| 40 | +ClusterIP and a brief commentary on how the concept design can be extended to |
| 41 | +accommodate LoadBalancer and NodePort Services. |
| 42 | + |
| 43 | +## Overview |
| 44 | + |
| 45 | +Gateway API can be thought of as decomposing Service API into multiple separable |
| 46 | +components that allow for definition of the ClusterIP address and listener configuration |
| 47 | +(Gateway resource), implementation specifics and common configuration (GatewayClass |
| 48 | +resource), and routing traffic to backends (Route resource). |
| 49 | + |
| 50 | +### Limitations of Service API |
| 51 | + |
| 52 | +Besides what has been discussed in the past about Service API maintainability, evolvability, |
| 53 | +and complexity concerns, see: https://www.youtube.com/watch?v=Oslwx3hj2Eg, we ran into |
| 54 | +additional practical concerns that rendered Service API insufficient for the needs at hand. |
| 55 | + |
| 56 | +Service IPs can only be assigned out of the ServiceCIDR range configured for the API server. |
| 57 | +While Kubernetes 1.31 added a Beta feature that allows for the Extension of Service IP Ranges, |
| 58 | +there have been use cases where multi-NIC pods (pods with multiple network interfaces) require |
| 59 | +the flexibility of specifying different ServiceCIDR ranges to be used for ClusterIP services |
| 60 | +corresponding to the multiple different networks. There are strict traffic splitting and network |
| 61 | +isolation requirements that demand non-overlapping ServiceCIDR ranges for per-network ClusterIP |
| 62 | +service groups. Because of the way service definition and IP address allocation are tightly |
| 63 | +coupled in API server, it is not possible to use the current Service API to achieve this model |
| 64 | +without resorting to inelegant and klugey implementations. |
| 65 | + |
| 66 | +Gateway API also satisfies, in a user-friendly and uncomplicated manner, the need for advanced |
| 67 | +routing and load balancing capabilities in order to enable canary rollouts, weighted traffic |
| 68 | +distribution, isolation of access and configuration. |
| 69 | + |
| 70 | +### Service Model to Gateway API Model |
| 71 | + |
| 72 | + |
| 73 | + |
| 74 | +### EndpointSelector as Backend |
| 75 | + |
| 76 | +A Route can forward traffic to the endpoints selected via selector rules defined in EndpointSelector. |
| 77 | +While Service is the default resource kind of the referent in backendRef, EndpointSelector is |
| 78 | +suggested as an example of a custom resource that implementations could have to attach pods (or |
| 79 | +potentially other resource kinds) directly to a Route via backendRef. |
| 80 | + |
| 81 | +```yaml |
| 82 | +{% include 'standard/clusterip-gateway/tcproute-with-endpointselector.yaml' %} |
| 83 | +``` |
| 84 | + |
| 85 | +The EndpointSelector object is defined as follows. It allows the user to specify which endpoints |
| 86 | +should be targeted for the Route. |
| 87 | + |
| 88 | +```yaml |
| 89 | +{% include 'standard/clusterip-gateway/endpointselector.yaml' %} |
| 90 | +``` |
| 91 | + |
| 92 | +To allow more granular control over traffic routing, there have been discussions around adding |
| 93 | +support for using Kubernetes resources besides Service (or external endpoints) directly as backendRefs. |
| 94 | +Gateway API allows for this flexibility, so having a generic EndpointSelector resource supported as a |
| 95 | +backendRef would be a good evolutionary step. |
| 96 | + |
| 97 | +### User Journey |
| 98 | + |
| 99 | +Infrastructure provider supplies a GatewayClass corresponding to the type of service-like behavior to |
| 100 | +be supported. |
| 101 | + |
| 102 | +Below is the example of a GatewayClass for ClusterIP support: |
| 103 | +```yaml |
| 104 | +{% include 'standard/clusterip-gateway/clusterip-gatewayclass.yaml' %} |
| 105 | +``` |
| 106 | + |
| 107 | +The user must then create a Gateway in order to configure and enable the behavior as per their intent: |
| 108 | +```yaml |
| 109 | +{% include 'standard/clusterip-gateway/clusterip-gateway.yaml' %} |
| 110 | +``` |
| 111 | + |
| 112 | +By default, IP address(es) from a pool specified by a CIDR block will be assigned unless a static IP is |
| 113 | +configured in the _addresses_ field as shown above. The CIDR block may be configured using a custom CR. |
| 114 | +Subject to further discussion, it may make sense to have a GatewayCIDR resource available upstream to |
| 115 | +specify an IP address range for Gateway IP allocation. |
| 116 | + |
| 117 | +Finally the specific Route and EndpointSelector resources must be created in order to set up the backend |
| 118 | +pods for the configured ClusterIP. |
| 119 | +```yaml |
| 120 | +{% include 'standard/clusterip-gateway/customroute.yaml' %} |
| 121 | +``` |
| 122 | + |
| 123 | +### Backends on Listeners |
| 124 | + |
| 125 | +As seen above, Gateway API requires at least three CRs to be defined. This introduces some complexity. |
| 126 | +GEP-1713 proposes the addition of a ListenerSet resource to allow sets of listeners to attach to a Gateway. |
| 127 | +As a part of discussions around this topic, the idea of directly adding backendRefs to listeners has come |
| 128 | +up. Allowing backendRefs directly on the listeners eliminates the need to have Route objects for simple |
| 129 | +cases. More complex traffic splitting and advanced load balancing cases can still use Route attachments via |
| 130 | +allowedRoutes. |
| 131 | + |
| 132 | +### DNS |
| 133 | + |
| 134 | +ClusterIP Gateways in the cluster need to have consistent DNS names assigned to allow ClusterIP lookup by |
| 135 | +name rather than IP address. DNS A and/or AAAA record creation needs to happen when Kubernetes publishes |
| 136 | +information about Gateways, in a manner similar to ClusterIP Service creation behavior. DNS nameservers |
| 137 | +in pods’ /etc/resolv.conf need to be programmed accordingly by kubelet. |
| 138 | + |
| 139 | +``` |
| 140 | +<name of gateway>.<gateway-namespace>.gw.cluster.local |
| 141 | +``` |
| 142 | + |
| 143 | +This results in the following search option entries in Pods’ /etc/resolv.conf: |
| 144 | +``` |
| 145 | +search <ns>.gw.cluster.local gw.cluster.local cluster.local |
| 146 | +``` |
| 147 | + |
| 148 | +### Cross-namespace References |
| 149 | + |
| 150 | +Gateway API allows for Routes in different namespaces to attach to the Gateway. |
| 151 | + |
| 152 | +When modeling ClusterIP service networking, the simplest recommendation might be to keep Gateway and Routes |
| 153 | +within the same namespace. While cross namespace routing would work and allow for evolved functionality, |
| 154 | +it may make supporting certain cases tricky. One specific example for this case is the pod DNS resolution |
| 155 | +support of the following format |
| 156 | + |
| 157 | +``` |
| 158 | +pod-ipv4-address.gateway-name.my-namespace.gw.cluster-domain.example |
| 159 | +``` |
| 160 | + |
| 161 | +If Gateway and Routes (and hence the backing pods) are in different namespaces, there arises ambiguity in |
| 162 | +whether and how to support this pod DNS resolution format. |
| 163 | + |
| 164 | +## LoadBalancer and NodePort Services |
| 165 | + |
| 166 | +Extending the concept further to LoadBalancer and NodePort type services follows a similar pattern. The idea |
| 167 | +is to have a GatewayClass corresponding to each type of service networking behavior that needs to be modeled |
| 168 | +and supported. |
| 169 | + |
| 170 | + |
| 171 | + |
| 172 | +Note that Gateway API allows flexibility and clear separation of concerns so that one would not need to |
| 173 | +configure cluster-ip and node-port when configuring a load-balancer. |
| 174 | + |
| 175 | +But for completeness, the case shown below demonstrates how load balancer functionality analogous to |
| 176 | +LoadBalancer Service API can be achieved using Gateway API. |
| 177 | + |
| 178 | + |
| 179 | + |
| 180 | +## Additional Service API Features |
| 181 | + |
| 182 | +Services natively provide additional features as listed below (not an exhaustive list). Gateway API can be |
| 183 | +extended to provide some of these features natively, while others may be left up to the specifics of |
| 184 | +implementations. |
| 185 | + |
| 186 | +| Feature | ServiceAPI options | Gateway API possibilities | |
| 187 | +|---|---|---| |
| 188 | +| sessionAffinity | ClientIP <br /> NoAffinity | Route level |
| 189 | +| allocateLoadBalancerNodePorts | True <br /> False | Not supported for ClusterIP Gateway <br /> Supported for LoadBalancer Gateway | |
| 190 | +| externalIPs | List of externalIPs for service | Not supported? | |
| 191 | +| externalTrafficPolicy | Local <br /> Cluster | Supported for LB Gateways only, Route level | |
| 192 | +| internalTrafficPolicy | Local <br /> Cluster | Supported for ClusterIP Gateways only, Route level | |
| 193 | +| ipFamily | IPv4 <br /> IPv6 | Route level | |
| 194 | +| publishNotReadyAddresses | True <br /> False | Route or EndpointSelector level | |
| 195 | +| ClusterIP (headless service) | IPAddress <br /> None | GatewayClass definition for Headless Service type | |
| 196 | +| externalName | External name reference <br /> (e.g. DNS CNAME) | GatewayClass definition for ExternalName Service type | |
| 197 | + |
| 198 | +## References |
| 199 | + |
| 200 | +* [Original Doc](https://docs.google.com/document/d/1N-C-dBHfyfwkKufknwKTDLAw4AP2BnJlnmx0dB-cC4U/edit) |
0 commit comments