Skip to content

Commit 0743da8

Browse files
WG serving proposal
1 parent 2d34a98 commit 0743da8

File tree

14 files changed

+200
-0
lines changed

14 files changed

+200
-0
lines changed

OWNERS_ALIASES

+4
Original file line numberDiff line numberDiff line change
@@ -142,6 +142,10 @@ aliases:
142142
- JimBugwadia
143143
- poonam-lamba
144144
- sudermanjr
145+
wg-serving-leads:
146+
- ArangoGutierrez
147+
- SergeyKanzhelev
148+
- terrytangyuan
145149
wg-structured-logging-leads:
146150
- mengjiao-liu
147151
- pohly

liaisons.md

+1
Original file line numberDiff line numberDiff line change
@@ -60,6 +60,7 @@ members will assume one of the departing members groups.
6060
| [WG Device Management](wg-device-management/README.md) | Patrick Ohly (**[@pohly](https://github.com/pohly)**) |
6161
| [WG LTS](wg-lts/README.md) | Nabarun Pal (**[@palnabarun](https://github.com/palnabarun)**) |
6262
| [WG Policy](wg-policy/README.md) | Patrick Ohly (**[@pohly](https://github.com/pohly)**) |
63+
| [WG Serving](wg-serving/README.md) | Maciej Szulik (**[@soltysh](https://github.com/soltysh)**) |
6364
| [WG Structured Logging](wg-structured-logging/README.md) | Nabarun Pal (**[@palnabarun](https://github.com/palnabarun)**) |
6465
| [Committee Code of Conduct](committee-code-of-conduct/README.md) | Nabarun Pal (**[@palnabarun](https://github.com/palnabarun)**) |
6566
| [Committee Security Response](committee-security-response/README.md) | Stephen Augustus (**[@justaugustus](https://github.com/justaugustus)**) |

sig-apps/README.md

+1
Original file line numberDiff line numberDiff line change
@@ -59,6 +59,7 @@ subprojects, and resolve cross-subproject technical issues and decisions.
5959
The following [working groups][working-group-definition] are sponsored by sig-apps:
6060
* [WG Batch](/wg-batch)
6161
* [WG Data Protection](/wg-data-protection)
62+
* [WG Serving](/wg-serving)
6263

6364

6465
## Subprojects

sig-architecture/README.md

+1
Original file line numberDiff line numberDiff line change
@@ -60,6 +60,7 @@ The following [working groups][working-group-definition] are sponsored by sig-ar
6060
* [WG Device Management](/wg-device-management)
6161
* [WG LTS](/wg-lts)
6262
* [WG Policy](/wg-policy)
63+
* [WG Serving](/wg-serving)
6364
* [WG Structured Logging](/wg-structured-logging)
6465

6566

sig-autoscaling/README.md

+1
Original file line numberDiff line numberDiff line change
@@ -48,6 +48,7 @@ The Chairs of the SIG run operations and processes governing the SIG.
4848
The following [working groups][working-group-definition] are sponsored by sig-autoscaling:
4949
* [WG Batch](/wg-batch)
5050
* [WG Device Management](/wg-device-management)
51+
* [WG Serving](/wg-serving)
5152

5253

5354
## Subprojects

sig-instrumentation/README.md

+1
Original file line numberDiff line numberDiff line change
@@ -53,6 +53,7 @@ subprojects, and resolve cross-subproject technical issues and decisions.
5353
## Working Groups
5454

5555
The following [working groups][working-group-definition] are sponsored by sig-instrumentation:
56+
* [WG Serving](/wg-serving)
5657
* [WG Structured Logging](/wg-structured-logging)
5758

5859

sig-list.md

+1
Original file line numberDiff line numberDiff line change
@@ -67,6 +67,7 @@ When the need arises, a [new SIG can be created](sig-wg-lifecycle.md)
6767
|[Device Management](wg-device-management/README.md)|[device-management](https://github.com/kubernetes/kubernetes/labels/wg%2Fdevice-management)|* Architecture<br>* Autoscaling<br>* Network<br>* Node<br>* Scheduling<br>|* [John Belamaric](https://github.com/johnbelamaric), Google<br>* [Kevin Klues](https://github.com/klueska), NVIDIA<br>* [Patrick Ohly](https://github.com/pohly), Intel<br>|* [Slack](https://kubernetes.slack.com/messages/wg-device-management)<br>* [Mailing List](https://groups.google.com/a/kubernetes.io/g/wg-device-management)|* Regular WG Meeting: [Tuesdays at 8:30 PT (Pacific Time) (biweekly)](TBD)<br>
6868
|[LTS](wg-lts/README.md)|[lts](https://github.com/kubernetes/kubernetes/labels/wg%2Flts)|* Architecture<br>* Cluster Lifecycle<br>* K8s Infra<br>* Release<br>* Security<br>* Testing<br>|* [Jeremy Rickard](https://github.com/jeremyrickard), Microsoft<br>* [Jordan Liggitt](https://github.com/liggitt), Google<br>* [Micah Hausler](https://github.com/micahhausler), Amazon<br>|* [Slack](https://kubernetes.slack.com/messages/wg-lts)<br>* [Mailing List](https://groups.google.com/a/kubernetes.io/g/wg-lts)|* Regular WG Meeting: [Tuesdays at 07:00 PT (Pacific Time) (biweekly)](https://zoom.us/j/92480197536?pwd=dmtSMGJRQmNYYTIyZkFlQ25JRngrdz09)<br>
6969
|[Policy](wg-policy/README.md)|[policy](https://github.com/kubernetes/kubernetes/labels/wg%2Fpolicy)|* Architecture<br>* Auth<br>* Multicluster<br>* Network<br>* Node<br>* Scheduling<br>* Storage<br>|* [Jim Bugwadia](https://github.com/JimBugwadia), Kyverno/Nirmata<br>* [Poonam Lamba](https://github.com/poonam-lamba), Google<br>* [Andy Suderman](https://github.com/sudermanjr), Fairwinds<br>|* [Slack](https://kubernetes.slack.com/messages/wg-policy)<br>* [Mailing List](https://groups.google.com/forum/#!forum/kubernetes-wg-policy)|* Regular WG Meeting: [Wednesdays at 8:00 PT (Pacific Time) (semimonthly)](https://zoom.us/j/7375677271)<br>
70+
|[Serving](wg-serving/README.md)|[serving](https://github.com/kubernetes/kubernetes/labels/wg%2Fserving)|* Apps<br>* Architecture<br>* Autoscaling<br>* Instrumentation<br>* Network<br>* Node<br>* Scheduling<br>* Storage<br>|* [Eduardo Arango](https://github.com/ArangoGutierrez), NVIDIA<br>* [Sergey Kanzhelev](https://github.com/SergeyKanzhelev), Google<br>* [Yuan Tang](https://github.com/terrytangyuan), Red Hat<br>|* [Slack](https://kubernetes.slack.com/messages/wg-serving)<br>* [Mailing List](https://groups.google.com/a/kubernetes.io/g/wg-serving)|* WG Serving weekly meeting ([Calendar](https://calendar.google.com/calendar/embed?src=e896b769743f3877edfab2d4c6a14132b2aa53287021e9bbf113cab676da54ba%40group.calendar.google.com)): [Wednesdays at 9:00 AM PT (Pacific Time) (weekly)](https://zoom.us/j/93517402529?pwd=RnkwUUQ4L3J2QmNYYlNBcnZGbXcvQT09)<br>
7071
|[Structured Logging](wg-structured-logging/README.md)|[structured-logging](https://github.com/kubernetes/kubernetes/labels/wg%2Fstructured-logging)|* API Machinery<br>* Architecture<br>* Cloud Provider<br>* Instrumentation<br>* Network<br>* Node<br>* Scheduling<br>* Storage<br>|* [Mengjiao Liu](https://github.com/mengjiao-liu), DaoCloud<br>* [Patrick Ohly](https://github.com/pohly), Intel<br>|* [Slack](https://kubernetes.slack.com/messages/wg-structured-logging)<br>* [Mailing List](https://groups.google.com/forum/#!forum/kubernetes-wg-structured-logging)|
7172

7273
### Committees

sig-network/README.md

+1
Original file line numberDiff line numberDiff line change
@@ -74,6 +74,7 @@ subprojects, and resolve cross-subproject technical issues and decisions.
7474
The following [working groups][working-group-definition] are sponsored by sig-network:
7575
* [WG Device Management](/wg-device-management)
7676
* [WG Policy](/wg-policy)
77+
* [WG Serving](/wg-serving)
7778
* [WG Structured Logging](/wg-structured-logging)
7879

7980

sig-node/README.md

+1
Original file line numberDiff line numberDiff line change
@@ -55,6 +55,7 @@ The following [working groups][working-group-definition] are sponsored by sig-no
5555
* [WG Batch](/wg-batch)
5656
* [WG Device Management](/wg-device-management)
5757
* [WG Policy](/wg-policy)
58+
* [WG Serving](/wg-serving)
5859
* [WG Structured Logging](/wg-structured-logging)
5960

6061

sig-scheduling/README.md

+1
Original file line numberDiff line numberDiff line change
@@ -65,6 +65,7 @@ The following [working groups][working-group-definition] are sponsored by sig-sc
6565
* [WG Batch](/wg-batch)
6666
* [WG Device Management](/wg-device-management)
6767
* [WG Policy](/wg-policy)
68+
* [WG Serving](/wg-serving)
6869
* [WG Structured Logging](/wg-structured-logging)
6970

7071

sig-storage/README.md

+1
Original file line numberDiff line numberDiff line change
@@ -58,6 +58,7 @@ subprojects, and resolve cross-subproject technical issues and decisions.
5858
The following [working groups][working-group-definition] are sponsored by sig-storage:
5959
* [WG Data Protection](/wg-data-protection)
6060
* [WG Policy](/wg-policy)
61+
* [WG Serving](/wg-serving)
6162
* [WG Structured Logging](/wg-structured-logging)
6263

6364

sigs.yaml

+44
Original file line numberDiff line numberDiff line change
@@ -3481,6 +3481,50 @@ workinggroups:
34813481
liaison:
34823482
github: pohly
34833483
name: Patrick Ohly
3484+
- dir: wg-serving
3485+
name: Serving
3486+
mission_statement: >
3487+
Discuss and enhance the support of inference serving for accelerated workloads
3488+
in Kubernetes. Make Kubernetes the natural choice for hosting production inference
3489+
reliably, and improve all serving workloads along the way.
3490+
3491+
charter_link: charter.md
3492+
stakeholder_sigs:
3493+
- Apps
3494+
- Architecture
3495+
- Autoscaling
3496+
- Instrumentation
3497+
- Network
3498+
- Node
3499+
- Scheduling
3500+
- Storage
3501+
label: serving
3502+
leadership:
3503+
chairs:
3504+
- github: ArangoGutierrez
3505+
name: Eduardo Arango
3506+
company: NVIDIA
3507+
- github: SergeyKanzhelev
3508+
name: Sergey Kanzhelev
3509+
company: Google
3510+
- github: terrytangyuan
3511+
name: Yuan Tang
3512+
company: Red Hat
3513+
meetings:
3514+
- description: WG Serving weekly meeting ([Calendar](https://calendar.google.com/calendar/embed?src=e896b769743f3877edfab2d4c6a14132b2aa53287021e9bbf113cab676da54ba%40group.calendar.google.com))
3515+
day: Wednesday
3516+
time: 9:00 AM
3517+
tz: PT (Pacific Time)
3518+
frequency: weekly
3519+
url: https://zoom.us/j/93517402529?pwd=RnkwUUQ4L3J2QmNYYlNBcnZGbXcvQT09
3520+
archive_url: https://docs.google.com/document/d/1aExJFtaLnO-TM6_2uILgI8NI0IjOm7FcwLABBKEMEo0/edit
3521+
recordings_url: https://www.youtube.com/playlist?list=TODO
3522+
contact:
3523+
slack: wg-serving
3524+
mailing_list: https://groups.google.com/a/kubernetes.io/g/wg-serving
3525+
liaison:
3526+
github: soltysh
3527+
name: Maciej Szulik
34843528
- dir: wg-structured-logging
34853529
name: Structured Logging
34863530
mission_statement: >

wg-serving/README.md

+44
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,44 @@
1+
<!---
2+
This is an autogenerated file!
3+
4+
Please do not edit this file directly, but instead make changes to the
5+
sigs.yaml file in the project root.
6+
7+
To understand how this file is generated, see https://git.k8s.io/community/generator/README.md
8+
--->
9+
# Serving Working Group
10+
11+
Discuss and enhance the support of inference serving for accelerated workloads in Kubernetes. Make Kubernetes the natural choice for hosting production inference reliably, and improve all serving workloads along the way.
12+
13+
The [charter](charter.md) defines the scope and governance of the Serving Working Group.
14+
15+
## Stakeholder SIGs
16+
* [SIG Apps](/sig-apps)
17+
* [SIG Architecture](/sig-architecture)
18+
* [SIG Autoscaling](/sig-autoscaling)
19+
* [SIG Instrumentation](/sig-instrumentation)
20+
* [SIG Network](/sig-network)
21+
* [SIG Node](/sig-node)
22+
* [SIG Scheduling](/sig-scheduling)
23+
* [SIG Storage](/sig-storage)
24+
25+
## Meetings
26+
*Joining the [mailing list](https://groups.google.com/a/kubernetes.io/g/wg-serving) for the group will typically add invites for the following meetings to your calendar.*
27+
* WG Serving weekly meeting ([Calendar](https://calendar.google.com/calendar/embed?src=e896b769743f3877edfab2d4c6a14132b2aa53287021e9bbf113cab676da54ba%40group.calendar.google.com)): [Wednesdays at 9:00 AM PT (Pacific Time)](https://zoom.us/j/93517402529?pwd=RnkwUUQ4L3J2QmNYYlNBcnZGbXcvQT09) (weekly). [Convert to your timezone](http://www.thetimezoneconverter.com/?t=9:00 AM&tz=PT%20%28Pacific%20Time%29).
28+
* [Meeting notes and Agenda](https://docs.google.com/document/d/1aExJFtaLnO-TM6_2uILgI8NI0IjOm7FcwLABBKEMEo0/edit).
29+
* [Meeting recordings](https://www.youtube.com/playlist?list=TODO).
30+
31+
## Organizers
32+
33+
* Eduardo Arango (**[@ArangoGutierrez](https://github.com/ArangoGutierrez)**), NVIDIA
34+
* Sergey Kanzhelev (**[@SergeyKanzhelev](https://github.com/SergeyKanzhelev)**), Google
35+
* Yuan Tang (**[@terrytangyuan](https://github.com/terrytangyuan)**), Red Hat
36+
37+
## Contact
38+
- Slack: [#wg-serving](https://kubernetes.slack.com/messages/wg-serving)
39+
- [Mailing list](https://groups.google.com/a/kubernetes.io/g/wg-serving)
40+
- [Open Community Issues/PRs](https://github.com/kubernetes/community/labels/wg%2Fserving)
41+
- Steering Committee Liaison: Maciej Szulik (**[@soltysh](https://github.com/soltysh)**)
42+
<!-- BEGIN CUSTOM CONTENT -->
43+
44+
<!-- END CUSTOM CONTENT -->

wg-serving/charter.md

+98
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,98 @@
1+
# WG Serving Charter
2+
3+
This charter adheres to the conventions described in the [Kubernetes Charter README] and uses
4+
the Roles and Organization Management outlined in [wg-governance].
5+
6+
[Kubernetes Charter README]: /committee-steering/governance/README.md
7+
8+
## Scope
9+
10+
Discuss and enhance serving workloads on Kubernetes, specifically focusing on
11+
hardware-accelerated AI/ML inference. The working group will focus on the novel
12+
challenges of compute-intensive online inference. Scenarios solving use cases
13+
involving non-fungible accelerators will be prioritized over solutions against
14+
generic CPU. However, all improvements should, where possible, benefit other
15+
serving workloads like web services or stateful databases, be usable as
16+
primitives by multiple ecosystem projects, and compose well into the workflows
17+
of those deploying models to production. The Working Group Batch has a similar
18+
scope. The difference in scope by a simplified definition is that the Serving WG
19+
will generally concentrate on the workloads where Pods are running with
20+
restartPolicy=Always, while WG Batch will generally be looking at Pods with the
21+
restartPolicy=OnFailure. There are edge cases to this definition, but it creates
22+
an easy enough framework to differentiate the scope of these two Working Groups.
23+
24+
### In scope
25+
26+
- Gather requirements for serving workloads (inference primarily, but benefiting
27+
other non-batch use cases where possible) that have broad community alignment
28+
from practitioners, distros, and vendors. Provide concrete input to other SIGs
29+
and WGs around needs for identified requirements. Do it in partnership
30+
with existing ecosystem projects like kServe, Seldon, Kaito, and
31+
others to identify, extract, or implement common shared problems (like Kueue
32+
abstracted deferred scheduling for multiple batch frameworks).
33+
- Specific areas of improvement include:
34+
- Directly improve key kubernetes workload controllers when used with
35+
accelerators and the most common inference serving frameworks and model
36+
servers.
37+
- Explore new projects that improve orchestration, scaling, and load balancing
38+
of inference workloads and compose well with other workloads on Kubernetes
39+
- Being able to run serving workloads safely while giving up
40+
available slack capacity to batch frameworks
41+
42+
### Out of scope
43+
44+
- Training and batch inference, which are covered by WG Batch.
45+
- Ability to describe the workflows for serving workloads is out of scope,
46+
Kubernetes will offer building blocks to MLOps platforms to build those.
47+
48+
## Stakeholders
49+
50+
Stakeholders in this working group span multiple SIGs that own parts of the
51+
code in core kubernetes components and addons.
52+
53+
- SIG Apps as a primary SIG
54+
- SIG Architecture
55+
- SIG Node
56+
- SIG Scheduling
57+
- SIG Autoscaling
58+
- SIG Network
59+
- SIG Instrumentation
60+
- SIG Storage
61+
62+
## Deliverables
63+
64+
The list of deliverables include the following high level features:
65+
66+
- To SIG Apps:
67+
- Ability to express the model serving workloads with easy to understand logical
68+
objects with the ability to scale to multi-host
69+
- To SIG Scheduling and Autoscaling
70+
- Faster scaling up and down
71+
- Ability to preempt workloads
72+
- To SIG Node:
73+
- Runtime support for Pods preemption
74+
- Runtime support for devices partitioning
75+
76+
## Roles and Organization Management
77+
78+
This WG adheres to the Roles and Organization Management outlined in [wg-governance]
79+
and opts-in to updates and modifications to [wg-governance].
80+
81+
[wg-governance]: /committee-steering/governance/wg-governance.md
82+
83+
Additionally, the WG commits to:
84+
85+
- maintain a solid communication line between the Kubernetes groups and the wider CNCF community;
86+
- submit a proposal to the KubeCon/CloudNativeCon maintainers track;
87+
88+
## Timelines and Disbanding
89+
90+
As a first mandate, the WG will define a roadmap in the first quarter of operation.
91+
We believe there will be a set of features the Working Group can identify and deliver
92+
that will enable the majority of frameworks operate natively on Kubernetes.
93+
94+
Achieving the aforementioned deliverables, also mentioned in the `In Scope`
95+
section, will allow us to decide when to disband this WG. There is no
96+
expectations that the Working Group will be converted into SIG long term,
97+
however, there is a chance that a separate project or a sizeable sub-component
98+
of SIG Apps can be created as a result of a Working Group.

0 commit comments

Comments
 (0)