Kubernetes API Server架构设计深度剖析 - 请求处理流程与核心机制

API Server架构设计深度剖析

整体架构图

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
graph TB
subgraph "Client Layer"
CLI[kubectl]
WEB[Web Dashboard]
SDK[Client SDK]
end

subgraph "API Server"
subgraph "HTTP Handler"
ROUTER[Request Router]
CORS[CORS Handler]
TIMEOUT[Timeout Handler]
end

subgraph "Authentication"
CERT[Certificate Auth]
TOKEN[Token Auth]
OIDC[OIDC Auth]
WEBHOOK[Auth Webhook]
end

subgraph "Authorization"
RBAC[RBAC]
ABAC[ABAC]
WEBHOOK2[Authz Webhook]
end

subgraph "Admission Control"
MUTATE[Mutating Admissions]
VALIDATE[Validating Admissions]
BUILT[Built-in Admissions]
end

subgraph "API Machinery"
REG[Schema Registry]
CONVERT[Version Converter]
VALIDATE2[Object Validator]
end

subgraph "Storage Layer"
GENERIC[Generic Store]
CACHE[Cache Layer]
ETCD3[etcd v3 Client]
end
end

CLI --> ROUTER
WEB --> ROUTER
SDK --> ROUTER

ROUTER --> CORS
CORS --> TIMEOUT
TIMEOUT --> CERT
CERT --> TOKEN
TOKEN --> OIDC
OIDC --> WEBHOOK
WEBHOOK --> RBAC
RBAC --> ABAC
ABAC --> WEBHOOK2
WEBHOOK2 --> MUTATE
MUTATE --> VALIDATE
VALIDATE --> BUILT
BUILT --> REG
REG --> CONVERT
CONVERT --> VALIDATE2
VALIDATE2 --> GENERIC
GENERIC --> CACHE
CACHE --> ETCD3

核心模块详解

1. HTTP服务层

Request Router (请求路由)

负责将HTTP请求路由到相应的处理器:

1
2
3
4
5
6
7
8
9
10
11
12
13
// API路由示例结构
type APIHandler struct {
group string // API组 (如 "apps")
version string // 版本 (如 "v1")
resource string // 资源类型 (如 "deployments")
namespace string // 命名空间
name string // 资源名称
}

// 路由规则
/api/v1/namespaces/{namespace}/pods/{name}
/apis/apps/v1/namespaces/{namespace}/deployments
/apis/apiextensions.k8s.io/v1/customresourcedefinitions

HTTP中间件链

1
2
3
4
5
6
7
8
9
10
graph LR
REQ[HTTP Request]
--> PANIC[Panic Recovery]
--> CORS[CORS Headers]
--> TIMEOUT[Request Timeout]
--> AUDIT[Audit Logging]
--> AUTH[Authentication]
--> AUTHZ[Authorization]
--> ADMIT[Admission]
--> HANDLER[Resource Handler]

2. 认证模块 (Authentication)

多种认证方式支持

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
graph TB
REQUEST[Incoming Request]

subgraph "认证插件链"
CERT[X.509 Certificate]
TOKEN[Bearer Token]
BASIC[Basic Auth]
OIDC[OpenID Connect]
WEBHOOK[Authentication Webhook]
ANON[Anonymous]
end

REQUEST --> CERT
CERT --> TOKEN
TOKEN --> BASIC
BASIC --> OIDC
OIDC --> WEBHOOK
WEBHOOK --> ANON

认证流程代码示例

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
// 认证接口定义
type Authenticator interface {
AuthenticateRequest(req *http.Request) (*Response, bool, error)
}

// 认证响应
type Response struct {
User *user.DefaultInfo
Groups []string
}

// X.509证书认证示例
func (ca *x509Authenticator) AuthenticateRequest(req *http.Request) (*Response, bool, error) {
// 提取客户端证书
clientCerts := req.TLS.PeerCertificates
if len(clientCerts) == 0 {
return nil, false, nil
}

// 验证证书链
cert := clientCerts[0]
if err := ca.verifier.Verify(cert); err != nil {
return nil, false, err
}

// 提取用户信息
user := &user.DefaultInfo{
Name: cert.Subject.CommonName,
UID: cert.Subject.SerialNumber,
Groups: cert.Subject.Organization,
}

return &Response{User: user}, true, nil
}

3. 授权模块 (Authorization)

RBAC授权架构

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
graph TB
subgraph "RBAC模型"
USER[User/ServiceAccount]
ROLE[Role/ClusterRole]
BINDING[RoleBinding/ClusterRoleBinding]

USER --> BINDING
ROLE --> BINDING
end

subgraph "权限检查"
REQUEST[Request]
SUBJECT[Subject]
RESOURCE[Resource]
VERB[Verb]

REQUEST --> SUBJECT
REQUEST --> RESOURCE
REQUEST --> VERB
end

BINDING --> ALLOW{允许访问}
SUBJECT --> ALLOW
RESOURCE --> ALLOW
VERB --> ALLOW

授权决策流程

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
// 授权接口
type Authorizer interface {
Authorize(ctx context.Context, a Attributes) (Decision, string, error)
}

// 授权属性
type Attributes interface {
GetUser() user.Info
GetVerb() string
GetResource() string
GetNamespace() string
GetName() string
GetAPIGroup() string
GetAPIVersion() string
}

// RBAC授权实现
func (r *RBACAuthorizer) Authorize(ctx context.Context, requestAttributes authorizer.Attributes) (authorizer.Decision, string, error) {
rules, err := r.GetRules(requestAttributes.GetUser())
if err != nil {
return authorizer.DecisionDeny, "", err
}

for _, rule := range rules {
if ruleAllows(requestAttributes, &rule) {
return authorizer.DecisionAllow, "", nil
}
}

return authorizer.DecisionDeny, "", nil
}

4. 准入控制模块 (Admission Control)

准入控制器链

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
graph LR
REQ[Request Object]

subgraph "Mutating Admission"
NS[NamespaceLifecycle]
LIMIT[LimitRanger]
SC[ServiceAccount]
WEBHOOK1[MutatingWebhook]
end

subgraph "Validating Admission"
QUOTA[ResourceQuota]
POLICY[PodSecurity]
WEBHOOK2[ValidatingWebhook]
end

REQ --> NS
NS --> LIMIT
LIMIT --> SC
SC --> WEBHOOK1
WEBHOOK1 --> QUOTA
QUOTA --> POLICY
POLICY --> WEBHOOK2
WEBHOOK2 --> STORE[(etcd)]

关键准入控制器

控制器 类型 功能
NamespaceLifecycle Built-in 命名空间生命周期管理
LimitRanger Built-in 资源限制检查
ResourceQuota Built-in 资源配额控制
ServiceAccount Built-in 自动注入ServiceAccount
PodSecurity Built-in Pod安全策略
MutatingAdmissionWebhook Dynamic 动态对象修改
ValidatingAdmissionWebhook Dynamic 动态验证检查

5. API机制层

版本转换机制

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
graph TB
subgraph "版本转换"
EXT[External Version<br/>v1beta1]
INT[Internal Version<br/>__internal]
STOR[Storage Version<br/>v1]

EXT <--> INT
INT <--> STOR
end

subgraph "转换注册"
SCHEME[Scheme Registry]
CONVERT[Converter]
VALIDATE[Validator]

SCHEME --> CONVERT
CONVERT --> VALIDATE
end

对象序列化/反序列化

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
// 序列化器接口
type Serializer interface {
Encode(obj runtime.Object, w io.Writer) error
Decode(data []byte, gvk *schema.GroupVersionKind, obj runtime.Object) (runtime.Object, error)
}

// JSON序列化器
type jsonSerializer struct {
scheme *runtime.Scheme
}

func (s *jsonSerializer) Encode(obj runtime.Object, w io.Writer) error {
// 获取对象的GroupVersionKind
gvk := obj.GetObjectKind().GroupVersionKind()

// 转换为外部版本
external, err := s.scheme.ConvertToVersion(obj, schema.GroupVersion{
Group: gvk.Group,
Version: gvk.Version,
})
if err != nil {
return err
}

// JSON编码
return json.NewEncoder(w).Encode(external)
}

6. 存储层架构

通用存储接口

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
graph TB
subgraph "Storage Interface"
STORE[Storage Interface]
GET[Get]
LIST[List]
CREATE[Create]
UPDATE[Update]
DELETE[Delete]
WATCH[Watch]

STORE --> GET
STORE --> LIST
STORE --> CREATE
STORE --> UPDATE
STORE --> DELETE
STORE --> WATCH
end

subgraph "Implementation"
ETCD3[etcd3 Store]
CACHE[Cached Store]
TRANSFORM[Transformation]

GET --> CACHE
LIST --> CACHE
CREATE --> ETCD3
UPDATE --> ETCD3
DELETE --> ETCD3
WATCH --> CACHE

CACHE --> TRANSFORM
TRANSFORM --> ETCD3
end

etcd存储实现

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
// etcd存储接口
type etcdStore struct {
client etcdclient.Client
pathPrefix string
keyFunc func(obj runtime.Object) (string, error)
transformer Transformer
}

// 创建对象
func (s *etcdStore) Create(ctx context.Context, key string, obj runtime.Object) error {
// 序列化对象
data, err := s.transformer.TransformToStorage(obj)
if err != nil {
return err
}

// 存储到etcd
key = path.Join(s.pathPrefix, key)
_, err = s.client.Put(ctx, key, string(data))
return err
}

// Watch实现
func (s *etcdStore) Watch(ctx context.Context, prefix string, opts storage.ListOptions) (watch.Interface, error) {
watchKey := path.Join(s.pathPrefix, prefix)
watcher := s.client.Watch(ctx, watchKey, clientv3.WithPrefix())

return newEtcdWatcher(watcher, s.transformer), nil
}

数据流分析

创建Pod的完整数据流

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
sequenceDiagram
participant C as kubectl
participant R as Router
participant A as Auth/Authz
participant M as Admission
participant S as Storage
participant E as etcd

C->>R: POST /api/v1/namespaces/default/pods
R->>A: 认证+授权检查
A->>M: 准入控制处理
M->>M: Mutating Admission (修改对象)
M->>M: Validating Admission (验证对象)
M->>S: 调用Storage.Create
S->>S: 序列化对象
S->>E: etcd Put操作
E->>S: 返回存储结果
S->>M: 返回创建结果
M->>A: 返回最终对象
A->>R: 返回HTTP响应
R->>C: 201 Created + Pod对象

Watch事件流

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
sequenceDiagram
participant C as Client
participant W as Watch Handler
participant S as Storage
participant E as etcd

C->>W: GET /api/v1/pods?watch=true
W->>S: 建立Watch连接
S->>E: etcd Watch

Note over E: Pod状态变更
E->>S: Watch Event
S->>S: 反序列化对象
S->>W: 发送Event
W->>C: Server-Sent Events

性能优化设计

1. 缓存策略

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
graph TB
subgraph "多层缓存"
L1[Informer Cache<br/>内存缓存]
L2[etcd Watch Cache<br/>近期变更]
L3[etcd Storage<br/>持久化存储]

L1 --> L2
L2 --> L3
end

subgraph "缓存更新"
EVENT[etcd Events]
UPDATE[Cache Update]
NOTIFY[Client Notify]

EVENT --> UPDATE
UPDATE --> NOTIFY
end

2. 请求限流

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
// 限流配置
type FlowControlConfig struct {
MaxRequestsInFlight int
MaxRequestWaitTime time.Duration
RequestTimeout time.Duration
}

// 限流器实现
type requestLimiter struct {
semaphore chan struct{}
timeout time.Duration
}

func (rl *requestLimiter) Limit(handler http.Handler) http.Handler {
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
select {
case rl.semaphore <- struct{}{}:
defer func() { <-rl.semaphore }()
handler.ServeHTTP(w, r)
case <-time.After(rl.timeout):
http.Error(w, "Request timeout", http.StatusTooManyRequests)
}
})
}

高可用架构

多实例部署

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
graph TB
subgraph "Load Balancer"
LB[HAProxy/nginx]
end

subgraph "API Server Cluster"
API1[kube-apiserver-1]
API2[kube-apiserver-2]
API3[kube-apiserver-3]
end

subgraph "etcd Cluster"
ETCD1[(etcd-1)]
ETCD2[(etcd-2)]
ETCD3[(etcd-3)]
end

LB --> API1
LB --> API2
LB --> API3

API1 --> ETCD1
API1 --> ETCD2
API1 --> ETCD3

API2 --> ETCD1
API2 --> ETCD2
API2 --> ETCD3

API3 --> ETCD1
API3 --> ETCD2
API3 --> ETCD3

故障转移机制

  • 健康检查: /healthz 端点监控
  • 自动恢复: 重启失败的API Server实例
  • 会话亲和: 避免Watch连接中断

关键设计思想

1. 插件化架构

API Server的认证、授权、准入控制都采用插件化设计,支持:

  • 多种认证方式并存
  • 可配置的授权策略
  • 动态的准入控制器

2. 分层抽象

从HTTP层到存储层,每层都有清晰的职责边界:

  • HTTP层: 协议处理
  • 安全层: 认证授权
  • 业务层: 准入控制
  • 存储层: 数据持久化

3. 事件驱动

通过Watch机制实现事件驱动的架构,确保:

  • 实时状态同步
  • 减少轮询开销
  • 支持大规模集群

这是API Server架构深度解析,展现了Kubernetes控制平面的核心设计思想。接下来我们将探讨具体的技术挑战和性能优化策略。

系列文章导航: