node controller 逻辑并不多,主要将 handle Pod 不正常且 ds 正常的 节点上的所有 pod 不正常但处于 running 状态下的 vmi 置为 failed 状态。这里的node 可以理解成节点上的 handle pod,有 nodeInformer 和 vmiInformer。相关的添加、更新、删除会将相关node 加入处理队列。

图1 nodeController 总体流程
代码入口在pkg/virt-controller/watch/node.go execute
从队列中取出 node obj 后,首先判断 node.Annotations 的 VirtHandlerHeartbeat 是否过期应该使virthandle 定期反馈的,若已经过期则unresponsive 为 true 并进入后续逻辑,若否则设定 period 将该 node 重新加入处理队列。
对于 unresponsive 的 node,首先确定 node 是否被打上了NodeSchedulable = true的 label 若否则先 patch 将这个标签置为 true。接着进入 checkNodeForOrphanedAndErroredVMIs 处理相关 pod 和 vmi。
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
|
func (c *NodeController) execute(key string) error {
logger := log.DefaultLogger()
obj, nodeExists, err := c.nodeInformer.GetStore().GetByKey(key)
if err != nil {
return err
}
var node *v1.Node
if nodeExists {
node = obj.(*v1.Node)
logger = logger.Object(node)
} else {
logger = logger.Key(key, "Node")
}
unresponsive, err := isNodeUnresponsive(node, c.heartBeatTimeout)
if err != nil {
logger.Reason(err).Error("Failed to determine if node is responsive, will not reenqueue")
return nil
}
if unresponsive {
if nodeIsSchedulable(node) {
if err := c.markNodeAsUnresponsive(node, logger); err != nil {
return err
}
}
err = c.checkNodeForOrphanedAndErroredVMIs(key, node, logger)
if err != nil {
return err
}
}
c.requeueIfExists(key, node)
return nil
}
|
在checkNodeForOrphanedAndErroredVMIs
该方法中首先列出该节点上 running 和 Scheduled 状态的 vmi 列表,接着对于该节点上的 handle Pod,查看daemonset 是否正常,最后给controller 添加event:
virt-handler is not present, there are orphaned vmis on this node. Run virt-handler on this node to migrate or remove them.
接下来对于该节点上的 launcher Pod 筛选出还正常的 Pod
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
|
for i := range list.Items {
pod := &list.Items[i]
if controllerRef := controller.GetControllerOf(pod); !isControlledByVMI(controllerRef) {
continue
}
// Some pods get stuck in a pending Termination during shutdown
// due to virt-handler not being available to unmount container disk
// mount propagation. A pod with all containers terminated is not
// considered alive
allContainersTerminated := false
if len(pod.Status.ContainerStatuses) > 0 {
allContainersTerminated = true
for _, status := range pod.Status.ContainerStatuses {
if status.State.Terminated == nil {
allContainersTerminated = false
break
}
}
}
phase := pod.Status.Phase
toAppendPod := !allContainersTerminated && phase != v1.PodFailed && phase != v1.PodSucceeded
if toAppendPod {
pods = append(pods, pod)
continue
}
}
|
根据这些 Pod 和 该节点上 vmi 的差集,筛选出 Pod 已经不正常的vmi,将这些 vmi 的状态置为failed。后续由vmi-controller 执行处理逻辑。