expose kubernetes node name/etc. in jenkins build logs

XMLWordPrintable

Details

• Type: Story
• Status: Done
• Resolution: Done
• Fix Version/s: None
• Component/s:
• Labels:
None
• Story Points:
2.5
• Team:
SQuaRE

Description

At present, when jenkins build failures are suspected to be caused by a specific kubernetes node, the process for determining the node name is tedious. It requires determining the name of the jenkins agent(s) the build was scheduled upon, then manually describing the running k8s pod(s) to see which node(s) said pod(s) are currently scheduled on. If the pod(s) have been killed/restarted since the suspect build, this information is essentially lost to the SQRE team (but perhaps could be reverse engineered from kubelet logs?)

Attachments

1. screenshot-1.png
58 kB

Activity

Hide
Joshua Hoblitt added a comment -

After some research, I discovered that the "downwardapi" could provided access to the k8s node name along with resource limits information for all of the containers in the pod.

The following env vars were added to the deployment:

 / $printenv | grep K8S | sort K8S_DIND_LIMITS_CPU=8 K8S_DIND_LIMITS_MEMORY_GI=64 K8S_DIND_REQUESTS_CPU=8 K8S_DIND_REQUESTS_MEMORY_GI=64 K8S_DOCKER_GC_LIMITS_CPU_M=500 K8S_DOCKER_GC_LIMITS_MEMORY_MI=512 K8S_DOCKER_GC_REQUESTS_CPU_M=500 K8S_DOCKER_GC_REQUESTS_MEMORY_MI=512 K8S_NODE_NAME=lsst-kub017 K8S_POD_IP=10.41.0.28 K8S_POD_NAMESPACE=jenkins-jhoblitt-curly K8S_SWARM_LIMITS_CPU=1 K8S_SWARM_LIMITS_MEMORY_GI=2 K8S_SWARM_REQUESTS_CPU=1 K8S_SWARM_REQUESTS_MEMORY_GI=2  However, the terraform kubernetes provider did not have support for the resoureceFieldSelector divisor key. I added support to my working fork of this provider and opened an upstream PR: https://github.com/terraform-providers/terraform-provider-kubernetes/pull/538 In order to get this information into the console log, it needs to be printed from inside of a jenkins pipeline node block, so that the env vars from the swarm container are visible, but outside of a docker block, as the env vars aren't present in the dind container nor would they be present inside of a container. This was accomplished by adding a wrap method around the node step named util.nodeWrap() and all existing pipelines were updated to use this wrapper method were appropriate. The console output of most jobs, which running an agent on top of k8s, should have output similar to the following: Show Joshua Hoblitt added a comment - After some research, I discovered that the "downwardapi" could provided access to the k8s node name along with resource limits information for all of the containers in the pod. https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.15/#downwardapivolumefile-v1-core https://kubernetes.io/docs/tasks/inject-data-application/environment-variable-expose-pod-information/ The following env vars were added to the deployment: /$ printenv | grep K8S | sort K8S_DIND_LIMITS_CPU= 8 K8S_DIND_LIMITS_MEMORY_GI= 64 K8S_DIND_REQUESTS_CPU= 8 K8S_DIND_REQUESTS_MEMORY_GI= 64 K8S_DOCKER_GC_LIMITS_CPU_M= 500 K8S_DOCKER_GC_LIMITS_MEMORY_MI= 512 K8S_DOCKER_GC_REQUESTS_CPU_M= 500 K8S_DOCKER_GC_REQUESTS_MEMORY_MI= 512 K8S_NODE_NAME=lsst-kub017 K8S_POD_IP= 10.41 . 0.28 K8S_POD_NAMESPACE=jenkins-jhoblitt-curly K8S_SWARM_LIMITS_CPU= 1 K8S_SWARM_LIMITS_MEMORY_GI= 2 K8S_SWARM_REQUESTS_CPU= 1 K8S_SWARM_REQUESTS_MEMORY_GI= 2 However, the terraform kubernetes provider did not have support for the resoureceFieldSelector divisor key. I added support to my working fork of this provider and opened an upstream PR: https://github.com/terraform-providers/terraform-provider-kubernetes/pull/538 In order to get this information into the console log, it needs to be printed from inside of a jenkins pipeline node block, so that the env vars from the swarm container are visible, but outside of a docker block, as the env vars aren't present in the dind container nor would they be present inside of a container. This was accomplished by adding a wrap method around the node step named util.nodeWrap() and all existing pipelines were updated to use this wrapper method were appropriate. The console output of most jobs, which running an agent on top of k8s, should have output similar to the following:

People

Assignee:
Unassigned
Reporter:
Joshua Hoblitt
Watchers:
Adam Thornton, Joshua Hoblitt