I have 5 node production cluster. 2 nodes are solely for ngnix ingress controllers pods only. the ngnix deployment set to a replica of 4.
Everything runs fine for a week after serving half a million requests(Its an SaaS ).... But randomly then I get the below ingress selection of a new leader from one of my inactive ingress nodes which causes 502 errors for my clients.
What triggers an election?I use loggly.com/APM for analysis and I see nothing that would have caused k8s to elect a new leader?(load is less than 10% per node) ...
I am running kuberenetes v18 on linode.com
2020-12-29 23:54:36.000
I1230 04:54:36.394834 6 leaderelection.go:252] successfully acquired lease default/ingress-controller-leader-nginx
{ syslog: { severity: "Error", appName: "default[nginx-ingress-controller-7d849b6c8d-722kx]", host: "nginx-ingress-controller", pid: 23298, priority: "11", facility: "user-level messages", timestamp: "2020-12-30T04:54:36Z" } }
2020-12-29 23:53:56.000
I1230 04:53:56.166120 6 leaderelection.go:242] attempting to acquire leader lease default/ingress-controller-leader-nginx...
{ syslog: { severity: "Error", appName: "default[nginx-ingress-controller-7d849b6c8d-722kx]", host: "nginx-ingress-controller", pid: 23298, priority: "11", facility: "user-level messages", timestamp: "2020-12-30T04:53:56Z" } }
2020-12-29 23:53:56.000
I1230 04:53:56.164251 6 leaderelection.go:277] failed to renew lease default/ingress-controller-leader-nginx: timed out waiting for the condition
{ syslog: { severity: "Error", appName: "default[nginx-ingress-controller-7d849b6c8d-722kx]", host: "nginx-ingress-controller", pid: 23298, priority: "11", facility: "user-level messages", timestamp: "2020-12-30T04:53:56Z" } }
2020-12-29 23:53:56.000
E1230 04:53:56.161159 6 leaderelection.go:320] error retrieving resource lock default/ingress-controller-leader-nginx: Get "https://10.128.0.1:443/api/v1/namespaces/default/configmaps/ingress-controller-leader-nginx": context deadline exceeded
{ syslog: { severity: "Error", appName: "default[nginx-ingress-controller-7d849b6c8d-722kx]", host: "nginx-ingress-controller", pid: 23298, priority: "11", facility: "user-level messages", timestamp: "2020-12-30T04:53:56Z" } }
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…