Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
712 views
in Technique[技术] by (71.8m points)

nginx - Randomly New leader election in kubernetes cluster

I have 5 node production cluster. 2 nodes are solely for ngnix ingress controllers pods only. the ngnix deployment set to a replica of 4.

Everything runs fine for a week after serving half a million requests(Its an SaaS ).... But randomly then I get the below ingress selection of a new leader from one of my inactive ingress nodes which causes 502 errors for my clients.

What triggers an election?I use loggly.com/APM for analysis and I see nothing that would have caused k8s to elect a new leader?(load is less than 10% per node) ...

I am running kuberenetes v18 on linode.com

2020-12-29 23:54:36.000
I1230 04:54:36.394834 6 leaderelection.go:252] successfully acquired lease default/ingress-controller-leader-nginx
{ syslog: { severity: "Error", appName: "default[nginx-ingress-controller-7d849b6c8d-722kx]", host: "nginx-ingress-controller", pid: 23298, priority: "11", facility: "user-level messages", timestamp: "2020-12-30T04:54:36Z" } }
2020-12-29 23:53:56.000
I1230 04:53:56.166120 6 leaderelection.go:242] attempting to acquire leader lease default/ingress-controller-leader-nginx...
{ syslog: { severity: "Error", appName: "default[nginx-ingress-controller-7d849b6c8d-722kx]", host: "nginx-ingress-controller", pid: 23298, priority: "11", facility: "user-level messages", timestamp: "2020-12-30T04:53:56Z" } }
2020-12-29 23:53:56.000
I1230 04:53:56.164251 6 leaderelection.go:277] failed to renew lease default/ingress-controller-leader-nginx: timed out waiting for the condition
{ syslog: { severity: "Error", appName: "default[nginx-ingress-controller-7d849b6c8d-722kx]", host: "nginx-ingress-controller", pid: 23298, priority: "11", facility: "user-level messages", timestamp: "2020-12-30T04:53:56Z" } }
2020-12-29 23:53:56.000
E1230 04:53:56.161159 6 leaderelection.go:320] error retrieving resource lock default/ingress-controller-leader-nginx: Get "https://10.128.0.1:443/api/v1/namespaces/default/configmaps/ingress-controller-leader-nginx": context deadline exceeded
{ syslog: { severity: "Error", appName: "default[nginx-ingress-controller-7d849b6c8d-722kx]", host: "nginx-ingress-controller", pid: 23298, priority: "11", facility: "user-level messages", timestamp: "2020-12-30T04:53:56Z" } }

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)
等待大神答复

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...