Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
239 views
in Technique[技术] by (71.8m points)

amazon web services - AWS RDS SQL Server failover really slow

We recently resized one of our Multi-AZ RDS SQL Server instances. With the goal of aligning our resource needs with instance capabilities... In summary we were cruising along on a bigger instance then we needed and saw the obvious cost savings. Our business is seasonal and we expect this could need to be up-sized later in the year.

According to AWS the failover process typically takes around 60-120 seconds. In this window we knew our application would not be able to communicate with the database. Realistically we were expecting less than 5 minutes, however the actual time was around 20 minutes before our applications were able to connect and query the databases. This is in-line with the time AWS said the failover had completed.

I am interested to know if anyone else has had similar experiences? We you able to change anything in your setup to improve the failover time? Are the factors under our control which can reduce the failover time.

Further details: We have approximately 25 databases hosted on this instance. We had been on an db.m4.2xlarge instance

question from:https://stackoverflow.com/questions/65941486/aws-rds-sql-server-failover-really-slow

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Not an expert on SQL-Server at all, so take this with a grain of salt.

RDS for SQL Server works different from the other RDS database engines and the article you linked to is very generic. The SQL Server uses either Always-On-Availability groups or Database Mirroring depending on your setup, so this could be a factor that influences the failover time. The documentation for RDS SQL Server Multi-AZ setups is quite extensive, so you might want to review that.

The only hint mentioned there is:

  • Failover times are affected by the time it takes to complete the recovery process. Large transactions increase the failover time.

If you have AWS support in your account, I'd ask them for the details, they have more insight into the underlying issues and should be able to tell you more - it's a managed service after all and if it doesn't deliver on the typically 60-120 second failover times, I'd ask the service provider what went wrong - doesn't mean it's necessarily their fault, but they can at least point you to the root cause :-)


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...