Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

grpc/sa: Implement deep health checks #6928

Merged
merged 10 commits into from
Jun 12, 2023
Merged

Conversation

beautifulentropy
Copy link
Member

@beautifulentropy beautifulentropy commented Jun 2, 2023

Add the necessary scaffolding for deep health checking of our various gRPC components. Each component implementation that also implements the grpc.checker interface will be checked periodically, and the health status of the component will be updated accordingly.

Add the necessary methods to SA to implement the grpc.checker interface and register these new health checks with Consul.

Additionally:

  • Update entry point script to check for ProxySQL readiness.
  • Increase the poll rate for gRPC Consul checks from 5s to 2s to help with DNS failures, due to check failures, on startup.
  • Change log level for Consul from INFO to ERROR to deal with noisy logs full of transport failures due to Consul gRPC checks firing before the SAs are up.

Fixes #6878
Part of #6795

@beautifulentropy beautifulentropy force-pushed the sa-deep-health-check branch 7 times, most recently from 0074147 to 5f34810 Compare June 7, 2023 19:44
@beautifulentropy beautifulentropy marked this pull request as ready for review June 8, 2023 17:13
@beautifulentropy beautifulentropy requested a review from a team as a code owner June 8, 2023 17:13
pgporada
pgporada previously approved these changes Jun 8, 2023
grpc/server.go Outdated Show resolved Hide resolved
grpc/server.go Outdated Show resolved Hide resolved
grpc/server.go Outdated Show resolved Hide resolved
sa/saro.go Outdated Show resolved Hide resolved
grpc/server.go Outdated Show resolved Hide resolved
grpc/server.go Outdated Show resolved Hide resolved
@pgporada pgporada dismissed their stale review June 8, 2023 20:48

I thought I understood this, but I don't yet.

@beautifulentropy
Copy link
Member Author

@pgporada and I had a call to chat through his questions.

@beautifulentropy beautifulentropy changed the title sa/grpc: Implement deep health check of database connection grpc/sa: Implement deep health check of database connection Jun 8, 2023
@beautifulentropy beautifulentropy changed the title grpc/sa: Implement deep health check of database connection grpc/sa: Implement deep health checks Jun 8, 2023
@beautifulentropy
Copy link
Member Author

beautifulentropy commented Jun 8, 2023

Took some screenshots and ran through a few scenarios so folks are clear on how this is going to work:

  • grpc will always be green if the grpc server is up
  • sa_ro failure is a failure for sa and sa_ro
  • sa failure is just a failure for sa

The DB and ProxySQL are both up, everything is green:
Screenshot 2023-06-08 at 6 24 55 PM

If you take down ProxySQL (or the DB), sa and saro are red, but grpc is green:
Screenshot 2023-06-08 at 6 25 12 PM

If you remove the sa_ro user from ProxySQL, sa and saro are red, but grpc is green:
Screenshot 2023-06-08 at 6 32 38 PM

If you remove the sa user from ProxySQL, sa is red, but saro and grpc are green:
Screenshot 2023-06-08 at 6 33 34 PM

Copy link
Contributor

@aarongable aarongable left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly tiny code organization comments. Last top-level comment: it would be great to have a unittest that tests the check closure: sets up a health server, supplies a tiny fake health check function, and ensures that the health server's status was updated appropriately. This may require busting check all the way out to being a method on serverBuilder or something like that, rather than being a closure =/

grpc/server.go Outdated Show resolved Hide resolved
grpc/server.go Outdated Show resolved Hide resolved
grpc/server.go Outdated Show resolved Hide resolved
grpc/server.go Outdated Show resolved Hide resolved
grpc/server.go Outdated Show resolved Hide resolved
grpc/server.go Outdated Show resolved Hide resolved
grpc/server.go Outdated Show resolved Hide resolved
aarongable
aarongable previously approved these changes Jun 9, 2023
grpc/server_test.go Show resolved Hide resolved
@beautifulentropy beautifulentropy merged commit 124c4cc into main Jun 12, 2023
11 checks passed
@beautifulentropy beautifulentropy deleted the sa-deep-health-check branch June 12, 2023 17:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

sa: include a ping to the database in health checking
3 participants