-
Notifications
You must be signed in to change notification settings - Fork 3k
Core: introduce shared authentication refresh executor #12563
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull request requires a review, please simply write any comment. If closed, you can revive the PR at any time and @mention a reviewer or discuss it on the dev@iceberg.apache.org list. Thank you for your contributions. |
69ab7f4 to
9587a2e
Compare
|
@nastra @danielcweeks can we consider this for 1.10 please? |
|
This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull request requires a review, please simply write any comment. If closed, you can revive the PR at any time and @mention a reviewer or discuss it on the dev@iceberg.apache.org list. Thank you for your contributions. |
|
Pinging @nastra and @danielcweeks again: do you think we can consider this one for 1.10? |
|
This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull request requires a review, please simply write any comment. If closed, you can revive the PR at any time and @mention a reviewer or discuss it on the dev@iceberg.apache.org list. Thank you for your contributions. |
|
@nastra @danielcweeks : ping for review, please. |
| * A shared {@link ScheduledExecutorService} that REST catalogs can use for refreshing their | ||
| * authentication data. | ||
| */ | ||
| public static ScheduledExecutorService getAuthRefreshPool() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| public static ScheduledExecutorService getAuthRefreshPool() { | |
| public static ScheduledExecutorService authRefreshPool() { |
I'm aware that we already have some methods in this class with a get prefix, but those methods have been added at the very beginning and haven't been renamed
| /** | ||
| * An {@link AuthManager} that provides machinery for refreshing authentication data asynchronously, | ||
| * using a background thread pool. | ||
| * @deprecated since 1.9.0, will be removed in 1.10.0; use {@link ThreadPools#getAuthRefreshPool()}. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
version info needs to be updated here, since the deprecation applies with 1.10.0 and we can only remove with 1.11.0
| * using a background thread pool. | ||
| * @deprecated since 1.9.0, will be removed in 1.10.0; use {@link ThreadPools#getAuthRefreshPool()}. | ||
| */ | ||
| @Deprecated |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if we're deprecating this class, then I don't think we should be removing functionality here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in fact I believe we should only be deprecating this here without updating the functionality (as that would be a breaking change). Once RefreshingAuthManager is removed, we can switch the OAuth2Manager to use the new auth refresh pool, but I don't think we can deprecate + switch within the same release
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nastra I don't mind doing the change in 2 steps, but in which case do you think this would be a breaking change? The API is not affected (and revapi didn't complain).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what if somebody else from the community is extending RefreshingAuthManager and relying on its functionality? We're effectively removing the "refresh" part, which is a breaking change in behavior, even though RevAPI wasn't complaining
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What we can do though, is revert the changes to RefreshingAuthManager just in case someone is extending that class.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry I commented before I saw your comment 😅 – I agree that's an issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What we can do though, is revert the changes to
RefreshingAuthManagerjust in case someone is extending that class.
Yes that's what I was proposing in my earlier comment
Currently, each `AuthManager` instance manages its own refresh pool. During the lifecycle of the catalog session, a few `AuthManager` instances may be created: one for the catalog itself, and a few more in components like the S3 signer client, and credential refreshers. Each one of them is currently creating their own pool, which is a waste of resources. This PR introduces a shared refresh pool for use by `AuthManager`s, thus allowing to minimize the amount of threads allocated for auth session refreshes. The shared pool is closed by a shutdown hook.
9587a2e to
6f256de
Compare
|
FYI I rebased the PR since it's been a while. |
| private long startTimeMillis; | ||
| private OAuthTokenResponse authResponse; | ||
| private AuthSessionCache sessionCache; | ||
| private ScheduledExecutorService refreshExecutor; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need any changes to this class. We would update it once RefreshingAuthManager has been removed right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, for me there wasn't any risk of breaking change here – but OK. Let me change this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
| new ConfigEntry<>( | ||
| "iceberg.rest.auth.refresh.num-threads", | ||
| "ICEBERG_AUTH_REFRESH_NUM_THREADS", | ||
| 1, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
given that we eventually want to share this thread pool, I wonder whether the default should be higher
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think 1 is fine. Most of the time, we'll have only 1 auth manager per JVM, and sometimes 2 (e.g. request signing + credential refreshing) – but realistically, I think we won't see more than 2 per JVM. Given that the typical usage of this executor is to schedule token refreshes every hour or so, 1 is probably fine (also: this is the core pool size, not the maximum pool size). Wdyt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah right, yeah ok I think 1 should be fine then
| assertThat(validatingSigner.icebergSigner) | ||
| .extracting("authManager") | ||
| .extracting("refreshExecutor") | ||
| assertThat(ThreadPools.authRefreshPool()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should roll back these test changes here and in the other test class, since nothing really changed in terms of the implementation
| try (OAuth2Manager manager = new OAuth2Manager("test"); | ||
| OAuth2Util.AuthSession session = manager.initSession(client, properties)) { | ||
| assertThat(session.headers()).isEmpty(); | ||
| assertThat(manager) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see my other comment (#12563 (comment)), so we should be rolling back the changes here
nastra
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, I'll also wait a bit in case @danielcweeks wants to review this
|
This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull request requires a review, please simply write any comment. If closed, you can revive the PR at any time and @mention a reviewer or discuss it on the dev@iceberg.apache.org list. Thank you for your contributions. |
|
@nastra @danielcweeks can we try to get this in please? It was already approved. |
Currently, each
AuthManagerinstance manages its own refresh pool. During the lifecycle of the catalog session, a fewAuthManagerinstances may be created: one for the catalog itself, and a few more in components like the S3 signer client, and credential refreshers. Each one of them is currently creating their own pool, which is a waste of resources.This PR introduces a shared refresh pool for use by
AuthManagers, thus allowing to minimize the amount of threads allocated for auth session refreshes. The shared pool is closed by a shutdown hook.