Skip to content

Conversation

@AnatolyPopov
Copy link
Contributor

When S3InputStreamReadFully is called it does not expose the underlying InputStream to the caller and does not close it on its own. This leads to connection leak in S3Client since the underlying connection is not returned to connection pool in Apace HTTP client.

@github-actions github-actions bot added the AWS label Aug 22, 2025
String range = String.format("bytes=%s-%s", position, position + length - 1);

IOUtil.readFully(readRange(range), buffer, offset, length);
try (InputStream stream = readRange(range)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

readTail method has the same problem no?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, it seems that the same pattern is present in ADLS and GCS input streams!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. I was starting small to see if that's the correct approach. I will do the follow up PRs for other places.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice find @nandorKollar (bad assumption that readFully would close the stream). I would agree there's the same issue for readTail and the other implementations.

Copy link
Contributor

@danielcweeks danielcweeks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 (pending checks)

When S3InputStreamReadFully is called it does not expose the underlying InputStream to the caller and does not close it on its own. This leads to a connection leak in S3Client since the underlying connection is not returned to connection pool in Apace HTTP client.
@AnatolyPopov AnatolyPopov force-pushed the s3stream-connection-leak branch from c682668 to 894e108 Compare August 22, 2025 17:21
@AnatolyPopov
Copy link
Contributor Author

@danielcweeks could you please re-approve the GH actions run? It failed on spotless check on the first attempt. Sorry, my bad.

Copy link
Contributor

@nandorKollar nandorKollar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, though the other cases (readTail and ADLS/GCS) mentioned before also might need attention, maybe in a separate PR?

@AnatolyPopov
Copy link
Contributor Author

LGTM, though the other cases (readTail and ADLS/GCS) mentioned before also might need attention, maybe in a separate PR?

That's exactly what I mentioned above. I'll do another PR for those. Thanks!

@danielcweeks danielcweeks merged commit 0e6633b into apache:main Aug 22, 2025
42 checks passed
@danielcweeks
Copy link
Contributor

Thanks @AnatolyPopov!

AnatolyPopov added a commit to AnatolyPopov/iceberg that referenced this pull request Aug 22, 2025
A follow-up to apache#13899 with more fixes of the same issue.
AnatolyPopov added a commit to AnatolyPopov/iceberg that referenced this pull request Aug 22, 2025
A follow-up to apache#13899 with more fixes of the same issue.
AnatolyPopov added a commit to AnatolyPopov/iceberg that referenced this pull request Aug 22, 2025
A follow-up to apache#13899 with more fixes of the same issue.
amogh-jahagirdar pushed a commit that referenced this pull request Aug 25, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants