Skip to content

Explicit HTTP content copy/retain #116115

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 41 commits into from
Nov 21, 2024
Merged

Conversation

mhl-b
Copy link
Contributor

@mhl-b mhl-b commented Nov 1, 2024

Today we implicitly copy HTTP content before parsing, except for two RestHandlers - RestSearchAction and RestBulkAction. The original problem comes from netty pooled byte buffers, netty tries to reduce allocations and GC pressure by manual reference counting. But there is no mechanism in RestHandler that allow us to keep track of the refCount, so to prevent leaking buffers we copy whole http body and let GC take care of it.

The flag RestHandler#allowsUnsafeBuffers() was introduced to indicate that Handler will not leak content at the end of request so it can be cleaned up properly. Flag is removed in this PR.

This PR removes implicit HTTP content copy. Right now RestHandler must explicitly tell how to handle content in prepareRequest() - either copy or retain. The content reference count will be decremented after RestHandler.handlerRequest and content will be released if it reaches 0.

Major change is in RestRequest and "rest" package. A few new methods introduced to help with content handling.

  1. RestRequest.content() allocates new GC-managed buffer (aka "safe") and copy content.
  2. RestRequest.requiredContent() depends on content(), means a copy.
  3. Two new counter-part methods releasableContent() and requiredReleasableContent(). These methods return ReleasableBytesReference backed by Netty ByteBuf. These two are recommended way to handle content without copying it.

Additionally, a new ByteBufAllocator is introduced - TrashingByteBufAllocator. It allocates buffers that trash(fill with zeros) it's content on release. This allocator works only in tests by wrapping our current allocator. Our abstraction over byte buffers ReleasableBytesReference can expose underlying byte arrays, effectively escaping ref counts lifecycle. That means we might still have an access to bytes after release that will pass undetected. Filling buffer with zeros increases chances to catch use-after-free in tests. Netty4TrashingAllocatorIT displays how RestHandler can leak buffers.

@mhl-b mhl-b added :Distributed Coordination/Network Http and internode communication implementations Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. v8.16.0 >enhancement labels Nov 2, 2024
@elasticsearchmachine
Copy link
Collaborator

Hi @mhl-b, I've created a changelog YAML for you.

@mhl-b mhl-b marked this pull request as ready for review November 2, 2024 01:47
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-distributed (Team:Distributed)

* Release underlying HTTP request and related buffers.
*/
@Override
public void close() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this only for the testPipelineOverflow workaround? If so, I'd rather we used a package-private method instead of making this public and Releasable. The eventual goal would be to release these buffers up-front and then we can remove the workaround, and in the meantime we don't want to accumulate other spots that explicitly close the request.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can add method that returns HttpBody and then we can close body explicitly. Right now there is public access from RestRequest to close pooled/streamed content.

@DaveCTurner
Copy link
Contributor

Armin raised a concern that with this change we'll be retaining these Netty buffers for much longer than we do today, which might result in much more stress on the Netty allocator (especially since the Netty buffers are oversized). Could we move to releasing these things on return from BaseRestHandler#handleRequest() already?

@elasticsearchmachine elasticsearchmachine added needs:triage Requires assignment of a team area label and removed Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. labels Nov 6, 2024
@mhl-b mhl-b changed the title Allow http unsafe buffers by default Explicit HTTP content copy/retain Nov 6, 2024
@mhl-b mhl-b added Team:Distributed Coordination Meta label for Distributed Coordination team and removed needs:triage Requires assignment of a team area label labels Nov 6, 2024
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-distributed-coordination (Team:Distributed Coordination)

@mhl-b mhl-b requested a review from DaveCTurner November 20, 2024 23:08
@mhl-b
Copy link
Contributor Author

mhl-b commented Nov 20, 2024

@DaveCTurner, I addressed feedback 226060e

Copy link
Contributor

@DaveCTurner DaveCTurner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - two optional comments that don't block this change.

@@ -102,9 +103,11 @@ public RestChannelConsumer prepareRequest(final RestRequest request, final NodeC
boolean defaultRequireDataStream = request.paramAsBoolean(DocWriteRequest.REQUIRE_DATA_STREAM, false);
bulkRequest.timeout(request.paramAsTime("timeout", BulkShardRequest.DEFAULT_TIMEOUT));
bulkRequest.setRefreshPolicy(request.param("refresh"));
ReleasableBytesReference content = request.requiredReleasableContent();
content.mustIncRef();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I think the exception handling here is correct, but I wonder if we should use the same pattern as in RestIndexAction where we call mustIncRef within the lambda, just before client.bulk. That will be more robust to future changes that might introduce more exception paths around here.

Comment on lines 375 to 377
if (refCnt() == 1) {
trashContent();
}
Copy link
Contributor

@DaveCTurner DaveCTurner Nov 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah it's not valid but it's also not obvious from the code that this isn't valid. Suggest a comment like this:

diff --git a/modules/transport-netty4/src/main/java/org/elasticsearch/transport/netty4/NettyAllocator.java b/modules/transport-netty4/src/main/java/org/elasticsearch/transport/netty4/NettyAllocator.java
index 7fd092c1964..791d55481d5 100644
--- a/modules/transport-netty4/src/main/java/org/elasticsearch/transport/netty4/NettyAllocator.java
+++ b/modules/transport-netty4/src/main/java/org/elasticsearch/transport/netty4/NettyAllocator.java
@@ -373,6 +373,7 @@ public class NettyAllocator {
         @Override
         public boolean release() {
             if (refCnt() == 1) {
+                // see [NOTE on racy trashContent() calls]
                 trashContent();
             }
             return super.release();
@@ -381,11 +382,20 @@ public class NettyAllocator {
         @Override
         public boolean release(int decrement) {
             if (refCnt() == decrement && refCnt() > 0) {
+                // see [NOTE on racy trashContent() calls]
                 trashContent();
             }
             return super.release(decrement);
         }

+        // [NOTE on racy trashContent() calls]: We trash the buffer content _before_ reducing the ref count to zero, which looks racy
+        // because in principle a concurrent caller could come along and successfully retain() this buffer to keep it alive after it's been
+        // trashed. Such a caller would sometimes get an IllegalReferenceCountException ofc but that's something it could handle - see for
+        // instance org.elasticsearch.transport.netty4.Netty4Utils.ByteBufRefCounted.tryIncRef. Yet in practice this should never happen,
+        // we only ever retain() these buffers while we know them to be alive (i.e. via RefCounted#mustIncRef or its moral equivalents) so
+        // it'd be a bug for a caller to retain() a buffer whose ref count is heading to zero and whose contents we've already decided to
+        // trash.
+
         private void trashContent() {
             if (trashed == false) {
                 trashed = true;

@@ -117,10 +120,16 @@ public RestChannelConsumer prepareRequest(final RestRequest request, final NodeC
request.getRestApiVersion()
);
} catch (Exception e) {
content.close();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We shouldn't close the content on exception here any more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oops, thank you!

Copy link
Contributor

@DaveCTurner DaveCTurner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM again

@mhl-b mhl-b added the auto-backport Automatically create backport pull requests when merged label Nov 21, 2024
@mhl-b mhl-b merged commit 7aa07f1 into elastic:main Nov 21, 2024
16 checks passed
@elasticsearchmachine
Copy link
Collaborator

💔 Backport failed

Status Branch Result
8.18 The branch "8.18" is invalid or doesn't exist

You can use sqren/backport to manually backport by running backport --upstream elastic/elasticsearch --pr 116115

mhl-b added a commit to mhl-b/elasticsearch that referenced this pull request Nov 21, 2024
elasticsearchmachine pushed a commit that referenced this pull request Nov 21, 2024
* backport explicit http content copy/retain #116115

* spotless
smalyshev pushed a commit to smalyshev/elasticsearch that referenced this pull request Nov 22, 2024
alexey-ivanov-es pushed a commit to alexey-ivanov-es/elasticsearch that referenced this pull request Nov 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
auto-backport Automatically create backport pull requests when merged :Distributed Coordination/Network Http and internode communication implementations >enhancement Team:Distributed Coordination Meta label for Distributed Coordination team v8.18.0 v9.0.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants