Explicit HTTP content copy/retain #116115

mhl-b · 2024-11-01T18:03:24Z

Today we implicitly copy HTTP content before parsing, except for two RestHandlers - RestSearchAction and RestBulkAction. The original problem comes from netty pooled byte buffers, netty tries to reduce allocations and GC pressure by manual reference counting. But there is no mechanism in RestHandler that allow us to keep track of the refCount, so to prevent leaking buffers we copy whole http body and let GC take care of it.

The flag RestHandler#allowsUnsafeBuffers() was introduced to indicate that Handler will not leak content at the end of request so it can be cleaned up properly. Flag is removed in this PR.

This PR removes implicit HTTP content copy. Right now RestHandler must explicitly tell how to handle content in prepareRequest() - either copy or retain. The content reference count will be decremented after RestHandler.handlerRequest and content will be released if it reaches 0.

Major change is in RestRequest and "rest" package. A few new methods introduced to help with content handling.

RestRequest.content() allocates new GC-managed buffer (aka "safe") and copy content.
RestRequest.requiredContent() depends on content(), means a copy.
Two new counter-part methods releasableContent() and requiredReleasableContent(). These methods return ReleasableBytesReference backed by Netty ByteBuf. These two are recommended way to handle content without copying it.

Additionally, a new ByteBufAllocator is introduced - TrashingByteBufAllocator. It allocates buffers that trash(fill with zeros) it's content on release. This allocator works only in tests by wrapping our current allocator. Our abstraction over byte buffers ReleasableBytesReference can expose underlying byte arrays, effectively escaping ref counts lifecycle. That means we might still have an access to bytes after release that will pass undetected. Filling buffer with zeros increases chances to catch use-after-free in tests. Netty4TrashingAllocatorIT displays how RestHandler can leak buffers.

...rt-netty4/src/internalClusterTest/java/org/elasticsearch/http/netty4/Netty4PipeliningIT.java

…-default

elasticsearchmachine · 2024-11-02T01:46:58Z

Hi @mhl-b, I've created a changelog YAML for you.

elasticsearchmachine · 2024-11-02T01:48:13Z

Pinging @elastic/es-distributed (Team:Distributed)

…-default

…csearch into http-unsafe-buffers-default

...rt-netty4/src/internalClusterTest/java/org/elasticsearch/http/netty4/Netty4PipeliningIT.java

DaveCTurner · 2024-11-05T12:49:06Z

server/src/main/java/org/elasticsearch/rest/RestRequest.java

+     * Release underlying HTTP request and related buffers.
+     */
+    @Override
+    public void close() {


Is this only for the testPipelineOverflow workaround? If so, I'd rather we used a package-private method instead of making this public and Releasable. The eventual goal would be to release these buffers up-front and then we can remove the workaround, and in the meantime we don't want to accumulate other spots that explicitly close the request.

I can add method that returns HttpBody and then we can close body explicitly. Right now there is public access from RestRequest to close pooled/streamed content.

DaveCTurner · 2024-11-05T13:39:18Z

Armin raised a concern that with this change we'll be retaining these Netty buffers for much longer than we do today, which might result in much more stress on the Netty allocator (especially since the Netty buffers are oversized). Could we move to releasing these things on return from BaseRestHandler#handleRequest() already?

…-default

elasticsearchmachine · 2024-11-06T03:09:23Z

Pinging @elastic/es-distributed-coordination (Team:Distributed Coordination)

…-default

mhl-b · 2024-11-20T23:10:06Z

@DaveCTurner, I addressed feedback 226060e

…-default

DaveCTurner

LGTM - two optional comments that don't block this change.

DaveCTurner · 2024-11-21T07:07:24Z

server/src/main/java/org/elasticsearch/rest/action/document/RestBulkAction.java

@@ -102,9 +103,11 @@ public RestChannelConsumer prepareRequest(final RestRequest request, final NodeC
            boolean defaultRequireDataStream = request.paramAsBoolean(DocWriteRequest.REQUIRE_DATA_STREAM, false);
            bulkRequest.timeout(request.paramAsTime("timeout", BulkShardRequest.DEFAULT_TIMEOUT));
            bulkRequest.setRefreshPolicy(request.param("refresh"));
+            ReleasableBytesReference content = request.requiredReleasableContent();
+            content.mustIncRef();


nit: I think the exception handling here is correct, but I wonder if we should use the same pattern as in RestIndexAction where we call mustIncRef within the lambda, just before client.bulk. That will be more robust to future changes that might introduce more exception paths around here.

DaveCTurner · 2024-11-21T07:21:00Z

modules/transport-netty4/src/main/java/org/elasticsearch/transport/netty4/NettyAllocator.java

+            if (refCnt() == 1) {
+                trashContent();
+            }


Yeah it's not valid but it's also not obvious from the code that this isn't valid. Suggest a comment like this:

diff --git a/modules/transport-netty4/src/main/java/org/elasticsearch/transport/netty4/NettyAllocator.java b/modules/transport-netty4/src/main/java/org/elasticsearch/transport/netty4/NettyAllocator.java index 7fd092c1964..791d55481d5 100644 --- a/modules/transport-netty4/src/main/java/org/elasticsearch/transport/netty4/NettyAllocator.java +++ b/modules/transport-netty4/src/main/java/org/elasticsearch/transport/netty4/NettyAllocator.java @@ -373,6 +373,7 @@ public class NettyAllocator { @Override public boolean release() { if (refCnt() == 1) { + // see [NOTE on racy trashContent() calls] trashContent(); } return super.release(); @@ -381,11 +382,20 @@ public class NettyAllocator { @Override public boolean release(int decrement) { if (refCnt() == decrement && refCnt() > 0) { + // see [NOTE on racy trashContent() calls] trashContent(); } return super.release(decrement); } + // [NOTE on racy trashContent() calls]: We trash the buffer content _before_ reducing the ref count to zero, which looks racy + // because in principle a concurrent caller could come along and successfully retain() this buffer to keep it alive after it's been + // trashed. Such a caller would sometimes get an IllegalReferenceCountException ofc but that's something it could handle - see for + // instance org.elasticsearch.transport.netty4.Netty4Utils.ByteBufRefCounted.tryIncRef. Yet in practice this should never happen, + // we only ever retain() these buffers while we know them to be alive (i.e. via RefCounted#mustIncRef or its moral equivalents) so + // it'd be a bug for a caller to retain() a buffer whose ref count is heading to zero and whose contents we've already decided to + // trash. + private void trashContent() { if (trashed == false) { trashed = true;

…-default

DaveCTurner · 2024-11-21T15:54:51Z

server/src/main/java/org/elasticsearch/rest/action/document/RestBulkAction.java

@@ -117,10 +120,16 @@ public RestChannelConsumer prepareRequest(final RestRequest request, final NodeC
                    request.getRestApiVersion()
                );
            } catch (Exception e) {
+                content.close();


We shouldn't close the content on exception here any more.

oops, thank you!

DaveCTurner

LGTM again

elasticsearchmachine · 2024-11-21T19:05:47Z

💔 Backport failed

Status	Branch	Result
❌	8.18	The branch "8.18" is invalid or doesn't exist

You can use sqren/backport to manually backport by running backport --upstream elastic/elasticsearch --pr 116115

* backport explicit http content copy/retain #116115 * spotless

allow http unsafe buffers by default

d691763

elasticsearchmachine added the v9.0.0 label Nov 1, 2024

mhl-b added 2 commits November 1, 2024 12:25

dont copy http byte buffers

b11738f

make rest request releasable and fix leaking test

e2aee86

mhl-b commented Nov 1, 2024

View reviewed changes

...rt-netty4/src/internalClusterTest/java/org/elasticsearch/http/netty4/Netty4PipeliningIT.java Outdated Show resolved Hide resolved

Merge remote-tracking branch 'upstream/main' into http-unsafe-buffers…

70c3197

…-default

mhl-b added :Distributed Coordination/Network Http and internode communication implementations Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. v8.16.0 >enhancement labels Nov 2, 2024

Update docs/changelog/116115.yaml

6196a32

mhl-b requested review from Tim-Brooks, DaveCTurner and ywangd November 2, 2024 01:46

mhl-b marked this pull request as ready for review November 2, 2024 01:47

mhl-b added 2 commits November 4, 2024 10:52

Merge remote-tracking branch 'upstream/main' into http-unsafe-buffers…

d4f635e

…-default

Merge branch 'http-unsafe-buffers-default' of github.com:mhl-b/elasti…

eb6f2b4

…csearch into http-unsafe-buffers-default

DaveCTurner reviewed Nov 5, 2024

View reviewed changes

mhl-b added 2 commits November 5, 2024 12:00

Merge remote-tracking branch 'upstream/main' into http-unsafe-buffers…

d64b31c

…-default

explicit http content copy/retain

24f94c7

elasticsearchmachine added needs:triage Requires assignment of a team area label and removed Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. labels Nov 6, 2024

mhl-b changed the title ~~Allow http unsafe buffers by default~~ Explicit HTTP content copy/retain Nov 6, 2024

mhl-b added Team:Distributed Coordination Meta label for Distributed Coordination team and removed needs:triage Requires assignment of a team area label labels Nov 6, 2024

mhl-b requested a review from original-brownbear November 6, 2024 03:11

mhl-b added 3 commits November 20, 2024 16:29

feedback

226060e

Merge remote-tracking branch 'upstream/main' into http-unsafe-buffers…

56fd5ef

…-default

Merge branch 'main' into http-unsafe-buffers-default

5712ef1

elasticsearchmachine added v8.18.0 and removed v8.17.0 labels Nov 20, 2024

Merge branch 'main' into http-unsafe-buffers-default

e453b78

mhl-b requested a review from DaveCTurner November 20, 2024 23:08

mhl-b added 2 commits November 20, 2024 18:13

Merge branch 'main' into http-unsafe-buffers-default

bd46fb3

Merge remote-tracking branch 'upstream/main' into http-unsafe-buffers…

412ac10

…-default

DaveCTurner approved these changes Nov 21, 2024

View reviewed changes

mhl-b added 3 commits November 21, 2024 08:46

Merge remote-tracking branch 'upstream/main' into http-unsafe-buffers…

cf80122

…-default

feedback

d76878d

Merge remote-tracking branch 'upstream/main' into http-unsafe-buffers…

b3db039

…-default

DaveCTurner requested changes Nov 21, 2024

View reviewed changes

mhl-b added 2 commits November 21, 2024 11:09

spotless

4bb1db6

bug

ef1f356

DaveCTurner approved these changes Nov 21, 2024

View reviewed changes

mhl-b added the auto-backport Automatically create backport pull requests when merged label Nov 21, 2024

mhl-b merged commit 7aa07f1 into elastic:main Nov 21, 2024
16 checks passed

elasticsearchmachine added the backport pending label Nov 21, 2024

mhl-b added a commit to mhl-b/elasticsearch that referenced this pull request Nov 21, 2024

backport explicit http content copy/retain elastic#116115

cddba47

mhl-b mentioned this pull request Nov 21, 2024

Backport explicit http content copy/retain #116115 #117276

Merged

elasticsearchmachine pushed a commit that referenced this pull request Nov 21, 2024

Backport explicit http content copy/retain #116115 (#117276)

7ab9097

* backport explicit http content copy/retain #116115 * spotless

mhl-b removed the backport pending label Nov 21, 2024

smalyshev pushed a commit to smalyshev/elasticsearch that referenced this pull request Nov 22, 2024

Explicit HTTP content copy/retain (elastic#116115)

b285b86

mhl-b mentioned this pull request Nov 25, 2024

Remove HTTP content copies #117303

Merged

alexey-ivanov-es pushed a commit to alexey-ivanov-es/elasticsearch that referenced this pull request Nov 28, 2024

Explicit HTTP content copy/retain (elastic#116115)

5e8ad40

mhl-b mentioned this pull request Nov 29, 2024

Trash derived netty buffers #117744

Merged

Explicit HTTP content copy/retain #116115

Explicit HTTP content copy/retain #116115

Uh oh!

Conversation

mhl-b commented Nov 1, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

elasticsearchmachine commented Nov 2, 2024

Uh oh!

elasticsearchmachine commented Nov 2, 2024

Uh oh!

Uh oh!

DaveCTurner Nov 5, 2024

Choose a reason for hiding this comment

Uh oh!

mhl-b Nov 5, 2024

Choose a reason for hiding this comment

Uh oh!

DaveCTurner commented Nov 5, 2024

Uh oh!

elasticsearchmachine commented Nov 6, 2024

Uh oh!

mhl-b commented Nov 20, 2024

Uh oh!

DaveCTurner left a comment

Choose a reason for hiding this comment

Uh oh!

DaveCTurner Nov 21, 2024

Choose a reason for hiding this comment

Uh oh!

DaveCTurner Nov 21, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

DaveCTurner Nov 21, 2024

Choose a reason for hiding this comment

Uh oh!

mhl-b Nov 21, 2024

Choose a reason for hiding this comment

Uh oh!

DaveCTurner left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

elasticsearchmachine commented Nov 21, 2024

💔 Backport failed

Uh oh!

Uh oh!

mhl-b commented Nov 1, 2024 •

edited

Loading

DaveCTurner Nov 21, 2024 •

edited

Loading