-
Notifications
You must be signed in to change notification settings - Fork 3k
Core: Add param to limit manifest parallel reader queue size #7844
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
982cf30 to
45493ed
Compare
| import org.apache.iceberg.relocated.com.google.common.collect.Iterables; | ||
|
|
||
| public class ParallelIterable<T> extends CloseableGroup implements CloseableIterable<T> { | ||
| public static final int MANIFEST_READER_QUEUE_SIZE = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need this and expose as public?
| * The size of the queue in ParallelIterable. This queue limits the memory usage of manifest | ||
| * reader. | ||
| */ | ||
| public static final ConfigEntry<Integer> MANIFESTS_READER_QUEUE_SIZE = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The ParallelIterable is a common util, does this config is used target for manifest only?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I only found used for read manifest.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't matter that it is only used for one purpose right now. The iterable should be kept generic by passing in configuration.
| private final Future<?>[] taskFutures; | ||
| private final ConcurrentLinkedQueue<T> queue = new ConcurrentLinkedQueue<>(); | ||
| private final LinkedBlockingQueue<T> queue = | ||
| new LinkedBlockingQueue<>(MANIFEST_READER_QUEUE_SIZE); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This config seems to affect all the instances.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
iceberg does not seem to provide a more effective configuration management method other than environment variables. Can we turn this into a table-level configuration?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Heltman, we typically avoid system or environment config. A more appropriate place for this is in the engine's config. Then it can pass that configuration down. For example, Flink passes threadpools into the scan API rather than using the common worker pool.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How I pass a threadpools to iceberg from trino. Trino only use plan() to get FileScanTask
45493ed to
ac85817
Compare
| (iterable instanceof Closeable) ? (Closeable) iterable : () -> {}) { | ||
| for (T item : iterable) { | ||
| queue.add(item); | ||
| queue.put(item); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ConeyLiu I found future.cancel can't exit this loop, because ConcurrentLinkedQueue not check InterruptedException. So we need check close in this place, to avoid memory leak.
Actural, I found trino kill query, but this loop will continue add to queue until finished.
We need use LinkedBlockingQueue or add check iterator is closed, or both them.Just like below
for (T item : iterable) {
if (closed) {
queue.clear()
return;
}
queue.put(item);
}There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Heltman can you please add a few more tests to TestParallelIterable that would exercise this exact condition you're describing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Close already clears the queue and cancels tasks. I don't think that we need to modify this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cancels will not worker if use ConcurrentLinkedQueue. If we don't change to LinkedBlockingQueue, we need check closed every time before we add to queue.
ac85817 to
9014674
Compare
| } catch (IOException e) { | ||
| throw new RuntimeIOException(e, "Failed to close iterable"); | ||
| } catch (InterruptedException e) { | ||
| throw new RuntimeException( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we actually want to throw here or just log? It seems like you'll have to deal with an unnecessary/non-actionable exception when you're really just trying to cancel the iteration. Maybe just turn this into an info log message?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@danielcweeks If we had closed this iterator, we don't care of throw error or log. I don't use log only because this class don't have log, I just follow it. Maybe we can add log to this class.
|
One minor comment but otherwise looks good to me. @nastra thoughts? |
nastra
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My biggest concern right now is that there aren't enough tests in TestParallelIterable that would raise the confidence in the changes being proposed
| import org.apache.iceberg.relocated.com.google.common.collect.Iterables; | ||
|
|
||
| public class ParallelIterable<T> extends CloseableGroup implements CloseableIterable<T> { | ||
| public static final int ITERATOR_QUEUE_SIZE = SystemConfigs.ITERATOR_QUEUE_SIZE.value(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
any particular reason to make this public? This doesn't seem to be used anywhere outside of this class. Also you probably could just use SystemConfigs.ITERATOR_QUEUE_SIZE.value() directly in L59
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a mistake.
| (iterable instanceof Closeable) ? (Closeable) iterable : () -> {}) { | ||
| for (T item : iterable) { | ||
| queue.add(item); | ||
| queue.put(item); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Heltman can you please add a few more tests to TestParallelIterable that would exercise this exact condition you're describing?
| */ | ||
| public static final ConfigEntry<Integer> ITERATOR_QUEUE_SIZE = | ||
| new ConfigEntry<>( | ||
| "iceberg.iterator.queue-size", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the config naming implies (at least to me) that this is being applied to all iterators being used in Iceberg, which isn't the case
|
This class cannot use a blocking queue with the worker pool, so I'm -1 on this change. The problem is that planning uses a shared threadpool. Using a blocking queue would cause tasks to stall, which would then tie up the threads in the shared pool and cause all planning to halt. If you want to limit memory consumption here, then you need to do the following:
Once that new variant is in, we can look at how to use it from the scan API. Feel free to contact me to review this, since this is a part of the code where bad changes can cause a lot of trouble! |
rdblue
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See my comment above.
|
We have been using a blocking queue and shared thread pool for a while. We did hit some "dead lock" issue when running multi-stage queries with trino, because the iceberg split source generates splits synchronously. We fixed it by making it async, and so far everything works fine. I think using an isolated pool would be safer, but it may hurt performance. |
|
I will add a some change for fix memory leak. And think about creating BlockingParallelIterable instead of change ParallelIterable. |
good point
from Trino perspective it would be most convenient to be able to provide an |
|
I created a PR aiming to make the queue bounded, but without requiring separate executor pool. The change is effectively transparent to class consumers. Please see #10691 and let me know what you think of that approach. |
|
This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull request requires a review, please simply write any comment. If closed, you can revive the PR at any time and @mention a reviewer or discuss it on the dev@iceberg.apache.org list. Thank you for your contributions. |
|
This pull request has been closed due to lack of activity. This is not a judgement on the merit of the PR in any way. It is just a way of keeping the PR queue manageable. If you think that is incorrect, or the pull request requires review, you can revive the PR at any time. |
fixes #7843