Flink: add an option to set monitoring snapshot number #4943

chenjunjiedada · 2022-06-02T11:07:30Z

This adds an option to control how many snapshots to monitor at once when using iceberg table as a Flink source.

Currently, the monitor operator generates file splits from the last consumed snapshot to the latest snapshot, which may lead to backpressure when the consumer lag behind as follow image shows. We can reduce the checkpoint lock scope (#4911) or increase the network buffer to mitigate the situation while the problem still cannot be completely avoided since the number of the splits is unknown, especially when starting a consumer for the first time.

With the option, the user can tune the monitoring flow according to backpressure and busy metrics.

kbendick

Thanks for this contribution @chenjunjiedada! I know that especially the case of streaming a table from the start is difficult and can cause high memory pressure.

I would also love to get @stevenzwu's input here.

kbendick · 2022-06-02T23:50:58Z

flink/v1.15/flink/src/main/java/org/apache/iceberg/flink/source/ScanContext.java

+  private static final ConfigOption<Integer> MONITOR_SNAPSHOT_NUMBER =
+      ConfigOptions.key("monitor-snapshot-number").intType().defaultValue(Integer.MAX_VALUE);


Nit: Is there perhaps a more descriptive name for this? This is the number of snapshots to consider within each monitor interval loop, correct?

Kafka has more or less the same concept in its max.poll.interval and other consumer related configurations properties around polling.

Maybe we can take some inspiration from that naming. Thinking off the top of my head, but maybe monitor-max-snapshots-per-interval or something like that would be more instructive to the user? Given we already have monitor-interval as a configuration property as well.

cc @stevenzwu for your thoughts as well

+1 to use "max" word. How about max-snapshots-per-monitor-interval?

Yeah I’m good with that. That’s much more clear to me what that is, especially as we have a monitor-interval in the configs already.

So users will go to look up what that is - which we should ensure is documented as a follow up.

If you could please rename the code instances from monitorNumber that would be great. That name is much more descriptive.

kbendick · 2022-06-02T23:56:16Z

.../v1.15/flink/src/test/java/org/apache/iceberg/flink/source/TestStreamingMonitorFunction.java

+    List<List<Record>> recordsList = generateRecordsAndCommitTxn(10);
+
+    for (int monitorNumber = 1; monitorNumber < 11; monitorNumber = monitorNumber + 1) {
+      ScanContext scanContext = ScanContext.builder()
+          .monitorInterval(Duration.ofMillis(100))
+          .monitorSnapshotNumber(monitorNumber)
+          .build();


Are there any assertions we can apply (that wouldn't be too flakey) for this whole outer loop? Seems like we should have 10 splits total, correct?

Also nit on starting the for loop at 0 vs 1 if possible.

Let me try to tune the input split size and add more assertions.

Since using 0 as the max monitor number make no sense, so I planned to add a condition check for ScanContext ctor to discard invalid configuration. How about adding that check and an assertion to throw a check exception? Does that make sense to you?

Yes definitely. A Precondition on an invalid configuration is much preferred. 👍

It’s always better in my opinion to throw on an invalid configuration vs try to adjust for the user’s behavior (unless we introduced the bug in which case we should take that case by case).

We should check that it’s non-negative in a Precondition check.

We should also be testing that we only have at most 10 * max-snapshots-per-monitor-interval amount of snapshots processed at the end of the large loop.

That should be true, right?

Instead of testing at the outside loop, I compare the exact split number to the planning result inside the loop. Does that make sense to you?

Yes that makes sense. Thank you @chenjunjiedada

kbendick · 2022-06-03T00:01:22Z

flink/v1.15/flink/src/main/java/org/apache/iceberg/flink/source/StreamingMonitorFunction.java

      } else {
+        List<Long> snapshotIds = SnapshotUtil.snapshotIdsBetween(table, lastSnapshotId, snapshot.snapshotId());
+        if (snapshotIds.size() < scanContext.monitorSnapshotNumber()) {
+          snapshotId = snapshot.snapshotId();
+        } else {
+          snapshotId = snapshotIds.get(snapshotIds.size() - scanContext.monitorSnapshotNumber());
+        }


Can you elaborate on this logic here / walk me through an example case where snapshotId needs to be determined because it's equal to (or possibly greater than?) the monitorSnapshotNumber?

The SnapshotIdBetween function returns a list of snapshot IDs [lastSnapshotId(exclusive), currentSnapshotId(inclusive)] that are ordered by commit time, descending. So the latest snapshot is the first item on the list.

Consider following two cases:

monitorSnapshotNumber > snapshotIds.size(), snapshotId should be the Id of latest snapshot.

monitorSnapshotNumber < snapshotIds.size(), snapshotId is computed according to reverted index because of the descending order in the list.

When monitorSnapshotNumber is equal to snapshotIds.size(), snapshotId values are same in if and else blocks.

Instead, I suggest using lastSnapshotId and monitorSnapshotNumber directly to get to the destination.

Specifically, there should be two ways:

use lastSnapshotId and monitorSnapshotNumber to get snapshotId, instead of getting currentSnapshotId and comparing,

Use lastSnapshotId and monitorSnapshotNumber directly to get the snapshot list

The above may require modifications to the core module, but I think it will make the logic more intuitive.

I’m thinking through these situations now.

But users would never enter this block if they don’t opt into the new behavior, is that correct?

If so, can we add a conditional case that this block won’t be entered unless the user have a non INT_MAX value (eg we make a Precondition and just make a sentence that starts with [bug] this shouldn’t happen <because>). We have one or two other places that use the same [bug] syntax and this new logic change would ideally not apply to users that keep the default behavior unless they encounter a bug.

Just to be extra cautious. Or for my own understanding while I review these scenarios 🙂

I do like the idea of using the snapshot ID though generally speaking (in this case I need to review).

However, It’s possible to turn off snapshot ID inheritance. So we’d need to consider that.

EDIT - We have assertions on snapshot ID inheritance within this class already so that’s fair to consider imo 👍

Also, can we make this its own method? It should be skipped entirely if the user doesn’t have a configured value (eg they have INT_MAX). They don’t need to any of this processing.

Since we already consider the inheritance in state initialization, do we need another check?

@hililiwei I agree computing using lastComsumedSnapshotId and maxSnapshotsPerMonitorInterval is more accurate. But I haven't found direct methods or utils to compute that, maybe need more utils in SnapshotUtil as you said. Can we have such cases in which the last consumed snapshot is not an ancestor of the latest snapshot?

I wonder if that would be the case after supporting Branch and Tag feature?

I'm not sure whether it can route to different branches without changing table states or properties. Like git, if we don't explicitly checkout a branch we should be able to traverse the history, right?

kbendick · 2022-06-03T00:05:28Z

The numbers in the photos do look really good, but I'd be interested in seeing the same screenshot with monitor snapshot number applied vs not applied at around the same time into the job.

hililiwei · 2022-06-04T12:45:11Z

flink/v1.15/flink/src/main/java/org/apache/iceberg/flink/source/ScanContext.java

    }

+    Builder monitorSnapshotNumber(int newMonitorSnapshotNumber) {
+      this.monitorSnapshotNumber = newMonitorSnapshotNumber;


Should we add a check here? It should be greater than 0, right?

+1. A precondition check. And INT_MAX can be the value that disables this behavior.

Or a negative value (eg -1). A negative value is probably more in-line with what we typically do and more cross-language friendly (as Iceberg table format is a specification first and foremost… it should be able to be rewritten in a language that doesn’t have JVM INT_MAX).

hililiwei · 2022-06-04T13:09:09Z

flink/v1.15/flink/src/main/java/org/apache/iceberg/flink/source/StreamingMonitorFunction.java

  public void run(SourceContext<FlinkInputSplit> ctx) throws Exception {
    this.sourceContext = ctx;
    while (isRunning) {
+      LOG.info("Start polling snapshots from snapshot id: {}, monitor snapshot number {}", lastSnapshotId,


Use debug to prevent too many logs?

Yeah this could get very excessive very quickly.

This should be a debug log as users can opt in via their logging configuration if need be.

kbendick · 2022-06-05T06:02:14Z

It might be a good idea (in a follow up) to make a metric that monitors how many snapshots are processed per monitor interval.

Also, since this grabs the checkpoint lock, will it possibly not be an even multiple when the checkpoint happens? If so, we should add a metric that tracks that as well (again, in a follow up).

Does that interest you @hililiwei? Or @chenjunjiedada? If so, feel free to make the ticket and we’ll deal with that as soon as we can 🙂

stevenzwu · 2022-06-05T16:55:37Z

flink/v1.15/flink/src/main/java/org/apache/iceberg/flink/source/StreamingMonitorFunction.java

+      return latestSnapshotId;
+    } else {
+      // This doesn't consider snapshot inheritance since it is already checked in state initialization.
+      return snapshotIds.get(snapshotIds.size() - maxSnapshotsPerMonitorInterval);


This logic seems incorrect to me. SnapshotUtil.snapshotIdsBetween returns the snapshot ids in the reverse order (most recent snapshot first). I think we need to use the reversed list.

I got it now. it is actually correct. We should improve the comment. Because it is a revert list (list size - maxSnapshotsPerMonitorInterval) will actually point to the snapshot that results in (fromSnapshotId, toSnapshotId) with size of maxSnapshotsPerMonitorInterval

stevenzwu · 2022-06-05T16:59:08Z

flink/v1.15/flink/src/main/java/org/apache/iceberg/flink/source/StreamingMonitorFunction.java

          monitorAndForwardSplits();
        }
      }
+      LOG.debug("Forwarded splits from {}(exclusive) to {}(inclusive), time elapsed {}ms",


I think it is better to log the durations for the two steps separately if we want to have better understanding of the bottleneck

The logging seems wrong to me. both startSnapshotId and lastSnapshotId are pointing to the same snapshotId. Moving the logging inside maxReachableSnapshotId, we can correctly apply the endSnapshotId calculated from maxReachableSnapshotId.

lastSnapshotId is updated in monitorAndForwardSplits. Does that make sense to you?

yes. I missed lastSnapshotId was updated by monitorAndForwardSplits. It also means that the code is difficult to read. we are relying on the side effect of the monitorAndForwardSplits method. It is more clear to avoid such side effect. Plus it is better to measure the latency separately: planning vs emitting.

@chenjunjiedada I prefer we address the above comment regarding logging

stevenzwu · 2022-06-05T17:02:05Z

flink/v1.15/flink/src/main/java/org/apache/iceberg/flink/FlinkConfigOptions.java

 */
 public class FlinkConfigOptions {

+  public static final int MAX_SNAPSHOTS_PER_MONITOR_INTERVAL_DEFAULT = -1;


we can default it to Integer.MAX_VALUE. then there is no need to check default value.

@stevenzwu , Kyle suggested using -1 here: #4943 (comment). I think that makes sense to me as well.

I see the ScanSummary also use Integer.MAX_VALUE. -1 default seems to be used mostly for invalid ids (like snapshotId, fieldId etc.)

private int limit = Integer.MAX_VALUE;

@kbendick What do you think for this?

I’m ok with INT_MAX if that’s what’s used elsewhere in ScanSummary. I suggested -1 as we use -1 to disable cache expiration in the caching catalog and because it’s easier to use in things like Python where INT_MAX is less commonly used. -1 is also easier to use as in-line SQL option imo.

But if INT_MAX is already used in the ScanSummary, I suggest we keep that consistent.

We can then later on consider using -1 in both places within Flink (or more places). But consistency is better imo.

stevenzwu · 2022-06-05T17:14:30Z

In general, I think this is a right direction. It is not mutually exclusive with the PR #4911 (reduce the checkpoint lock scope). We should have both. This is focused on making the plan smaller and faster. PR #4911 can avoid holding the lock beyond what is actually necessary.

For the new FLIP-27 source, I have been thinking about something very similar. There is no point of eagerly discover all splits/snapshots if the Flink job is falling behind too much. We need to throttle the split discovery. In addition to limit the number of snapshots per discovery cycle, I am also thinking about that we should pause/skip the split discovery, if the number of pending splits is over a certain threshold. It is like a backpressure mechanism. This can help control the memory footprint. This won't be in the MVP version of FLIP-27 source. We can follow up on the optimization after MVP version is merged.

stevenzwu · 2022-06-05T17:26:26Z

.../v1.15/flink/src/test/java/org/apache/iceberg/flink/source/TestStreamingMonitorFunction.java

+          .maxSnapshotsPerMonitorInterval(maxSnapshotsNum)
+          .build();
+
+      FlinkInputSplit[] expectedSplits = FlinkSplitPlanner


I think we should directly define the number of expected splits and avoid using FlinkSplitPlanner. if FlinkSplitPlanner is not honoring maxSnapshotsPerMonitorInterval, the assertion later will still be correct

FlinkSplitPlanner here is using the same scanContext as StreamingMonitorFunction, will it still produce different splits?

If FlinkSplitPlanner didn't honor the maxSnapshotsPerMonitorInterval option correctly, this unit test won't be able to detect it. Both the expected and actual splits use the same planner and will generate the same planning result

stevenzwu · 2022-06-05T17:28:31Z

flink/v1.15/flink/src/main/java/org/apache/iceberg/flink/source/ScanContext.java

      ConfigOptions.key("include-column-stats").booleanType().defaultValue(false);

+  private static final ConfigOption<Integer> MAX_SNAPSHOTS_PER_MONITOR_INTERVAL =
+      ConfigOptions.key("max-snapshot-per-monitor-interval").intType()


it will be more clear if the config is defined as max-snapshot-count-per-monitor-interval or max-snapshot-count-per-incremental-scan

hililiwei · 2022-06-06T12:47:48Z

It might be a good idea (in a follow up) to make a metric that monitors how many snapshots are processed per monitor interval.

Also, since this grabs the checkpoint lock, will it possibly not be an even multiple when the checkpoint happens? If so, we should add a metric that tracks that as well (again, in a follow up).

Does that interest you @hililiwei? Or @chenjunjiedada? If so, feel free to make the ticket and we’ll deal with that as soon as we can 🙂

Sounds interesting. if you don't mind, I'll try to follow up.

chenjunjiedada · 2022-06-07T15:55:09Z

@kbendick @stevenzwu @hililiwei I rebased and addressed comments, PTAL.

stevenzwu · 2022-06-07T17:21:10Z

.../v1.15/flink/src/test/java/org/apache/iceberg/flink/source/TestStreamingMonitorFunction.java

+  public void testConsumeWithMaxSnapshotCountPerMonitorInterval() throws Exception {
+    List<List<Record>> recordsList = generateRecordsAndCommitTxn(10);
+
+    final ScanContext scanContext1 = ScanContext.builder()


it is better to move the invalid config to a separate test method

stevenzwu · 2022-06-07T17:40:09Z

.../v1.15/flink/src/test/java/org/apache/iceberg/flink/source/TestStreamingMonitorFunction.java

+        TestSourceContext sourceContext = new TestSourceContext(latch);
+        runSourceFunctionInTask(sourceContext, function);
+        // Ensure the first loop in monitoring finished
+        Thread.sleep(100);


Depending on sleep can lead to flaky test. We can probably wait and check the condition of expected splits on sourceContext.splits

stevenzwu · 2022-06-07T17:49:45Z

.../v1.15/flink/src/test/java/org/apache/iceberg/flink/source/TestStreamingMonitorFunction.java

+          }
+        }
+
+        Assert.assertTrue("Should have expected elements.",


I feel it is unnecessary to wait for all the snapshots. Should we make monitorAndForwardSplits package default so that it can be directly in this test class? We can use AbstractStreamOperatorTestHarness#getOperator.

We just need to call monitorAndForwardSplits once and verify it limits the number of discovered snapshots/splits to maxSnapshotsNum. We can just take a few values of maxSnapshotsNum (like 3, 9, 12). no need to loop 1 to 15.

@stevenzwu I removed lock/wait/sleep logic and kept the loop from 1 to 15 since the total execution time of the unit test is just a few seconds.

a few seconds is long for unit test. We just need to test the 3 scenarios of the maxSnapshotsNum: less, equal, greater.

stevenzwu · 2022-06-08T15:43:44Z

flink/v1.15/flink/src/main/java/org/apache/iceberg/flink/source/StreamingMonitorFunction.java

        newScanContext = scanContext.copyWithSnapshotId(snapshotId);
+        LOG.debug("Start generating splits for {}", snapshotId);
      } else {
+        if (scanContext.maxSnapshotCountPerMonitorInterval() ==


Related to an earlier comment. if the default value is Integer.MAX_VALUE, we won't need the if-else

stevenzwu · 2022-06-08T16:05:20Z

flink/v1.15/flink/src/main/java/org/apache/iceberg/flink/source/StreamingMonitorFunction.java

+              scanContext.maxSnapshotCountPerMonitorInterval());
+        }
        newScanContext = scanContext.copyWithAppendsBetween(lastSnapshotId, snapshotId);
+        LOG.debug("Start generating splits from {}(exclusive) to {}(inclusive),", lastSnapshotId, snapshotId);


Nit: why not move the debug log outside the if-else block? To me, it is ok to use the same log line for the if case too.

Also maybe is "discover" more accurate than "generate" in the log lines?

Do you mean it is ok to show from -1 to $snapshotId?

kbendick · 2022-06-09T16:44:07Z

.../v1.15/flink/src/test/java/org/apache/iceberg/flink/source/TestStreamingMonitorFunction.java

+    // Use the oldest snapshot as starting to avoid the initial case.
+    long oldestSnapshotId = SnapshotUtil.oldestAncestor(table).snapshotId();
+
+    ScanContext scanContext3 = ScanContext.builder()


Nit / non-blocking: As these are in different functions, do they need to be named scanContext1 ... scanContext3. They aren't in the same scope are they?

This doesn't need to be changed, but for my own understanding.

Not necessary actually, updated.

kbendick · 2022-06-09T16:44:42Z

.../v1.15/flink/src/test/java/org/apache/iceberg/flink/source/TestStreamingMonitorFunction.java


+  @Test
+  public void testInvalidMaxSnapshotCountPerMonitorInterval() {
+    final ScanContext scanContext1 = ScanContext.builder()


Nit: We normally don't use final for variables inside of methods. Is this usage of final necessary?

Not necessary, removed.

kbendick

A few non-blocking nits but overall this looks good to me. Thanks @chenjunjiedada!

rdblue · 2022-06-10T23:46:13Z

flink/v1.15/flink/src/main/java/org/apache/iceberg/flink/source/StreamingMonitorFunction.java

-  private void monitorAndForwardSplits() {
+  private long maxReachableSnapshotId(long lastConsumedSnapshotId, long latestSnapshotId,
+                                      int maxSnapshotCountPerMonitorInterval) {
+    // This doesn't consider snapshot inheritance since it is already checked in state initialization.


What do you mean by "snapshot inheritance" here?

The latest table snapshot id might not be the ancestor of the last consumed one. Let me delete this comment to avoid confusion.

rdblue · 2022-06-10T23:46:28Z

flink/v1.15/flink/src/main/java/org/apache/iceberg/flink/source/StreamingMonitorFunction.java


-  private void monitorAndForwardSplits() {
+  private long maxReachableSnapshotId(long lastConsumedSnapshotId, long latestSnapshotId,
+                                      int maxSnapshotCountPerMonitorInterval) {


I think this should be maxSnapshots

Changed to snapshotLimit.

rdblue · 2022-06-10T23:50:40Z

flink/v1.15/flink/src/main/java/org/apache/iceberg/flink/source/StreamingMonitorFunction.java

        newScanContext = scanContext.copyWithAppendsBetween(lastSnapshotId, snapshotId);
      }

+      LOG.debug("Start discovering splits from {}(exclusive) to {}(inclusive),", lastSnapshotId, snapshotId);


This is missing spaces between the snapshot IDs and the inclusive or exclusive clarification.

rdblue · 2022-06-10T23:55:26Z

flink/v1.15/flink/src/main/java/org/apache/iceberg/flink/source/ScanContext.java

      ConfigOptions.key("include-column-stats").booleanType().defaultValue(false);

+  private static final ConfigOption<Integer> MAX_SNAPSHOT_COUNT_PER_MONITOR_INTERVAL =
+      ConfigOptions.key("max-snapshot-count-per-monitor-interval").intType().defaultValue(Integer.MAX_VALUE);


I don't think it is very clear what this is setting from the name. Not many people know Flink internals well enough to understand what the "monitor interval" is. Is there a simpler name?

What about "max-planning-group-size" or "snapshot-group-limit"?

max-planning-group-size lacks information about the group items, snapshot-group-limit looks better to me. "group" is more concise than "per-monitor-interval".

I agree with @chenjunjiedada that group is vague. what about max-planning-snapshot-count?

I meant group is better than per-monitor-interval. I'm OK with both since I think we definitely need a doc to describe what it is and how it impacts the planning and the backpressure.

I found snapshot-group-limit less intuitive. what does the group mean in this context. monitor-interval is an existing config that user is already familiar.

I do agree that we should make the config name clear. Just brainstorming more names. what about max-snapshot-count-per-incremental-scan or max-snapshot-count-per-planning?

Both max-planning-snapshot-count and max-snapshot-count-per-incremental-scan are OK with me. @kbendick @rdblue what do you think?

rdblue · 2022-06-17T16:38:45Z

flink/v1.15/flink/src/main/java/org/apache/iceberg/flink/source/ScanContext.java

      ConfigOptions.key("include-column-stats").booleanType().defaultValue(false);

+  private static final ConfigOption<Integer> SNAPSHOT_GROUP_LIMIT =
+      ConfigOptions.key("snapshot-group-limit").intType().defaultValue(Integer.MAX_VALUE);


I talked with @stevenzwu about this and the best name we could come up with was max-planning-snapshot-count. What do you think, @chenjunjiedada? Could you rename this?

+1 to this name. Config keys should ideally be concise as well as being as short as possible and max-planning-snapshot-count achieves that.

The documentation can potentially use the language Maximum number of snapshots to consume and plan per group in each iteration of an incremental scan or something similar (might need to work on that language too but the language from the other ideas can be used in the docs possibly).

stevenzwu · 2022-06-18T04:44:39Z

flink/v1.15/flink/src/main/java/org/apache/iceberg/flink/source/StreamingMonitorFunction.java

  }

-  private void monitorAndForwardSplits() {
+  private long maxReachableSnapshotId(long lastConsumedSnapshotId, long latestSnapshotId, int maxSnapshotCount) {


nit: maybe call the args fromSnapshotIdExclusive and toSnapshotIdInclusive to be consistent with recent incremental API naming

actually maybe call this method as toSnapshotIdInclusive(long lastConsumedSnapshotId, long currentSnapshotId, int maxPlanningSnapshotCount)

stevenzwu · 2022-06-18T04:45:09Z

flink/v1.15/flink/src/main/java/org/apache/iceberg/flink/source/StreamingMonitorFunction.java

-  private void monitorAndForwardSplits() {
+  private long maxReachableSnapshotId(long lastConsumedSnapshotId, long latestSnapshotId, int maxSnapshotCount) {
+    List<Long> snapshotIds = SnapshotUtil.snapshotIdsBetween(table, lastConsumedSnapshotId, latestSnapshotId);
+


nit: we add empty line after control block (not before)

chenjunjiedada · 2022-06-23T23:38:47Z

@rdblue @kbendick, Could you please take another look?

rdblue · 2022-06-24T16:03:16Z

Thanks, @chenjunjiedada! Nice work.

…ation (apache#4943)

(#5263)

Flink: add an option to set monitoring snapshot number

0963252

github-actions bot added the flink label Jun 2, 2022

chenjunjiedada requested a review from kbendick June 2, 2022 11:42

kbendick reviewed Jun 3, 2022

View reviewed changes

hililiwei reviewed Jun 4, 2022

View reviewed changes

chenjunjiedada added 3 commits June 5, 2022 16:16

rename the config

b843bd2

update unit test, add a method to compute max reachable snapshot

117febf

minor updates

51162eb

stevenzwu reviewed Jun 5, 2022

View reviewed changes

stevenzwu mentioned this pull request Jun 5, 2022

Flink: reduce the scope and duration of holding checkpoint lock. #4911

Merged

refactor nameing, doc

59eead0

chenjunjiedada added 2 commits June 7, 2022 09:49

Merge branch 'master' into add-monitor-snapshot-number-apache

655e578

update unit test

4321fe1

stevenzwu reviewed Jun 7, 2022

View reviewed changes

update logging and ut

fd17afa

stevenzwu reviewed Jun 8, 2022

View reviewed changes

address comments

4597c4c

kbendick reviewed Jun 9, 2022

View reviewed changes

kbendick approved these changes Jun 9, 2022

View reviewed changes

minor update

05d7558

rdblue reviewed Jun 10, 2022

View reviewed changes

chenjunjiedada added 2 commits June 11, 2022 23:10

Merge branch 'master' into add-monitor-snapshot-number-apache

f46eda7

refactor names

31de0c0

chenjunjiedada force-pushed the add-monitor-snapshot-number-apache branch from 12779fa to 31de0c0 Compare June 11, 2022 15:29

update ut function names

ee27234

chenjunjiedada force-pushed the add-monitor-snapshot-number-apache branch from 8d04acb to ee27234 Compare June 11, 2022 15:39

rdblue reviewed Jun 17, 2022

View reviewed changes

rename config

bd1d0e6

chenjunjiedada force-pushed the add-monitor-snapshot-number-apache branch from d5601dc to bd1d0e6 Compare June 18, 2022 00:13

stevenzwu reviewed Jun 18, 2022

View reviewed changes

minor update

a734a8e

kbendick mentioned this pull request Jun 23, 2022

Use bounded queue to avoid consuming too much memory #4596

Closed

rdblue approved these changes Jun 24, 2022

View reviewed changes

rdblue merged commit 95dfb2b into apache:master Jun 24, 2022

namrathamyske pushed a commit to namrathamyske/iceberg that referenced this pull request Jul 10, 2022

Flink: Add option to limit the number of snapshots in a planning oper…

effcb4d

…ation (apache#4943)

namrathamyske pushed a commit to namrathamyske/iceberg that referenced this pull request Jul 10, 2022

Flink: Add option to limit the number of snapshots in a planning oper…

4734b7d

…ation (apache#4943)

chenjunjiedada added a commit to chenjunjiedada/incubator-iceberg that referenced this pull request Jul 13, 2022

Flink: port apache#4943

b9fdb5d

chenjunjiedada added a commit to chenjunjiedada/incubator-iceberg that referenced this pull request Jul 13, 2022

Flink: port apache#4943

c172dc6

rdblue pushed a commit that referenced this pull request Jul 13, 2022

Flink 1.13, 1.14: Port maxPlanningSnapshotCount configuration from #4943

6a06cb3

(#5263)

stevenzwu mentioned this pull request Aug 22, 2022

Flink: throttle FLIP-27 source enumerator for split discovery when Flink job is falling behind in streaming execution #5613

Closed

		private static final ConfigOption<Integer> MONITOR_SNAPSHOT_NUMBER =
		ConfigOptions.key("monitor-snapshot-number").intType().defaultValue(Integer.MAX_VALUE);

Flink: add an option to set monitoring snapshot number #4943

Flink: add an option to set monitoring snapshot number #4943

Uh oh!

Conversation

chenjunjiedada commented Jun 2, 2022

Uh oh!

kbendick left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kbendick Jun 5, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kbendick Jun 5, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kbendick Jun 5, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kbendick Jun 5, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kbendick Jun 5, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kbendick commented Jun 3, 2022

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kbendick commented Jun 5, 2022

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

chenjunjiedada Jun 6, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

kbendick Jun 5, 2022 •

edited

Loading

kbendick Jun 5, 2022 •

edited

Loading

kbendick Jun 5, 2022 •

edited

Loading

kbendick Jun 5, 2022 •

edited

Loading

kbendick Jun 5, 2022 •

edited

Loading

chenjunjiedada Jun 6, 2022 •

edited

Loading

stevenzwu Jun 6, 2022 •

edited

Loading