Skip to content

Conversation

@manuzhang
Copy link
Member

@manuzhang manuzhang commented Jul 29, 2024

fixes #10480
fixes #10569

@github-actions github-actions bot added the spark label Jul 29, 2024
@manuzhang manuzhang closed this Jul 30, 2024
@manuzhang manuzhang reopened this Jul 30, 2024
@manuzhang manuzhang closed this Jul 30, 2024
@manuzhang manuzhang reopened this Jul 30, 2024
@manuzhang manuzhang closed this Jul 30, 2024
@manuzhang manuzhang reopened this Jul 30, 2024
@manuzhang
Copy link
Member Author

This PR didn't fix but exposed more underlying errors from https://github.com/apache/iceberg/actions/runs/10156307633/job/28084490263?pr=10811

TestDataFrameWrites > testFaultToleranceOnWrite() > format = avro FAILED
    org.apache.commons.io.IOExceptionList: 1 exception(s): [org.apache.commons.io.IOIndexedException: IOException #1: Cannot delete file: /tmp/junit60135975027943685/parquet/test/data]
        at org.apache.commons.io.IOExceptionList.checkEmpty(IOExceptionList.java:49)
        at org.apache.commons.io.function.IOStream.forAll(IOStream.java:352)
        at org.apache.commons.io.function.IOStreams.forAll(IOStreams.java:42)
        at org.apache.commons.io.function.IOStreams.forAll(IOStreams.java:36)
        at org.apache.commons.io.function.IOConsumer.forAll(IOConsumer.java:80)
        at org.apache.commons.io.FileUtils.cleanDirectory(FileUtils.java:333)
        at org.apache.commons.io.FileUtils.deleteDirectory(FileUtils.java:1192)
        at org.apache.iceberg.spark.source.TestDataFrameWrites.testFaultToleranceOnWrite(TestDataFrameWrites.java:424)

        Caused by:
        org.apache.commons.io.IOIndexedException: IOException #1: Cannot delete file: /tmp/junit60135975027943685/parquet/test/data
            at org.apache.commons.io.function.IOStream.lambda$forAll$11(IOStream.java:347)
            at java.util.Spliterators$ArraySpliterator.forEachRemaining(Spliterators.java:948)
            at java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.java:647)
            at org.apache.commons.io.function.IOStream.forAll(IOStream.java:338)
            ... 6 more

            Caused by:
            java.io.IOException: Cannot delete file: /tmp/junit60135975027943685/parquet/test/data
                at org.apache.commons.io.FileUtils.forceDelete(FileUtils.java:1340)
                at org.apache.commons.io.function.IOStream.lambda$forAll$11(IOStream.java:340)
                ... 9 more

                Caused by:
                java.nio.file.NoSuchFileException: /tmp/junit60135975027943685/parquet/test/data/.00008-134-fed69c74-d8da-41c3-937f-ea0ab632d7d0-0-00001.parquet.crc
                    at sun.nio.fs.UnixException.translateToIOException(UnixException.java:86)
                    at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
                    at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)
                    at sun.nio.fs.UnixFileAttributeViews$Basic.readAttributes(UnixFileAttributeViews.java:55)
                    at sun.nio.fs.UnixFileSystemProvider.readAttributes(UnixFileSystemProvider.java:144)
                    at sun.nio.fs.LinuxFileSystemProvider.readAttributes(LinuxFileSystemProvider.java:99)
                    at java.nio.file.Files.readAttributes(Files.java:1737)
                    at java.nio.file.FileTreeWalker.getAttributes(FileTreeWalker.java:219)
                    at java.nio.file.FileTreeWalker.visit(FileTreeWalker.java:276)
                    at java.nio.file.FileTreeWalker.next(FileTreeWalker.java:372)
                    at java.nio.file.Files.walkFileTree(Files.java:2706)
                    at java.nio.file.Files.walkFileTree(Files.java:2742)
                    at org.apache.commons.io.file.PathUtils.visitFileTree(PathUtils.java:1654)
                    at org.apache.commons.io.file.PathUtils.deleteDirectory(PathUtils.java:517)
                    at org.apache.commons.io.file.PathUtils.delete(PathUtils.java:476)
                    at org.apache.commons.io.FileUtils.forceDelete(FileUtils.java:1337)
                    ... 10 more

@manuzhang
Copy link
Member Author

@Fokko @dramaticlly PTAL. This PR now proactively deletes temporary directory catching exceptions until it's cleaned up.

@manuzhang
Copy link
Member Author

manuzhang commented Sep 11, 2024

cc @nastra please help review

assertThat(snapshotBeforeFailingWrite).isEqualTo(snapshotAfterFailingWrite);
assertThat(resultBeforeFailingWrite).isEqualTo(resultAfterFailingWrite);

while (location.exists()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this seems like a workaround. https://stackoverflow.com/questions/56290320/junit-cannot-delete-tempdir-with-file-created-by-spark-structured-streaming indicates that the issue might be due to a resource not being closed, so maybe we should investigate what the actual root cause of this issue is

Copy link
Contributor

@Fokko Fokko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm seeing this one quite often. I would suggest merging this and creating an issue to track the underlying issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[SPARK] Fix flakey test Flaky test due to failing to delete temp directory

5 participants