Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test: add test case 'TestResetAfterWrite' #520

Closed
wants to merge 1 commit into from
Closed

test: add test case 'TestResetAfterWrite' #520

wants to merge 1 commit into from

Conversation

ahrtr
Copy link
Member

@ahrtr ahrtr commented Jun 1, 2023

There are a couple of data corruption cases, in which some pages are reset; In other words, all data in the pages are zero values. I am not sure whether it has anything to do with code below on some platform (e.g. windows),

bbolt/tx.go

Lines 489 to 491 in b31e3ec

for i := range buf {
buf[i] = 0
}

Anyway, added case TestResetAfterWrite to double check.

cc @ptabor @cenkalti @tjungblu

}

tmpBuf := make([]byte, 4096)
if n, err = f.ReadAt(tmpBuf, 0); err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you think that f.writeAt doesn't actually persist the data and zeroing the page after will actually cause an empty page to be written?

sounds like a specific network file system again ;-)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's my guess. In theory it's possible. But usually it's unlikely because there is a datasync by default before resetting the buf.

If it happens, then it means that:

  • the datasync is disabled by users or it did NOT work;
  • and the buf was reset before the writeAt really copied the data from the buf

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's not possible because WriteAt is sync call which uses pwrite syscall under Linux. Before function returns, it copies user provided buffer into OS buffer. So, resetting the buffer does not have any effect after WriteAt returns.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Plus it doesn't matter whether fsync is called or not. OS always looks at page cache first when reading.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On "normal" filesystems you have the read after pwrite guarantee, yes. Hence my stab at NFS earlier. I think it's difficult to repro this without a clearer picture on the hardware/OS setup for this test.

@ahrtr ahrtr marked this pull request as draft June 6, 2023 08:12
@ahrtr ahrtr deleted the branch etcd-io:master December 12, 2023 10:26
@ahrtr ahrtr closed this Dec 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

3 participants