Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BREAKING]: Change how Badger handles WAL #1555

Merged
merged 90 commits into from
Oct 7, 2020
Merged

[BREAKING]: Change how Badger handles WAL #1555

merged 90 commits into from
Oct 7, 2020

Conversation

manishrjain
Copy link
Contributor

@manishrjain manishrjain commented Oct 4, 2020

This PR significantly improves Badger's disk usage behavior.

Breaking: This PR increases the magic version from 7 to 8. So, no older Badger directories would work with this change.

With this PR, we no longer use value log as write-ahead log. Instead, each MemTable has its own WAL. Value logs now only write values which are greater than ValueThreshold, while MemTable WAL only writes smaller values and value pointers.

On a crash and restart, the MemTable WALs are replayed to apply updates to Skiplist. When MemTables are flushed to L0, the corresponding WALs are deleted.

This PR makes big changes to how value log GC works:

  • Discard stats are now stored in a separate file, instead of within the LSM tree.
  • GC only picks up value logs based off discard stats.
  • GC no longer does sampling, it uses discard stats to inform when a value log needs to be GCed.
  • Value log would now no longer grow indefinitely, because of the shift to MemTable WAL.
  • This PR also removes the badger gc tool.
  • Value Log Head pointer tracking is removed.
  • Only the last value log file is replayed on every start, and truncated as necessary.

This PR also makes a bunch of other changes:

  • Removes ValueLogLoadingMode (always uses mmap now).
  • Removes TableLoadingMode (always uses mmap now).
  • Removes Truncate option.
  • Removes KeepL0InMemory option.

This change is Reviewable

Ibrahim Jarif added 9 commits October 5, 2020 11:55
…ed (#1549)"

This reverts commit 5d1bab4.
We'reverting this commit because it seems to cause a strange issue while
writing data. Running `go run -tags main.go benchmark write --sorted
--compression=true --block-cache-mb=100` creates a directory which does
not have all the keys. I don't know why this would fix the issue but the
test works fine after reverting this commit.
@CLAassistant
Copy link

CLAassistant commented Oct 6, 2020

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
4 out of 5 committers have signed the CLA.

✅ manishrjain
✅ jarifibrahim
✅ NamanJain8
✅ martinmr
❌ asdf


asdf seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
You have signed the CLA already but the status is still pending? Let us recheck it.

@manishrjain manishrjain merged commit e3a0d29 into master Oct 7, 2020
@manishrjain manishrjain deleted the mrjn/wal branch October 7, 2020 01:41
@kode54
Copy link

kode54 commented Feb 3, 2022

This renders Badger completely unusable on iOS, as it attempts to mmap 2GB at once, and fails.

@baryluk
Copy link

baryluk commented Jun 18, 2023

FYI. This is not mentioned in the MR or commit message, but this also removed LoadBloomsOnOpen option.

mangalaman93 pushed a commit that referenced this pull request Jul 17, 2024
This is a leftover from when Badger supported different modes
(#1555).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

6 participants