Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(dropPrefix): add DropPrefixNonBlocking API #1698

Merged
merged 7 commits into from
May 4, 2021

Conversation

NamanJain8
Copy link
Contributor

@NamanJain8 NamanJain8 commented Apr 29, 2021

Related to DGRAPH-3319

This PR adds DropPrefixNonBlocking API that can be used to logically delete the data for specified prefixes at a given timestamp. This also adds an equivalent API of older DropPrefix == DropPrefixBlocking.
DropPrefix now makes decision based on badger option AllowStopTheWorld whose default is to use DropPrefixBlocking.
With DropPrefixNonBlocking , the data would not be cleared from the LSM tree immediately. It would be deleted eventually through compactions.
This operation is useful when we don't want to block writes while we delete the prefixes (DropPrefix blocks writes).
It does this in the following way:

  • Stream the given prefixes.
  • Write them to skiplist and handover that skiplist to DB.

This change is Reviewable

db.go Outdated Show resolved Hide resolved
db.go Outdated Show resolved Hide resolved
@NamanJain8 NamanJain8 marked this pull request as ready for review April 30, 2021 12:46
db.go Outdated Show resolved Hide resolved
Copy link
Contributor

@manishrjain manishrjain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm: Approved, but address the comments carefully.

Reviewed 1 of 1 files at r2, 1 of 1 files at r3, 1 of 1 files at r4.
Reviewable status: all files reviewed, 12 unresolved discussions (waiting on @ahsanbarkati, @jarifibrahim, and @NamanJain8)


db.go, line 1879 at r4 (raw file):

				list.Kv = append(list.Kv, kv)

				if db.opt.NumVersionsToKeep == 1 {

If you just have a delete marker at higher timestamp, everything below it would be considered deleted anyway.

For NumVersionsToKeep > 1 (in Dgraph), you'd continue to iterate upon it and generate more delete markers. You just need to generate one per key.


db.go, line 1883 at r4 (raw file):

				}

				if item.DiscardEarlierVersions() {

Don't care about this. Both of these if conds are not required.


db.go, line 1890 at r4 (raw file):

		}

		var wg sync.WaitGroup

move this outside.


db.go, line 1894 at r4 (raw file):

		initSize := int64(float64(db.opt.MemTableSize) * 1.1)

		handover := func(force bool) error {

if cbuf.Len > whatever, then sort, and create a skiplist, handover.


db.go, line 1897 at r4 (raw file):

			for id, b := range builderMap {
				sl := b.Skiplist()
				if force || sl.MemSize() > db.opt.MemTableSize {

don't need to check this. Just use cbuf size.


db.go, line 1908 at r4 (raw file):

			return nil
		}

var cbuf *z.Buffer


db.go, line 1910 at r4 (raw file):

		stream.Send = func(buf *z.Buffer) error {
			err := buf.SliceIterate(func(s []byte) error {

You have a cbuf outside, and you append buf to cbuf. And then sort cbuf by keys. Create a skiplist out of it using builder, and then handover.


db.go, line 1918 at r4 (raw file):

					builderMap[kv.StreamId] = skl.NewBuilder(initSize)
				}
				builderMap[kv.StreamId].Add(y.KeyWithTs(kv.Key, ts), y.ValueStruct{Meta: bitDelete})

this would be expensive. You'd rather use skiplist builder. And for that, use the idea above.


db.go, line 1942 at r4 (raw file):

			return errors.Wrapf(err, "While dropping prefix: %#x", prefix)
		}
	}

wg.Wait() here.

Copy link
Contributor

@manishrjain manishrjain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm_cancel: let's review this after changes.

Reviewable status: all files reviewed, 13 unresolved discussions (waiting on @ahsanbarkati, @jarifibrahim, and @NamanJain8)


db.go, line 1843 at r1 (raw file):

Previously, NamanJain8 (Naman Jain) wrote…

Done.

This should be done via an option. DropPrefix should work in either blocking or non-blocking mode.

You could still have these public funcs, but DropPrefix should choose one based on Badger.Options.

DropPrefixNonBlocking and DropPrefixBlocking.


db.go, line 1877 at r4 (raw file):

				kv := y.NewKV(a)
				kv.Key = ka
				list.Kv = append(list.Kv, kv)

You know the version to use here.

Copy link
Contributor Author

@NamanJain8 NamanJain8 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dismissed @jarifibrahim and @manishrjain from 9 discussions.
Reviewable status: 1 of 4 files reviewed, 4 unresolved discussions (waiting on @ahsanbarkati, @jarifibrahim, and @manishrjain)


db.go, line 1843 at r1 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

This should be done via an option. DropPrefix should work in either blocking or non-blocking mode.

You could still have these public funcs, but DropPrefix should choose one based on Badger.Options.

DropPrefixNonBlocking and DropPrefixBlocking.

Done. We have BlockWritesOnDrop DB option.


db.go, line 1845 at r1 (raw file):

Previously, jarifibrahim (Ibrahim Jarif) wrote…

return error

Done.


db.go, line 1877 at r4 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

You know the version to use here.

Done.


db.go, line 1879 at r4 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

If you just have a delete marker at higher timestamp, everything below it would be considered deleted anyway.

For NumVersionsToKeep > 1 (in Dgraph), you'd continue to iterate upon it and generate more delete markers. You just need to generate one per key.

Done.


db.go, line 1883 at r4 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

Don't care about this. Both of these if conds are not required.

Done.


db.go, line 1890 at r4 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

move this outside.

Done.


db.go, line 1894 at r4 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

if cbuf.Len > whatever, then sort, and create a skiplist, handover.

Done.


db.go, line 1897 at r4 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

don't need to check this. Just use cbuf size.

Done.


db.go, line 1908 at r4 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

var cbuf *z.Buffer

Done.


db.go, line 1910 at r4 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

You have a cbuf outside, and you append buf to cbuf. And then sort cbuf by keys. Create a skiplist out of it using builder, and then handover.

Done.


db.go, line 1918 at r4 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

this would be expensive. You'd rather use skiplist builder. And for that, use the idea above.

Done


db.go, line 1942 at r4 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

wg.Wait() here.

Done.

Copy link
Contributor

@manishrjain manishrjain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 3 of 3 files at r5.
Reviewable status: all files reviewed, 10 unresolved discussions (waiting on @ahsanbarkati, @jarifibrahim, and @NamanJain8)


db.go, line 1864 at r5 (raw file):

		}

		var kvs []*pb.KV

Anytime Go memory gets involved, it kills it.


db.go, line 1865 at r5 (raw file):

		var kvs []*pb.KV
		err := cbuf.SliceIterate(func(slice []byte) error {

cbuf.SliceSort.


db.go, line 1884 at r5 (raw file):

			return bytes.Compare(kvs[i].Key, kvs[j].Key) < 0
		})
		for _, kv := range kvs {

iterate over cbuf and do the generation of the skiplist.


db.go, line 1885 at r5 (raw file):

		})
		for _, kv := range kvs {
			b.Add(y.KeyWithTs(kv.Key, kv.Version+1), y.ValueStruct{Meta: bitDelete})

Just use the same Version.


db.go, line 1917 at r5 (raw file):

			kv.Key = ka
			kv.Version = item.Version()
			list.Kv = append(list.Kv, kv)

You could just generate a key with ts.


db.go, line 1924 at r5 (raw file):

		stream.Send = func(buf *z.Buffer) error {
			sz := buf.LenNoPadding()
			dst := cbuf.Allocate(sz)

you could iterate over buf, and do slice allocates over cbuf to write the keywithTs ONLY.


db.go, line 1925 at r5 (raw file):

			sz := buf.LenNoPadding()
			dst := cbuf.Allocate(sz)
			y.AssertTrue(sz == copy(dst, buf.Bytes()))

Iterate over buf here, and use the same KV.


options.go, line 690 at r5 (raw file):

//
// The default value of BlockWritesOnDrop is true.
func (opt Options) WithBlockWritesOnDrop(b bool) Options {

WithBlockingDrops(b bool)

Copy link
Contributor

@manishrjain manishrjain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 10 unresolved discussions (waiting on @ahsanbarkati, @jarifibrahim, and @NamanJain8)


options.go, line 690 at r5 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

WithBlockingDrops(b bool)

A better name would be -- AllowStopTheWorld -- true by default. And false for Dgraph.

Copy link
Contributor

@manishrjain manishrjain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm: Approved. Nice work!

Reviewed 5 of 5 files at r6.
Reviewable status: all files reviewed, 10 unresolved discussions (waiting on @ahsanbarkati, @jarifibrahim, and @NamanJain8)

@NamanJain8 NamanJain8 merged commit da5f789 into master May 4, 2021
@NamanJain8 NamanJain8 deleted the naman/drop-prefix-nb branch May 4, 2021 19:25
joshua-goldstein added a commit that referenced this pull request Jan 5, 2023
mangalaman93 added a commit that referenced this pull request Feb 14, 2023
mangalaman93 added a commit that referenced this pull request Feb 14, 2023
mangalaman93 added a commit that referenced this pull request Feb 14, 2023
mangalaman93 added a commit that referenced this pull request Feb 14, 2023
mangalaman93 added a commit that referenced this pull request Feb 15, 2023
mangalaman93 added a commit that referenced this pull request Feb 18, 2023
mangalaman93 added a commit that referenced this pull request Feb 18, 2023
mangalaman93 added a commit that referenced this pull request Feb 21, 2023
mangalaman93 added a commit that referenced this pull request Feb 21, 2023
fredcarle pushed a commit to fredcarle/badger that referenced this pull request Aug 1, 2023
This PR adds DropPrefixNonBlocking and DropPrefixBlocking API that can be used to logically delete the data for specified prefixes.
DropPrefix now makes decision based on badger option AllowStopTheWorld whose default is to use DropPrefixBlocking.
With DropPrefixNonBlocking the data would not be cleared from the LSM tree immediately. It would be deleted eventually through compactions.

Co-authored-by: Rohan Prasad <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

5 participants