badger

package module

v1.6.1-fix Latest Latest Go to latest Published: Aug 25, 2020 License: Apache-2.0 Imports: 33 Imported by: 0

Details

Valid go.mod file

The Go module system was introduced in Go 1.11 and is the official dependency management solution for Go.
Redistributable license

Redistributable licenses place minimal restrictions on how software can be used, modified, and redistributed.
Tagged version

Modules with tagged versions give importers more predictable builds.
Stable version

When a project reaches major version v1 it is considered stable.
Learn more about best practices

Repository

github.com/meeyio/badger

README ¶

BadgerDB

Badger mascot

BadgerDB is an embeddable, persistent and fast key-value (KV) database written in pure Go. It is the underlying database for Dgraph, a fast, distributed graph database. It's meant to be a performant alternative to non-Go-based key-value stores like RocksDB.

Project Status [Jun 26, 2019]

Badger is stable and is being used to serve data sets worth hundreds of terabytes. Badger supports concurrent ACID transactions with serializable snapshot isolation (SSI) guarantees. A Jepsen-style bank test runs nightly for 8h, with --race flag and ensures the maintenance of transactional guarantees. Badger has also been tested to work with filesystem level anomalies, to ensure persistence and consistency.

Badger v1.0 was released in Nov 2017, and the latest version that is data-compatible with v1.0 is v1.6.0.

Badger v2.0, a new release coming up very soon will use a new storage format which won't be compatible with all of the v1.x. The Changelog is kept fairly up-to-date.

For more details on our version naming schema please read Choosing a version.

Getting Started

Installing

To start using Badger, install Go 1.11 or above and run go get:

$ go get github.com/meeyio/badger/...

This will retrieve the library and install the badger command line utility into your $GOBIN path.

Choosing a version

BadgerDB is a pretty special package from the point of view that the most important change we can make to it is not on its API but rather on how data is stored on disk.

This is why we follow a version naming schema that differs from Semantic Versioning.

New major versions are released when the data format on disk changes in an incompatible way.
New minor versions are released whenever the API changes but data compatibility is maintained. Note that the changes on the API could be backward-incompatible - unlike Semantic Versioning.
New patch versions are released when there's no changes to the data format nor the API.

Following these rules:

v1.5.0 and v1.6.0 can be used on top of the same files without any concerns, as their major version is the same, therefore the data format on disk is compatible.
v1.6.0 and v2.0.0 are data incompatible as their major version implies, so files created with v1.6.0 will need to be converted into the new format before they can be used by v2.0.0.

For a longer explanation on the reasons behind using a new versioning naming schema, you can read VERSIONING.md.

Opening a database

The top-level object in Badger is a DB. It represents multiple files on disk in specific directories, which contain the data for a single database.

To open your database, use the badger.Open() function, with the appropriate options. The Dir and ValueDir options are mandatory and must be specified by the client. They can be set to the same value to simplify things.

package main

import (
	"log"

	badger "github.com/meeyio/badger"
)

func main() {
  // Open the Badger database located in the /tmp/badger directory.
  // It will be created if it doesn't exist.
  db, err := badger.Open(badger.DefaultOptions("/tmp/badger"))
  if err != nil {
	  log.Fatal(err)
  }
  defer db.Close()
  // Your code here…
}

Please note that Badger obtains a lock on the directories so multiple processes cannot open the same database at the same time.

Transactions

Read-only transactions

To start a read-only transaction, you can use the DB.View() method:

err := db.View(func(txn *badger.Txn) error {
  // Your code here…
  return nil
})

You cannot perform any writes or deletes within this transaction. Badger ensures that you get a consistent view of the database within this closure. Any writes that happen elsewhere after the transaction has started, will not be seen by calls made within the closure.

Read-write transactions

To start a read-write transaction, you can use the DB.Update() method:

err := db.Update(func(txn *badger.Txn) error {
  // Your code here…
  return nil
})

All database operations are allowed inside a read-write transaction.

Always check the returned error value. If you return an error within your closure it will be passed through.

An ErrConflict error will be reported in case of a conflict. Depending on the state of your application, you have the option to retry the operation if you receive this error.

An ErrTxnTooBig will be reported in case the number of pending writes/deletes in the transaction exceeds a certain limit. In that case, it is best to commit the transaction and start a new transaction immediately. Here is an example (we are not checking for errors in some places for simplicity):

updates := make(map[string]string)
txn := db.NewTransaction(true)
for k,v := range updates {
  if err := txn.Set([]byte(k),[]byte(v)); err == badger.ErrTxnTooBig {
    _ = txn.Commit()
    txn = db.NewTransaction(true)
    _ = txn.Set([]byte(k),[]byte(v))
  }
}
_ = txn.Commit()

Managing transactions manually

The DB.View() and DB.Update() methods are wrappers around the DB.NewTransaction() and Txn.Commit() methods (or Txn.Discard() in case of read-only transactions). These helper methods will start the transaction, execute a function, and then safely discard your transaction if an error is returned. This is the recommended way to use Badger transactions.

However, sometimes you may want to manually create and commit your transactions. You can use the DB.NewTransaction() function directly, which takes in a boolean argument to specify whether a read-write transaction is required. For read-write transactions, it is necessary to call Txn.Commit() to ensure the transaction is committed. For read-only transactions, calling Txn.Discard() is sufficient. Txn.Commit() also calls Txn.Discard() internally to cleanup the transaction, so just calling Txn.Commit() is sufficient for read-write transaction. However, if your code doesn’t call Txn.Commit() for some reason (for e.g it returns prematurely with an error), then please make sure you call Txn.Discard() in a defer block. Refer to the code below.

// Start a writable transaction.
txn := db.NewTransaction(true)
defer txn.Discard()

// Use the transaction...
err := txn.Set([]byte("answer"), []byte("42"))
if err != nil {
    return err
}

// Commit the transaction and check for error.
if err := txn.Commit(); err != nil {
    return err
}

The first argument to DB.NewTransaction() is a boolean stating if the transaction should be writable.

Badger allows an optional callback to the Txn.Commit() method. Normally, the callback can be set to nil, and the method will return after all the writes have succeeded. However, if this callback is provided, the Txn.Commit() method returns as soon as it has checked for any conflicts. The actual writing to the disk happens asynchronously, and the callback is invoked once the writing has finished, or an error has occurred. This can improve the throughput of the application in some cases. But it also means that a transaction is not durable until the callback has been invoked with a nil error value.

Using key/value pairs

To save a key/value pair, use the Txn.Set() method:

err := db.Update(func(txn *badger.Txn) error {
  err := txn.Set([]byte("answer"), []byte("42"))
  return err
})

Key/Value pair can also be saved by first creating Entry, then setting this Entry using Txn.SetEntry(). Entry also exposes methods to set properties on it.

err := db.Update(func(txn *badger.Txn) error {
  e := badger.NewEntry([]byte("answer"), []byte("42"))
  err := txn.SetEntry(e)
  return err
})

This will set the value of the "answer" key to "42". To retrieve this value, we can use the Txn.Get() method:

err := db.View(func(txn *badger.Txn) error {
  item, err := txn.Get([]byte("answer"))
  handle(err)

  var valNot, valCopy []byte
  err := item.Value(func(val []byte) error {
    // This func with val would only be called if item.Value encounters no error.

    // Accessing val here is valid.
    fmt.Printf("The answer is: %s\n", val)

    // Copying or parsing val is valid.
    valCopy = append([]byte{}, val...)

    // Assigning val slice to another variable is NOT OK.
    valNot = val // Do not do this.
    return nil
  })
  handle(err)

  // DO NOT access val here. It is the most common cause of bugs.
  fmt.Printf("NEVER do this. %s\n", valNot)

  // You must copy it to use it outside item.Value(...).
  fmt.Printf("The answer is: %s\n", valCopy)

  // Alternatively, you could also use item.ValueCopy().
  valCopy, err = item.ValueCopy(nil)
  handle(err)
  fmt.Printf("The answer is: %s\n", valCopy)

  return nil
})

Txn.Get() returns ErrKeyNotFound if the value is not found.

Please note that values returned from Get() are only valid while the transaction is open. If you need to use a value outside of the transaction then you must use copy() to copy it to another byte slice.

Use the Txn.Delete() method to delete a key.

Monotonically increasing integers

To get unique monotonically increasing integers with strong durability, you can use the DB.GetSequence method. This method returns a Sequence object, which is thread-safe and can be used concurrently via various goroutines.

Badger would lease a range of integers to hand out from memory, with the bandwidth provided to DB.GetSequence. The frequency at which disk writes are done is determined by this lease bandwidth and the frequency of Next invocations. Setting a bandwidth too low would do more disk writes, setting it too high would result in wasted integers if Badger is closed or crashes. To avoid wasted integers, call Release before closing Badger.

seq, err := db.GetSequence(key, 1000)
defer seq.Release()
for {
  num, err := seq.Next()
}

Merge Operations

Badger provides support for ordered merge operations. You can define a func of type MergeFunc which takes in an existing value, and a value to be merged with it. It returns a new value which is the result of the merge operation. All values are specified in byte arrays. For e.g., here is a merge function (add) which appends a []byte value to an existing []byte value.

// Merge function to append one byte slice to another
func add(originalValue, newValue []byte) []byte {
  return append(originalValue, newValue...)
}

This function can then be passed to the DB.GetMergeOperator() method, along with a key, and a duration value. The duration specifies how often the merge function is run on values that have been added using the MergeOperator.Add() method.

MergeOperator.Get() method can be used to retrieve the cumulative value of the key associated with the merge operation.

key := []byte("merge")

m := db.GetMergeOperator(key, add, 200*time.Millisecond)
defer m.Stop()

m.Add([]byte("A"))
m.Add([]byte("B"))
m.Add([]byte("C"))

res, _ := m.Get() // res should have value ABC encoded

Example: Merge operator which increments a counter

func uint64ToBytes(i uint64) []byte {
  var buf [8]byte
  binary.BigEndian.PutUint64(buf[:], i)
  return buf[:]
}

func bytesToUint64(b []byte) uint64 {
  return binary.BigEndian.Uint64(b)
}

// Merge function to add two uint64 numbers
func add(existing, new []byte) []byte {
  return uint64ToBytes(bytesToUint64(existing) + bytesToUint64(new))
}

It can be used as

key := []byte("merge")

m := db.GetMergeOperator(key, add, 200*time.Millisecond)
defer m.Stop()

m.Add(uint64ToBytes(1))
m.Add(uint64ToBytes(2))
m.Add(uint64ToBytes(3))

res, _ := m.Get() // res should have value 6 encoded

Setting Time To Live(TTL) and User Metadata on Keys

Badger allows setting an optional Time to Live (TTL) value on keys. Once the TTL has elapsed, the key will no longer be retrievable and will be eligible for garbage collection. A TTL can be set as a time.Duration value using the Entry.WithTTL() and Txn.SetEntry() API methods.

err := db.Update(func(txn *badger.Txn) error {
  e := badger.NewEntry([]byte("answer"), []byte("42")).WithTTL(time.Hour)
  err := txn.SetEntry(e)
  return err
})

An optional user metadata value can be set on each key. A user metadata value is represented by a single byte. It can be used to set certain bits along with the key to aid in interpreting or decoding the key-value pair. User metadata can be set using Entry.WithMeta() and Txn.SetEntry() API methods.

err := db.Update(func(txn *badger.Txn) error {
  e := badger.NewEntry([]byte("answer"), []byte("42")).WithMeta(byte(1))
  err := txn.SetEntry(e)
  return err
})

Entry APIs can be used to add the user metadata and TTL for same key. This Entry then can be set using Txn.SetEntry().

err := db.Update(func(txn *badger.Txn) error {
  e := badger.NewEntry([]byte("answer"), []byte("42")).WithMeta(byte(1)).WithTTL(time.Hour)
  err := txn.SetEntry(e)
  return err
})

Iterating over keys

To iterate over keys, we can use an Iterator, which can be obtained using the Txn.NewIterator() method. Iteration happens in byte-wise lexicographical sorting order.

err := db.View(func(txn *badger.Txn) error {
  opts := badger.DefaultIteratorOptions
  opts.PrefetchSize = 10
  it := txn.NewIterator(opts)
  defer it.Close()
  for it.Rewind(); it.Valid(); it.Next() {
    item := it.Item()
    k := item.Key()
    err := item.Value(func(v []byte) error {
      fmt.Printf("key=%s, value=%s\n", k, v)
      return nil
    })
    if err != nil {
      return err
    }
  }
  return nil
})

The iterator allows you to move to a specific point in the list of keys and move forward or backward through the keys one at a time.

By default, Badger prefetches the values of the next 100 items. You can adjust that with the IteratorOptions.PrefetchSize field. However, setting it to a value higher than GOMAXPROCS (which we recommend to be 128 or higher) shouldn’t give any additional benefits. You can also turn off the fetching of values altogether. See section below on key-only iteration.

Prefix scans

To iterate over a key prefix, you can combine Seek() and ValidForPrefix():

db.View(func(txn *badger.Txn) error {
  it := txn.NewIterator(badger.DefaultIteratorOptions)
  defer it.Close()
  prefix := []byte("1234")
  for it.Seek(prefix); it.ValidForPrefix(prefix); it.Next() {
    item := it.Item()
    k := item.Key()
    err := item.Value(func(v []byte) error {
      fmt.Printf("key=%s, value=%s\n", k, v)
      return nil
    })
    if err != nil {
      return err
    }
  }
  return nil
})

Key-only iteration

Badger supports a unique mode of iteration called key-only iteration. It is several order of magnitudes faster than regular iteration, because it involves access to the LSM-tree only, which is usually resident entirely in RAM. To enable key-only iteration, you need to set the IteratorOptions.PrefetchValues field to false. This can also be used to do sparse reads for selected keys during an iteration, by calling item.Value() only when required.

err := db.View(func(txn *badger.Txn) error {
  opts := badger.DefaultIteratorOptions
  opts.PrefetchValues = false
  it := txn.NewIterator(opts)
  defer it.Close()
  for it.Rewind(); it.Valid(); it.Next() {
    item := it.Item()
    k := item.Key()
    fmt.Printf("key=%s\n", k)
  }
  return nil
})

Stream

Badger provides a Stream framework, which concurrently iterates over all or a portion of the DB, converting data into custom key-values, and streams it out serially to be sent over network, written to disk, or even written back to Badger. This is a lot faster way to iterate over Badger than using a single Iterator. Stream supports Badger in both managed and normal mode.

Stream uses the natural boundaries created by SSTables within the LSM tree, to quickly generate key ranges. Each goroutine then picks a range and runs an iterator to iterate over it. Each iterator iterates over all versions of values and is created from the same transaction, thus working over a snapshot of the DB. Every time a new key is encountered, it calls ChooseKey(item), followed by KeyToList(key, itr). This allows a user to select or reject that key, and if selected, convert the value versions into custom key-values. The goroutine batches up 4MB worth of key-values, before sending it over to a channel. Another goroutine further batches up data from this channel using smart batching algorithm and calls Send serially.

This framework is designed for high throughput key-value iteration, spreading the work of iteration across many goroutines. DB.Backup uses this framework to provide full and incremental backups quickly. Dgraph is a heavy user of this framework. In fact, this framework was developed and used within Dgraph, before getting ported over to Badger.

stream := db.NewStream()
// db.NewStreamAt(readTs) for managed mode.

// -- Optional settings
stream.NumGo = 16                     // Set number of goroutines to use for iteration.
stream.Prefix = []byte("some-prefix") // Leave nil for iteration over the whole DB.
stream.LogPrefix = "Badger.Streaming" // For identifying stream logs. Outputs to Logger.

// ChooseKey is called concurrently for every key. If left nil, assumes true by default.
stream.ChooseKey = func(item *badger.Item) bool {
  return bytes.HasSuffix(item.Key(), []byte("er"))
}

// KeyToList is called concurrently for chosen keys. This can be used to convert
// Badger data into custom key-values. If nil, uses stream.ToList, a default
// implementation, which picks all valid key-values.
stream.KeyToList = nil

// -- End of optional settings.

// Send is called serially, while Stream.Orchestrate is running.
stream.Send = func(list *pb.KVList) error {
  return proto.MarshalText(w, list) // Write to w.
}

// Run the stream
if err := stream.Orchestrate(context.Background()); err != nil {
  return err
}
// Done.

Garbage Collection

Badger values need to be garbage collected, because of two reasons:

Badger keeps values separately from the LSM tree. This means that the compaction operations that clean up the LSM tree do not touch the values at all. Values need to be cleaned up separately.
Concurrent read/write transactions could leave behind multiple values for a single key, because they are stored with different versions. These could accumulate, and take up unneeded space beyond the time these older versions are needed.

Badger relies on the client to perform garbage collection at a time of their choosing. It provides the following method, which can be invoked at an appropriate time:

DB.RunValueLogGC(): This method is designed to do garbage collection while Badger is online. Along with randomly picking a file, it uses statistics generated by the LSM-tree compactions to pick files that are likely to lead to maximum space reclamation. It is recommended to be called during periods of low activity in your system, or periodically. One call would only result in removal of at max one log file. As an optimization, you could also immediately re-run it whenever it returns nil error (indicating a successful value log GC), as shown below.
```
ticker := time.NewTicker(5 * time.Minute)
defer ticker.Stop()
for range ticker.C {
again:
	err := db.RunValueLogGC(0.7)
	if err == nil {
		goto again
	}
}
```
DB.PurgeOlderVersions(): This method is DEPRECATED since v1.5.0. Now, Badger's LSM tree automatically discards older/invalid versions of keys.

Note: The RunValueLogGC method would not garbage collect the latest value log.

Database backup

There are two public API methods DB.Backup() and DB.Load() which can be used to do online backups and restores. Badger v0.9 provides a CLI tool badger, which can do offline backup/restore. Make sure you have $GOPATH/bin in your PATH to use this tool.

The command below will create a version-agnostic backup of the database, to a file badger.bak in the current working directory

badger backup --dir <path/to/badgerdb>

To restore badger.bak in the current working directory to a new database:

badger restore --dir <path/to/badgerdb>

See badger --help for more details.

If you have a Badger database that was created using v0.8 (or below), you can use the badger_backup tool provided in v0.8.1, and then restore it using the command above to upgrade your database to work with the latest version.

badger_backup --dir <path/to/badgerdb> --backup-file badger.bak

We recommend all users to use the Backup and Restore APIs and tools. However, Badger is also rsync-friendly because all files are immutable, barring the latest value log which is append-only. So, rsync can be used as rudimentary way to perform a backup. In the following script, we repeat rsync to ensure that the LSM tree remains consistent with the MANIFEST file while doing a full backup.

#!/bin/bash
set -o history
set -o histexpand
# Makes a complete copy of a Badger database directory.
# Repeat rsync if the MANIFEST and SSTables are updated.
rsync -avz --delete db/ dst
while !! | grep -q "(MANIFEST\|\.sst)$"; do :; done

Memory usage

Badger's memory usage can be managed by tweaking several options available in the Options struct that is passed in when opening the database using DB.Open.

Options.ValueLogLoadingMode can be set to options.FileIO (instead of the default options.MemoryMap) to avoid memory-mapping log files. This can be useful in environments with low RAM.
Number of memtables (Options.NumMemtables)
- If you modify Options.NumMemtables, also adjust Options.NumLevelZeroTables and Options.NumLevelZeroTablesStall accordingly.
Number of concurrent compactions (Options.NumCompactors)
Mode in which LSM tree is loaded (Options.TableLoadingMode)
Size of table (Options.MaxTableSize)
Size of value log file (Options.ValueLogFileSize)

If you want to decrease the memory usage of Badger instance, tweak these options (ideally one at a time) until you achieve the desired memory usage.

Statistics

Badger records metrics using the expvar package, which is included in the Go standard library. All the metrics are documented in y/metrics.go file.

expvar package adds a handler in to the default HTTP server (which has to be started explicitly), and serves up the metrics at the /debug/vars endpoint. These metrics can then be collected by a system like Prometheus, to get better visibility into what Badger is doing.

Resources

Blog Posts

Design

Badger was written with these design goals in mind:

Write a key-value database in pure Go.
Use latest research to build the fastest KV database for data sets spanning terabytes.
Optimize for SSDs.

Badger’s design is based on a paper titled WiscKey: Separating Keys from Values in SSD-conscious Storage.

Comparisons

Feature	Badger	RocksDB	BoltDB
Design	LSM tree with value log	LSM tree only	B+ tree
High Read throughput	Yes	No	Yes
High Write throughput	Yes	Yes	No
Designed for SSDs	Yes (with latest research ¹)	Not specifically ²	No
Embeddable	Yes	Yes	Yes
Sorted KV access	Yes	Yes	Yes
Pure Go (no Cgo)	Yes	No	Yes
Transactions	Yes, ACID, concurrent with SSI³	Yes (but non-ACID)	Yes, ACID
Snapshots	Yes	Yes	Yes
TTL support	Yes	Yes	No
3D access (key-value-version)	Yes⁴	No	No

¹ The WISCKEY paper (on which Badger is based) saw big wins with separating values from keys, significantly reducing the write amplification compared to a typical LSM tree.

² RocksDB is an SSD optimized version of LevelDB, which was designed specifically for rotating disks. As such RocksDB's design isn't aimed at SSDs.

³ SSI: Serializable Snapshot Isolation. For more details, see the blog post Concurrent ACID Transactions in Badger

⁴ Badger provides direct access to value versions via its Iterator API. Users can also specify how many versions to keep per key via Options.

Benchmarks

We have run comprehensive benchmarks against RocksDB, Bolt and LMDB. The benchmarking code, and the detailed logs for the benchmarks can be found in the badger-bench repo. More explanation, including graphs can be found the blog posts (linked above).

Other Projects Using Badger

Below is a list of known projects that use Badger:

0-stor - Single device object store.
Dgraph - Distributed graph database.
Jaeger - Distributed tracing platform.
TalariaDB - Distributed, low latency time-series database.
Dispatch Protocol - Blockchain protocol for distributed application data analytics.
Sandglass - distributed, horizontally scalable, persistent, time sorted message queue.
Usenet Express - Serving over 300TB of data with Badger.
go-ipfs - Go client for the InterPlanetary File System (IPFS), a new hypermedia distribution protocol.
gorush - A push notification server written in Go.
emitter - Scalable, low latency, distributed pub/sub broker with message storage, uses MQTT, gossip and badger.
GarageMQ - AMQP server written in Go.
RedixDB - A real-time persistent key-value store with the same redis protocol.
BBVA - Raft backend implementation using BadgerDB for Hashicorp raft.
Riot - An open-source, distributed search engine.
Fantom - aBFT Consensus platform for distributed applications.
decred - An open, progressive, and self-funding cryptocurrency with a system of community-based governance integrated into its blockchain.
OpenNetSys - Create useful dApps in any software language.
HoneyTrap - An extensible and opensource system for running, monitoring and managing honeypots.
Insolar - Enterprise-ready blockchain platform.
IoTeX - The next generation of the decentralized network for IoT powered by scalability- and privacy-centric blockchains.
go-sessions - The sessions manager for Go net/http and fasthttp.
Babble - BFT Consensus platform for distributed applications.
Tormenta - Embedded object-persistence layer / simple JSON database for Go projects.
BadgerHold - An embeddable NoSQL store for querying Go types built on Badger
Goblero - Pure Go embedded persistent job queue backed by BadgerDB
Surfline - Serving global wave and weather forecast data with Badger.
Cete - Simple and highly available distributed key-value store built on Badger. Makes it easy bringing up a cluster of Badger with Raft consensus algorithm by hashicorp/raft.
Volument - A new take on website analytics backed by Badger.
Sloop - Kubernetes History Visualization.
KVdb - Hosted key-value store and serverless platform built on top of Badger.
Dkron - Distributed, fault tolerant job scheduling system.

If you are using Badger in a project please send a pull request to add it to the list.

Frequently Asked Questions

My writes are getting stuck. Why?

Update: With the new Value(func(v []byte)) API, this deadlock can no longer happen.

The following is true for users on Badger v1.x.

This can happen if a long running iteration with Prefetch is set to false, but a Item::Value call is made internally in the loop. That causes Badger to acquire read locks over the value log files to avoid value log GC removing the file from underneath. As a side effect, this also blocks a new value log GC file from being created, when the value log file boundary is hit.

Please see Github issues #293 and #315.

There are multiple workarounds during iteration:

Use Item::ValueCopy instead of Item::Value when retrieving value.
Set Prefetch to true. Badger would then copy over the value and release the file lock immediately.
When Prefetch is false, don't call Item::Value and do a pure key-only iteration. This might be useful if you just want to delete a lot of keys.
Do the writes in a separate transaction after the reads.

My writes are really slow. Why?

Are you creating a new transaction for every single key update, and waiting for it to Commit fully before creating a new one? This will lead to very low throughput.

We have created WriteBatch API which provides a way to batch up many updates into a single transaction and Commit that transaction using callbacks to avoid blocking. This amortizes the cost of a transaction really well, and provides the most efficient way to do bulk writes.

wb := db.NewWriteBatch()
defer wb.Cancel()

for i := 0; i < N; i++ {
  err := wb.Set(key(i), value(i), 0) // Will create txns as needed.
  handle(err)
}
handle(wb.Flush()) // Wait for all txns to finish.

Note that WriteBatch API does not allow any reads. For read-modify-write workloads, you should be using the Transaction API.

I don't see any disk writes. Why?

If you're using Badger with SyncWrites=false, then your writes might not be written to value log and won't get synced to disk immediately. Writes to LSM tree are done inmemory first, before they get compacted to disk. The compaction would only happen once MaxTableSize has been reached. So, if you're doing a few writes and then checking, you might not see anything on disk. Once you Close the database, you'll see these writes on disk.

Reverse iteration doesn't give me the right results.

Just like forward iteration goes to the first key which is equal or greater than the SEEK key, reverse iteration goes to the first key which is equal or lesser than the SEEK key. Therefore, SEEK key would not be part of the results. You can typically add a 0xff byte as a suffix to the SEEK key to include it in the results. See the following issues: #436 and #347.

Which instances should I use for Badger?

We recommend using instances which provide local SSD storage, without any limit on the maximum IOPS. In AWS, these are storage optimized instances like i3. They provide local SSDs which clock 100K IOPS over 4KB blocks easily.

I'm getting a closed channel error. Why?

panic: close of closed channel
panic: send on closed channel

If you're seeing panics like above, this would be because you're operating on a closed DB. This can happen, if you call Close() before sending a write, or multiple times. You should ensure that you only call Close() once, and all your read/write operations finish before closing.

Are there any Go specific settings that I should use?

We highly recommend setting a high number for GOMAXPROCS, which allows Go to observe the full IOPS throughput provided by modern SSDs. In Dgraph, we have set it to 128. For more details, see this thread.

Are there any Linux specific settings that I should use?

We recommend setting max file descriptors to a high number depending upon the expected size of your data. On Linux and Mac, you can check the file descriptor limit with ulimit -n -H for the hard limit and ulimit -n -S for the soft limit. A soft limit of 65535 is a good lower bound. You can adjust the limit as needed.

I see "manifest has unsupported version: X (we support Y)" error.

This error means you have a badger directory which was created by an older version of badger and you're trying to open in a newer version of badger. The underlying data format can change across badger versions and users will have to migrate their data directory. Badger data can be migrated from version X of badger to version Y of badger by following the steps listed below. Assume you were on badger v1.6.0 and you wish to migrate to v2.0.0 version.

Install badger version v1.6.0
- cd $GOPATH/src/github.com/meeyio/badger
- git checkout v1.6.0
- cd badger && go install
  
  This should install the old badger binary in your $GOBIN.
Create Backup
- badger backup --dir path/to/badger/directory -f badger.backup
Install badger version v2.0.0
- cd $GOPATH/src/github.com/meeyio/badger
- git checkout v2.0.0
- cd badger && go install
  
  This should install new badger binary in your $GOBIN
Install badger version v2.0.0
- badger restore --dir path/to/new/badger/directory -f badger.backup
  
  This will create a new directory on path/to/new/badger/directory and add badger data in newer format to it.

NOTE - The above steps shouldn't cause any data loss but please ensure the new data is valid before deleting the old badger directory.

Contact

Please use discuss.dgraph.io for questions, feature requests and discussions.
Please use Github issue tracker for filing bugs or feature requests.
Join .
Follow us on Twitter @dgraphlabs.

Documentation ¶

Rendered for

Overview ¶

Package badger implements an embeddable, simple and fast key-value database, written in pure Go. It is designed to be highly performant for both reads and writes simultaneously. Badger uses Multi-Version Concurrency Control (MVCC), and supports transactions. It runs transactions concurrently, with serializable snapshot isolation guarantees.

Badger uses an LSM tree along with a value log to separate keys from values, hence reducing both write amplification and the size of the LSM tree. This allows LSM tree to be served entirely from RAM, while the values are served from SSD.

Usage ¶

Badger has the following main types: DB, Txn, Item and Iterator. DB contains keys that are associated with values. It must be opened with the appropriate options before it can be accessed.

All operations happen inside a Txn. Txn represents a transaction, which can be read-only or read-write. Read-only transactions can read values for a given key (which are returned inside an Item), or iterate over a set of key-value pairs using an Iterator (which are returned as Item type values as well). Read-write transactions can also update and delete keys from the DB.

See the examples for more usage details.

Index ¶

Constants
Variables
type DB
- func Open(opt Options) (db *DB, err error)
- func OpenManaged(opts Options) (*DB, error)
- func (db *DB) Backup(w io.Writer, since uint64) (uint64, error)
- func (db *DB) Close() error
- func (db *DB) DropAll() error
- func (db *DB) DropPrefix(prefix []byte) error
- func (db *DB) Flatten(workers int) error
- func (db *DB) GetMergeOperator(key []byte, f MergeFunc, dur time.Duration) *MergeOperator
- func (db *DB) GetSequence(key []byte, bandwidth uint64) (*Sequence, error)
- func (db *DB) KeySplits(prefix []byte) []string
- func (db *DB) Load(r io.Reader, maxPendingWrites int) error
- func (db *DB) MaxBatchCount() int64
- func (db *DB) MaxBatchSize() int64
- func (db *DB) NewKVLoader(maxPendingWrites int) *KVLoader
- func (db *DB) NewStream() *Stream
- func (db *DB) NewStreamAt(readTs uint64) *Stream
- func (db *DB) NewStreamWriter() *StreamWriter
- func (db *DB) NewTransaction(update bool) *Txn
- func (db *DB) NewTransactionAt(readTs uint64, update bool) *Txn
- func (db *DB) NewWriteBatch() *WriteBatch
- func (db *DB) NewWriteBatchAt(commitTs uint64) *WriteBatch
- func (db *DB) PrintHistogram(keyPrefix []byte)
- func (db *DB) RunValueLogGC(discardRatio float64) error
- func (db *DB) SetDiscardTs(ts uint64)
- func (db *DB) Size() (lsm, vlog int64)
- func (db *DB) Subscribe(ctx context.Context, cb func(kv *KVList) error, prefixes ...[]byte) error
- func (db *DB) Sync() error
- func (db *DB) Tables(withKeysCount bool) []TableInfo
- func (db *DB) Update(fn func(txn *Txn) error) error
- func (db *DB) View(fn func(txn *Txn) error) error
type Entry
- func NewEntry(key, value []byte) *Entry
- func (e *Entry) WithDiscard() *Entry
- func (e *Entry) WithMeta(meta byte) *Entry
- func (e *Entry) WithTTL(dur time.Duration) *Entry
type Item
- func (item *Item) DiscardEarlierVersions() bool
- func (item *Item) EstimatedSize() int64
- func (item *Item) ExpiresAt() uint64
- func (item *Item) IsDeletedOrExpired() bool
- func (item *Item) Key() []byte
- func (item *Item) KeyCopy(dst []byte) []byte
- func (item *Item) KeySize() int64
- func (item *Item) String() string
- func (item *Item) UserMeta() byte
- func (item *Item) Value(fn func(val []byte) error) error
- func (item *Item) ValueCopy(dst []byte) ([]byte, error)
- func (item *Item) ValueSize() int64
- func (item *Item) Version() uint64
type Iterator
- func (it *Iterator) Close()
- func (it *Iterator) Item() *Item
- func (it *Iterator) Next()
- func (it *Iterator) Rewind()
- func (it *Iterator) Seek(key []byte)
- func (it *Iterator) Valid() bool
- func (it *Iterator) ValidForPrefix(prefix []byte) bool
type IteratorOptions
type KVList
type KVLoader
- func (l *KVLoader) Finish() error
- func (l *KVLoader) Set(kv *pb.KV) error
type Logger
type Manifest
- func ReplayManifestFile(fp *os.File) (Manifest, int64, error)
type MergeFunc
type MergeOperator
- func (op *MergeOperator) Add(val []byte) error
- func (op *MergeOperator) Get() ([]byte, error)
- func (op *MergeOperator) Stop()
type Options
- func DefaultOptions(path string) Options
- func LSMOnlyOptions(path string) Options
- func (opt *Options) Debugf(format string, v ...interface{})
- func (opt *Options) Errorf(format string, v ...interface{})
- func (opt *Options) Infof(format string, v ...interface{})
- func (opt *Options) Warningf(format string, v ...interface{})
- func (opt Options) WithBypassLockGuard(b bool) Options
- func (opt Options) WithCompactL0OnClose(val bool) Options
- func (opt Options) WithDir(val string) Options
- func (opt Options) WithEventLogging(enabled bool) Options
- func (opt Options) WithLevelOneSize(val int64) Options
- func (opt Options) WithLevelSizeMultiplier(val int) Options
- func (opt Options) WithLogRotatesToFlush(val int32) Options
- func (opt Options) WithLogger(val Logger) Options
- func (opt Options) WithMaxLevels(val int) Options
- func (opt Options) WithMaxTableSize(val int64) Options
- func (opt Options) WithNumCompactors(val int) Options
- func (opt Options) WithNumLevelZeroTables(val int) Options
- func (opt Options) WithNumLevelZeroTablesStall(val int) Options
- func (opt Options) WithNumMemtables(val int) Options
- func (opt Options) WithNumVersionsToKeep(val int) Options
- func (opt Options) WithReadOnly(val bool) Options
- func (opt Options) WithSyncWrites(val bool) Options
- func (opt Options) WithTableLoadingMode(val options.FileLoadingMode) Options
- func (opt Options) WithTruncate(val bool) Options
- func (opt Options) WithValueDir(val string) Options
- func (opt Options) WithValueLogFileSize(val int64) Options
- func (opt Options) WithValueLogLoadingMode(val options.FileLoadingMode) Options
- func (opt Options) WithValueLogMaxEntries(val uint32) Options
- func (opt Options) WithValueThreshold(val int) Options
- func (opt Options) WithVerifyValueChecksum(val bool) Options
type Sequence
- func (seq *Sequence) Next() (uint64, error)
- func (seq *Sequence) Release() error
type Stream
- func (stream *Stream) Backup(w io.Writer, since uint64) (uint64, error)
- func (st *Stream) Orchestrate(ctx context.Context) error
- func (st *Stream) ToList(key []byte, itr *Iterator) (*pb.KVList, error)
type StreamWriter
- func (sw *StreamWriter) Flush() error
- func (sw *StreamWriter) Prepare() error
- func (sw *StreamWriter) Write(kvs *pb.KVList) error
type TableInfo
type TableManifest
type Txn
- func (txn *Txn) Commit() error
- func (txn *Txn) CommitAt(commitTs uint64, callback func(error)) error
- func (txn *Txn) CommitWith(cb func(error))
- func (txn *Txn) Delete(key []byte) error
- func (txn *Txn) Discard()
- func (txn *Txn) Get(key []byte) (item *Item, rerr error)
- func (txn *Txn) NewIterator(opt IteratorOptions) *Iterator
- func (txn *Txn) NewKeyIterator(key []byte, opt IteratorOptions) *Iterator
- func (txn *Txn) ReadTs() uint64
- func (txn *Txn) Set(key, val []byte) error
- func (txn *Txn) SetEntry(e *Entry) error
type WriteBatch
- func (wb *WriteBatch) Cancel()
- func (wb *WriteBatch) Delete(k []byte) error
- func (wb *WriteBatch) Error() error
- func (wb *WriteBatch) Flush() error
- func (wb *WriteBatch) Set(k, v []byte) error
- func (wb *WriteBatch) SetEntry(e *Entry) error
- func (wb *WriteBatch) SetMaxPendingTxns(max int)

Constants ¶

View Source

const (
	// ManifestFilename is the filename for the manifest file.
	ManifestFilename = "MANIFEST"
)

View Source

const (
	// ValueThresholdLimit is the maximum permissible value of opt.ValueThreshold.
	ValueThresholdLimit = math.MaxUint16 - 16 + 1
)

Variables ¶

View Source

var (
	// ErrValueLogSize is returned when opt.ValueLogFileSize option is not within the valid
	// range.
	ErrValueLogSize = errors.New("Invalid ValueLogFileSize, must be between 1MB and 2GB")

	// ErrValueThreshold is returned when ValueThreshold is set to a value close to or greater than
	// uint16.
	ErrValueThreshold = errors.Errorf(
		"Invalid ValueThreshold, must be less than %d", ValueThresholdLimit)

	// ErrKeyNotFound is returned when key isn't found on a txn.Get.
	ErrKeyNotFound = errors.New("Key not found")

	// ErrTxnTooBig is returned if too many writes are fit into a single transaction.
	ErrTxnTooBig = errors.New("Txn is too big to fit into one request")

	// ErrConflict is returned when a transaction conflicts with another transaction. This can
	// happen if the read rows had been updated concurrently by another transaction.
	ErrConflict = errors.New("Transaction Conflict. Please retry")

	// ErrReadOnlyTxn is returned if an update function is called on a read-only transaction.
	ErrReadOnlyTxn = errors.New("No sets or deletes are allowed in a read-only transaction")

	// ErrDiscardedTxn is returned if a previously discarded transaction is re-used.
	ErrDiscardedTxn = errors.New("This transaction has been discarded. Create a new one")

	// ErrEmptyKey is returned if an empty key is passed on an update function.
	ErrEmptyKey = errors.New("Key cannot be empty")

	// ErrInvalidKey is returned if the key has a special !badger! prefix,
	// reserved for internal usage.
	ErrInvalidKey = errors.New("Key is using a reserved !badger! prefix")

	// ErrRetry is returned when a log file containing the value is not found.
	// This usually indicates that it may have been garbage collected, and the
	// operation needs to be retried.
	ErrRetry = errors.New("Unable to find log file. Please retry")

	// ErrThresholdZero is returned if threshold is set to zero, and value log GC is called.
	// In such a case, GC can't be run.
	ErrThresholdZero = errors.New(
		"Value log GC can't run because threshold is set to zero")

	// ErrNoRewrite is returned if a call for value log GC doesn't result in a log file rewrite.
	ErrNoRewrite = errors.New(
		"Value log GC attempt didn't result in any cleanup")

	// ErrRejected is returned if a value log GC is called either while another GC is running, or
	// after DB::Close has been called.
	ErrRejected = errors.New("Value log GC request rejected")

	// ErrInvalidRequest is returned if the user request is invalid.
	ErrInvalidRequest = errors.New("Invalid request")

	// ErrManagedTxn is returned if the user tries to use an API which isn't
	// allowed due to external management of transactions, when using ManagedDB.
	ErrManagedTxn = errors.New(
		"Invalid API request. Not allowed to perform this action using ManagedDB")

	// ErrInvalidDump if a data dump made previously cannot be loaded into the database.
	ErrInvalidDump = errors.New("Data dump cannot be read")

	// ErrZeroBandwidth is returned if the user passes in zero bandwidth for sequence.
	ErrZeroBandwidth = errors.New("Bandwidth must be greater than zero")

	// ErrInvalidLoadingMode is returned when opt.ValueLogLoadingMode option is not
	// within the valid range
	ErrInvalidLoadingMode = errors.New("Invalid ValueLogLoadingMode, must be FileIO or MemoryMap")

	// ErrReplayNeeded is returned when opt.ReadOnly is set but the
	// database requires a value log replay.
	ErrReplayNeeded = errors.New("Database was not properly closed, cannot open read-only")

	// ErrWindowsNotSupported is returned when opt.ReadOnly is used on Windows
	ErrWindowsNotSupported = errors.New("Read-only mode is not supported on Windows")

	// ErrTruncateNeeded is returned when the value log gets corrupt, and requires truncation of
	// corrupt data to allow Badger to run properly.
	ErrTruncateNeeded = errors.New(
		"Value log truncate required to run DB. This might result in data loss")

	// ErrBlockedWrites is returned if the user called DropAll. During the process of dropping all
	// data from Badger, we stop accepting new writes, by returning this error.
	ErrBlockedWrites = errors.New("Writes are blocked, possibly due to DropAll or Close")

	// ErrNilCallback is returned when subscriber's callback is nil.
	ErrNilCallback = errors.New("Callback cannot be nil")
)

View Source

var DefaultIteratorOptions = IteratorOptions{
	PrefetchValues: true,
	PrefetchSize:   100,
	Reverse:        false,
	AllVersions:    false,
}

DefaultIteratorOptions contains default options when iterating over Badger key-value stores.

View Source

var ErrUnsortedKey = errors.New("Keys not in sorted order")

ErrUnsortedKey is returned when any out of order key arrives at sortedWriter during call to Add.

Functions ¶

This section is empty.

Types ¶

type DB ¶

type DB struct {
	sync.RWMutex // Guards list of inmemory tables, not individual reads and writes.
	// contains filtered or unexported fields
}

DB provides the various functions required to interact with Badger. DB is thread-safe.

func Open ¶

func Open(opt Options) (db *DB, err error)

Open returns a new DB object.

Example ¶

dir, err := ioutil.TempDir("", "badger-test")
if err != nil {
	panic(err)
}
defer removeDir(dir)
db, err := Open(DefaultOptions(dir))
if err != nil {
	panic(err)
}
defer db.Close()

err = db.View(func(txn *Txn) error {
	_, err := txn.Get([]byte("key"))
	// We expect ErrKeyNotFound
	fmt.Println(err)
	return nil
})

if err != nil {
	panic(err)
}

txn := db.NewTransaction(true) // Read-write txn
err = txn.SetEntry(NewEntry([]byte("key"), []byte("value")))
if err != nil {
	panic(err)
}
err = txn.Commit()
if err != nil {
	panic(err)
}

err = db.View(func(txn *Txn) error {
	item, err := txn.Get([]byte("key"))
	if err != nil {
		return err
	}
	val, err := item.ValueCopy(nil)
	if err != nil {
		return err
	}
	fmt.Printf("%s\n", string(val))
	return nil
})

if err != nil {
	panic(err)
}

Output:

Key not found
value

func OpenManaged ¶

func OpenManaged(opts Options) (*DB, error)

OpenManaged returns a new DB, which allows more control over setting transaction timestamps, aka managed mode.

This is only useful for databases built on top of Badger (like Dgraph), and can be ignored by most users.

func (*DB) Backup ¶

func (db *DB) Backup(w io.Writer, since uint64) (uint64, error)

Backup is a wrapper function over Stream.Backup to generate full and incremental backups of the DB. For more control over how many goroutines are used to generate the backup, or if you wish to backup only a certain range of keys, use Stream.Backup directly.

func (*DB) Close ¶

func (db *DB) Close() error

Close closes a DB. It's crucial to call it to ensure all the pending updates make their way to disk. Calling DB.Close() multiple times would still only close the DB once.

func (*DB) DropAll ¶

func (db *DB) DropAll() error

DropAll would drop all the data stored in Badger. It does this in the following way. - Stop accepting new writes. - Pause memtable flushes and compactions. - Pick all tables from all levels, create a changeset to delete all these tables and apply it to manifest. - Pick all log files from value log, and delete all of them. Restart value log files from zero. - Resume memtable flushes and compactions.

NOTE: DropAll is resilient to concurrent writes, but not to reads. It is up to the user to not do any reads while DropAll is going on, otherwise they may result in panics. Ideally, both reads and writes are paused before running DropAll, and resumed after it is finished.

func (*DB) DropPrefix ¶

func (db *DB) DropPrefix(prefix []byte) error

DropPrefix would drop all the keys with the provided prefix. It does this in the following way:

Stop accepting new writes.
Stop memtable flushes before acquiring lock. Because we're acquring lock here and memtable flush stalls for lock, which leads to deadlock
Flush out all memtables, skipping over keys with the given prefix, Kp.
Write out the value log header to memtables when flushing, so we don't accidentally bring Kp back after a restart.
Stop compaction.
Compact L0->L1, skipping over Kp.
Compact rest of the levels, Li->Li, picking tables which have Kp.
Resume memtable flushes, compactions and writes.

func (*DB) Flatten ¶

func (db *DB) Flatten(workers int) error

Flatten can be used to force compactions on the LSM tree so all the tables fall on the same level. This ensures that all the versions of keys are colocated and not split across multiple levels, which is necessary after a restore from backup. During Flatten, live compactions are stopped. Ideally, no writes are going on during Flatten. Otherwise, it would create competition between flattening the tree and new tables being created at level zero.

func (*DB) GetMergeOperator ¶

func (db *DB) GetMergeOperator(key []byte,
	f MergeFunc, dur time.Duration) *MergeOperator

GetMergeOperator creates a new MergeOperator for a given key and returns a pointer to it. It also fires off a goroutine that performs a compaction using the merge function that runs periodically, as specified by dur.

func (*DB) GetSequence ¶

func (db *DB) GetSequence(key []byte, bandwidth uint64) (*Sequence, error)

GetSequence would initiate a new sequence object, generating it from the stored lease, if available, in the database. Sequence can be used to get a list of monotonically increasing integers. Multiple sequences can be created by providing different keys. Bandwidth sets the size of the lease, determining how many Next() requests can be served from memory.

GetSequence is not supported on ManagedDB. Calling this would result in a panic.

func (*DB) KeySplits ¶

func (db *DB) KeySplits(prefix []byte) []string

KeySplits can be used to get rough key ranges to divide up iteration over the DB.

func (*DB) Load ¶

func (db *DB) Load(r io.Reader, maxPendingWrites int) error

Load reads a protobuf-encoded list of all entries from a reader and writes them to the database. This can be used to restore the database from a backup made by calling DB.Backup(). If more complex logic is needed to restore a badger backup, the KVLoader interface should be used instead.

DB.Load() should be called on a database that is not running any other concurrent transactions while it is running.

func (*DB) MaxBatchCount ¶

func (db *DB) MaxBatchCount() int64

MaxBatchCount returns max possible entries in batch

func (*DB) MaxBatchSize ¶

func (db *DB) MaxBatchSize() int64

MaxBatchSize returns max possible batch size

func (*DB) NewKVLoader ¶

func (db *DB) NewKVLoader(maxPendingWrites int) *KVLoader

NewKVLoader returns a new instance of KVLoader.

func (*DB) NewStream ¶

func (db *DB) NewStream() *Stream

NewStream creates a new Stream.

func (*DB) NewStreamAt ¶

func (db *DB) NewStreamAt(readTs uint64) *Stream

NewStreamAt creates a new Stream at a particular timestamp. Should only be used with managed DB.

func (*DB) NewStreamWriter ¶

func (db *DB) NewStreamWriter() *StreamWriter

NewStreamWriter creates a StreamWriter. Right after creating StreamWriter, Prepare must be called. The memory usage of a StreamWriter is directly proportional to the number of streams possible. So, efforts must be made to keep the number of streams low. Stream framework would typically use 16 goroutines and hence create 16 streams.

func (*DB) NewTransaction ¶

func (db *DB) NewTransaction(update bool) *Txn

NewTransaction creates a new transaction. Badger supports concurrent execution of transactions, providing serializable snapshot isolation, avoiding write skews. Badger achieves this by tracking the keys read and at Commit time, ensuring that these read keys weren't concurrently modified by another transaction.

For read-only transactions, set update to false. In this mode, we don't track the rows read for any changes. Thus, any long running iterations done in this mode wouldn't pay this overhead.

Running transactions concurrently is OK. However, a transaction itself isn't thread safe, and should only be run serially. It doesn't matter if a transaction is created by one goroutine and passed down to other, as long as the Txn APIs are called serially.

When you create a new transaction, it is absolutely essential to call Discard(). This should be done irrespective of what the update param is set to. Commit API internally runs Discard, but running it twice wouldn't cause any issues.

txn := db.NewTransaction(false)
defer txn.Discard()
// Call various APIs.

func (*DB) NewTransactionAt ¶

func (db *DB) NewTransactionAt(readTs uint64, update bool) *Txn

NewTransactionAt follows the same logic as DB.NewTransaction(), but uses the provided read timestamp.

This is only useful for databases built on top of Badger (like Dgraph), and can be ignored by most users.

func (*DB) NewWriteBatch ¶

func (db *DB) NewWriteBatch() *WriteBatch

NewWriteBatch creates a new WriteBatch. This provides a way to conveniently do a lot of writes, batching them up as tightly as possible in a single transaction and using callbacks to avoid waiting for them to commit, thus achieving good performance. This API hides away the logic of creating and committing transactions. Due to the nature of SSI guaratees provided by Badger, blind writes can never encounter transaction conflicts (ErrConflict).

func (*DB) NewWriteBatchAt ¶

func (db *DB) NewWriteBatchAt(commitTs uint64) *WriteBatch

NewWriteBatchAt is similar to NewWriteBatch but it allows user to set the commit timestamp. NewWriteBatchAt is supposed to be used only in the managed mode.

func (*DB) PrintHistogram ¶

func (db *DB) PrintHistogram(keyPrefix []byte)

PrintHistogram builds and displays the key-value size histogram. When keyPrefix is set, only the keys that have prefix "keyPrefix" are considered for creating the histogram

func (*DB) RunValueLogGC ¶

func (db *DB) RunValueLogGC(discardRatio float64) error

RunValueLogGC triggers a value log garbage collection.

It picks value log files to perform GC based on statistics that are collected during compactions. If no such statistics are available, then log files are picked in random order. The process stops as soon as the first log file is encountered which does not result in garbage collection.

When a log file is picked, it is first sampled. If the sample shows that we can discard at least discardRatio space of that file, it would be rewritten.

If a call to RunValueLogGC results in no rewrites, then an ErrNoRewrite is thrown indicating that the call resulted in no file rewrites.

We recommend setting discardRatio to 0.5, thus indicating that a file be rewritten if half the space can be discarded. This results in a lifetime value log write amplification of 2 (1 from original write + 0.5 rewrite + 0.25 + 0.125 + ... = 2). Setting it to higher value would result in fewer space reclaims, while setting it to a lower value would result in more space reclaims at the cost of increased activity on the LSM tree. discardRatio must be in the range (0.0, 1.0), both endpoints excluded, otherwise an ErrInvalidRequest is returned.

Only one GC is allowed at a time. If another value log GC is running, or DB has been closed, this would return an ErrRejected.

Note: Every time GC is run, it would produce a spike of activity on the LSM tree.

func (*DB) SetDiscardTs ¶

func (db *DB) SetDiscardTs(ts uint64)

SetDiscardTs sets a timestamp at or below which, any invalid or deleted versions can be discarded from the LSM tree, and thence from the value log to reclaim disk space. Can only be used with managed transactions.

func (*DB) Size ¶

func (db *DB) Size() (lsm, vlog int64)

Size returns the size of lsm and value log files in bytes. It can be used to decide how often to call RunValueLogGC.

func (db *DB) Subscribe(ctx context.Context, cb func(kv *KVList) error, prefixes ...[]byte) error

Subscribe can be used to watch key changes for the given key prefixes. At least one prefix should be passed, or an error will be returned. You can use an empty prefix to monitor all changes to the DB. This function blocks until the given context is done or an error occurs. The given function will be called with a new KVList containing the modified keys and the corresponding values.

func (*DB) Sync ¶

func (db *DB) Sync() error

Sync syncs database content to disk. This function provides more control to user to sync data whenever required.

func (*DB) Tables ¶

func (db *DB) Tables(withKeysCount bool) []TableInfo

Tables gets the TableInfo objects from the level controller. If withKeysCount is true, TableInfo objects also contain counts of keys for the tables.

func (*DB) Update ¶

func (db *DB) Update(fn func(txn *Txn) error) error

Update executes a function, creating and managing a read-write transaction for the user. Error returned by the function is relayed by the Update method. Update cannot be used with managed transactions.

func (*DB) View ¶

func (db *DB) View(fn func(txn *Txn) error) error

View executes a function creating and managing a read-only transaction for the user. Error returned by the function is relayed by the View method. If View is used with managed transactions, it would assume a read timestamp of MaxUint64.

type Entry ¶

type Entry struct {
	Key       []byte
	Value     []byte
	UserMeta  byte
	ExpiresAt uint64 // time.Unix
	// contains filtered or unexported fields
}

Entry provides Key, Value, UserMeta and ExpiresAt. This struct can be used by the user to set data.

func NewEntry ¶

func NewEntry(key, value []byte) *Entry

NewEntry creates a new entry with key and value passed in args. This newly created entry can be set in a transaction by calling txn.SetEntry(). All other properties of Entry can be set by calling WithMeta, WithDiscard, WithTTL methods on it. This function uses key and value reference, hence users must not modify key and value until the end of transaction.

func (*Entry) WithDiscard ¶

func (e *Entry) WithDiscard() *Entry

WithDiscard adds a marker to Entry e. This means all the previous versions of the key (of the Entry) will be eligible for garbage collection. This method is only useful if you have set a higher limit for options.NumVersionsToKeep. The default setting is 1, in which case, this function doesn't add any more benefit. If however, you have a higher setting for NumVersionsToKeep (in Dgraph, we set it to infinity), you can use this method to indicate that all the older versions can be discarded and removed during compactions.

func (*Entry) WithMeta ¶

func (e *Entry) WithMeta(meta byte) *Entry

WithMeta adds meta data to Entry e. This byte is stored alongside the key and can be used as an aid to interpret the value or store other contextual bits corresponding to the key-value pair of entry.

func (*Entry) WithTTL ¶

func (e *Entry) WithTTL(dur time.Duration) *Entry

WithTTL adds time to live duration to Entry e. Entry stored with a TTL would automatically expire after the time has elapsed, and will be eligible for garbage collection.

type Item ¶

type Item struct {
	// contains filtered or unexported fields
}

Item is returned during iteration. Both the Key() and Value() output is only valid until iterator.Next() is called.

func (*Item) DiscardEarlierVersions ¶

func (item *Item) DiscardEarlierVersions() bool

DiscardEarlierVersions returns whether the item was created with the option to discard earlier versions of a key when multiple are available.

func (*Item) EstimatedSize ¶

func (item *Item) EstimatedSize() int64

EstimatedSize returns the approximate size of the key-value pair.

This can be called while iterating through a store to quickly estimate the size of a range of key-value pairs (without fetching the corresponding values).

func (*Item) ExpiresAt ¶

func (item *Item) ExpiresAt() uint64

ExpiresAt returns a Unix time value indicating when the item will be considered expired. 0 indicates that the item will never expire.

func (*Item) IsDeletedOrExpired ¶

func (item *Item) IsDeletedOrExpired() bool

IsDeletedOrExpired returns true if item contains deleted or expired value.

func (*Item) Key ¶

func (item *Item) Key() []byte

Key returns the key.

Key is only valid as long as item is valid, or transaction is valid. If you need to use it outside its validity, please use KeyCopy.

func (*Item) KeyCopy ¶

func (item *Item) KeyCopy(dst []byte) []byte

KeyCopy returns a copy of the key of the item, writing it to dst slice. If nil is passed, or capacity of dst isn't sufficient, a new slice would be allocated and returned.

func (*Item) KeySize ¶

func (item *Item) KeySize() int64

KeySize returns the size of the key. Exact size of the key is key + 8 bytes of timestamp

func (*Item) String ¶

func (item *Item) String() string

String returns a string representation of Item

func (*Item) UserMeta ¶

func (item *Item) UserMeta() byte

UserMeta returns the userMeta set by the user. Typically, this byte, optionally set by the user is used to interpret the value.

func (*Item) Value ¶

func (item *Item) Value(fn func(val []byte) error) error

Value retrieves the value of the item from the value log.

This method must be called within a transaction. Calling it outside a transaction is considered undefined behavior. If an iterator is being used, then Item.Value() is defined in the current iteration only, because items are reused.

If you need to use a value outside a transaction, please use Item.ValueCopy instead, or copy it yourself. Value might change once discard or commit is called. Use ValueCopy if you want to do a Set after Get.

func (*Item) ValueCopy ¶

func (item *Item) ValueCopy(dst []byte) ([]byte, error)

ValueCopy returns a copy of the value of the item from the value log, writing it to dst slice. If nil is passed, or capacity of dst isn't sufficient, a new slice would be allocated and returned. Tip: It might make sense to reuse the returned slice as dst argument for the next call.

This function is useful in long running iterate/update transactions to avoid a write deadlock. See Github issue: https://github.com/meeyio/badger/issues/315

func (*Item) ValueSize ¶

func (item *Item) ValueSize() int64

ValueSize returns the exact size of the value.

This can be called to quickly estimate the size of a value without fetching it.

func (*Item) Version ¶

func (item *Item) Version() uint64

Version returns the commit timestamp of the item.

type Iterator ¶

type Iterator struct {
	// contains filtered or unexported fields
}

Iterator helps iterating over the KV pairs in a lexicographically sorted order.

func (*Iterator) Close ¶

func (it *Iterator) Close()

Close would close the iterator. It is important to call this when you're done with iteration.

func (*Iterator) Item ¶

func (it *Iterator) Item() *Item

Item returns pointer to the current key-value pair. This item is only valid until it.Next() gets called.

func (*Iterator) Next ¶

func (it *Iterator) Next()

Next would advance the iterator by one. Always check it.Valid() after a Next() to ensure you have access to a valid it.Item().

func (*Iterator) Rewind ¶

func (it *Iterator) Rewind()

Rewind would rewind the iterator cursor all the way to zero-th position, which would be the smallest key if iterating forward, and largest if iterating backward. It does not keep track of whether the cursor started with a Seek().

func (*Iterator) Seek ¶

func (it *Iterator) Seek(key []byte)

Seek would seek to the provided key if present. If absent, it would seek to the next smallest key greater than the provided key if iterating in the forward direction. Behavior would be reversed if iterating backwards.

func (*Iterator) Valid ¶

func (it *Iterator) Valid() bool

Valid returns false when iteration is done.

func (*Iterator) ValidForPrefix ¶

func (it *Iterator) ValidForPrefix(prefix []byte) bool

ValidForPrefix returns false when iteration is done or when the current key is not prefixed by the specified prefix.

type IteratorOptions ¶

type IteratorOptions struct {
	// Indicates whether we should prefetch values during iteration and store them.
	PrefetchValues bool
	// How many KV pairs to prefetch while iterating. Valid only if PrefetchValues is true.
	PrefetchSize int
	Reverse      bool // Direction of iteration. False is forward, true is backward.
	AllVersions  bool // Fetch all valid versions of the same key.

	// The following option is used to narrow down the SSTables that iterator picks up. If
	// Prefix is specified, only tables which could have this prefix are picked based on their range
	// of keys.
	Prefix []byte // Only iterate over this given prefix.

	InternalAccess bool // Used to allow internal access to badger keys.
	// contains filtered or unexported fields
}

IteratorOptions is used to set options when iterating over Badger key-value stores.

This package provides DefaultIteratorOptions which contains options that should work for most applications. Consider using that as a starting point before customizing it for your own needs.

type KVList ¶

type KVList = pb.KVList

KVList contains a list of key-value pairs.

type KVLoader ¶

type KVLoader struct {
	// contains filtered or unexported fields
}

KVLoader is used to write KVList objects in to badger. It can be used to restore a backup.

func (*KVLoader) Finish ¶

func (l *KVLoader) Finish() error

Finish is meant to be called after all the key-value pairs have been loaded.

func (*KVLoader) Set ¶

func (l *KVLoader) Set(kv *pb.KV) error

Set writes the key-value pair to the database.

type Logger ¶

type Logger interface {
	Errorf(string, ...interface{})
	Warningf(string, ...interface{})
	Infof(string, ...interface{})
	Debugf(string, ...interface{})
}

Logger is implemented by any logging system that is used for standard logs.

type Manifest ¶

type Manifest struct {
	Levels []levelManifest
	Tables map[uint64]TableManifest

	// Contains total number of creation and deletion changes in the manifest -- used to compute
	// whether it'd be useful to rewrite the manifest.
	Creations int
	Deletions int
}

Manifest represents the contents of the MANIFEST file in a Badger store.

The MANIFEST file describes the startup state of the db -- all LSM files and what level they're at.

It consists of a sequence of ManifestChangeSet objects. Each of these is treated atomically, and contains a sequence of ManifestChange's (file creations/deletions) which we use to reconstruct the manifest at startup.

func ReplayManifestFile ¶

func ReplayManifestFile(fp *os.File) (Manifest, int64, error)

ReplayManifestFile reads the manifest file and constructs two manifest objects. (We need one immutable copy and one mutable copy of the manifest. Easiest way is to construct two of them.) Also, returns the last offset after a completely read manifest entry -- the file must be truncated at that point before further appends are made (if there is a partial entry after that). In normal conditions, truncOffset is the file size.

type MergeFunc ¶

type MergeFunc func(existingVal, newVal []byte) []byte

MergeFunc accepts two byte slices, one representing an existing value, and another representing a new value that needs to be ‘merged’ into it. MergeFunc contains the logic to perform the ‘merge’ and return an updated value. MergeFunc could perform operations like integer addition, list appends etc. Note that the ordering of the operands is maintained.

type MergeOperator ¶

type MergeOperator struct {
	sync.RWMutex
	// contains filtered or unexported fields
}

MergeOperator represents a Badger merge operator.

func (*MergeOperator) Add ¶

func (op *MergeOperator) Add(val []byte) error

Add records a value in Badger which will eventually be merged by a background routine into the values that were recorded by previous invocations to Add().

func (*MergeOperator) Get ¶

func (op *MergeOperator) Get() ([]byte, error)

Get returns the latest value for the merge operator, which is derived by applying the merge function to all the values added so far.

If Add has not been called even once, Get will return ErrKeyNotFound.

func (*MergeOperator) Stop ¶

func (op *MergeOperator) Stop()

Stop waits for any pending merge to complete and then stops the background goroutine.

type Options ¶

type Options struct {
	Dir      string
	ValueDir string

	SyncWrites          bool
	TableLoadingMode    options.FileLoadingMode
	ValueLogLoadingMode options.FileLoadingMode
	NumVersionsToKeep   int
	ReadOnly            bool
	Truncate            bool
	Logger              Logger
	EventLogging        bool

	MaxTableSize        int64
	LevelSizeMultiplier int
	MaxLevels           int
	ValueThreshold      int
	NumMemtables        int

	NumLevelZeroTables      int
	NumLevelZeroTablesStall int

	LevelOneSize       int64
	ValueLogFileSize   int64
	ValueLogMaxEntries uint32

	NumCompactors     int
	CompactL0OnClose  bool
	LogRotatesToFlush int32
	// When set, checksum will be validated for each entry read from the value log file.
	VerifyValueChecksum bool

	// BypassLockGaurd will bypass the lock guard on badger. Bypassing lock
	// guard can cause data corruption if multiple badger instances are using
	// the same directory. Use this options with caution.
	BypassLockGuard bool
	// contains filtered or unexported fields
}

Options are params for creating DB object.

This package provides DefaultOptions which contains options that should work for most applications. Consider using that as a starting point before customizing it for your own needs.

Each option X is documented on the WithX method.

func DefaultOptions ¶

func DefaultOptions(path string) Options

DefaultOptions sets a list of recommended options for good performance. Feel free to modify these to suit your needs with the WithX methods.

func LSMOnlyOptions ¶

func LSMOnlyOptions(path string) Options

LSMOnlyOptions follows from DefaultOptions, but sets a higher ValueThreshold so values would be colocated with the LSM tree, with value log largely acting as a write-ahead log only. These options would reduce the disk usage of value log, and make Badger act more like a typical LSM tree.

func (*Options) Debugf ¶

func (opt *Options) Debugf(format string, v ...interface{})

Debugf logs a DEBUG message to the logger specified in opts.

func (*Options) Errorf ¶

func (opt *Options) Errorf(format string, v ...interface{})

Errorf logs an ERROR log message to the logger specified in opts or to the global logger if no logger is specified in opts.

func (*Options) Infof ¶

func (opt *Options) Infof(format string, v ...interface{})

Infof logs an INFO message to the logger specified in opts.

func (*Options) Warningf ¶

func (opt *Options) Warningf(format string, v ...interface{})

Warningf logs a WARNING message to the logger specified in opts.

func (Options) WithBypassLockGuard ¶

func (opt Options) WithBypassLockGuard(b bool) Options

WithBypassLockGuard returns a new Options value with BypassLockGuard set to the given value.

When BypassLockGuard option is set, badger will not acquire a lock on the directory. This could lead to data corruption if multiple badger instances write to the same data directory. Use this option with caution.

The default value of BypassLockGuard is false.

func (Options) WithCompactL0OnClose ¶

func (opt Options) WithCompactL0OnClose(val bool) Options

WithCompactL0OnClose returns a new Options value with CompactL0OnClose set to the given value.

CompactL0OnClose determines whether Level 0 should be compacted before closing the DB. This ensures that both reads and writes are efficient when the DB is opened later.

The default value of CompactL0OnClose is true.

func (Options) WithDir ¶

func (opt Options) WithDir(val string) Options

WithDir returns a new Options value with Dir set to the given value.

Dir is the path of the directory where key data will be stored in. If it doesn't exist, Badger will try to create it for you. This is set automatically to be the path given to `DefaultOptions`.

func (Options) WithEventLogging ¶

func (opt Options) WithEventLogging(enabled bool) Options

WithEventLogging returns a new Options value with EventLogging set to the given value.

EventLogging provides a way to enable or disable trace.EventLog logging.

The default value of EventLogging is true.

func (Options) WithLevelOneSize ¶

func (opt Options) WithLevelOneSize(val int64) Options

WithLevelOneSize returns a new Options value with LevelOneSize set to the given value.

LevelOneSize sets the maximum total size for Level 1.

The default value of LevelOneSize is 20MB.

func (Options) WithLevelSizeMultiplier ¶

func (opt Options) WithLevelSizeMultiplier(val int) Options

WithLevelSizeMultiplier returns a new Options value with LevelSizeMultiplier set to the given value.

LevelSizeMultiplier sets the ratio between the maximum sizes of contiguous levels in the LSM. Once a level grows to be larger than this ratio allowed, the compaction process will be

triggered.

The default value of LevelSizeMultiplier is 10.

func (Options) WithLogRotatesToFlush ¶

func (opt Options) WithLogRotatesToFlush(val int32) Options

WithLogRotatesToFlush returns a new Options value with LogRotatesToFlush set to the given value.

LogRotatesToFlush sets the number of value log file rotates after which the Memtables are flushed to disk. This is useful in write loads with fewer keys and larger values. This work load would fill up the value logs quickly, while not filling up the Memtables. Thus, on a crash and restart, the value log head could cause the replay of a good number of value log files which can slow things on start.

The default value of LogRotatesToFlush is 2.

func (Options) WithLogger ¶

func (opt Options) WithLogger(val Logger) Options

WithLogger returns a new Options value with Logger set to the given value.

Logger provides a way to configure what logger each value of badger.DB uses.

The default value of Logger writes to stderr using the log package from the Go standard library.

func (Options) WithMaxLevels ¶

func (opt Options) WithMaxLevels(val int) Options

WithMaxLevels returns a new Options value with MaxLevels set to the given value.

Maximum number of levels of compaction allowed in the LSM.

The default value of MaxLevels is 7.

func (Options) WithMaxTableSize ¶

func (opt Options) WithMaxTableSize(val int64) Options

WithMaxTableSize returns a new Options value with MaxTableSize set to the given value.

MaxTableSize sets the maximum size in bytes for each LSM table or file.

The default value of MaxTableSize is 64MB.

func (Options) WithNumCompactors ¶

func (opt Options) WithNumCompactors(val int) Options

WithNumCompactors returns a new Options value with NumCompactors set to the given value.

NumCompactors sets the number of compaction workers to run concurrently. Setting this to zero stops compactions, which could eventually cause writes to block forever.

The default value of NumCompactors is 2.

func (Options) WithNumLevelZeroTables ¶

func (opt Options) WithNumLevelZeroTables(val int) Options

WithNumLevelZeroTables returns a new Options value with NumLevelZeroTables set to the given value.

NumLevelZeroTables sets the maximum number of Level 0 tables before compaction starts.

The default value of NumLevelZeroTables is 5.

func (Options) WithNumLevelZeroTablesStall ¶

func (opt Options) WithNumLevelZeroTablesStall(val int) Options

WithNumLevelZeroTablesStall returns a new Options value with NumLevelZeroTablesStall set to the given value.

NumLevelZeroTablesStall sets the number of Level 0 tables that once reached causes the DB to stall until compaction succeeds.

The default value of NumLevelZeroTablesStall is 10.

func (Options) WithNumMemtables ¶

func (opt Options) WithNumMemtables(val int) Options

WithNumMemtables returns a new Options value with NumMemtables set to the given value.

NumMemtables sets the maximum number of tables to keep in memory before stalling.

The default value of NumMemtables is 5.

func (Options) WithNumVersionsToKeep ¶

func (opt Options) WithNumVersionsToKeep(val int) Options

WithNumVersionsToKeep returns a new Options value with NumVersionsToKeep set to the given value.

NumVersionsToKeep sets how many versions to keep per key at most.

The default value of NumVersionsToKeep is 1.

func (Options) WithReadOnly ¶

func (opt Options) WithReadOnly(val bool) Options

WithReadOnly returns a new Options value with ReadOnly set to the given value.

When ReadOnly is true the DB will be opened on read-only mode. Multiple processes can open the same Badger DB. Note: if the DB being opened had crashed before and has vlog data to be replayed, ReadOnly will cause Open to fail with an appropriate message.

The default value of ReadOnly is false.

func (Options) WithSyncWrites ¶

func (opt Options) WithSyncWrites(val bool) Options

WithSyncWrites returns a new Options value with SyncWrites set to the given value.

When SyncWrites is true all writes are synced to disk. Setting this to false would achieve better performance, but may cause data loss in case of crash.

The default value of SyncWrites is true.

func (Options) WithTableLoadingMode ¶

func (opt Options) WithTableLoadingMode(val options.FileLoadingMode) Options

WithTableLoadingMode returns a new Options value with TableLoadingMode set to the given value.

TableLoadingMode indicates which file loading mode should be used for the LSM tree data files.

The default value of TableLoadingMode is options.MemoryMap.

func (Options) WithTruncate ¶

func (opt Options) WithTruncate(val bool) Options

WithTruncate returns a new Options value with Truncate set to the given value.

Truncate indicates whether value log files should be truncated to delete corrupt data, if any. This option is ignored when ReadOnly is true.

The default value of Truncate is false.

func (Options) WithValueDir ¶

func (opt Options) WithValueDir(val string) Options

WithValueDir returns a new Options value with ValueDir set to the given value.

ValueDir is the path of the directory where value data will be stored in. If it doesn't exist, Badger will try to create it for you. This is set automatically to be the path given to `DefaultOptions`.

func (Options) WithValueLogFileSize ¶

func (opt Options) WithValueLogFileSize(val int64) Options

WithValueLogFileSize returns a new Options value with ValueLogFileSize set to the given value.

ValueLogFileSize sets the maximum size of a single value log file.

The default value of ValueLogFileSize is 1GB.

func (Options) WithValueLogLoadingMode ¶

func (opt Options) WithValueLogLoadingMode(val options.FileLoadingMode) Options

WithValueLogLoadingMode returns a new Options value with ValueLogLoadingMode set to the given value.

ValueLogLoadingMode indicates which file loading mode should be used for the value log data files.

The default value of ValueLogLoadingMode is options.MemoryMap.

func (Options) WithValueLogMaxEntries ¶

func (opt Options) WithValueLogMaxEntries(val uint32) Options

WithValueLogMaxEntries returns a new Options value with ValueLogMaxEntries set to the given value.

ValueLogMaxEntries sets the maximum number of entries a value log file can hold approximately. A actual size limit of a value log file is the minimum of ValueLogFileSize and ValueLogMaxEntries.

The default value of ValueLogMaxEntries is one million (1000000).

func (Options) WithValueThreshold ¶

func (opt Options) WithValueThreshold(val int) Options

WithValueThreshold returns a new Options value with ValueThreshold set to the given value.

ValueThreshold sets the threshold used to decide whether a value is stored directly in the LSM tree or separatedly in the log value files.

The default value of ValueThreshold is 32, but LSMOnlyOptions sets it to 65500.

func (Options) WithVerifyValueChecksum ¶

func (opt Options) WithVerifyValueChecksum(val bool) Options

WithVerifyValueChecksum returns a new Options value with VerifyValueChecksum set to the given value.

When VerifyValueChecksum is set to true, checksum will be verified for every entry read from the value log. If the value is stored in SST (value size less than value threshold) then the checksum validation will not be done.

The default value of VerifyValueChecksum is False.

type Sequence ¶

type Sequence struct {
	sync.Mutex
	// contains filtered or unexported fields
}

Sequence represents a Badger sequence.

func (*Sequence) Next ¶

func (seq *Sequence) Next() (uint64, error)

Next would return the next integer in the sequence, updating the lease by running a transaction if needed.

func (*Sequence) Release ¶

func (seq *Sequence) Release() error

Release the leased sequence to avoid wasted integers. This should be done right before closing the associated DB. However it is valid to use the sequence after it was released, causing a new lease with full bandwidth.

type Stream ¶

type Stream struct {
	// Prefix to only iterate over certain range of keys. If set to nil (default), Stream would
	// iterate over the entire DB.
	Prefix []byte

	// Number of goroutines to use for iterating over key ranges. Defaults to 16.
	NumGo int

	// Badger would produce log entries in Infof to indicate the progress of Stream. LogPrefix can
	// be used to help differentiate them from other activities. Default is "Badger.Stream".
	LogPrefix string

	// ChooseKey is invoked each time a new key is encountered. Note that this is not called
	// on every version of the value, only the first encountered version (i.e. the highest version
	// of the value a key has). ChooseKey can be left nil to select all keys.
	//
	// Note: Calls to ChooseKey are concurrent.
	ChooseKey func(item *Item) bool

	// KeyToList, similar to ChooseKey, is only invoked on the highest version of the value. It
	// is upto the caller to iterate over the versions and generate zero, one or more KVs. It
	// is expected that the user would advance the iterator to go through the versions of the
	// values. However, the user MUST immediately return from this function on the first encounter
	// with a mismatching key. See example usage in ToList function. Can be left nil to use ToList
	// function by default.
	//
	// Note: Calls to KeyToList are concurrent.
	KeyToList func(key []byte, itr *Iterator) (*pb.KVList, error)

	// This is the method where Stream sends the final output. All calls to Send are done by a
	// single goroutine, i.e. logic within Send method can expect single threaded execution.
	Send func(*pb.KVList) error
	// contains filtered or unexported fields
}

Stream provides a framework to concurrently iterate over a snapshot of Badger, pick up key-values, batch them up and call Send. Stream does concurrent iteration over many smaller key ranges. It does NOT send keys in lexicographical sorted order. To get keys in sorted order, use Iterator.

func (*Stream) Backup ¶

func (stream *Stream) Backup(w io.Writer, since uint64) (uint64, error)

Backup dumps a protobuf-encoded list of all entries in the database into the given writer, that are newer than the specified version. It returns a timestamp indicating when the entries were dumped which can be passed into a later invocation to generate an incremental dump, of entries that have been added/modified since the last invocation of Stream.Backup().

This can be used to backup the data in a database at a given point in time.

func (*Stream) Orchestrate ¶

func (st *Stream) Orchestrate(ctx context.Context) error

Orchestrate runs Stream. It picks up ranges from the SSTables, then runs NumGo number of goroutines to iterate over these ranges and batch up KVs in lists. It concurrently runs a single goroutine to pick these lists, batch them up further and send to Output.Send. Orchestrate also spits logs out to Infof, using provided LogPrefix. Note that all calls to Output.Send are serial. In case any of these steps encounter an error, Orchestrate would stop execution and return that error. Orchestrate can be called multiple times, but in serial order.

func (*Stream) ToList ¶

func (st *Stream) ToList(key []byte, itr *Iterator) (*pb.KVList, error)

ToList is a default implementation of KeyToList. It picks up all valid versions of the key, skipping over deleted or expired keys.

type StreamWriter ¶

type StreamWriter struct {
	// contains filtered or unexported fields
}

StreamWriter is used to write data coming from multiple streams. The streams must not have any overlapping key ranges. Within each stream, the keys must be sorted. Badger Stream framework is capable of generating such an output. So, this StreamWriter can be used at the other end to build BadgerDB at a much faster pace by writing SSTables (and value logs) directly to LSM tree levels without causing any compactions at all. This is way faster than using batched writer or using transactions, but only applicable in situations where the keys are pre-sorted and the DB is being bootstrapped. Existing data would get deleted when using this writer. So, this is only useful when restoring from backup or replicating DB across servers.

StreamWriter should not be called on in-use DB instances. It is designed only to bootstrap new DBs.

func (*StreamWriter) Flush ¶

func (sw *StreamWriter) Flush() error

Flush is called once we are done writing all the entries. It syncs DB directories. It also updates Oracle with maxVersion found in all entries (if DB is not managed).

func (*StreamWriter) Prepare ¶

func (sw *StreamWriter) Prepare() error

Prepare should be called before writing any entry to StreamWriter. It deletes all data present in existing DB, stops compactions and any writes being done by other means. Be very careful when calling Prepare, because it could result in permanent data loss. Not calling Prepare would result in a corrupt Badger instance.

func (*StreamWriter) Write ¶

func (sw *StreamWriter) Write(kvs *pb.KVList) error

Write writes KVList to DB. Each KV within the list contains the stream id which StreamWriter would use to demux the writes. Write is thread safe and can be called concurrently by mulitple goroutines.

type TableInfo ¶

type TableInfo struct {
	ID       uint64
	Level    int
	Left     []byte
	Right    []byte
	KeyCount uint64 // Number of keys in the table
}

TableInfo represents the information about a table.

type TableManifest ¶

type TableManifest struct {
	Level    uint8
	Checksum []byte
}

TableManifest contains information about a specific level in the LSM tree.

type Txn ¶

type Txn struct {
	// contains filtered or unexported fields
}

Txn represents a Badger transaction.

func (*Txn) Commit ¶

func (txn *Txn) Commit() error

Commit commits the transaction, following these steps:

1. If there are no writes, return immediately.

2. Check if read rows were updated since txn started. If so, return ErrConflict.

3. If no conflict, generate a commit timestamp and update written rows' commit ts.

4. Batch up all writes, write them to value log and LSM tree.

5. If callback is provided, Badger will return immediately after checking for conflicts. Writes to the database will happen in the background. If there is a conflict, an error will be returned and the callback will not run. If there are no conflicts, the callback will be called in the background upon successful completion of writes or any error during write.

If error is nil, the transaction is successfully committed. In case of a non-nil error, the LSM tree won't be updated, so there's no need for any rollback.

func (*Txn) CommitAt ¶

func (txn *Txn) CommitAt(commitTs uint64, callback func(error)) error

CommitAt commits the transaction, following the same logic as Commit(), but at the given commit timestamp. This will panic if not used with managed transactions.

This is only useful for databases built on top of Badger (like Dgraph), and can be ignored by most users.

func (*Txn) CommitWith ¶

func (txn *Txn) CommitWith(cb func(error))

CommitWith acts like Commit, but takes a callback, which gets run via a goroutine to avoid blocking this function. The callback is guaranteed to run, so it is safe to increment sync.WaitGroup before calling CommitWith, and decrementing it in the callback; to block until all callbacks are run.

func (*Txn) Delete ¶

func (txn *Txn) Delete(key []byte) error

Delete deletes a key.

This is done by adding a delete marker for the key at commit timestamp. Any reads happening before this timestamp would be unaffected. Any reads after this commit would see the deletion.

The current transaction keeps a reference to the key byte slice argument. Users must not modify the key until the end of the transaction.

func (*Txn) Discard ¶

func (txn *Txn) Discard()

Discard discards a created transaction. This method is very important and must be called. Commit method calls this internally, however, calling this multiple times doesn't cause any issues. So, this can safely be called via a defer right when transaction is created.

NOTE: If any operations are run on a discarded transaction, ErrDiscardedTxn is returned.

func (*Txn) Get ¶

func (txn *Txn) Get(key []byte) (item *Item, rerr error)

Get looks for key and returns corresponding Item. If key is not found, ErrKeyNotFound is returned.

func (*Txn) NewIterator ¶

func (txn *Txn) NewIterator(opt IteratorOptions) *Iterator

NewIterator returns a new iterator. Depending upon the options, either only keys, or both key-value pairs would be fetched. The keys are returned in lexicographically sorted order. Using prefetch is recommended if you're doing a long running iteration, for performance.

Multiple Iterators: For a read-only txn, multiple iterators can be running simultaneously. However, for a read-write txn, only one can be running at one time to avoid race conditions, because Txn is thread-unsafe.

Example ¶

dir, err := ioutil.TempDir("", "badger-test")
if err != nil {
	panic(err)
}
defer removeDir(dir)

db, err := Open(DefaultOptions(dir))
if err != nil {
	panic(err)
}
defer db.Close()

bkey := func(i int) []byte {
	return []byte(fmt.Sprintf("%09d", i))
}
bval := func(i int) []byte {
	return []byte(fmt.Sprintf("%025d", i))
}

txn := db.NewTransaction(true)

// Fill in 1000 items
n := 1000
for i := 0; i < n; i++ {
	err := txn.SetEntry(NewEntry(bkey(i), bval(i)))
	if err != nil {
		panic(err)
	}
}

err = txn.Commit()
if err != nil {
	panic(err)
}

opt := DefaultIteratorOptions
opt.PrefetchSize = 10

// Iterate over 1000 items
var count int
err = db.View(func(txn *Txn) error {
	it := txn.NewIterator(opt)
	defer it.Close()
	for it.Rewind(); it.Valid(); it.Next() {
		count++
	}
	return nil
})
if err != nil {
	panic(err)
}
fmt.Printf("Counted %d elements", count)

Output:

Counted 1000 elements

func (*Txn) NewKeyIterator ¶

func (txn *Txn) NewKeyIterator(key []byte, opt IteratorOptions) *Iterator

NewKeyIterator is just like NewIterator, but allows the user to iterate over all versions of a single key. Internally, it sets the Prefix option in provided opt, and uses that prefix to additionally run bloom filter lookups before picking tables from the LSM tree.

func (*Txn) ReadTs ¶

func (txn *Txn) ReadTs() uint64

ReadTs returns the read timestamp of the transaction.

func (*Txn) Set ¶

func (txn *Txn) Set(key, val []byte) error

Set adds a key-value pair to the database. It will return ErrReadOnlyTxn if update flag was set to false when creating the transaction.

The current transaction keeps a reference to the key and val byte slice arguments. Users must not modify key and val until the end of the transaction.

func (*Txn) SetEntry ¶

func (txn *Txn) SetEntry(e *Entry) error

SetEntry takes an Entry struct and adds the key-value pair in the struct, along with other metadata to the database.

The current transaction keeps a reference to the entry passed in argument. Users must not modify the entry until the end of the transaction.

type WriteBatch ¶

type WriteBatch struct {
	sync.Mutex
	// contains filtered or unexported fields
}

WriteBatch holds the necessary info to perform batched writes.

func (*WriteBatch) Cancel ¶

func (wb *WriteBatch) Cancel()

Cancel function must be called if there's a chance that Flush might not get called. If neither Flush or Cancel is called, the transaction oracle would never get a chance to clear out the row commit timestamp map, thus causing an unbounded memory consumption. Typically, you can call Cancel as a defer statement right after NewWriteBatch is called.

Note that any committed writes would still go through despite calling Cancel.

func (*WriteBatch) Delete ¶

func (wb *WriteBatch) Delete(k []byte) error

Delete is equivalent of Txn.Delete.

func (*WriteBatch) Error ¶

func (wb *WriteBatch) Error() error

Error returns any errors encountered so far. No commits would be run once an error is detected.

func (*WriteBatch) Flush ¶

func (wb *WriteBatch) Flush() error

Flush must be called at the end to ensure that any pending writes get committed to Badger. Flush returns any error stored by WriteBatch.

func (*WriteBatch) Set ¶

func (wb *WriteBatch) Set(k, v []byte) error

Set is equivalent of Txn.Set().

func (*WriteBatch) SetEntry ¶

func (wb *WriteBatch) SetEntry(e *Entry) error

SetEntry is the equivalent of Txn.SetEntry.

func (*WriteBatch) SetMaxPendingTxns ¶

func (wb *WriteBatch) SetMaxPendingTxns(max int)

SetMaxPendingTxns sets a limit on maximum number of pending transactions while writing batches. This function should be called before using WriteBatch. Default value of MaxPendingTxns is 16 to minimise memory usage.

Source Files ¶

View all Source files

Directories ¶

Path	Synopsis
badger
cmd
integration
testgc
options
pb
skl
table
trie
y

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL

README ¶

BadgerDB

Project Status [Jun 26, 2019]

Table of Contents

Getting Started

Installing

Choosing a version

Opening a database

Transactions

Read-only transactions

Read-write transactions

Managing transactions manually

Using key/value pairs

Monotonically increasing integers

Merge Operations

Setting Time To Live(TTL) and User Metadata on Keys

Iterating over keys

Prefix scans

Key-only iteration

Stream

Garbage Collection

Database backup

Memory usage

Statistics

Resources

Blog Posts

Design

Comparisons

Benchmarks

Other Projects Using Badger

Frequently Asked Questions

My writes are getting stuck. Why?

My writes are really slow. Why?

I don't see any disk writes. Why?

Reverse iteration doesn't give me the right results.

Which instances should I use for Badger?

I'm getting a closed channel error. Why?

Are there any Go specific settings that I should use?

Are there any Linux specific settings that I should use?

I see "manifest has unsupported version: X (we support Y)" error.

Contact

Documentation ¶

Overview ¶

Usage ¶

Index ¶

Examples ¶

Constants ¶

Variables ¶

Functions ¶

Types ¶

type DB ¶

func Open ¶

func OpenManaged ¶

func (*DB) Backup ¶

func (*DB) Close ¶

func (*DB) DropAll ¶

func (*DB) DropPrefix ¶

func (*DB) Flatten ¶

func (*DB) GetMergeOperator ¶

func (*DB) GetSequence ¶

func (*DB) KeySplits ¶

func (*DB) Load ¶

func (*DB) MaxBatchCount ¶

func (*DB) MaxBatchSize ¶

func (*DB) NewKVLoader ¶

func (*DB) NewStream ¶

func (*DB) NewStreamAt ¶

func (*DB) NewStreamWriter ¶

func (*DB) NewTransaction ¶

func (*DB) NewTransactionAt ¶

func (*DB) NewWriteBatch ¶

func (*DB) NewWriteBatchAt ¶

func (*DB) PrintHistogram ¶

func (*DB) RunValueLogGC ¶

func (*DB) SetDiscardTs ¶

func (*DB) Size ¶

func (*DB) Subscribe ¶

func (*DB) Sync ¶

func (*DB) Tables ¶

func (*DB) Update ¶