Dealing With Jumbo Chunks in Mongodb
Dealing With Jumbo Chunks in Mongodb
( By Akira Kurogane ) Insight for DBAs, MongoDB * insight for DBAs, MongoDB + 0 Comments Percona's experts can maximize your
11 application performance with our open
Jun In this blog post, we will discuss how to deal with jumbo chunks in source database support, managed
2020 MongoDB. services or consulting.
Scenario: You are a MongoDB DBA, and your first task of the day is Contact us
to remove a shard from your cluster. It sounds scary at first, but you
know it is pretty easy. You can do it with a simple command:
SUBSCRIBE
The next morning when you wake up, you check the status of that particular shard and you find the process is stuck: Subscribe to our blog
Shell
1 "msg" : "draining ongoing",
2 "state" : "ongoing", CATEGORIES
3 "remaining" : {
4 "chunks" : NumberLong(3), MySQL(3474)
5 "dbs" : NumberLong(0)
Insight for DBAs(1695)
There are three chunks that for some reason haven’t been migrated, so the removeShard command is stalled! Now,
what do you do? Percona Software(1648)
Percona Events(886)
Find Chunks That Cannot Be Moved
MongoDB(611)
We need to connect to mongos and check the catalog: Insight for Developers(516)
Shell Benchmarks(348)
1 mongos> use config
2 switched to db config Percona Live(344)
3 mongos> db.chunks.find({shard:"server1_set6"})
Cloud(328)
The output will show three chunks, with minimum and maximum _id keys, along with the namespace where they
Webinars(308)
belong. But the last part of the output is what we really need to check:
PostgreSQL(214)
Shell
1 { Monitoring(208)
2 [...]
3 "min" : { Percona Services(170)
4 "_id" : "17zx3j9i60180"
5 },
MariaDB(159)
6 "max" : {
7 "_id" : "30td24p9sx9j0"
8 }, Security(139)
9 "shard" : "server1_set6",
10 "jumbo" : true ProxySQL(133)
11 }
Hardware and Storage(109)
So, the chunk is marked as “jumbo.” We have found the reason the balancer cannot move the chunk!
Storage Engine(64)
Percona Announcements(17)
So, what is a “jumbo chunk”? It is a chunk whose size exceeds the maximum amount specified in the chunk size
configuration parameter (which has a default value of 64 MB). When the value is greater than the limit, the balancer
won’t move it. . Percona Blog RSS Feed
That’s just the concept though. As a concrete implementation, it is a chunk that has been flagged as being jumbo,
which will happen after a splitChunk command finds it cannot split a range of documents into segments smaller than UPCOMING WEBINARS
the settings-defined chunk size. splitChunk commands are typically executed by the balancer’s moveChunk commands No upcoming webinars
or the background auto-splitting process. All Webinars »
E.g. imagine a shard key index of {“surname”: 1, “given_name”: 1}. This is a non-unique tuple because humans,
regrettably, do not have a primary key. If you have 100,000 documents for {“surname”: “Smith”, “given_name”: “John”,
…} there is no opportunity to split them apart. The chunk will, therefore, be as big as those 100,000 documents.
In older versions, you do it manually by removing that “jumbo” field from the documents in the config db that define
chunk ranges the mongos nodes and shard nodes refer to.
JavaScript
db.getSiblingDB("config").chunks.update(
{"ns": <your_sharded_user_db.coll>, "jumbo": true},
{$unset: { "jumbo": "" }}
);
But if you have MongoDB version <= 4.2 a chunk can’t be moved if it exceeds the chunk size setting. In the can’t-drain
problem scenario described in the top paragraph you will have to do one of the following things to finish draining the
shard so you can run the final removeShard :
Move the jumbo chunks after raising the chunk size setting
Iterate all the jumbo chunks Use the dataSize command to find out what the largest size is.
Change the chunksize setting to be larger than that.
Clear the jumbo flag (see sub-section above)
Start draining again and wait for it to move the big chunks. Use the sh.moveChunk() command if you want
to see them happen sooner rather than later.
Don’t forget to change the chunk size back after.
Delete that data for a while. Reinsert a copy after the shard draining is complete.
You’ll still need to clear the jumbo flag (see sub-section above) before the now-empty chunk will be
‘moved’ to another shard.
In theory, you don’t need to do any splits manually, but if you want to hurry up and get confirmation that the chunks
can be split into small enough sizes see how to with the sh.splitAt() command documentation.
Learn more about the history of Oracle, the growth of MongoDB, and what really qualifies software as open source. If
you are a DBA, or an executive looking to adopt or renew with MongoDB, this is a must-read!
Related
Refining Shard Keys in MongoDB MongoDB Sharding: Are Chunks MongoDB 3.4: Sharding
4.4 and Above Balanced (Part 1)? Improvements
July 8, 2021 April 9, 2018 January 4, 2017
In "Insight for DBAs" In "Insight for DBAs" In "MongoDB"
Join 50,000+ of your fellow open-source enthusiasts! Our newsletter provides updates on Percona open
source software releases, technical resources, and valuable MySQL, MariaDB, PostgreSQL, and MongoDB-
related articles. Get information about Percona Live, our technical webinars, and upcoming events and meetups
where you can talk with our experts.
By submitting my information I agree that Percona may use my personal data in send communication to me
about Percona services. I understand that I can unsubscribe from the communication at any time in
accordance with the Percona Privacy Policy.
Sign Me Up!
Author
Akira Kurogane
Leave a Reply