Bit Manipulation: Peter Goldsborough
Bit Manipulation: Peter Goldsborough
Bit Manipulation
Oct 11, 2015
16 minute read
Very often when working with programming languages that are just in any way higher level than
assembly, such as when building web services, desktop applications, mobile apps – you name it –
we may forget about the lower-level happenings in our systems: the bits and bytes and niddy-
griddy details. Often, we actually wish to forget about those happenings and will abstract,
encapsulate and wrap operations on those bits and bytes in classes and objects to make our lives
easier. However, just as problematic it would be to have to deal with bit-manipulation to achieve
even the simplest things and write even the simplest program, it is just as problematic to forget
about how to ip bits, form masks and use the binary system to our advantage.
This post will outline a few tips and practical concepts regarding bit-manipulation and show how
they can be used to solve actual problems (taken from Cracking the Coding Interview).
After reading this, you should be able to upgrade to this keyboard, for “real” programmers:
keyboard
Table of Contents
Basic Concepts
Binary Operations
Bit-Manipulation
Tricks
Determining if a Value is a Power of Two
Masks
Finding the LSB
Problems
Determining the Cardinality
Bit-Merging
Floating-Point Representation
Bit-Twins
Edit-Distance for Bits
Even-Odd Bit-Swapping
Drawing a Line in a Monochrome Screen.
Basic Concepts
First, we should lay out the basics. I’m assuming you know what the binary system is, that it only
uses the digits 0 and 1 and that the n th
bit (including 0 ) corresponds to a value of 2 .
n
Binary Operations
Now follow basic binary concepts. You may very well be familiar with these operations, so feel free
to skip them.
Addition
To add two binary values in your head, you can either rst convert them to decimal and then do the
addition there (and possibly re-convert the result to binary), or just do addition like you learnt it in
primary school.
1. Convert 1101 to 13
2 10 and 1010 to 10
2 10 and then easily do 13 + 10 = 23
2. Write the two numbers beneath each other and follow the rules:
0 +0 give 0
0 +1 give 1
1 +1 give 0 with a carry of 1 (add that at the next digit)
1101
+
1010
_____
10111
Subtraction
2. Write the two numbers beneath each other and follow the rules:
0 - 0 give 0
1 - 0 give 1
1 - 1 give 0
10 - 1 give 01 (borrow) and generally a 1 followed by n 0s gives a 0 followed by n 1s
Multiplication
Multiplication is a bit more complicated, but basically involves shifting bits around:
1. 11012
⋅ 10102 = 1310 ⋅ 1010 = 130
2. For multiplication you have to work on a bit-by-bit basis and “multiply” each bit A in the one
value by each bit B in the other value by shifting B over by the position of B. You then have to add
the result of each shift operation. For example, with x = 4 = 0b0100 and y = 2 = 0b0010, to
do x ⋅ y , you have to shift x over by the position of every bit in y. For x, here, there is only the 2
nd
bit (0-indexed) and for y only the 1 . For the result, you would now have to shift x to the left by
st
the index of the bit in y, i.e. by one position. The result is then 0b1000 = 8. If there were more
bits in x, you would repeat this for all other bits and add the results of each shift for the nal
result.
1101
x
1010
________
10000010
OR
The OR operation is the rst binary-only operation (can’t do it in decimal). In most programming
languages, it is performed with the | (bar) operator. To OR two binary values, you follow the rules
that, for every bit:
0 | 0 = 0
0 | 1 = 1
1 | 1 = 1
The basic idea is, the bit will be set if one or the other is, but you most likely know that from boolean
expressions in programming.
AND
For boolean AND – usually represented by the & (ampersand) character –, a bit is only set if both the
bit in the rst value and in the second value are set, such that:
0 & 0 = 0
0 & 1 = 0
1 & 1 = 1
XOR
XOR, short for “exclusive-OR”, sets a bit only if the bits in the two values di er, i.e. if bit1 !=
bit2:
0 ^ 0 = 0
0 ^ 1 = 1
1 ^ 1 = 0
Complement
Complementing, also called twiddling, negating or simply ipping, changes all 1 s to 0 s and vice-
versa. Note that this is a unary operation, not a binary operation, meaning it is performed on only
one value and not on/between two values (as is AND, OR and XOR). Its character is the tilde: ~.
~10101101 = 01010010
~1111 = 0000
~0 = 1
Shifting
Shifting, to the left with << and to the right with >>, shifts a binary value to the left or right by a
certain number of bits. Note that in Java, there exist also the <<< and >>> operators, which also shift
the sign-bit (while >> would only shift the bits to the right of the sign-bit).
Bit-Manipulation
While the above paragraphs show how to use the operators available for bit operations, they do not
yet answer questions such as how to set, clear or update bits. These manipulation-techniques are
described below.
Setting Bits
For setting a bit, the OR operation is ideal, as OR-ing a bit with 1 will always result in that bit being
1, whether it was 1 before or not. You may have rst thought of XOR-ing an unset bit with 1, but that
would clear the bit if it was already set. That’s why the common method is to do something along
the lines of the following:
To set the n th
bit (starting at 0) of a value x: x |= (1 << n)
For example, when working with microcontrollers, I always have a macro like this (note that this is
for working with microcontrollers with a few KB of RAM where macros are often better than
function calls, in any higher-level language you should always prefer functions):
Clearing Bits
To clear a bit, we use the AND (&) and NOT (~) operations. To clear/unset the n
th
bit of a value, the
basic idea is to create a bit-mask with all bits set except for that n th
bit, and then to AND the value
with this mask. All bits that were previously set will be left alone, because 1 & 1 = 1 and all bits
that were unset will also remain unset, because 0 & 1 = 0.
To clear the n th
bit of a value x: x &= ~(1 << n).
Macro:
Toggling Bits
We use XOR with 1 to toggle a bit. This operation will always ip a 0 to a 1 and a 1 to a 0.
To toggle the n th
bit of a value x: x ^= (1 << n)
Macro:
Updating Bits
Sometimes we may want to update a bit to speci c value, stored in a variable. For this, we rst have
to clear the bit, and then OR it with the value (true/false).
To update the n th
bit of a value x:
Checking Bits
Of course, we’d also like to know if bits are set or not. For this, we AND the bit with a 1. If the bit was
set, the result will be a 1 and else a 0. This also evaluates nicely to boolean true and false, thus you
can use it in an if clause.
To check the n th
bit of a value x: x & (1 << n)
For example, you can check if a value is odd by AND-ing the rst (i.e. 0 ) bit:
th
1 void foo(int x)
2 {
3 if (x & 1) /* Do odd things. */;
4
5 else /* Do even things.*/;
6 }
I actually much prefer this way of checking for evenness, but most people are more familiar with
the modulo method (even if x % 2 == 0) so I tend to stick with the modulo method for compliance.
Macro below. Note how here, after performing the AND operation, we shift the the result back to the
right so that the set or cleared bit is at index 0. This is if we want to check the bit in a higher-
bitwidth value and then store the result in a lower-bitwidth value, e.g. check the 31 st
bit of a 32-bit
unsigned integer and store the result in an 8-bit unsigned char. The bits more-signi cant than the
7
th
bit get lost during casting.
Tricks
First, two simple tricks that can be useful for bit-manipulation.
import math
log = math.log2(value)
if int(log) == log:
...
But there’s really a nice trick to doing this: A value x is a power of 2 if you can AND x with x - 1 and
have the result be 0.
Example: x = 16
x: 00010000
AND
x - 1: 00001111
_______________
00000000
Counterexample: x = 6
x: 00000110
AND
x - 1: 00000101
_______________
00000100
Masks
Of Exactly N Bits
To create a mask of N bits, shift 1 over by N positions to the left and subtract 1:
>>> bin(0xAA)
>>> '0b10101010'
With the explanation that 0xA in hex is 10 in decimal which is 0b1010 in binary (and the st
1 and rd
3
The analog for all even bits is 0x5, as 0x5 is 0b0101 in binary, where all even bits are set.
>>> bin(0x55)
'0b1010101'
1 template<typename T>
2 std::size_t find_lsb(const T& value)
3 {
4 if (value == 0) throw std::invalid_argument("No bit set at all!");
5
6 std::size_t bit = 0;
7
8 while (! value & bit) ++bit;
9
10 return bit;
11 }
While this works, its complexity is O(N) where N is the bitwidth of the data-type T used to represent
the value. Using some super-awesome tricks we can get this down to constant-time:
Example: A: 011010
B: 011010 - 1 = 011001
Then, OR A with B, such that all bits before and including the LSB are set in A:
A: 011010
OR
B: 011001
_________
C: 011011
The consequence is that when you now XOR B with C, the only position where bits di er is at the LSB
(because you also set the bits before the LSB and the bits after it are una ected).
C: 011011
XOR
B: 011001
_________
D: 000010
If you now take base-2 logarithm of this value you get the LSB position.
1 template<typename T>
2 std::size_t find_lsb(T value)
3 {
4 T less = value - 1;
5
6 value = (less | value) ^ less;
7
8 return std::log2(value);
9 }
Problems
Here follow some problems regarding bit-manipulation, many taken from Cracking the Coding
Interview), to which all credit goes for them.
We’ll have to count, but optimize a bit by testing if the value is a power of 2 or one value before a
power of 2.
1 template<typename T>
2 std::size_t cardinality(T value)
3 {
4 if (value == 0) return 0;
5
6 if (((value - 1) & value) == 0) return 1;
7
8 if (((value + 1) & value) == 0) return std::log2(value + 1);
9
10 std::size_t count = 0;
11
12 for ( ; value; value >>= 1)
13 {
14 if (value & 1) ++count;
15 }
16
17 return count;
18 }
Bit-Merging
You are given two 32-bit number, N and M, and two bit positions i and j with i being less signi cant than j.
Insert M into N at those positions.
Solution: Create a proper mask to unset the bits between i and j in N, then shift M over by i bits and
or them.
2. Binary AND N by the mask, to clear the relevant bits between i and j.
3. Simply “insert” M by ORing N by M.
1 template<typename T>
2 void insert_bits(T& target, T bits, std::size_t i, std::size_t j)
3 {
4 if (! target || ! bits || i == j) return;
5
6 auto mask = (1 << (j - i)) - 1;
7
8 // clear bits
9 target &= ~(mask << i);
10
11 // Make sure to get rid of excessive bits
12 // (we only want j - i many, so mask off those)
13 target |= (bits & mask) << i;
14 }
Floating-Point Representation
Given a real number between 0 and 1, print its binary representation if it can be represented with at most 32
characters, else print “Error”.
Note: Binary numbers are generally structured such that each bit signi es 0 or 1 ⋅ 2 . This is true forN
positive values, i.e. 101 means, from right to left, . But it is also true for
0 1 2
1 ⋅2 +0 ⋅2 +1 ⋅2 = 5
negative values, where each digit to the right of the 0/1 bit stands for 1 or 0 ⋅ 2 −N
(note the minus),
such that 0.101 means 1 ⋅ 2 −1
= 1 ⋅
1
2
…
Solution: Start with a “signi cance” of 0.5 and see if we can subtract that signi cance from the
oating-point value. If so, we add a 1 to our representation and subtract the signi cance from the
value. If not, we add a 0 to the representation. Before each next iteration, we divide the signi cance
by 2 to get 0.5, 0.25, 0.125, …
1 void print_binary_double(double value)
2 {
3 static const std::size_t limit = 32;
4
5 double significance = 0.5;
6
7 std::string representation;
8
9 for (std::size_t count = 0; count < limit; ++count)
10 {
11 if (value >= significance)
12 {
13 representation += "1";
14
15 value -= significance;
16
17 if (value == 0)
18 {
19 std::cout << representation << std::endl;
20
21 return;
22 }
23 }
24
25 else representation += "0";
26
27 significance /= 2;
28 }
29
30 std::cout << "Error" << std::endl;
31 }
Bit-Twins
Given an integer with N bits, nd the next greater and smaller values with the same number of bits set as
that integer.
Solution 1: Brute force. Increment the value and compute its cardinality each time, until the
cardinality matches that of the original value. Same for decrementing. This algorithm’s complexity
would be O(N ⋅ B) where N is the number of values we must check and B the bitwidth of the data-
type.
1 template<typename T>
2 std::size_t cardinality(const T& value)
3 {
4 if ((value & (value - 1)) == 0) return 1;
5
6 if ((value & (value + 1)) == 0) return std::log2(value + 1);
7
8 const std::size_t bits = sizeof(value) * 8;
9
10 std::size_t count = 0;
11
12 for (std::size_t i = 0; i < bits; ++i)
13 {
14 if (value & (1 << i)) ++count;
15 }
16
17 return count;
18 }
19
20 template<typename T>
21 std::pair<std::size_t, std::size_t> twins(const T& value)
22 {
23 static const std::size_t maximum = std::numeric_limits<T>::max();
24
25 if (value == 0) return {0, maximum};
26
27 const std::size_t bits = cardinality(value);
28
29 T next = value + 1;
30
31 while (next != maximum && cardinality(next) != bits) ++next;
32
33 T previous = value - 1;
34
35 while (previous != 0 && cardinality(previous) != bits) --previous;
36
37 return {previous, next};
38 }
1 template<typename T>
2 std::size_t find_lsb(T value)
3 {
4 T less = value - 1;
5
6 value = (less | value) ^ less;
7
8 return std::log2(value + 1);
9 }
10
11 template<typename T>
12 T find_msb(T value)
13 {
14 static const std::size_t max_bit = sizeof(T) * 8;
15
16 if (value == 0) throw std::invalid_argument("No bit set at all!");
17
18 for (long bit = max_bit - 1; bit >= 0; --bit)
19 {
20 if (value & (1 << bit)) return bit;
21 }
22 }
23
24 template<typename T>
25 std::size_t cardinality(const T& value, std::size_t msb)
26 {
27 if ((value & (value - 1)) == 0) return 1;
28
29 if ((value & (value + 1)) == 0) return std::log2(value + 1);
30
31 std::size_t count = 1;
32
33 for (std::size_t i = 0; i < msb; ++i)
34 {
35 if (value & (1 << i)) ++count;
36 }
37
38 return count;
39 }
40
41 template<typename T>
42 std::size_t cardinality(const T& value)
43 {
44 return cardinality(value, find_msb(value));
45 }
46
47 template<typename T>
48 std::pair<T, T> twins(const T& value)
49 {
50 // Nothing to do here.
51 if (value == 0) return {0, 0};
52
53 // If its a power of 2, just left/right shift
54 if ((value & (value - 1)) == 0)
55 {
56 return {value << 1, value >> 1};
57 }
58
59 T lsb = find_lsb(value);
60
61 T msb = find_msb(value, lsb);
62
63 T next_largest;
64
65 // We need to differentiate between when a number is
66 // a cluster of bits, e.g. 0111, and when it is not,
67 // e.g. 01010. If it is a cluster of bits, the
68 // solution to find the next-largest value is to
69 // left shift the MSB and right shift the non-MSB bits.
70
71 // A value is a cluster of bits when we can right
72 // shift the values to the start to remove all 0s
73 // before the LSB (11000 -> 00011), then add one
74 // and have a power of 2. I.e. if this is a cluster,
75 // adding 1 will make it a power of 2. A value is
76 // a power of 2 if you can substract 1, AND those
77 // two values and get 0. If it is not a cluster of
78 // bits, you'll get something like 10110 -> 01011,
79 // to remove the 0 padding, then 01100 when adding 1
80 // and then the power-of-2 check will fail because
81 // 01100 & 01011 is not 0, but 01000 -> thus not a cluster
82
83 T copy = value >> lsb;
84
85 // Is a cluster of bits
86 if ((copy & (copy + 1)) == 0)
87 {
88 // Unset the old MSB, which we want
89 // to left shift afterwards
90 next_largest = copy & ~(1 << (msb - 1));
91
92 // If we can't right shift the non-msb bits, we
93 // already have the next-largest value
94 if (lsb > 1) next_largest >>= 1;
95
96 // Add the new, left-shifted MSB
97 next_largest |= (1 << msb);
98
99 // If it's a cluster of bits and lsb is at bit 1,
100 // there is no smaller value with the same # of bits
101 if (lsb == 0) return {value, next_largest};
102 }
103
104 else
105 {
106 // If the value is not composed of a cluster of bits
107 // the idea is to find the first gap in the bits
108 // and shift the values before the gap to the left
109
110 // We find the first gap in the bits of A = 1001 by
111 // (1) adding 1: 1001 + 1 = 1010 -> B
112 // (2) ORing those two: 1001 | 1010 = 1011 -> C
113 // (3) XORing C with A: 1001 ^ 1011 = 0010
114 // (4) Taking the log2 to find the bit position where the gap was.
115
116 T temp = copy | (copy + 1);
117
118 // The gap bit
119 std::size_t gap = lsb + std::log2(temp ^ copy);
120
121 // Mask off the bits before the gap
122 T before_gap_mask = value & ((1 << gap) - 1);
123
124 // Get the bits after the gap by unsetting the masked bits
125 T after_gap = value & ~before_gap_mask;
126
127 next_largest = after_gap | (before_gap_mask << 1);
128 }
129
130 T next_smallest = 0;
131
132 // To find the next smallest value, again two cases
133 // If the first bit is not set, just shift the LSB one
134 // to the right. Else if the first bit is set, we'll have
135 // to shift the MSB one to the right and then cram all the
136 // non-MSB bits right next to the MSB to get the highest
137 // possible value with the MSB being one to the right.
138
139 if (value & 1)
140 {
141 std::size_t not_msb_bits = cardinality(value, msb) - 1;
142
143 msb -= 1;
144
145 // Add in the new MSB
146 next_smallest |= (1 << msb);
147
148 // Create a mask containing bits for all the non-MSB bits
149 T mask = (1 << not_msb_bits) - 1;
150
151 // Shift those next to the MSB
152 mask <<= (msb - not_msb_bits);
153
154 // Add in the non-MSB bits
155 next_smallest |= mask;
156 }
157
158 else
159 {
160 // Unset the old LSB
161 next_smallest = value & ~(1 << lsb);
162
163 // Insert the LSB one position before
164 next_smallest |= (1 << (lsb - 1));
165 }
166
167 return {next_smallest, next_largest};
168 }
twin_bits.cpp hosted with ❤ by GitHub view raw
De nitely, one would want to XOR the two integers to determine which bits di er. Then, it depends
on how optimize the counting of bits.
Solution 1: Count between the LSB and MSB of the XORed value:
Solution 2: As above, but move to then next LSB each time (can skip some bits with a constant-time
operation):
Even-Odd Bit-Swapping
Write a program to swap odd and even bits in an integer with as few instructions as possible
1 template<typename T>
2 void swap_bits(T& value, std::size_t i, std::size_t j)
3 {
4 // Store bits
5 bool first = value & (1 << i);
6 bool second = value & (1 << j);
7
8 // First unset old bit, then
9 // set to bit of the other value
10 value &= ~(1 << i);
11 value |= (second << i);
12
13 // vice-versa
14 value &= ~(1 << j);
15 value |= (first << j);
16 }
17
18 template<typename T>
19 T swap_even_odd(T value)
20 {
21 auto msb = find_msb(value);
22
23 for (std::size_t bit = 0; bit <= msb; bit += 2)
24 {
25 swap_bits(value, bit, bit + 1);
26 }
27
28 return value;
29 }
An appropriate mask for all odd bits is 0xAA (two As for every 8 bits / one for every 4). We rst unset
the odd bits, then shift the value to the left by one position to put the even bits into the odd
positions. Then, we grab the odd bits, right shift (or left shift) those, and or the two sides.
1 template<typename T>
2 T swap_even_odd(const T& value)
3 {
4 static const std::size_t mask = 0xAAAAAAAAAAAAAAAA;
5
6 return ((value & ~mask) << 1) | ((value & mask) >> 1);
7 }
Farewell
And that’s it for this article on bit-manipulation. Hope you learnt something!
Fix the World
6 Comments goldsborough.me 🔒 Disqus' Privacy Policy
1 Login
LOG IN WITH
OR SIGN UP WITH DISQUS ?
Name
△ ▽ • Reply • Share ›
△ ▽ • Reply • Share ›
Related Posts
A Look at Naiad
Unifying batch and streaming through timely data ow
A Look at Dremel
An Overview of "Dremel: Interactive Analysis of Web-Scale Datasets" (2010)
A Look at Mesos
An introduction to the Mesos cluster management framework
A Promenade of PyTorch
A brief discussion of a research- rst deep learning framework
Peter Goldsborough
Based on pixyll by John Otander.