Python’s bytearray
is a mutable sequence of bytes that allows you to manipulate binary data efficiently. Unlike immutable bytes
, bytearray
can be modified in place, making it suitable for tasks requiring frequent updates to byte sequences.
You can create a bytearray
using the bytearray()
constructor with various arguments or from a string of hexadecimal digits using .fromhex()
. This tutorial explores creating, modifying, and using bytearray
objects in Python.
By the end of this tutorial, you’ll understand that:
- A
bytearray
in Python is a mutable sequence of bytes that allows in-place modifications, unlike the immutablebytes
. - You create a
bytearray
by using thebytearray()
constructor with a non-negative integer, iterable of integers, bytes-like object, or a string with specified encoding. - You can modify a
bytearray
in Python by appending, slicing, or changing individual bytes, thanks to its mutable nature. - Common uses for
bytearray
include processing large binary files, working with network protocols, and tasks needing frequent updates to byte sequences.
You’ll dive deeper into each aspect of bytearray
, exploring its creation, manipulation, and practical applications in Python programming.
Get Your Code: Click here to download the free sample code that you’ll use to learn about Python’s bytearray data type.
Take the Quiz: Test your knowledge with our interactive “Python's Bytearray” quiz. You’ll receive a score upon completion to help you track your learning progress:
Interactive Quiz
Python's BytearrayIn this quiz, you'll test your understanding of Python's bytearray data type. By working through this quiz, you'll revisit the key concepts and uses of bytearray in Python.
Understanding Python’s bytearray
Type
Although Python remains a high-level programming language, it exposes a few specialized data types that let you manipulate binary data directly should you ever need to. These data types can be useful for tasks such as processing custom binary file formats, or working with low-level network protocols requiring precise control over the data.
The three closely related binary sequence types built into the language are:
bytes
bytearray
memoryview
While they’re all Python sequences optimized for performance when dealing with binary data, they each have slightly different strengths and use cases.
Note: You’ll take a deep dive into Python’s bytearray
in this tutorial. But, if you’d like to learn more about the companion bytes
data type, then check out Bytes Objects: Handling Binary Data in Python, which also covers binary data fundamentals.
As both names suggest, bytes
and bytearray
are sequences of individual byte values, letting you process binary data at the byte level. For example, you may use them to work with plain text data, which typically represents characters as unique byte values, depending on the given character encoding.
Python natively interprets bytes as 8-bit unsigned integers, each representing one of 256 possible values (28) between 0 and 255. But sometimes, you may need to interpret the same bit pattern as a signed integer, for example, when handling digital audio samples that encode a sound wave’s amplitude levels. See the section on signedness in the Python bytes
tutorial for more details.
The choice between bytes
and bytearray
boils down to whether you want read-only access to the underlying bytes or not. Instances of the bytes
data type are immutable, meaning each one has a fixed value that you can’t change once the object is created. In contrast, bytearray
objects are mutable sequences, allowing you to modify their contents after creation.
While it may seem counterintuitive at first—since many newcomers to Python expect objects to be directly modifiable—immutable objects have several benefits over their mutable counterparts. That’s why types like strings, tuples, and others require reassignment in Python.
The advantages of immutable data types include better memory efficiency due to the ability to cache or reuse objects without unnecessary copying. In Python, immutable objects are inherently hashable, so they can become dictionary keys or set elements. Additionally, relying on immutable objects gives you extra security, data integrity, and thread safety.
That said, if you need a binary sequence that allows for modification, then bytearray
is the way to go. Use it when you frequently perform in-place byte operations that involve changing the contents of the sequence, such as appending, inserting, extending, or modifying individual bytes. A scenario where bytearray
can be particularly useful includes processing large binary files in chunks or incrementally reading messages from a network buffer.
The third binary sequence type in Python mentioned earlier, memoryview
, provides a zero-overhead view into the memory of certain objects. Unlike bytes
and bytearray
, whose mutability status is fixed, a memoryview
can be either mutable or immutable depending on the target object it references. Just like bytes
and bytearray
, a memoryview
may represent a series of single bytes, but at the same time, it can represent a sequence of multi-byte words.
Now that you have a basic understanding of Python’s binary sequence types and where bytearray
fits into them, you can explore ways to create and work with bytearray
objects in Python.
Creating bytearray
Objects in Python
Unlike the immutable bytes
data type, whose literal form resembles a string literal prefixed with the letter b
—for example, b"GIF89a"
—the mutable bytearray
has no literal syntax in Python. This distinction is important despite many similarities between both byte-oriented sequences, which you’ll discover in the next section.
The primary way to create new bytearray
instances is by explicitly calling the type’s class constructor, sometimes informally known as the bytearray()
built-in function. Alternatively, you can create a bytearray
from a string of hexadecimal digits. You’ll learn about both methods next.
The bytearray()
Constructor
Depending on the number and types of arguments passed to the bytearray()
constructor, you can create mutable byte sequences from various Python objects. Below are the possible signatures of the bytearray()
constructor, accepting different arguments:
# Argumentless:
bytearray()
# Single argument:
bytearray(length: int) # Non-negative integer
bytearray(data: Buffer) # Bytes-like or a buffer object
bytearray(values: Iterable[int]) # Iterable of integers between 0 and 255
# Two or three arguments:
bytearray(text: str, encoding: str)
bytearray(text: str, encoding: str, errors: str = "strict")
According to the information provided above, you can call bytearray()
without any arguments, which creates an empty byte array, or you can pass various values to initialize the array with specific content:
- Non-negative integer: Creates a zero-filled byte array of the specified length.
- Iterable of small integers: Creates a byte array from an iterable of integers in the range of 0 to 255, representing the subsequent byte values.
- Bytes-like object of buffer: Creates a mutable copy of the given bytes-like object or an object implementing the buffer protocol.
- String and character encoding: Encodes a string into a byte array using the specified character encoding. The optional error-handling strategy allows for graceful handling of characters that don’t have a representation in the given encoding.
To specify an empty array of bytes, you can leverage one of these equivalent techniques:
>>> bytearray()
bytearray(b'')
>>> bytearray(0)
bytearray(b'')
>>> bytearray([])
bytearray(b'')
>>> bytearray(b"")
bytearray(b'')
In each case, you get a new bytearray
object that initially contains no byte values but still allows you to add them later.
When you call bytearray()
with a positive integer as an argument, you create a zero-filled byte array of the specified size, initialized with null bytes (b"\x00"
). In other words, each element of the resulting array is a byte with a value of zero. You can take a peek at your array’s content by converting it to a Python list:
>>> bytearray(5)
bytearray(b'\x00\x00\x00\x00\x00')
>>> list(bytearray(5))
[0, 0, 0, 0, 0]
As you can see, calling bytearray(5)
produces an array of five zeros. Creating such an array of null bytes can be useful in scenarios when you need to initialize a data structure with a known size in advance to reduce the number of memory allocations and fragmentation.
You can also pass an iterable of small integers into bytearray()
to treat them as standalone byte values. The iterable can be a lazily evaluated object like an iterator or generator, or it can be a sequence with a size known upfront:
>>> bytearray(range(65, 91))
bytearray(b'ABCDEFGHIJKLMNOPQRSTUVWXYZ')
>>> bytearray([82, 101, 97, 108, 32, 80, 121, 116, 104, 111, 110])
bytearray(b'Real Python')
In this case, the range()
function returns a range object, which you can iterate over, generating numbers on demand without storing them all in memory at once. Conversely, a list of numbers is an example of a random-access sequence that allows for direct retrieval of elements by index.
Watch out for iterables with incorrect data types or numeric values outside the expected range:
>>> bytearray([3.14, 2.72])
Traceback (most recent call last):
...
TypeError: 'float' object cannot be interpreted as an integer
>>> bytearray([-1])
Traceback (most recent call last):
...
ValueError: byte must be in range(0, 256)
>>> bytearray([256])
Traceback (most recent call last):
...
ValueError: byte must be in range(0, 256)
In the first case, you called bytearray()
with a list of floating-point numbers as an argument, and in the following two cases, you passed too-small and too-big integer values, respectively. Remember, bytearray()
requires an iterable of Python integers that must fall within the range of 0 to 255, which coincides with 8-bit unsigned bytes.
The last single-argument invocation of bytearray()
involves passing a bytes-like object or an object implementing the so-called buffer protocol as a parameter. It could be another bytearray
or bytes
object, for example:
>>> binary_data = b"This is a bytes literal"
>>> bytearray(binary_data)
bytearray(b'This is a bytes literal')
Even though it may look like the bytearray
is effectively wrapping your bytes
object, that isn’t the case. Instead, the code snippet above creates a mutable copy of the original binary sequence, allowing you to modify its contents without affecting the original bytes
data:
>>> mutable_copy = bytearray(binary_data)
>>> mutable_copy[14:] = b"array"
>>> mutable_copy
bytearray(b'This is a bytearray')
>>> binary_data
b'This is a bytes literal'
>>> binary_data[14:] = b"array"
Traceback (most recent call last):
...
TypeError: 'bytes' object does not support item assignment
After creating yet another bytearray
instance from the same bytes
object that you defined earlier, you assign it to a variable named mutable_copy
. Next, you use a slice assignment, which you’ll explore later, to replace a fragment of the resulting bytearray
with a different sequence of bytes. This modifies your copy without affecting the original binary_data
, demonstrating the mutable nature of bytearray
compared to the immutable bytes
object.
Another way to create a bytearray
object is by passing two arguments to the constructor, both of which must be Python strings. The first argument may represent arbitrary text, while the second argument must be the name of a valid character encoding registered with the codecs
module, such as UTF-8 or ISO 8859-1:
>>> bytearray("¿Habla español?", "utf-8")
bytearray(b'\xc2\xbfHabla espa\xc3\xb1ol?')
>>> bytearray("¿Habla español?", "iso-8859-1")
bytearray(b'\xbfHabla espa\xf1ol?')
What you get in return is a bytearray
instance containing the original text encoded into a sequence of bytes according to the chosen encoding.
Note: Although it’s generally considered Pythonic to encode strings into byte sequences using str.encode()
rather than passing equivalent arguments to bytearray()
, there are cases where the latter approach can be beneficial. Compare the following two invocations:
bytearray("¿Habla español?".encode("utf-8"))
bytearray("¿Habla español?", "utf-8")
Both techniques produce an identical result. However, the first one creates an intermediate bytes
object before making its mutable copy, whereas the second one avoids this extra step, making it slightly more memory efficient. The difference can become noticeable when you work with particularly long strings.
By default, characters that lack a meaningful representation in the specified character encoding will make Python raise an exception. However, you can override this behavior by providing an optional third string argument to bytearray()
with an alternative strategy for handling such encoding errors:
>>> bytearray("¿Habla español?", "ascii")
Traceback (most recent call last):
...
UnicodeEncodeError: 'ascii' codec can't encode character
⮑ '\xbf' in position 0: ordinal not in range(128)
>>> bytearray("¿Habla español?", "ascii", errors="ignore")
bytearray(b'Habla espaol?')
While specifying errors="ignore"
allows you to sidestep the encoding error, it can still lead to data loss. You can see this in the example above where the non-ASCII characters ¿ and ñ are omitted from the resulting bytearray
. Other strategies include replacing the problematic characters with safe placeholders or rewriting them with the corresponding escape sequences:
>>> bytearray("¿Habla español?", "ascii", errors="replace")
bytearray(b'?Habla espa?ol?')
>>> bytearray("¿Habla español?", "ascii", errors="backslashreplace")
bytearray(b'\\xbfHabla espa\\xf1ol?')
This time, instead of skipping the two characters in the output, you either replace them with a question mark (?
) or escape using their hexadecimal codes. For more information about the available strategies, check out the error handlers.
Next up, you’ll learn about another method of creating bytearray
objects in Python.
The .fromhex()
Class Method
There’s an alternative way to create a bytearray
object in Python, which you may sometimes prefer. You do this by calling bytearray.fromhex()
on a string of hexadecimal digits, like so:
>>> bytearray.fromhex("30 8C C9 FF")
bytearray(b'0\x8c\xc9\xff')
The signature and behavior of bytearray.fromhex()
is analogous to that of bytes.fromhex()
. Both are class methods, which you call on the type rather than a particular instance.
Using the hexadecimal system to express byte values is pretty common, as it allows you to represent binary data more compactly than with the decimal or binary systems. Take a look at the following table to see the difference:
Binary | Decimal | Hexadecimal |
---|---|---|
00110000 |
48 |
30 |
10001100 |
140 |
8C |
11001001 |
201 |
C9 |
11111111 |
255 |
FF |
Binary numbers take a lot of space because they require more digits to represent the same value compared to other numeral systems. After all, they’re composed of only two digits: 0 and 1. In contrast, the decimal system provides ten decimal digits (0-9), making numbers quite a bit shorter.
However, switching to the hexadecimal system gives you an additional six letters of the alphabet (A-F) to represent the values ten through fifteen. This lets you conveniently express every 8-bit byte with no more than two hexadecimal digits.
Since bytearray
and bytes
share over eighty percent of their functionality and are sometimes interchangeable, you’ll first compare them before diving into manipulation techniques.
Comparing bytearray
to bytes
Objects
While the bytes
data type builds on Python strings, bytearray
extends the interface of bytes
even further by introducing mutable behavior. It does so through a few additional methods and operators that are missing from the other two data types.
Public Methods
The bytearray
type includes all the methods of bytes
but, being mutable, also provides eight additional methods designed for in-place modifications:
Method | Description |
---|---|
.append() |
Append a single item to the end of the bytearray . |
.copy() |
Return a copy of the bytearray . |
.remove() |
Remove the first occurrence of a value in the bytearray . |
.reverse() |
Reverse the order of the values in the bytearray in place. |
.pop() |
Remove and return a single item from the bytearray at the given index. |
.insert() |
Insert a single item into the bytearray before the given index. |
.extend() |
Append all the items from the iterator or sequence to the end of the bytearray . |
.clear() |
Remove all items from the bytearray . |
You’ll explore these methods, among others, in more detail later in this tutorial. For now, keep in mind that the public interface of bytearray
forms a superset of the methods and attributes of the bytes
data type.
If you’re wondering how to quickly identify the differences between bytearray
and bytes
data types, then you can use the following code snippet:
>>> def public_members(cls):
... return {name for name in dir(cls) if not name.startswith("_")}
...
>>> public_members(bytearray) - public_members(bytes)
{'append', 'copy', 'remove', 'reverse', 'pop', 'insert', 'extend', 'clear'}
>>> public_members(bytearray) >= public_members(bytes)
True
First, you define a function named public_members()
, which uses a set comprehension with a condition to filter out non-public methods and attributes of a class—those whose names start with an underscore (_
). Next, you call your function on bytearray
and bytes
, and then use the set difference operator (-
) to identify methods present in bytearray
but not in bytes
. Finally, you confirm that bytearray
includes all public members of bytes
.
If those eight methods look familiar to you, then that’s because you might have seen them in other mutable sequence types in Python, such as lists or deques. They let you manipulate the contents of a bytearray
in ways that aren’t possible with an immutable bytes
object. With them, you can add, remove, and rearrange bytes directly within a given bytearray
.
But calling methods isn’t the only way to mutate an object. You can also change its state using operators, such as concatenation (+
), assignment (=
), or others supported by the object.
Supported Operators
In addition to supplementing the public interface of bytes
with a few extra methods, the bytearray
data type introduces some special methods corresponding to Python operators that allow for in-place byte manipulation:
Special Method | Operator | Description |
---|---|---|
.__setitem__() |
= |
Assign a new value to a byte or a slice at the given index of the bytearray . |
.__delitem__() |
del |
Delete a byte or a slice at the given index of the bytearray . |
.__iadd__() |
+= |
Append the bytes of another iterable to the end of the bytearray . |
.__imul__() |
*= |
Repeat the bytearray a specified number of times and update it in place. |
Again, you’ll take a closer look at these operators later in the tutorial. In the meantime, here’s how you can find them in the bytearray
type’s definition:
>>> def magic_members(cls):
... return {name for name in dir(cls) if name.startswith("__")}
...
>>> magic_members(bytearray) - magic_members(bytes)
{
'__alloc__',
'__release_buffer__',
'__delitem__',
'__iadd__',
'__imul__',
'__setitem__'
}
The highlighted methods implement deletion (del
), in-place concatenation (+=
), in-place repetition (*=
), and assignment (=
), respectively. This form of operator overloading allows you to modify a bytearray
instance directly instead of creating a new object. Consider this example:
>>> buffer = bytearray(b"Real")
>>> buffer += b" Python"
>>> buffer
bytearray(b'Real Python')
>>> buffer + b" is awesome"
bytearray(b'Real Python is awesome')
>>> buffer
bytearray(b'Real Python')
Notice the difference between the augmented assignment operator (+=
), which modifies your existing buffer
in place without returning anything, and the concatenation operator (+
), which creates a brand new bytearray
object. This behavior can be particularly useful when you work with large binary data, as it minimizes memory overhead and improves performance.
Have you noticed how bytearray
interacts with bytes
in the expressions above? You’ll explore this behavior next.
Expressions With bytearray
and bytes
Another point worth noting here is that you can freely mix bytearray
and bytes
objects in one expression, like before when you concatenated the two. That’s possible because both data types embody sequences of bytes, making them compatible despite one being mutable and the other one not. In particular, when you compare a bytearray
instance to a bytes
object or the other way around, Python considers them equal if their binary contents are the same:
>>> bytearray("Python", "utf-8") == bytes("Python", "utf-8")
True
>>> [1, 2, 3] == (1, 2, 3)
False
When it comes to binary sequences in Python, mutability doesn’t affect equality. This is different than, say, comparing lists and tuples, which don’t compare on their contents alone—their types also matter.
Note that the type of the resulting value may sometimes differ depending on the operand order:
>>> bytearray([82, 101, 97, 108]) + b" Python"
bytearray(b'Real Python')
>>> b"Real" + bytearray([32, 80, 121, 116, 104, 111, 110])
b'Real Python'
When evaluating an expression that involves a binary operator, such as the plus operator (+
) in the example above, Python checks the left operand for the respective implementation. If the left operand is a bytearray
object, then Python invokes its .__add__()
special method, which returns a new bytearray
instance. Conversely, if the left operand is a bytes
object, its own version of .__add__()
kicks in, producing another bytes
object as a result.
Note: In custom classes, you can implement the right-hand side versions of special methods, such as .__radd__()
, so that your instances will correctly handle the operation even when they appear on the right side of a binary operator.
To obtain predictable output, you can always cast your operands to the desired types before performing an operation on them. Alternatively, you may convert the result afterward:
>>> bytes(bytearray([82, 101, 97, 108])) + b" Python"
b'Real Python'
>>> bytearray(b"Real") + bytearray([32, 80, 121, 116, 104, 111, 110])
bytearray(b'Real Python')
>>> bytearray(b"Real" + b" Python")
bytearray(b'Real Python')
In the first example, you convert a bytearray
on the left to bytes
before concatenating it with a bytes
literal on the right. In the second example, you perform the conversion in the opposite direction, while in the last example, you convert the concatenated result regardless of its type.
The ability to modify bytearray
objects is what sets them apart from bytes
, making them particularly useful in scenarios where performance and memory efficiency matter. Now that you understand how bytearray
extends bytes
with additional methods and operators, you’ll explore practical ways to manipulate bytearray
objects in Python.
Manipulating bytearray
Objects in Python
At this point, you know that Python’s bytearray
and bytes
are binary sequences, which give you the ability to process individual byte values. While bytes
is an immutable type modeled after Python strings, bytearray
adds support for mutability through a few extra methods and operators, which you’ll take a closer look at in this section. After all, the key advantage of bytearray
over bytes
lies in its ability to be modified in place, saving memory.
String-Like Operations
Since bytearray
is a mutable extension of the bytes
data type, it also shares a strong connection with Python strings. Nearly forty string methods have been carried over from str
to bytes
and bytearray
, letting you treat those binary sequences much like text strings. However, there’s a catch.
Although you may expect bytearray
to modify its contents in place when you perform string-like operations like .replace()
, that’s not the case. Instead, these methods return a modified copy of the original data rather than altering it directly. This distinction can make it tricky to remember which bytearray
operations mutate the object in place and which produce a copy.
To find the methods common to bytearray
and str
—and bytes
—you can use the set intersection operator (&
) as shown below:
>>> for i, name in enumerate(
... sorted(
... name for name in set(dir(bytearray)) & set(dir(str))
... if not name.startswith("_")
... ),
... start=1
... ):
... print(f"({i:>2}) .{name}()")
...
( 1) .capitalize()
( 2) .center()
( 3) .count()
( 4) .endswith()
⋮ ⋮
(38) .upper()
(39) .zfill()
Looking from the inside of the loop, you use a generator expression to iterate over the resulting set intersection, keeping only public method names. Next, you sort them alphabetically in ascending order, enumerate them—starting from one—and print the formatted method names.
These methods work in a similar fashion to their string counterparts. For example, you can obtain an uppercase copy of your byterarray
by calling its .upper()
method:
>>> binary_data = bytearray(b"Real Python")
>>> binary_data.upper()
bytearray(b'REAL PYTHON')
That doesn’t look surprising since your bytearray
contains only English alphabet letters. But what about when you throw non-letter characters—or even byte values outside the ASCII range—into the mix? You can find out by defining a slightly different sequence of bytes, such as this one:
>>> binary_data = bytearray([202, 254, 186, 190])
>>> binary_data
bytearray(b'\xca\xfe\xba\xbe')
>>> binary_data.upper()
bytearray(b'\xca\xfe\xba\xbe')
In this case, none of the bytes encode an ASCII character, so Python displays their numeric values using the hexadecimal notation. Calling .upper()
has no effect because these bytes lack uppercase equivalents. This behavior is similar to how strings preserve non-letter characters, such as digits:
>>> "42".upper()
'42'
Although digits are printable, they’re not characters, so the concepts of uppercase and lowercase don’t apply to them.
But, calling the same method on a bytearray
and a string can yield vastly different results, even when both represent the same piece of data. Check out this example:
>>> binary_data = bytearray("café", "utf-8")
>>> binary_data
bytearray(b'caf\xc3\xa9')
>>> str(binary_data.upper(), "utf-8")
'CAFé'
>>> str(binary_data, "utf-8").upper()
'CAFÉ'
You begin by creating a bytearray
from the string "café"
using the UTF-8 encoding. Within this character encoding, the Unicode letter é is represented by two separate bytes, whose decimal values are 195 and 169, neither of which are part of the ASCII character set.
When you call .upper()
on the resulting bytearray
and later convert it back into a string, you get "CAFé"
. This happens because your method call only affects ASCII characters in the bytearray
, leaving the non-ASCII byte sequence for the letter é unchanged. As a result, the é remains in its original form when decoded back into a string.
In contrast, when you convert the bytearray
before calling .upper()
on the resulting string, you get the expected "CAFÉ"
. To sum it up, string methods defined in bytearray
work on individual byte values, whereas their string prototypes operate on whole Unicode characters.
Immutable Sequence Operations
Because bytearray
objects fall under the category of sequence types in Python, they support all the common sequence operations you’d expect from an immutable tuple or string.
For example, a bytearray
object supports indexing and slicing through the square bracket syntax:
>>> binary_data = bytearray(b"Monty Python")
>>> binary_data[-1]
110
>>> binary_data[-1:]
bytearray(b'n')
>>> binary_data[6:]
bytearray(b'Python')
When you access a specific byte at the given index in the array, you get its integer value. For example, the last byte at index -1 has a value of 110, which corresponds to the ASCII character "n"
. At the same time, slicing a bytearray
always yields a new subarray, even if it only contains one element. The same is true of bytes
objects, but not strings, which return another string object regardless of whether you index or slice them.
The bytearray
has a known size, so you can measure its length at any time. It also supports iteration over the byte elements, and it lets you create a reversed copy by calling the reversed()
built-in function:
>>> len(binary_data)
12
>>> for byte in binary_data:
... print(f"{byte:x}: {chr(byte)!r}")
...
4d: 'M'
6f: 'o'
6e: 'n'
74: 't'
79: 'y'
20: ' '
50: 'P'
79: 'y'
74: 't'
68: 'h'
6f: 'o'
6e: 'n'
>>> reversed(binary_data)
<reversed object at 0x7e28e3f607c0>
>>> bytearray(reversed(binary_data))
bytearray(b'nohtyP ytnoM')
In the code snippet above, you call the len()
function to get the number of bytes in your bytearray
. Next, you iterate over the array and use an f-string literal to format each byte as a hexadecimal number (x
), and then you call the chr()
function to reveal the corresponding character. A bit later, you create a new bytearray
object from an iterator of elements in reversed order.
You can find the starting index and the number of occurrences of a subsequence, as well as test for membership in a bytearray
:
>>> binary_data.index(b"Python")
6
>>> binary_data.count(b"Python")
1
>>> b"Python" in binary_data
True
>>> b"Python" not in binary_data
False
While you look for byte sequences expressed as bytes
literals in these examples, you could just as well provide bytearray
instances or other bytes-like objects as arguments instead.
Apart from in
and not in
, two other binary operators commonly used with Python sequences are the concatenation (+
) and repetition (*
) operators, which work equally well with bytearray
objects:
>>> binary_data + b"'s Flying Circus"
bytearray(b"Monty Python\'s Flying Circus")
>>> bytearray(b"spam ") * 3
bytearray(b'spam spam spam ')
In both cases, you end up creating new bytearray
objects while your original byte sequences remain unchanged.
This wraps up the overview of immutable sequence operations. Now, it’s time to move on and explore the mutable sequence operations that bytearray
objects also support.
Mutable Sequence Operations
Because bytearray
is a mutable sequence, it provides a few extra methods and supports operators that you saw earlier, allowing you to modify its contents without creating expensive copies. This makes bytearray
an excellent choice for tasks that require frequent updates or modifications to binary data.
For instance, you’ll often want to change the value of a byte positioned at a particular index in your bytearray
. Say you’re working with a sequence of pixel values representing a monochromatic image, and you want to create its negative by inverting all pixels using a loop:
>>> pixels = bytearray([48, 140, 201, 252, 186, 3, 37, 186, 52])
>>> for i in range(len(pixels)):
... pixels[i] = 255 - pixels[i]
...
Each byte represents a relative light intensity, with 0 denoting complete darkness and 255 indicating maximum brightness. Inverting a pixel involves subtracting its current value from 255, effectively reversing its brightness.
When performing a single item assignment, as shown on the highlighted line, you must always provide an integer value to store at the specified index. Because a bytearray
is a sequence of unsigned bytes, you can only provide integers between 0 and 255, or else Python will raise an error:
>>> pixels[0] = -1
Traceback (most recent call last):
...
ValueError: byte must be in range(0, 256)
>>> pixels[0] = 256
Traceback (most recent call last):
...
ValueError: byte must be in range(0, 256)
Neither -1 nor 256 represent valid byte values, so Python prevents their assignment to ensure data integrity. A similar problem may occur when you try to increase or decrease the brightness of pixels without accounting for possible overflow and underflow errors:
>>> for i in range(len(pixels)):
... pixels[i] *= 2
...
Traceback (most recent call last):
...
ValueError: byte must be in range(0, 256)
Here, you use one of the augmented assignment operators, which is a shorthand notation for pixels[i] = pixels[i] * 2
, to double the pixel intensity. However, this operation can result in values that exceed the valid range for a byte, causing a ValueError
.
Note: Don’t confuse the augmented concatenation (+=
) and repetition (*=
) operators of a bytearray
with the augmented assignment operators of its individual elements. Here’s an example that demonstrates a subtle difference between them:
>>> binary_data = bytearray(b"Why?")
>>> binary_data *= 2
>>> binary_data
bytearray(b'Why?Why?')
>>> binary_data = bytearray(b"Why?")
>>> binary_data[-1] *= 2
>>> binary_data
bytearray(b'Why~')
When you apply the operator to a bytearray
, it repeats the entire byte sequence. However, when you apply it to an individual element of the same bytearray
, it multiplies the byte value itself, resulting in a different ASCII representation.
Given your past experience with Python strings—where each character is essentially a string of length one—you might be tempted to directly assign a bytes-like object to a bytearray
at a specific index, like this:
>>> pixels[0] = b"\xff"
Traceback (most recent call last):
...
TypeError: 'bytes' object cannot be interpreted as an integer
Unfortunately, even though the bytes
literal on the right contains only one element, Python won’t unpack it for you. That said, you can assign a bytes-like object, such as bytes
or bytearray
, or even any other sequence of small integers when you use the slice assignment:
>>> pixels[3:6] = (0, 0, 0)
>>> list(pixels)
[207, 115, 54, 0, 0, 0, 218, 69, 203]
With this approach, you replace a fragment of pixels
with another subsequence of the same length. This behavior is similar to how list slicing works in Python. If the length of the sequence you’re assigning doesn’t match the length of the slice being replaced, then Python will automatically shrink or expand your bytearray
to accommodate the new sequence.
Mutability isn’t only about changing existing values or adding new ones, but also about deletion. You can delete a single byte or a whole slice from a bytearray
with the help of Python’s del
statement:
>>> del pixels[3:6]
>>> list(pixels)
[207, 115, 54, 218, 69, 203]
>>> del pixels[3]
>>> list(pixels)
[207, 115, 54, 69, 203]
Just be sure to provide the correct byte indices when deleting content to avoid an IndexError
.
Note that del
is a statement, which causes a side effect, but it doesn’t evaluate to a value like an expression does. If you’d like to delete a bytearray
element while intercepting its value, then call the .pop()
method with an index as an argument:
>>> pixels.pop(3)
69
>>> pixels.pop()
203
>>> list(pixels)
[207, 115, 54]
By default, the index
parameter is equal to -1, indicating the last item in the array. So, when you call .pop()
without providing any argument, then you’ll remove and return the rightmost byte.
On the other hand, if you don’t know the exact index of the byte to delete but know its value, you can remove its first occurrence by searching for it from the left:
>>> pixels.remove(115)
>>> list(pixels)
[207, 54]
If the sequence contains any duplicates that you wish to remove, then you’ll have to rinse and repeat as many times as needed. The lookup always starts from the left.
To delete everything in one go, you can call the .clear()
method:
>>> pixels.clear()
>>> pixels
bytearray(b'')
This method doesn’t return anything but irreversibly removes all the bytes from your bytearray
, leaving it empty.
Once you’ve cleared your array, you’ll want to populate it again with new data. To add a single byte into the bytearray
, you can either append one to the right end of the array or insert it before the specified index. Again, the new element has to be an integer within the valid range:
>>> pixels.append(65) # A
>>> pixels.append(67) # C
>>> pixels.insert(1, 66) # B
>>> pixels
bytearray(b'ABC')
The byte values 65, 66, and 67 correspond to ASCII letters A, B, and C, respectively. Notice how you first append A and C, only later inserting B between them.
If you have multiple bytes stored in an iterable, then you can append them all to the right end of your bytearray
by calling the .extend()
method:
>>> pixels.extend((1, 2, 3))
>>> pixels
bytearray(b'ABC\x01\x02\x03')
This method ensures that you append all bytes in a single step, which is more efficient and concise than looping through each byte and adding it individually.
The final two methods available in a mutable sequence enable an in-place reversal of its elements, as well as making a copy of the sequence:
>>> pixels.reverse()
>>> pixels
bytearray(b'\x03\x02\x01CBA')
>>> pixels.copy()
bytearray(b'\x03\x02\x01CBA')
>>> pixels is pixels.copy()
False
The .reverse()
method directly alters your bytearray
, changing the order of its elements without creating a new object. On the other hand, the .copy()
method creates a new sequence with the same bytes, allowing you to work with a duplicate. Note that the resulting copy has a unique identity, which is different than your original bytearray
.
You’ve covered behaviors typical of strings and sequences in Python, but bytearray
objects are also designed to store binary data. Next, you’ll learn about two methods specific to this data type.
Byte-Specific Operations
Both bytearray
and bytes
expose an instance method called .hex()
, which is the opposite of .fromhex()
that you learned about earlier. While .fromhex()
allowed you to create a new bytearray
object from a string of hexadecimal digits, .hex()
converts an existing bytearray
into its hexadecimal representation:
>>> binary_data = bytearray([48, 140, 201, 0])
>>> binary_data.hex()
'308cc900'
Notice that each byte is always represented by two hexadecimal digits by adding a leading zero, even when a single digit would technically be sufficient. This ensures clear interpretation of such strings, eliminating any ambiguity in where the boundaries lie.
To make this distinction even clearer visually, you can provide one or two optional parameters:
>>> binary_data.hex(":")
'30:8c:c9:00'
>>> binary_data.hex(":", -3)
'308cc9:00'
>>> binary_data.hex(":", 3)
'30:8cc900'
The first parameter is the separator, which must be a single-character string that will be placed between each pair of hexadecimal digits in the output. The other parameter is the number of bytes to group together before applying the separator. When this number is negative, the grouping starts from the left. Otherwise, it starts from the right.
Another commonly used method that you’ll encounter in the wild is one for converting a bytearray
into a Python string:
>>> binary_data = bytearray(b"caf\xc3\xa9")
>>> binary_data.decode("utf-8")
'café'
The .decode()
method takes an optional character encoding, which defaults to your platform’s encoding when you omit it. As a rule of thumb, it’s best to always explicitly provide character encoding to ensure consistency and avoid surprises.
Just as the bytearray()
constructor allowed you to define an error-handling strategy for encoding a string into a mutable sequence of bytes, .decode()
lets you specify the same strategy when you want to interpret bytes as a string:
>>> binary_data.decode("ascii", errors="ignore")
'caf'
The errors
parameter accepts a few predefined values, which you can find in the official documentation.
That concludes your overview of Python’s bytearray
. With this knowledge, you can efficiently manipulate binary data and perform operations that require mutable sequences of bytes.
Conclusion
You’ve explored Python’s bytearray
type, a mutable sequence of bytes that allows for efficient binary data manipulation. You learned how bytearray
compares to bytes
, how to create and modify bytearray
objects, and how to leverage its mutable behaviors for working with binary data. Along the way, you saw a few practical examples of using bytearray
in real-world scenarios, such as handling image pixels and performing character encoding conversions.
In this tutorial, you’ve learned:
- How to create
bytearray
instances from various sources - The key differences between
bytearray
andbytes
- Ways to modify
bytearray
contents in place using methods and operators - Practical use cases, such as handling image data and encoding text
With this knowledge, you can confidently use bytearray
in your projects when you need mutable binary sequences. To continue your exploration of Python’s binary data handling, check out memoryview
, which offers a zero-copy way to interact with binary buffers.
Get Your Code: Click here to download the free sample code that you’ll use to learn about Python’s bytearray data type.
Frequently Asked Questions
Now that you have some experience with Python’s bytearray
data type, you can use the questions and answers below to check your understanding and recap what you’ve learned.
These FAQs are related to the most important concepts you’ve covered in this tutorial. Click the Show/Hide toggle beside each question to reveal the answer.
A bytearray
is one of three binary sequence types in Python. More specifically, it’s a mutable sequence of bytes, allowing you to modify its contents after creation.
The bytearray
data type is mutable, meaning you can change its contents, while the bytes
data type is immutable and can’t be modified once created.
You can create a bytearray
by calling the bytearray()
constructor with various types of arguments, such as a non-negative integer, an iterable of integers, a bytes-like object, or a string with a specified encoding. Alternatively, you can use bytearray.fromhex()
to interpret a string of hexadecimal digits as a byte sequence.
Yes, you can modify a bytearray
by changing, appending, or inserting the individual bytes, and using various methods to manipulate its contents.
Common uses for bytearray
include processing binary files or streams, particularly when handling large files in chunks, as well as working with low-level network protocols that require mutable sequences.
Take the Quiz: Test your knowledge with our interactive “Python's Bytearray” quiz. You’ll receive a score upon completion to help you track your learning progress:
Interactive Quiz
Python's BytearrayIn this quiz, you'll test your understanding of Python's bytearray data type. By working through this quiz, you'll revisit the key concepts and uses of bytearray in Python.