Python's Bytearray: A Mutable Sequence of Bytes

Python’s bytearray is a mutable sequence of bytes that allows you to manipulate binary data efficiently. Unlike immutable bytes, bytearray can be modified in place, making it suitable for tasks requiring frequent updates to byte sequences.

You can create a bytearray using the bytearray() constructor with various arguments or from a string of hexadecimal digits using .fromhex(). This tutorial explores creating, modifying, and using bytearray objects in Python.

By the end of this tutorial, you’ll understand that:

A bytearray in Python is a mutable sequence of bytes that allows in-place modifications, unlike the immutable bytes.
You create a bytearray by using the bytearray() constructor with a non-negative integer, iterable of integers, bytes-like object, or a string with specified encoding.
You can modify a bytearray in Python by appending, slicing, or changing individual bytes, thanks to its mutable nature.
Common uses for bytearray include processing large binary files, working with network protocols, and tasks needing frequent updates to byte sequences.

You’ll dive deeper into each aspect of bytearray, exploring its creation, manipulation, and practical applications in Python programming.

Get Your Code: Click here to download the free sample code that you’ll use to learn about Python’s bytearray data type.

Take the Quiz: Test your knowledge with our interactive “Python's Bytearray” quiz. You’ll receive a score upon completion to help you track your learning progress:

Interactive Quiz

Python's Bytearray

In this quiz, you'll test your understanding of Python's bytearray data type. By working through this quiz, you'll revisit the key concepts and uses of bytearray in Python.

Understanding Python’s `bytearray` Type

Although Python remains a high-level programming language, it exposes a few specialized data types that let you manipulate binary data directly should you ever need to. These data types can be useful for tasks such as processing custom binary file formats, or working with low-level network protocols requiring precise control over the data.

The three closely related binary sequence types built into the language are:

bytes
bytearray
memoryview

While they’re all Python sequences optimized for performance when dealing with binary data, they each have slightly different strengths and use cases.

Note: You’ll take a deep dive into Python’s bytearray in this tutorial. But, if you’d like to learn more about the companion bytes data type, then check out Bytes Objects: Handling Binary Data in Python, which also covers binary data fundamentals.

As both names suggest, bytes and bytearray are sequences of individual byte values, letting you process binary data at the byte level. For example, you may use them to work with plain text data, which typically represents characters as unique byte values, depending on the given character encoding.

Python natively interprets bytes as 8-bit unsigned integers, each representing one of 256 possible values (2⁸) between 0 and 255. But sometimes, you may need to interpret the same bit pattern as a signed integer, for example, when handling digital audio samples that encode a sound wave’s amplitude levels. See the section on signedness in the Python bytes tutorial for more details.

The choice between bytes and bytearray boils down to whether you want read-only access to the underlying bytes or not. Instances of the bytes data type are immutable, meaning each one has a fixed value that you can’t change once the object is created. In contrast, bytearray objects are mutable sequences, allowing you to modify their contents after creation.

While it may seem counterintuitive at first—since many newcomers to Python expect objects to be directly modifiable—immutable objects have several benefits over their mutable counterparts. That’s why types like strings, tuples, and others require reassignment in Python.

The advantages of immutable data types include better memory efficiency due to the ability to cache or reuse objects without unnecessary copying. In Python, immutable objects are inherently hashable, so they can become dictionary keys or set elements. Additionally, relying on immutable objects gives you extra security, data integrity, and thread safety.

That said, if you need a binary sequence that allows for modification, then bytearray is the way to go. Use it when you frequently perform in-place byte operations that involve changing the contents of the sequence, such as appending, inserting, extending, or modifying individual bytes. A scenario where bytearray can be particularly useful includes processing large binary files in chunks or incrementally reading messages from a network buffer.

The third binary sequence type in Python mentioned earlier, memoryview, provides a zero-overhead view into the memory of certain objects. Unlike bytes and bytearray, whose mutability status is fixed, a memoryview can be either mutable or immutable depending on the target object it references. Just like bytes and bytearray, a memoryview may represent a series of single bytes, but at the same time, it can represent a sequence of multi-byte words.

Now that you have a basic understanding of Python’s binary sequence types and where bytearray fits into them, you can explore ways to create and work with bytearray objects in Python.

Remove ads

Creating `bytearray` Objects in Python

Unlike the immutable bytes data type, whose literal form resembles a string literal prefixed with the letter b—for example, b"GIF89a"—the mutable bytearray has no literal syntax in Python. This distinction is important despite many similarities between both byte-oriented sequences, which you’ll discover in the next section.

The primary way to create new bytearray instances is by explicitly calling the type’s class constructor, sometimes informally known as the bytearray() built-in function. Alternatively, you can create a bytearray from a string of hexadecimal digits. You’ll learn about both methods next.

The `bytearray()` Constructor

Depending on the number and types of arguments passed to the bytearray() constructor, you can create mutable byte sequences from various Python objects. Below are the possible signatures of the bytearray() constructor, accepting different arguments:

Python Syntax
      
    
# Argumentless:
bytearray()

# Single argument:
bytearray(length: int)            # Non-negative integer
bytearray(data: Buffer)           # Bytes-like or a buffer object
bytearray(values: Iterable[int])  # Iterable of integers between 0 and 255

# Two or three arguments:
bytearray(text: str, encoding: str)
bytearray(text: str, encoding: str, errors: str = "strict")

According to the information provided above, you can call bytearray() without any arguments, which creates an empty byte array, or you can pass various values to initialize the array with specific content:

Non-negative integer: Creates a zero-filled byte array of the specified length.
Iterable of small integers: Creates a byte array from an iterable of integers in the range of 0 to 255, representing the subsequent byte values.
Bytes-like object of buffer: Creates a mutable copy of the given bytes-like object or an object implementing the buffer protocol.
String and character encoding: Encodes a string into a byte array using the specified character encoding. The optional error-handling strategy allows for graceful handling of characters that don’t have a representation in the given encoding.

To specify an empty array of bytes, you can leverage one of these equivalent techniques:

Python
      
>>> bytearray()
bytearray(b'')

>>> bytearray(0)
bytearray(b'')

>>> bytearray([])
bytearray(b'')

>>> bytearray(b"")
bytearray(b'')

In each case, you get a new bytearray object that initially contains no byte values but still allows you to add them later.

When you call bytearray() with a positive integer as an argument, you create a zero-filled byte array of the specified size, initialized with null bytes (b"\x00"). In other words, each element of the resulting array is a byte with a value of zero. You can take a peek at your array’s content by converting it to a Python list:

Python
      
>>> bytearray(5)
bytearray(b'\x00\x00\x00\x00\x00')

>>> list(bytearray(5))
[0, 0, 0, 0, 0]

As you can see, calling bytearray(5) produces an array of five zeros. Creating such an array of null bytes can be useful in scenarios when you need to initialize a data structure with a known size in advance to reduce the number of memory allocations and fragmentation.

You can also pass an iterable of small integers into bytearray() to treat them as standalone byte values. The iterable can be a lazily evaluated object like an iterator or generator, or it can be a sequence with a size known upfront:

Python
      
>>> bytearray(range(65, 91))
bytearray(b'ABCDEFGHIJKLMNOPQRSTUVWXYZ')

>>> bytearray([82, 101, 97, 108, 32, 80, 121, 116, 104, 111, 110])
bytearray(b'Real Python')

In this case, the range() function returns a range object, which you can iterate over, generating numbers on demand without storing them all in memory at once. Conversely, a list of numbers is an example of a random-access sequence that allows for direct retrieval of elements by index.

Watch out for iterables with incorrect data types or numeric values outside the expected range:

Python
      
        
      
    
>>> bytearray([3.14, 2.72])
Traceback (most recent call last):
  ...
TypeError: 'float' object cannot be interpreted as an integer

>>> bytearray([-1])
Traceback (most recent call last):
  ...
ValueError: byte must be in range(0, 256)

>>> bytearray([256])
Traceback (most recent call last):
  ...
ValueError: byte must be in range(0, 256)

In the first case, you called bytearray() with a list of floating-point numbers as an argument, and in the following two cases, you passed too-small and too-big integer values, respectively. Remember, bytearray() requires an iterable of Python integers that must fall within the range of 0 to 255, which coincides with 8-bit unsigned bytes.

The last single-argument invocation of bytearray() involves passing a bytes-like object or an object implementing the so-called buffer protocol as a parameter. It could be another bytearray or bytes object, for example:

Python
      
>>> binary_data = b"This is a bytes literal"
>>> bytearray(binary_data)
bytearray(b'This is a bytes literal')

Even though it may look like the bytearray is effectively wrapping your bytes object, that isn’t the case. Instead, the code snippet above creates a mutable copy of the original binary sequence, allowing you to modify its contents without affecting the original bytes data:

Python
      
        
      
    
>>> mutable_copy = bytearray(binary_data)
>>> mutable_copy[14:] = b"array"

>>> mutable_copy
bytearray(b'This is a bytearray')

>>> binary_data
b'This is a bytes literal'

>>> binary_data[14:] = b"array"
Traceback (most recent call last):
  ...
TypeError: 'bytes' object does not support item assignment

After creating yet another bytearray instance from the same bytes object that you defined earlier, you assign it to a variable named mutable_copy. Next, you use a slice assignment, which you’ll explore later, to replace a fragment of the resulting bytearray with a different sequence of bytes. This modifies your copy without affecting the original binary_data, demonstrating the mutable nature of bytearray compared to the immutable bytes object.

Another way to create a bytearray object is by passing two arguments to the constructor, both of which must be Python strings. The first argument may represent arbitrary text, while the second argument must be the name of a valid character encoding registered with the codecs module, such as UTF-8 or ISO 8859-1:

Python
      
>>> bytearray("¿Habla español?", "utf-8")
bytearray(b'\xc2\xbfHabla espa\xc3\xb1ol?')

>>> bytearray("¿Habla español?", "iso-8859-1")
bytearray(b'\xbfHabla espa\xf1ol?')

What you get in return is a bytearray instance containing the original text encoded into a sequence of bytes according to the chosen encoding.

Note: Although it’s generally considered Pythonic to encode strings into byte sequences using str.encode() rather than passing equivalent arguments to bytearray(), there are cases where the latter approach can be beneficial. Compare the following two invocations:

bytearray("¿Habla español?".encode("utf-8"))
bytearray("¿Habla español?", "utf-8")

Both techniques produce an identical result. However, the first one creates an intermediate bytes object before making its mutable copy, whereas the second one avoids this extra step, making it slightly more memory efficient. The difference can become noticeable when you work with particularly long strings.

By default, characters that lack a meaningful representation in the specified character encoding will make Python raise an exception. However, you can override this behavior by providing an optional third string argument to bytearray() with an alternative strategy for handling such encoding errors:

Python
      
        
      
    
>>> bytearray("¿Habla español?", "ascii")
Traceback (most recent call last):
  ...
UnicodeEncodeError: 'ascii' codec can't encode character
⮑ '\xbf' in position 0: ordinal not in range(128)

>>> bytearray("¿Habla español?", "ascii", errors="ignore")
bytearray(b'Habla espaol?')

While specifying errors="ignore" allows you to sidestep the encoding error, it can still lead to data loss. You can see this in the example above where the non-ASCII characters ¿ and ñ are omitted from the resulting bytearray. Other strategies include replacing the problematic characters with safe placeholders or rewriting them with the corresponding escape sequences:

Python
      
>>> bytearray("¿Habla español?", "ascii", errors="replace")
bytearray(b'?Habla espa?ol?')

>>> bytearray("¿Habla español?", "ascii", errors="backslashreplace")
bytearray(b'\\xbfHabla espa\\xf1ol?')

This time, instead of skipping the two characters in the output, you either replace them with a question mark (?) or escape using their hexadecimal codes. For more information about the available strategies, check out the error handlers.

Next up, you’ll learn about another method of creating bytearray objects in Python.

Remove ads

The `.fromhex()` Class Method

There’s an alternative way to create a bytearray object in Python, which you may sometimes prefer. You do this by calling bytearray.fromhex() on a string of hexadecimal digits, like so:

Python
      
>>> bytearray.fromhex("30 8C C9 FF")
bytearray(b'0\x8c\xc9\xff')

The signature and behavior of bytearray.fromhex() is analogous to that of bytes.fromhex(). Both are class methods, which you call on the type rather than a particular instance.

Using the hexadecimal system to express byte values is pretty common, as it allows you to represent binary data more compactly than with the decimal or binary systems. Take a look at the following table to see the difference:

Binary	Decimal	Hexadecimal
`00110000`	`48`	`30`
`10001100`	`140`	`8C`
`11001001`	`201`	`C9`
`11111111`	`255`	`FF`

Binary numbers take a lot of space because they require more digits to represent the same value compared to other numeral systems. After all, they’re composed of only two digits: 0 and 1. In contrast, the decimal system provides ten decimal digits (0-9), making numbers quite a bit shorter.

However, switching to the hexadecimal system gives you an additional six letters of the alphabet (A-F) to represent the values ten through fifteen. This lets you conveniently express every 8-bit byte with no more than two hexadecimal digits.

Since bytearray and bytes share over eighty percent of their functionality and are sometimes interchangeable, you’ll first compare them before diving into manipulation techniques.

Comparing `bytearray` to `bytes` Objects

While the bytes data type builds on Python strings, bytearray extends the interface of bytes even further by introducing mutable behavior. It does so through a few additional methods and operators that are missing from the other two data types.

Public Methods

The bytearray type includes all the methods of bytes but, being mutable, also provides eight additional methods designed for in-place modifications:

Method	Description
`.append()`	Append a single item to the end of the `bytearray`.
`.copy()`	Return a copy of the `bytearray`.
`.remove()`	Remove the first occurrence of a value in the `bytearray`.
`.reverse()`	Reverse the order of the values in the `bytearray` in place.
`.pop()`	Remove and return a single item from the `bytearray` at the given index.
`.insert()`	Insert a single item into the `bytearray` before the given index.
`.extend()`	Append all the items from the iterator or sequence to the end of the `bytearray`.
`.clear()`	Remove all items from the `bytearray`.

You’ll explore these methods, among others, in more detail later in this tutorial. For now, keep in mind that the public interface of bytearray forms a superset of the methods and attributes of the bytes data type.

If you’re wondering how to quickly identify the differences between bytearray and bytes data types, then you can use the following code snippet:

Python
      
>>> def public_members(cls):
...     return {name for name in dir(cls) if not name.startswith("_")}
...

>>> public_members(bytearray) - public_members(bytes)
{'append', 'copy', 'remove', 'reverse', 'pop', 'insert', 'extend', 'clear'}

>>> public_members(bytearray) >= public_members(bytes)
True

First, you define a function named public_members(), which uses a set comprehension with a condition to filter out non-public methods and attributes of a class—those whose names start with an underscore (_). Next, you call your function on bytearray and bytes, and then use the set difference operator (-) to identify methods present in bytearray but not in bytes. Finally, you confirm that bytearray includes all public members of bytes.

If those eight methods look familiar to you, then that’s because you might have seen them in other mutable sequence types in Python, such as lists or deques. They let you manipulate the contents of a bytearray in ways that aren’t possible with an immutable bytes object. With them, you can add, remove, and rearrange bytes directly within a given bytearray.

But calling methods isn’t the only way to mutate an object. You can also change its state using operators, such as concatenation (+), assignment (=), or others supported by the object.

Remove ads

Supported Operators

In addition to supplementing the public interface of bytes with a few extra methods, the bytearray data type introduces some special methods corresponding to Python operators that allow for in-place byte manipulation:

Special Method	Operator	Description
`.__setitem__()`	`=`	Assign a new value to a byte or a slice at the given index of the `bytearray`.
`.__delitem__()`	`del`	Delete a byte or a slice at the given index of the `bytearray`.
`.__iadd__()`	`+=`	Append the bytes of another iterable to the end of the `bytearray`.
`.__imul__()`	`*=`	Repeat the `bytearray` a specified number of times and update it in place.

Again, you’ll take a closer look at these operators later in the tutorial. In the meantime, here’s how you can find them in the bytearray type’s definition:

Python
      
        
      
    
>>> def magic_members(cls):
...     return {name for name in dir(cls) if name.startswith("__")}
...

>>> magic_members(bytearray) - magic_members(bytes)
{
    '__alloc__',
    '__release_buffer__',
    '__delitem__',
    '__iadd__',
    '__imul__',
    '__setitem__'
}

The highlighted methods implement deletion (del), in-place concatenation (+=), in-place repetition (*=), and assignment (=), respectively. This form of operator overloading allows you to modify a bytearray instance directly instead of creating a new object. Consider this example:

Python
      
        
      
    
>>> buffer = bytearray(b"Real")
>>> buffer += b" Python"
>>> buffer
bytearray(b'Real Python')

>>> buffer + b" is awesome"
bytearray(b'Real Python is awesome')
>>> buffer
bytearray(b'Real Python')

Notice the difference between the augmented assignment operator (+=), which modifies your existing buffer in place without returning anything, and the concatenation operator (+), which creates a brand new bytearray object. This behavior can be particularly useful when you work with large binary data, as it minimizes memory overhead and improves performance.

Have you noticed how bytearray interacts with bytes in the expressions above? You’ll explore this behavior next.

Expressions With `bytearray` and `bytes`

Another point worth noting here is that you can freely mix bytearray and bytes objects in one expression, like before when you concatenated the two. That’s possible because both data types embody sequences of bytes, making them compatible despite one being mutable and the other one not. In particular, when you compare a bytearray instance to a bytes object or the other way around, Python considers them equal if their binary contents are the same:

Python
      
>>> bytearray("Python", "utf-8") == bytes("Python", "utf-8")
True

>>> [1, 2, 3] == (1, 2, 3)
False

When it comes to binary sequences in Python, mutability doesn’t affect equality. This is different than, say, comparing lists and tuples, which don’t compare on their contents alone—their types also matter.

Note that the type of the resulting value may sometimes differ depending on the operand order:

Python
      
>>> bytearray([82, 101, 97, 108]) + b" Python"
bytearray(b'Real Python')

>>> b"Real" + bytearray([32, 80, 121, 116, 104, 111, 110])
b'Real Python'

When evaluating an expression that involves a binary operator, such as the plus operator (+) in the example above, Python checks the left operand for the respective implementation. If the left operand is a bytearray object, then Python invokes its .__add__() special method, which returns a new bytearray instance. Conversely, if the left operand is a bytes object, its own version of .__add__() kicks in, producing another bytes object as a result.

Note: In custom classes, you can implement the right-hand side versions of special methods, such as .__radd__(), so that your instances will correctly handle the operation even when they appear on the right side of a binary operator.

To obtain predictable output, you can always cast your operands to the desired types before performing an operation on them. Alternatively, you may convert the result afterward:

Python
      
>>> bytes(bytearray([82, 101, 97, 108])) + b" Python"
b'Real Python'

>>> bytearray(b"Real") + bytearray([32, 80, 121, 116, 104, 111, 110])
bytearray(b'Real Python')

>>> bytearray(b"Real" + b" Python")
bytearray(b'Real Python')

In the first example, you convert a bytearray on the left to bytes before concatenating it with a bytes literal on the right. In the second example, you perform the conversion in the opposite direction, while in the last example, you convert the concatenated result regardless of its type.

The ability to modify bytearray objects is what sets them apart from bytes, making them particularly useful in scenarios where performance and memory efficiency matter. Now that you understand how bytearray extends bytes with additional methods and operators, you’ll explore practical ways to manipulate bytearray objects in Python.

Remove ads

Manipulating `bytearray` Objects in Python

At this point, you know that Python’s bytearray and bytes are binary sequences, which give you the ability to process individual byte values. While bytes is an immutable type modeled after Python strings, bytearray adds support for mutability through a few extra methods and operators, which you’ll take a closer look at in this section. After all, the key advantage of bytearray over bytes lies in its ability to be modified in place, saving memory.

String-Like Operations

Since bytearray is a mutable extension of the bytes data type, it also shares a strong connection with Python strings. Nearly forty string methods have been carried over from str to bytes and bytearray, letting you treat those binary sequences much like text strings. However, there’s a catch.

Although you may expect bytearray to modify its contents in place when you perform string-like operations like .replace(), that’s not the case. Instead, these methods return a modified copy of the original data rather than altering it directly. This distinction can make it tricky to remember which bytearray operations mutate the object in place and which produce a copy.

To find the methods common to bytearray and str—and bytes—you can use the set intersection operator (&) as shown below:

Python
      
        
      
    
>>> for i, name in enumerate(
...     sorted(
...         name for name in set(dir(bytearray)) & set(dir(str))
...         if not name.startswith("_")
...     ),
...     start=1
... ):
...     print(f"({i:>2}) .{name}()")
...
( 1) .capitalize()
( 2) .center()
( 3) .count()
( 4) .endswith()
  ⋮      ⋮
(38) .upper()
(39) .zfill()

Looking from the inside of the loop, you use a generator expression to iterate over the resulting set intersection, keeping only public method names. Next, you sort them alphabetically in ascending order, enumerate them—starting from one—and print the formatted method names.

These methods work in a similar fashion to their string counterparts. For example, you can obtain an uppercase copy of your byterarray by calling its .upper() method:

Python
      
>>> binary_data = bytearray(b"Real Python")
>>> binary_data.upper()
bytearray(b'REAL PYTHON')

That doesn’t look surprising since your bytearray contains only English alphabet letters. But what about when you throw non-letter characters—or even byte values outside the ASCII range—into the mix? You can find out by defining a slightly different sequence of bytes, such as this one:

Python
      
>>> binary_data = bytearray([202, 254, 186, 190])

>>> binary_data
bytearray(b'\xca\xfe\xba\xbe')

>>> binary_data.upper()
bytearray(b'\xca\xfe\xba\xbe')

In this case, none of the bytes encode an ASCII character, so Python displays their numeric values using the hexadecimal notation. Calling .upper() has no effect because these bytes lack uppercase equivalents. This behavior is similar to how strings preserve non-letter characters, such as digits:

Python
      
>>> "42".upper()
'42'

Although digits are printable, they’re not characters, so the concepts of uppercase and lowercase don’t apply to them.

But, calling the same method on a bytearray and a string can yield vastly different results, even when both represent the same piece of data. Check out this example:

Python
      
        
      
    
>>> binary_data = bytearray("café", "utf-8")
>>> binary_data
bytearray(b'caf\xc3\xa9')
>>> str(binary_data.upper(), "utf-8")
'CAFé'
>>> str(binary_data, "utf-8").upper()
'CAFÉ'

You begin by creating a bytearray from the string "café" using the UTF-8 encoding. Within this character encoding, the Unicode letter é is represented by two separate bytes, whose decimal values are 195 and 169, neither of which are part of the ASCII character set.

When you call .upper() on the resulting bytearray and later convert it back into a string, you get "CAFé". This happens because your method call only affects ASCII characters in the bytearray, leaving the non-ASCII byte sequence for the letter é unchanged. As a result, the é remains in its original form when decoded back into a string.

In contrast, when you convert the bytearray before calling .upper() on the resulting string, you get the expected "CAFÉ". To sum it up, string methods defined in bytearray work on individual byte values, whereas their string prototypes operate on whole Unicode characters.

Remove ads

Immutable Sequence Operations

Because bytearray objects fall under the category of sequence types in Python, they support all the common sequence operations you’d expect from an immutable tuple or string.

For example, a bytearray object supports indexing and slicing through the square bracket syntax:

Python
      
>>> binary_data = bytearray(b"Monty Python")

>>> binary_data[-1]
110

>>> binary_data[-1:]
bytearray(b'n')

>>> binary_data[6:]
bytearray(b'Python')

When you access a specific byte at the given index in the array, you get its integer value. For example, the last byte at index -1 has a value of 110, which corresponds to the ASCII character "n". At the same time, slicing a bytearray always yields a new subarray, even if it only contains one element. The same is true of bytes objects, but not strings, which return another string object regardless of whether you index or slice them.

The bytearray has a known size, so you can measure its length at any time. It also supports iteration over the byte elements, and it lets you create a reversed copy by calling the reversed() built-in function:

Python
      
        
      
    
>>> len(binary_data)
12

>>> for byte in binary_data:
...     print(f"{byte:x}: {chr(byte)!r}")
...
4d: 'M'
6f: 'o'
6e: 'n'
74: 't'
79: 'y'
20: ' '
50: 'P'
79: 'y'
74: 't'
68: 'h'
6f: 'o'
6e: 'n'

>>> reversed(binary_data)
<reversed object at 0x7e28e3f607c0>

>>> bytearray(reversed(binary_data))
bytearray(b'nohtyP ytnoM')

In the code snippet above, you call the len() function to get the number of bytes in your bytearray. Next, you iterate over the array and use an f-string literal to format each byte as a hexadecimal number (x), and then you call the chr() function to reveal the corresponding character. A bit later, you create a new bytearray object from an iterator of elements in reversed order.

You can find the starting index and the number of occurrences of a subsequence, as well as test for membership in a bytearray:

Python
      
>>> binary_data.index(b"Python")
6

>>> binary_data.count(b"Python")
1

>>> b"Python" in binary_data
True

>>> b"Python" not in binary_data
False

While you look for byte sequences expressed as bytes literals in these examples, you could just as well provide bytearray instances or other bytes-like objects as arguments instead.

Apart from in and not in, two other binary operators commonly used with Python sequences are the concatenation (+) and repetition (*) operators, which work equally well with bytearray objects:

Python
      
>>> binary_data + b"'s Flying Circus"
bytearray(b"Monty Python\'s Flying Circus")

>>> bytearray(b"spam ") * 3
bytearray(b'spam spam spam ')

In both cases, you end up creating new bytearray objects while your original byte sequences remain unchanged.

This wraps up the overview of immutable sequence operations. Now, it’s time to move on and explore the mutable sequence operations that bytearray objects also support.

Mutable Sequence Operations

Because bytearray is a mutable sequence, it provides a few extra methods and supports operators that you saw earlier, allowing you to modify its contents without creating expensive copies. This makes bytearray an excellent choice for tasks that require frequent updates or modifications to binary data.

For instance, you’ll often want to change the value of a byte positioned at a particular index in your bytearray. Say you’re working with a sequence of pixel values representing a monochromatic image, and you want to create its negative by inverting all pixels using a loop:

Python
      
>>> pixels = bytearray([48, 140, 201, 252, 186, 3, 37, 186, 52])
>>> for i in range(len(pixels)):
...     pixels[i] = 255 - pixels[i]
...

Each byte represents a relative light intensity, with 0 denoting complete darkness and 255 indicating maximum brightness. Inverting a pixel involves subtracting its current value from 255, effectively reversing its brightness.

When performing a single item assignment, as shown on the highlighted line, you must always provide an integer value to store at the specified index. Because a bytearray is a sequence of unsigned bytes, you can only provide integers between 0 and 255, or else Python will raise an error:

Python
      
        
      
    
>>> pixels[0] = -1
Traceback (most recent call last):
  ...
ValueError: byte must be in range(0, 256)

>>> pixels[0] = 256
Traceback (most recent call last):
  ...
ValueError: byte must be in range(0, 256)

Neither -1 nor 256 represent valid byte values, so Python prevents their assignment to ensure data integrity. A similar problem may occur when you try to increase or decrease the brightness of pixels without accounting for possible overflow and underflow errors:

Python
      
        
      
    
>>> for i in range(len(pixels)):
...     pixels[i] *= 2
...
Traceback (most recent call last):
  ...
ValueError: byte must be in range(0, 256)

Here, you use one of the augmented assignment operators, which is a shorthand notation for pixels[i] = pixels[i] * 2, to double the pixel intensity. However, this operation can result in values that exceed the valid range for a byte, causing a ValueError.

Note: Don’t confuse the augmented concatenation (+=) and repetition (*=) operators of a bytearray with the augmented assignment operators of its individual elements. Here’s an example that demonstrates a subtle difference between them:

Python
      
        
      
    
>>> binary_data = bytearray(b"Why?")
>>> binary_data *= 2
>>> binary_data
bytearray(b'Why?Why?')

>>> binary_data = bytearray(b"Why?")
>>> binary_data[-1] *= 2
>>> binary_data
bytearray(b'Why~')

When you apply the operator to a bytearray, it repeats the entire byte sequence. However, when you apply it to an individual element of the same bytearray, it multiplies the byte value itself, resulting in a different ASCII representation.

Given your past experience with Python strings—where each character is essentially a string of length one—you might be tempted to directly assign a bytes-like object to a bytearray at a specific index, like this:

Python
      
>>> pixels[0] = b"\xff"
Traceback (most recent call last):
  ...
TypeError: 'bytes' object cannot be interpreted as an integer

Unfortunately, even though the bytes literal on the right contains only one element, Python won’t unpack it for you. That said, you can assign a bytes-like object, such as bytes or bytearray, or even any other sequence of small integers when you use the slice assignment:

Python
      
>>> pixels[3:6] = (0, 0, 0)
>>> list(pixels)
[207, 115, 54, 0, 0, 0, 218, 69, 203]

With this approach, you replace a fragment of pixels with another subsequence of the same length. This behavior is similar to how list slicing works in Python. If the length of the sequence you’re assigning doesn’t match the length of the slice being replaced, then Python will automatically shrink or expand your bytearray to accommodate the new sequence.

Mutability isn’t only about changing existing values or adding new ones, but also about deletion. You can delete a single byte or a whole slice from a bytearray with the help of Python’s del statement:

Python
      
>>> del pixels[3:6]
>>> list(pixels)
[207, 115, 54, 218, 69, 203]

>>> del pixels[3]
>>> list(pixels)
[207, 115, 54, 69, 203]

Just be sure to provide the correct byte indices when deleting content to avoid an IndexError.

Note that del is a statement, which causes a side effect, but it doesn’t evaluate to a value like an expression does. If you’d like to delete a bytearray element while intercepting its value, then call the .pop() method with an index as an argument:

Python
      
>>> pixels.pop(3)
69

>>> pixels.pop()
203

>>> list(pixels)
[207, 115, 54]

By default, the index parameter is equal to -1, indicating the last item in the array. So, when you call .pop() without providing any argument, then you’ll remove and return the rightmost byte.

On the other hand, if you don’t know the exact index of the byte to delete but know its value, you can remove its first occurrence by searching for it from the left:

Python
      
>>> pixels.remove(115)
>>> list(pixels)
[207, 54]

If the sequence contains any duplicates that you wish to remove, then you’ll have to rinse and repeat as many times as needed. The lookup always starts from the left.

To delete everything in one go, you can call the .clear() method:

Python
      
>>> pixels.clear()
>>> pixels
bytearray(b'')

This method doesn’t return anything but irreversibly removes all the bytes from your bytearray, leaving it empty.

Once you’ve cleared your array, you’ll want to populate it again with new data. To add a single byte into the bytearray, you can either append one to the right end of the array or insert it before the specified index. Again, the new element has to be an integer within the valid range:

Python
      
        
      
    
>>> pixels.append(65)  # A
>>> pixels.append(67)  # C
>>> pixels.insert(1, 66)  # B
>>> pixels
bytearray(b'ABC')

The byte values 65, 66, and 67 correspond to ASCII letters A, B, and C, respectively. Notice how you first append A and C, only later inserting B between them.

If you have multiple bytes stored in an iterable, then you can append them all to the right end of your bytearray by calling the .extend() method:

Python
      
>>> pixels.extend((1, 2, 3))
>>> pixels
bytearray(b'ABC\x01\x02\x03')

This method ensures that you append all bytes in a single step, which is more efficient and concise than looping through each byte and adding it individually.

The final two methods available in a mutable sequence enable an in-place reversal of its elements, as well as making a copy of the sequence:

Python
      
>>> pixels.reverse()
>>> pixels
bytearray(b'\x03\x02\x01CBA')

>>> pixels.copy()
bytearray(b'\x03\x02\x01CBA')

>>> pixels is pixels.copy()
False

The .reverse() method directly alters your bytearray, changing the order of its elements without creating a new object. On the other hand, the .copy() method creates a new sequence with the same bytes, allowing you to work with a duplicate. Note that the resulting copy has a unique identity, which is different than your original bytearray.

You’ve covered behaviors typical of strings and sequences in Python, but bytearray objects are also designed to store binary data. Next, you’ll learn about two methods specific to this data type.

Remove ads

Byte-Specific Operations

Both bytearray and bytes expose an instance method called .hex(), which is the opposite of .fromhex() that you learned about earlier. While .fromhex() allowed you to create a new bytearray object from a string of hexadecimal digits, .hex() converts an existing bytearray into its hexadecimal representation:

Python
      
>>> binary_data = bytearray([48, 140, 201, 0])
>>> binary_data.hex()
'308cc900'

Notice that each byte is always represented by two hexadecimal digits by adding a leading zero, even when a single digit would technically be sufficient. This ensures clear interpretation of such strings, eliminating any ambiguity in where the boundaries lie.

To make this distinction even clearer visually, you can provide one or two optional parameters:

Python
      
>>> binary_data.hex(":")
'30:8c:c9:00'

>>> binary_data.hex(":", -3)
'308cc9:00'

>>> binary_data.hex(":", 3)
'30:8cc900'

The first parameter is the separator, which must be a single-character string that will be placed between each pair of hexadecimal digits in the output. The other parameter is the number of bytes to group together before applying the separator. When this number is negative, the grouping starts from the left. Otherwise, it starts from the right.

Another commonly used method that you’ll encounter in the wild is one for converting a bytearray into a Python string:

Python
      
>>> binary_data = bytearray(b"caf\xc3\xa9")
>>> binary_data.decode("utf-8")
'café'

The .decode() method takes an optional character encoding, which defaults to your platform’s encoding when you omit it. As a rule of thumb, it’s best to always explicitly provide character encoding to ensure consistency and avoid surprises.

Just as the bytearray() constructor allowed you to define an error-handling strategy for encoding a string into a mutable sequence of bytes, .decode() lets you specify the same strategy when you want to interpret bytes as a string:

Python
      
>>> binary_data.decode("ascii", errors="ignore")
'caf'

The errors parameter accepts a few predefined values, which you can find in the official documentation.

That concludes your overview of Python’s bytearray. With this knowledge, you can efficiently manipulate binary data and perform operations that require mutable sequences of bytes.

Conclusion

You’ve explored Python’s bytearray type, a mutable sequence of bytes that allows for efficient binary data manipulation. You learned how bytearray compares to bytes, how to create and modify bytearray objects, and how to leverage its mutable behaviors for working with binary data. Along the way, you saw a few practical examples of using bytearray in real-world scenarios, such as handling image pixels and performing character encoding conversions.

In this tutorial, you’ve learned:

How to create bytearray instances from various sources
The key differences between bytearray and bytes
Ways to modify bytearray contents in place using methods and operators
Practical use cases, such as handling image data and encoding text

With this knowledge, you can confidently use bytearray in your projects when you need mutable binary sequences. To continue your exploration of Python’s binary data handling, check out memoryview, which offers a zero-copy way to interact with binary buffers.

Get Your Code: Click here to download the free sample code that you’ll use to learn about Python’s bytearray data type.

Remove ads

Frequently Asked Questions

Now that you have some experience with Python’s bytearray data type, you can use the questions and answers below to check your understanding and recap what you’ve learned.

These FAQs are related to the most important concepts you’ve covered in this tutorial. Click the Show/Hide toggle beside each question to reveal the answer.

A bytearray is one of three binary sequence types in Python. More specifically, it’s a mutable sequence of bytes, allowing you to modify its contents after creation.

The bytearray data type is mutable, meaning you can change its contents, while the bytes data type is immutable and can’t be modified once created.

You can create a bytearray by calling the bytearray() constructor with various types of arguments, such as a non-negative integer, an iterable of integers, a bytes-like object, or a string with a specified encoding. Alternatively, you can use bytearray.fromhex() to interpret a string of hexadecimal digits as a byte sequence.

Yes, you can modify a bytearray by changing, appending, or inserting the individual bytes, and using various methods to manipulate its contents.

Common uses for bytearray include processing binary files or streams, particularly when handling large files in chunks, as well as working with low-level network protocols that require mutable sequences.

Take the Quiz: Test your knowledge with our interactive “Python's Bytearray” quiz. You’ll receive a score upon completion to help you track your learning progress:

Interactive Quiz

Python's Bytearray

In this quiz, you'll test your understanding of Python's bytearray data type. By working through this quiz, you'll revisit the key concepts and uses of bytearray in Python.

What Do You Think?

Rate this article:

What’s your #1 takeaway or favorite thing you learned? How are you going to put your newfound skills to use? Leave a comment below and let us know.

Commenting Tips: The most useful comments are those written with the goal of learning from or helping out other students. Get tips for asking good questions and get answers to common questions in our support portal.

Looking for a real-time conversation? Visit the Real Python Community Chat or join the next “Office Hours” Live Q&A Session. Happy Pythoning!

Python's Bytearray: A Mutable Sequence of Bytes

Understanding Python’s bytearray Type

Creating bytearray Objects in Python

The bytearray() Constructor

The .fromhex() Class Method

Comparing bytearray to bytes Objects

Public Methods

Supported Operators

Expressions With bytearray and bytes

Manipulating bytearray Objects in Python

String-Like Operations

Immutable Sequence Operations

Mutable Sequence Operations

Byte-Specific Operations

Conclusion

Frequently Asked Questions

Understanding Python’s `bytearray` Type

Creating `bytearray` Objects in Python

The `bytearray()` Constructor

The `.fromhex()` Class Method

Comparing `bytearray` to `bytes` Objects

Expressions With `bytearray` and `bytes`

Manipulating `bytearray` Objects in Python