Convert Unicode String to a Byte String in Python
Last Updated :
30 Jan, 2024
Python is a versatile programming language known for its simplicity and readability. Unicode support is a crucial aspect of Python, allowing developers to handle characters from various scripts and languages. However, there are instances where you might need to convert a Unicode string to a regular string. In this article, we will explore five different methods to achieve this in Python.
Convert A Unicode String to a Byte String In Python
Below, are the ways to convert a Unicode String to a Byte String In Python.
- Using
encode()
with UTF-8 - Using
encode()
with a Different Encoding - Using
bytes()
Constructor - Using
str.encode()
Method
Convert A Unicode to a Byte String Using encode()
with UTF-8
In this example, a Unicode string containing English and Chinese characters is encoded to a byte string using UTF-8 encoding. The resulting `bytes_representation` is printed, demonstrating the transformation of the mixed-language Unicode string into its byte representation suitable for storage or transmission in a UTF-8 encoded format.
Python3
unicode_string = "Hello, 你好"
bytes_representation = unicode_string.encode('utf-8')
print(bytes_representation)
Outputb'Hello, \xe4\xbd\xa0\xe5\xa5\xbd'
Unicode String To Byte String Using encode()
with a Different Encoding
In this example, the Unicode string is encoded into a byte string using UTF-16 encoding, resulting in a sequence of bytes that represents the mixed-language string. The byte string is then printed to demonstrate the UTF-16 encoded representation of the Unicode characters.
Python3
unicode_string = "Hello, 你好"
byte_string_utf16 = unicode_string.encode('utf-16')
# Displaying the byte string
print(byte_string_utf16)
Outputb'\xff\xfeH\x00e\x00l\x00l\x00o\x00,\x00 \x00`O}Y'
Convert A Unicode String To Byte String Using bytes()
Constructor
In this example, the Unicode string is converted to a byte string using the bytes() constructor with UTF-8 encoding. The resulting `byte_string_bytes` represents the UTF-8 encoded byte sequence of the mixed-language Unicode string.
Python3
unicode_string = "Hello, 你好"
byte_string_bytes = bytes(unicode_string, 'utf-8')
# Displaying the byte string
print(byte_string_bytes)
Outputb'Hello, \xe4\xbd\xa0\xe5\xa5\xbd'
Python Unicode String To Byte String Using str.encode()
Method
In this example, the Unicode string is transformed into a byte string using the str.encode() method with UTF-8 encoding. The resulting `byte_string_str_encode` represents the UTF-8 encoded byte sequence of the mixed-language Unicode string.
Python3
unicode_string = "Hello, 你好"
byte_string_str_encode = str.encode(unicode_string, 'utf-8')
# Displaying the byte string
print(byte_string_str_encode)
Outputb'Hello, \xe4\xbd\xa0\xe5\xa5\xbd'
Similar Reads
Convert Unicode to Bytes in Python Unicode, often known as the Universal Character Set, is a standard for text encoding. The primary objective of Unicode is to create a universal character set that can represent text in any language or writing system. Text characters from various writing systems are given distinctive representations
2 min read
How to Convert Bytes to String in Python ? We are given data in bytes format and our task is to convert it into a readable string. This is common when dealing with files, network responses, or binary data. For example, if the input is b'hello', the output will be 'hello'.This article covers different ways to convert bytes into strings in Pyt
2 min read
String to Int and Int to String in Python Python defines type conversion functions to directly convert one data type to another. This article is aimed at providing the information about converting a string to int and int to string. Converting a string to an int If we want to convert a number that is represented in the string to int, we have
2 min read
Convert Hex String to Bytes in Python Converting a hexadecimal string to bytes in Python involves interpreting each pair of hexadecimal characters as a byte. For example, the hex string 0xABCD would be represented as two bytes: 0xAB and 0xCD. Letâs explore a few techniques to convert a hex string to bytes.Using bytes.fromhex() bytes.fro
2 min read
Convert a String to Utf-8 in Python Unicode Transformation Format 8 (UTF-8) is a widely used character encoding that represents each character in a string using variable-length byte sequences. In Python, converting a string to UTF-8 is a common task, and there are several simple methods to achieve this. In this article, we will explor
3 min read
Convert binary to string using Python We are given a binary string and need to convert it into a readable text string. The goal is to interpret the binary data, where each group of 8 bits represents a character and decode it into its corresponding text. For example, the binary string '01100111011001010110010101101011' converts to 'geek'
3 min read