DEV Community

Super Kai (Kazuya Ito)
Super Kai (Kazuya Ito)

Posted on • Edited on

Byte string in Python (2)

Buy Me a Coffee

*Memos:

bytes() can create the byte string(immutable) whose type is bytes or can encode a string as shown below:

*Memos:

  • The 1st argument is source(Optional-Default:b''-Type:bytes-like object/int/iterable(int) or Required-Type:str): *Memos:
    • It's optional with the default values b'' and bytes-like object/int/iterable(int) types if encoding or encoding and errors isn't/aren't set. *int gives a null value(\x00) which represents no value.
    • It's required with str to encode if encoding or encoding and errors is/are set, working as str.encode().
  • The 2nd argument is encoding(Optional-Default:'utf-8'): *Memos:
    • 'utf-8', 'utf-7', 'utf-16', 'big5', 'ascii', etc can be set to it.
    • You can see Standard Encodings for more possible values.
  • The 3rd argument is errors(Optional-Default:'strict'): *Memos:
    • It controls decoding error with the error handlers, 'strict', 'ignore', 'replace', 'xmlcharrefreplace', 'backslashreplace', etc.
    • 'strict' raises UnicodeError if the character, which cannot be decoded, exists.
    • 'ignore' ignores the character which cannot be decoded.
    • 'replace' replaces the character, which cannot be decoded, with ?.
    • 'xmlcharrefreplace' replaces the character, which cannot be decoded, with a XML character e.g. ё.
    • 'backslashreplace' replaces the character, which cannot be decoded, with \\uxxxx e.g. \\u0451.
    • You can see more error handlers.
    • You can create your own error handler with codecs.register_error().
v = bytes()                       # Empty byte string
v = bytes(source=b'12')           # Byte String
v = bytes(source=12)              # Integer
v = bytes(source=[1, 2, 3])       # List(int)
v = bytes(source=(1, 2, 3))       # Tuple(int)
v = bytes(source={1, 2, 3})       # Set(int)
v = bytes(source={ord('a'):'A'})  # Dictionary(int:str)
v = bytes(source=iter([1, 2, 3])) # Iterator(int)
# No error

print(type(bytes(source=b'12')))
# <class 'bytes'>

v = bytes(source=1.2)             # Floating-point number
v = bytes(source=1.2+3.4j)        # Complex number
v = bytes(source=lambda: 10)      # Function
# Error
Enter fullscreen mode Exit fullscreen mode
v = bytes() # Empty byte string

print(v)
# b''
Enter fullscreen mode Exit fullscreen mode
v = bytes(source=b'12') # Byte string

print(v, v[0], v[1])
# b'12' 49 50
Enter fullscreen mode Exit fullscreen mode
v = bytes(source=12) # Integer

print(v, v[0], v[1])
# b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00' 0 0
Enter fullscreen mode Exit fullscreen mode
v = bytes(source=[1, 2, 3]) # List

print(v, v[0], v[1], v[2])
# b'\x01\x02\x03' 1 2 3
Enter fullscreen mode Exit fullscreen mode
v = bytes(source=(1, 2, 3)) # Tuple

print(v, v[0], v[1], v[2])
# b'\x01\x02\x03' 1 2 3
Enter fullscreen mode Exit fullscreen mode
v = bytes(source={1, 2, 3}) # Set

print(v, v[0], v[1], v[2])
# b'\x01\x02\x03' 1 2 3
Enter fullscreen mode Exit fullscreen mode
v = bytes(source={ord('a'):'A'})  # Dictionary

print(v, v[0])
# b'a' 97
Enter fullscreen mode Exit fullscreen mode
v = bytes(source=iter([1, 2, 3])) # Iterator

print(v, v[0], v[1], v[2])
# b'\x01\x02\x03' 1 2 3
Enter fullscreen mode Exit fullscreen mode
v = "Lёт's gφ!" # Let's go!

print(bytes(source=v, encoding='utf-8', errors='strict'))
print(bytes(source=v, encoding='utf-8'))
# b"L\xd1\x91\xd1\x82's g\xcf\x86!"
Enter fullscreen mode Exit fullscreen mode
v = "Lёт's gφ!" # Let's go!

print(bytes(source=v, encoding='utf-7'))
# b"L+BFEEQg's g+A8Y!"
Enter fullscreen mode Exit fullscreen mode
v = "Lёт's gφ!" # Let's go!

print(bytes(source=v, encoding='utf-16'))
# b"\xff\xfeL\x00Q\x04B\x04'\x00s\x00 \x00g\x00\xc6\x03!\x00"
Enter fullscreen mode Exit fullscreen mode
v = "Lёт's gφ!" # Let's go!

print(bytes(source=v, encoding='big5'))
# b"L\xc7\xce\xc7\xdb's g\xa3p!"
Enter fullscreen mode Exit fullscreen mode
import codecs

def hashreplace_handler(s):
    return ((s.end - s.start) * '#', s.end)

codecs.register_error('hashreplace', hashreplace_handler)

v = "Lёт's gφ!" # Let's go!

print(bytes(source=v, encoding='ascii', errors='ignore'))
# b"L's g!"

print(bytes(source=v, encoding='ascii', errors='replace'))
# b"L??'s g?!"

print(bytes(source=v, encoding='ascii', errors='hashreplace'))
# b"L##'s g#!"

print(bytes(source=v, encoding='ascii'))
print(bytes(source=v, encoding='ascii', errors='strict'))
# UnicodeEncodeError: 'ascii' codec can't encode characters
# in position 1-2: ordinal not in range(128)
Enter fullscreen mode Exit fullscreen mode
v = "Lёт's gφ!" # Let's go!

print(bytes(source=v, encoding='ascii', errors='xmlcharrefreplace'))
# b"L&#1105;&#1090;'s g&#966;!"
Enter fullscreen mode Exit fullscreen mode
v = "Lёт's gφ!" # Let's go!

print(bytes(source=v, encoding='ascii', errors='backslashreplace'))
# b"L\\u0451\\u0442's g\\u03c6!"
Enter fullscreen mode Exit fullscreen mode

Top comments (0)