Computer >> Computer tutorials >  >> Programming >> Python

What is the difference between a string and a byte string in Python?


A string is a sequence of characters; these are an abstract concept, and can't be directly stored on disk. A byte string is a sequence of bytes - things that can be stored on disk. The mapping between them is an encoding - there are quite a lot of these (and infinitely many are possible) - and you need to know which applies in the particular case in order to do the conversion, since a different encoding may map the same bytes to a different string. For example, the same byte string can represent 2 different strings in 2 different encodings.

For example

>>> b'\xcf\x84o\xcf\x81\xce\xbdo\xcf\x82'.decode('utf-16')
'蓏콯캁澽苏'
>>> b'\xcf\x84o\xcf\x81\xce\xbdo\xcf\x82'.decode('utf-8')
'τoρνoς'

Once you know which encoding to use, you can use the .decode() method of the byte string to get the right character string from it. The .encode() method of a character string goes the opposite way and encodes the character string as a byte string.