Data Structure - String
In the past videos, we encountered data types like int , float and boolean . These are
essentially single values.
We also learned that the "Hello world!" is an example of a string in Python.
Our next topic will talk about some basic ways by how Python handles a collection of values.
We will talk about:
strings
lists
tuples
dictionaries
As mentioned before, strings represent plain text.
We can also think of as a string as a sequence of characters.
For our example "Hello world!" , the starting character is "H" and ending with the
character "!".
Any sequence of characters enclosed in quotations will be recognized by Python as a string.
Note that even single values, so long as they are enclosed in quotations will be recognized
as a string.
In [1]:
type('I')
Out[1]: str
In [2]:
type(" ")
Out[2]: str
In [3]:
type("2")
Out[3]: str
In [4]:
type(2)
Out[4]: int
As such, the value 2 is different from "2" . While Python can process the operation 2 -
2 , it cannot understand "2" - 2 .
Similarly, it cannot understand 2 "-" 2 .
In [5]:
2 - 2
Out[5]: 0
In [6]:
"2" - 2
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-6-6970f12f915a> in <module>
----> 1 "2" - 2
TypeError: unsupported operand type(s) for -: 'str' and 'int'
Find length
One usual information we want to get out of a string is to know the number of characters in
the string. This number is also called length. For this, Python has the command len() .
If we put a string inside len() , it will give us the number of characters. For example,
In [7]:
len("Hey there!")
Out[7]: 10
The value 10 indicates that there are 10 characters in the string "Hey there!" . Note that
this includes the space " " and the exclamation mark "!" .
Index
Another usual operation we do on strings is access the individidual characters.
In Python, the location of the individual characters in a string is indicated by their so-called
index.
To access a character given its index, we type in the name of the string and then its index,
enclosed in square brackets.
For example, to access the character "H" from "Hey there!" :
In [8]:
a_string = "Hey there!"
a_string[0] # access the character at index 0
Out[8]: 'H'
Negative Index
We can also use negative index. This means we count starting from the of the string. -1
means last character, -2 means the second to the last character, and so on.
In [9]:
a_string = "Hey there!"
a_string[-1] # access the character at index -1
Out[9]: '!'
String Slicing
Through index enclosed in square brackets, we can also access a part of the string.
This is also called string slicing.
We simply include the range of indices to access as
follows:
In [10]:
a_string[4:9] # access the character starting from index 4 until index 8
Out[10]: 'there'
Notice that the end index we specified is until 9 but the output only returned until index 8.
This means that the range specified in square bracket excludes the upper bound.
The syntax for slicing in Python is [start:end:step] , the third argument, step is
optional. If you ignore the step argument, the default will be set to 1.
In [11]:
a_string[4:9:2] # access the character starting from index 4 until index 8
Out[11]: 'tee'
In [12]:
a_string[4:] # access the character starting from index 4 until end of
Out[12]: 'there!'
In [13]:
a_string[:4] # access the character starting from beginning until index
Out[13]: 'Hey '
In [14]:
a_string[::2] # access the character starting from beginning until end in
Out[14]: 'Hytee'
In [15]:
a_string[4:-1] # access the character starting from index 4 until the las
Out[15]: 'there'
In [16]:
a_string[::-1] # output the string in reverse order
Out[16]: '!ereht yeH'
String Concatenation
Another common operation on string is concatenation or joining strings.
This is indicated by the symbol + performed on strings. See example code below:
In [17]:
a_string = "I love you"
another_string = "3000"
print(a_string + " " + another_string) # print the result of the concatenatio
I love you 3000
Note that we get a different result when we removed the quotes in "3000" .
In [18]:
a_string = "I love you"
another_string = 3000
print(a_string + " " + another_string)
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-18-ed2ea8daebbc> in <module>
1 a_string = "I love you"
2 another_string = 3000
----> 3 print(a_string + " " + another_string)
TypeError: can only concatenate str (not "int") to str
What happened in the code above?
Since we removed the quotes, the type of data stored in the variable another_string
became an integer ( int ) instead of string.
As indicated in the TypeError output, we cannot concatenate a string with an integer.
Just like int() and float() , we can use str() to convert from a different data type
to a string.
In [19]:
print("I love you " + str(3000))
I love you 3000
In the code above, we used the built-in function str() to convert a value the integer
3000 to string.
In [20]:
type(str(3000))
Out[20]: str
String methods
In Python, a string is an object that has various methods that can be used to manipulate it
(this is the concept of object-oriented programming which we will tackle later).
Various methods on strings can be accessed here:
https://docs.python.org/3/library/stdtypes.html#string-methods
In Jupyter, <string> + dot + TAB shows the methods that can be used.
To have an idea of how a certain string method works, we again can use the command
help()
In [39]:
help("Hey there!".capitalize)
Help on built-in function capitalize:
capitalize() method of builtins.str instance
Return a capitalized version of the string.
More specifically, make the first character have upper case and the rest l
ower
case.
Let's look at how some of these methods are accessed and used:
In [21]:
msg = "valar morghulis"
msg.upper() # get msg where all letters are capitalized
Out[21]: 'VALAR MORGHULIS'
In [22]:
msg.count("l") # count the number of occurrence of "l" in msg
Out[22]: 2
In [23]:
msg.replace("morghulis","dohaeris") # replace "morghulis" with "dohaeris" in
Out[23]: 'valar dohaeris'
Note that we are not changing the value of variable msg . We are simply asking Python to
give us the resulting string when we replace "morghulis" with "dohaeris" . This is also
true for the method upper() . If we check the value of msg :
In [24]:
msg
Out[24]: 'valar morghulis'
Main Reference
1. Kong, Siauw, Bayen - Python Programming and Numerical Methods. A Guide for
Engineers and Scientists (2021)