0% found this document useful (0 votes)
101 views38 pages

CSE-326-16-Chapter String

A string is a sequence of characters that can be manipulated and formatted in various ways in Python. There are several string formatting methods in Python, including using format specifiers with the % operator, the str.format() method, and f-strings. Format specifiers allow controlling details like field width, precision, and padding and can format numeric values as integers, floats, hexadecimal, etc. F-strings provide an easy way to inject variable values into strings and are generally the preferred string formatting method in modern Python.

Uploaded by

Yogi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
101 views38 pages

CSE-326-16-Chapter String

A string is a sequence of characters that can be manipulated and formatted in various ways in Python. There are several string formatting methods in Python, including using format specifiers with the % operator, the str.format() method, and f-strings. Format specifiers allow controlling details like field width, precision, and padding and can format numeric values as integers, floats, hexadecimal, etc. F-strings provide an easy way to inject variable values into strings and are generally the preferred string formatting method in modern Python.

Uploaded by

Yogi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 38

Chapter String

A string is a sequence of Unicode codepoints. These codepoints are converted into a


sequence of bytes for efficient storage. This process is called character encoding. There
are many encodings such as UTF-8, UTF-16, ASCII etc. and Python uses UTF-8
encoding by default.
The Story of Unicode
UTF-8 is the most popular and commonly used for encoding characters. UTF stands for
Unicode Transformation Format and ‘8’ means that 8-bit values are used in the encoding.
It replaced ASCII (American Standard Code for Information Exchange) as it provides
more characters and can be used for different languages around the world, unlike ASCII
which is only limited to Latin languages.
Unicode story - Unicode In Python - The unicodedata Module Explained - AskPython
String Literals
Single and doubled quoted string literals
String literals in python are surrounded by either single quotation ' ' marks, or double
quotation marks . In python single and double-quoted strings are the same and return the
same type object. Here are some examples of string literals:

Python supports triple quoted string literals and a line break is part of the string and using
triple quotes string literals are known as a multiline string.
Raw String
A raw string is a type of string that can also include one or many escape characters. A raw
string is formed by prefixing an 'r' or 'R', a character following a backslash is included in
the string without change, and all backslashes are left in the string.

For example, the string literal r \n consists of two characters: a backslash and a lowercase
' n '.

String quotes can be escaped with a backslash, but the backslash remains in the string; for
example, r " is a valid string literal consisting of two characters: a backslash and a double
quote.

A raw string r " is not a valid string literal (even a raw string cannot end in an odd number
of backslashes).

Specifically, a raw string cannot end in a single backslash (since the backslash would
escape the following quote character). Note also that a single backslash followed by a
newline is interpreted as those two characters as part of the string, not as a line
continuation. Check the code snippet given below:

Escape Characters
Escape characters are used to solve the problem of using special characters inside a string
declaration. It directs the interpreter to take suitable actions mapped to that character.
We denote escape characters with a backslash (¿) at the beginning. For example, to put a
new line in the string we can add a linefeed ( ¿). Table Python provides information about
all the escape sequences supported by Python.
Escape Character Meaning

¿ Represents a single quote

¿ Feeding a newline

¿ Carriage return

¿ Considers a horizontal tab space

¿ A added a backspace

¿ A form feed

¿ Representing string by octal value

¿ Representing a hex value

¿ Add a backslash

¿ Represents a 16-bit hex value

¿ Represents a 32-bit hex value

Illustration of all the above escape characters are given below:


Adds a tab space between two words

String containing a ¿ shifted the characters to the next line


To form a string with backslash two backslashes, need to be inserted, as follows:

A string can also be formed using the octal Unicode values for each character. Illustration
given below shows an example of forming string literal Python using octal values:

String Formatting
As of Python 3.6, f-strings1 are a great new way to format strings. Not only are they more
readable, more concise, and less prone to error than other ways of formatting, but they are
also faster!
Formatting a string is the process of infusing things in the string dynamically and
presenting the string. There are three different ways to perform string formatting in
Python:

● Formatting with % Operator.


● Formatting with format() string method.
● Formatting with string literals, called f-strings.

Formatting a String
To format a string in Python we need to inject a required format specifier in the specifiers
based on the datatype listed below:
Formatting Type Format Specifiers Description

Integer %d

Character %c Convert an integer to character

Float-point %f

Exponential %e or %E Scientific notation

1
Python 3's f-Strings: An Improved String Formatting Syntax (Guide) – Real Python
String %s

Hexadecimal %x or %X

Octal %o or %O

String objects have a built-in operation using the % operator, which you can use to format
a string. Here’s what that looks like in practice:

The above code will print Hello, Goutam. Similar way we can also print multiple values
using %.

The above code highlights two different issues related to the formatting of string. Here we
are creating formatting string before printing the message. The complete string is saved in
message variables and value of name and age variable is appended. In the previous
example we have only passed name so parenthesis are not used. But while passing
multiple values it is necessary to pass variables inside parenthesis. In this case only one %
symbol will be used. The output of above code is Hello, Goutam you are 37 years old.
Here you can see %d is used to print the value of age. Because age is an int type.
In the next example we have shown how to format a string including numbers in octal
and hexadecimal format.

In the next example we have shown how to use format specifiers to print floating-point
value in decimal and exponential from.
In the exponential form there will be a negative (-) and positive (+) sign after e. A plus
(+) sign means a large value and minus (-) means very small value.
Formatting output string with Field Width and Precision

Field width is the width of the entire number and precision is the width towards the right.
One can alter these widths based on the requirements. The default Precision Width is set
to 6 for floating point numbers. Check the example given below:

Here in the output, we can see the value of a is printed up to 6 th decimal placed. In the
next example, we have used the width of precision and printing up to 3 decimal places.

In the given code at line 2, we have used a value (3) after the decimal point (.). It controls
the width or length of the precision. That’s why we get the output as a = 2.340 instead of
a = 2.340000. The width value should be placed in between % and f, and always be a
positive integer.
If the field width is set more than the necessary than the data will be right aligns itself to
adjust to the specified values. Check the next example given below:
Whenever we have used the precision value the output will be aligned either right or left.
The output of line 2 is default aligned to left, because we have not used filed or precision
width. But in line 3 we have used a value 5.3, it means filed width = 5 and precision = 3.
So the value will be printed up to 3 decimal place and the total length is 5. That’s why
aligned to left.
At line 4, filed width = 5 and precision = 9. Because the width is same as length then
output will be left aligned but number of decimals placed is increased to 9 from 6. In the
next line field width = 6 and precision = 3 and the output is right aligned by single space
after equal sign. Because field width – precision length (3) – length of decimal part (1)
– 1 (decimal point) = 1. In the last print, field width (10) – precision (9) – length of
decimal part (1) – 1 (decimal point) = -1, output will be left aligned.
We can also do Zero padding instead of space and achieved by adding a 0 at the start of
field width.

Here we can see the value 2.34 is padded with 6 zeros (0), because 10−2−1−1=6. For
proper alignment, a space can be left blank in the field width so that when a negative
number is used, proper alignment is maintained.

Here we can see first output is not properly aligned and but other three aligned to right.
Like a minus (-) sign get printed for a negative value, similar to this we can also print the
plus (+) sign for positive values, by inserting a plus (+) sign after the % sign.
Why %-formatting is not great?

The code examples that you just saw above are readable enough. However, once you start
using several parameters and longer strings, your code will quickly become much less
easily readable and things are starting to look a little messy. Check the code snippet given
below:

String formatting using str.format()

str.format()  is an improvement on %-formatting. It uses normal function call syntax.


With str.format(), the replacement fields are marked by curly braces. See the code given
below:

You can reference variable names in any order by referencing their index:

Using dictionaries with formatted string

We can also pass values inside the format from dictionaries. See the code given below:
You can also use ** to do this neat trick with dictionaries. In this case order of dictionary
items are not important. Check the code below:

In the person dictionary we have changed the order of dictionary items and in message we
have first use name and then age. It means ** will search the value using the name as key.
String formatting with f-string

Strings in Python are usually enclosed within double quotes (“”) or single quotes (‘’). To
create f-strings, you only need to add an f or an F before the opening quotes of your
string. For example, "This" is a string whereas f"This" is an f-String.
When using f-Strings to display variables, you only need to specify the names of the
variables inside a set of curly braces {}. And at runtime, all variable names will be
replaced with their respective values. If you have multiple variables in the string, you
need to enclose each of the variable names inside a set of curly braces.
You have two variables, language and school, enclosed in curly braces inside the f-
String.

Notice how the variables language and school have been replaced with Python and
Computer Science & Engineering, respectively.
As f-Strings are evaluated at runtime, you might as well evaluate valid Python
expressions on the fly. In the example below, num1 and num2 are two variables. To
calculate their product, you may insert the expression num1 * num2 inside a set of curly
braces.
Notice how num1 * num2 is replaced by the product of num1 and num2 in the output. I
hope you're now able to see the pattern.
In any f-String, {var_name}, {expression} serve as placeholders for variables and
expressions, and are replaced with the corresponding values at runtime. Head over to the
next section to learn more about f-Strings.
String in Action
Operations on String
This section highlights the basic operations that anyone can do with a string object and
discussion is started with modifying a string object.

Modifying a String
In a nutshell, you can’t. Strings are one of the data types Python considers immutable,
meaning not able to be changed. A statement like this will cause an error:

In truth, there really isn’t much need to modify strings. You can usually easily
accomplish what you want by generating a copy of the original string that has the desired
change in place. There are very many ways to do this in Python. Here is one possibility:

There is also a built-in string method to accomplish this:

Operators with String


You have already seen the operators + and * applied to numeric operands in Chapter 4.
These two operators can be applied to strings as well.
The plus '+' operator concatenates two strings and returns it. Consider the example given
below:

Two or more strings can also concatenate without using ' +' operator, as follows:

The * operator creates multiple copies of a string. If S is a string and n is an integer, either
of the following expressions returns a string consisting of n concatenated copies of S:

From the output we can see that ' ABA ' is printed 3 times, because we have multiplied the
string ' ABA ' by 3 and getting the same output while doing the opposite one. The
multiplier operand n must be an integer and if the multiplier is zero or any negative
number then result will be an empty string:

The membership operators are used with strings to check if the first operand is contained
within the second one.
The ¿ operator is used to check if a substring of any length is part of another string or not.
It returns True if the given substring is present, else return False . The not ∈¿ operator did
the opposite one, returns True if the substring is not found, else return False. Consider the
example given below:

String Comparison
Many operators are used in performing the comparison of strings, such as equality
operator (= =), and comparison operators like (<, >, <=, >=, !, !=). The simple logic in
comparison of strings is that characters are compared to both strings. It will check the
character with a lower Unicode and is considered a smaller value character or larger value
character, or equal value character. In this topic, we are going to learn about Python
Compare Strings.
In Python, strings use the ASCII value of characters for comparison. Python uses the
objects with the same values in memory which makes comparing objects faster. Some
basic comparison operators are equal to (= =) and ‘is’ operators. Now let see the example
for each of these operators below. Next we will see some examples to understand the
string comparison and before that we have consider the following points:
★ Comparisons are case sensitive. ' G ' is not the same as ' g ' .
★ Each character in a string has an ASCII value (American Standard Code for
Information Interchange) which is what operators look out for, and not the actual
character. For example, ' G ' has an ASCII value of 71 while ' g ' has a value of 103.
When compared, ' g ' becomes greater than ' G ' .

Compare strings with equal ¿=¿ operator


The = = operator returns True if two strings are equal, else return False . Consider the code
snippet:

The boolean value True returned because both the operands of ¿=¿ are the same. Let's
change the first character to lowercase and then compare:

The second operand starts with lowercase ' p ' and while comparing the first character of
both the string they are different and we get the outcome as False .
Note: Remember that using ¿ would make the interpreter assume you want to assign one
value to another. So make sure you are using ¿=¿ operator for comparison.

Comparing string with not equal (! =) operator


Not Equal ¿ operator is opposite of Equal ¿ operator and checks for non-equal string
objects:

The Not Equal ¿ operator returns True , saying that both the string objects are not equal
and returns False if objects are the same.

Comparing strings using less than (<) operator


The less than (¿)operator compares the string based on ASCII values of each character.
Less than (¿) requires two operands and syntax str 1<str 2 is used for that. Consider the
example given below:
We can see the output which states that string1 is less than string2. This comparison is
performed on the ASCII value of each character. To do that after the if-else statement the
ord() function is used and we can see the ASCII value of H (72) is less than the ASCII
value of h (104). Lets take another example:

In this program we have first print the ASCII values of each characters of the strings. We
have found that all the characters are same except the last one. So ASCII value of o (111)
is greater than uppercase O (79) and due to this we get the output string2 is less than
string1.

Comparing string with other comparison operators


In this example we have used three other comparison operators and for the same string
the operator < = returns True and other two operators > and > = return False.
Comparing string using identity operator is and is not
The == operator compares the values of both the operands and checks for value equality.
Whereas is operator checks whether both the operands refer to the same object or not. The
same is the case for != and is not. Let us understand with an example:

The object ID of the strings may vary on different machines. The object IDs of str1, str2
and str3 were the same therefore they the result is True in all the cases. After the
modifying the str1 the object id of str1 is changed, so the result of str1 and str2 will be
False. Even after creating str4 with the same contents as new str1, the answer will be
False as their object IDs are different. In continuation of the previous example we have
simply changed the line number 17 in the code and is operator replaced with is not.

As we have already known that ID of str 1 and str 4 are different and is not operator returns
True .

String Indexing and Slicing

String Indexing
We can consider a string object as an ordered set of items and all these items can be
accessed using an index. Indexing is a process of accessing individual items using a
number known as location or index.
In Python, strings are ordered sequences of character data, and also support indexing.
Individual characters in a string can be accessed by specifying the string name followed
by a number in square brackets([]).
In python, strings are zero (0) index-based, which means the first character can be
accessed using the number 0 by passing it inside the square brackets ([]). Let’s take a
string object ' python' and the below figure represents the individual characters with
respective index value:

0 1 2 3 4 5

p y t h o n

The individual characters of the given string can be accessed using the index of each
character as follows:

The index of the last character (in this case ' n ' ) is one less than the length of the string.
The len(s ) will return the number of characters and subtracting 1 from it will get the last
valid index, as follows:
Attempting to index beyond the end of the string results in an error and the python
interpreter will raise an IndexError for string index out of range as follows:

The string object ' Python ' has 6 characters and 5 is the highest index for the character ' n ' .
So specifying a number inside square brackets more than 5 will raise an IndexError.
String indices can also be specified with negative numbers, in which case indexing occurs
from the end of the string backward: -1 refers to the last character, -2 is the second-to-last
character, and so on. Here is the same diagram showing both the positive and negative
indices for the string ' Python ' :

0 1 2 3 4 5

P y t h o n

-6 -5 -4 -3 -2 -1

Here are some examples showing the use of negative index:

Like a positive index, the interpreter will raise an IndexError if we pass a negative index
out of the range. For the string ' Python ' , -6 is the lowest index for the first character ' P ' .
So if we pass a negative index less than -6 will raise an IndexError as follows:
String Slicing
Slicing is an indexing technique used to extract a substring from a string object. If s is a
string, an expression if the form s[ start : end ] returns the portion of s starting with position
start , and up to but not including position end .

Note: String indices are zero-based. The first character in a string has an index of 0. This
applies to both standard indexing and slicing.

The above indexing returns the substring ' is py ' (from space at the index 4 to index 9, i.e.
y ). The index of character t is 10, can get using str 1[10]. It is very obvious that the
indexing didn’t include the character at index 10, because the slicing [4 :10] returns
10−4=6 characters, not including the end index.

If we omit the start index python interpreter by default consider the 0 as a start index. It
means str 1[0 :end ] is equivalent to the slicing str 1[:end ] as follows:

Similar to this if we omit the end index it will consider the length of the string as last
index. It means msg[0 :len(msg)] is equivalent to msg[ 0 :] .
For a given index n(0 ≤n ≤ len( n)) the expression str 1[ :n]+ str 1[n :] will return the string
object as same as str 1. The first operand ( str 1[:n]) of the expression returns a substring
from index 0 to (n−1) and the second operand ( str 1[n :]) will pick the characters from
index n to len (info).

For this example, the value of n is considered as 5. So the string str 1 is sliced at ' This' and
' is python language ' . Later the plus ‘+¿ ’ operator performs the concatenation of these two
sliced strings.

Omitting both the index will return the same string in its entirety. Literally. It’s not a
copy, it’s a reference to the original string:

From the output, it is clear that both the names str 1 and str 2 referring to the same string
object. During slicing, if the start index is equal to or greater than the end index then
python returns an empty string.

Negative indices can be used with slicing as well. -1 refers to the last character, -2 the
second-to-last, and so on, just as with simple indexing. The Table below shows indexes in
positive and negative form and thereafter the respective slicing techniques are shown to
slice the substring ' thon ' from the string ' python ' using these indices:
0 1 2 3 4 5

p y t h o n

-6 -5 -4 -3 -2 -1

While slicing a string a third value representing a step count or a stride value can be used
to get a substring. The syntax is as follows:
str [start :end : step]

To add the step count an extra colon (:) is added after the stop index. This step value
represents the number of character(s) that need to be jumped between two consecutive
characters of a string. Check the examples given below:

Character ' p ' is in the start index 0 and after picking the first character it skipped the next
one and picked the second character ' t ' at index 2. Then the next character ' h ' at index 3
is skipped and the character ' o ' at index 4 is taken. The step value also means that every
second character will be picked from the start index.
While slicing a string we can also skip the first two indexes and only pass the third index.
In that case, default values for the starting and ending index will be considered. Consider
the example given below:

The step value is 2 and the other indexes are skipped, while the default values for the start
(i.e. 0 ) and stop (i.e. len(str 1)) will be considered. The step value 2 means, after picking
the first character ' p ' every second character third, (' t ' ) and fifth (' o ' ) will be considered.
We can also use a negative step value, in which case Python steps backward through the
string. In that case, the starting index should be greater than the ending index. Let’s
consider the examples given below:

Using a negative stride value means the sliced string will be a reverse one, which means
from the last character of the string object. In the above two examples, the start value is
given as 5 and the end values are 1 and 0 respectively.
The index of the last character (i.e. ' n ' ) is 5 and while using the stride value as -1, the
slicing start from the back and will go up to index 1 (not including) for the first example
str 1[5:1 :−1]. In the second slicing -2 is used as a stride value so after the first character
from the back (i.e. ' n ' ), every second character will be picked till the index 0.
There is another use of using a negative stride value and specifically used to reverse a
given string. A string can be reversed by using any of these three techniques:
★ Considering the last index as the start index with -1 as the stride value
str 1[5::−1]

★ By providing only the stride value as -1 while skipping other indexes


str 1[::−1]

★ By providing -1 as the start index and stride value while skipping the end index
str 1[−1::−1]

Built-in String Functions

Python supports many built-in type functions that can be used with a string object. Table
highlights all of them with a description and later a sample illustration for each function is
also given.
Function Description

chr() Converts an integer to a character

ord() Converts a character to an integer

len() Returns the length of the string

str() Returns a string representation of an object


ord (c)

This function returns an integer value that is associated with the given character.

The ord () accepts a single character string. If we pass a string of multiple characters then
it will raise a TypeError.

TypeError: ord() expected a character, but string of length 2 found


chr(n)

Returns a character value for the given integer. chr() does the reverse of ord(). Given a
numeric value n, chr(n) returns a string representing the character that corresponds to n:

chr() can also take care the Unicode characters.

len(s)

Returns the length of a string. With len(), you can check Python string length. len(s)
returns the number of characters in s:
Str(obj)

Virtually any object in Python can be rendered as a string. str(obj) returns the string
representation of object obj:

Built-in String Methods


Methods are similar to functions. A method is a specialized type of callable procedure
that is tightly associated with an object. Like a function, a method is called to perform a
distinct task, but it is invoked on a specific object and has knowledge of its target object
during execution.
The syntax for invoking a method on an object is as follows:
object . method name (argument (s))

Here object represents any string object and method name ( ) will any of the method discussed
in this section. Few of them may require single or more number of arguments or may not.
This list of methods are categorised as follows:
Methods Related to Case Conversion
Methods in this group perform case conversion on the target string.

str.capitalize() Capitalize the string object

It returns a copy of the string object with the first character converted to uppercase and all
other characters converted to lowercase and the non-alphabetic characters remain
unchanged

str.lower() Converts a string to lowercase

It returns a copy of s with all alphabetic characters converted to lowercase:


str.upper() Converts the alphabetic characters to uppercase

This method is just opposite of str . lower ( ) and returns a string with all uppercase.

str.swapcase() Interchange the cases of alphabets

It returns a copy of s with uppercase alphabetic characters converted to lowercase and


vice versa:

str.title() Converts the string to title case

returns a copy of s in which the first letter of each word is converted to uppercase and
remaining letters are lowercase:

This method uses a simple algorithm and not able to distinguish between important and
unimportant words, and it does not handle apostrophes, possessives, or acronyms
gracefully:

Searching and Replacing a Substring within a String


This set of methods provide various ways of searching a substring within a given string.
Each method in this group supports optional start and end arguments. These are
interpreted as for string slicing: the action of the method is restricted to the portion of the
target string starting at character position start and proceeding up to but not including
character position end . If start is specified but end is not, the method applies to the portion
of the target string from start to the end of the string.
str . count (¿string , star , end ) Returns the number of occurrences

The arguments start index and end index are optional and returns an integer. The return value is
the occurrences of the given ¿ sting within the string str .

In this start and end index is not passed so the occurrences of sub string is searched
within full string, including the last character also.

For this the searching space of the given substring is considered up to the 10 th character,
which includes.

When end index is not considered then the substring length is considered up to the length
of the string.
str . startswith(suffix , star , end) Checks the string starts with suffix or not
str . endswith( suffix , star , end ) Checks the string ends with suffix or not

The method startswith() returns True , if the given string str starts with suffix and endswith( )
checks the str is ends with suffix . Parameters are optional, if not given then full str will be
considered for searching.

The comparison is restricted to the substring indicated by start and end , if they are
specified:
The search can be restricted by passing the strat and end index. The suffix string will be
search upto the end (not including) index, and returns Flase in the given sample code.

For the above example the searched string is limited to Pytho. Although the end index is
given 5, which is the index of n . While matching the last index is considered as one less
than the end value. So the string Pytho doesn’t ends with thon and returns False .
str . find (¿ string , star , end) Search the first occurrence of a given string
str . rfind (¿string , star , end) Search the last occurrence of a given string

The find( ) and rfind ( ) is used to see a string contains a particular substring or not. If the
multiple occurrences are present then find( ) and rfind ( ) returns the smallest and highest
index respectively.

The search is restricted to the substring indicated by start and end , if they are specified:

Both the methods will return -1 if the given string is not found.

str . index(¿ string , star , end) Search the target string in a given string
str . rindex(¿ string , star , end ) Find the last occurrence of given string

index( ) and rindex () methods are similar to find() and rfind () respectively. The only
difference is index( ) and rindex () will through an exception ValueErorr instead of −1, if
the ¿ string is not found in the target string str .
Out of find( ) and index( ) method following points need to keep in mind when to prefer
one over another:
★ If you are sure about the presence of a substring in the given string, you can use
either of the methods.
★ If you are not sure about the presence of substring use find( ).
★ In case, you are using the index() method to find the index of the substring, use
exception handling to avoid runtime crashes. The exception mechanism is
discussed later.

Methods related to Character Classification


Function Name Description

str . isalnum() Returns true if all the characters of str is alphanumeric

str . isalpha() Returns true if all the characters of str is alphabetic.

str . isdigit () Returns true if all the characters of str is digit (0 – 9).

str . isidentifier () Returns true if str is a valid python identifier.

str . islower () Returns true if str contains all the lowercase characters.

str . isupper() Returns true if str contains all the uppercase characters.

str . isprintable() Returns true if str is a valid python identifier.

str . isspace( ) Returns true if str is a valid python identifier.

str . istitle () Returns true if str is a valid python identifier.

isidentifier () will return True for a string that matches a Python keyword even though that
would not actually be a valid identifier. If you really want to ensure that a string would
serve as a valid Python identifier, you should check that isidentifier () is True and that
iskeyword () is False.

string module
Python string module provides many string constants and can be used to verify string
contains all alphabets, digits, punctuations, whitespace etc. Table contains all such
constants with their purpose.
String Constants Description

ascii letters Returns a string containing all ASCII lower and uppercase

ascii lowercase Returns a string containing all ASCII lowercase characters

ascii uppercase Returns a string containing all ASCII uppercase characters

digits Returns a string containing all the digits from 0-9

hexdigits Returns a string containing all the hex digits 0-9 and a-f and A-F

octdigits Returns a string containing all the octal digits 0-7

punctuation String of ASCII characters which are considered punctuation characters

String of ASCII characters which are considered printable by combining digits ,


printable
ascii letters, punctuation and whitespace

A string containing all ASCII characters that are considered whitespace. This
whitespace
includes the characters space, tab, linefeed, return, formfeed, and vertical tab.

The import string statement facilitate us to use any of these constants, as follows:

Formatting String
Formatting using str . format ( )
A string can be formatted using str . format () method and a Formatted string contain
“replacement fields” surrounded by curly braces {} . Anything that is not contained in
braces is considered literal text, which is copied unchanged to the output. The grammar
for a replacement field contains a filed name, conversion character and a format specifier,
as follows:
replacement field =\{[field name ][! conversion][:format spec ]\}

In less formal terms, the replacement field can start with a field name that specifies the object
whose value is to be formatted and inserted into the output instead of the replacement
field. The field name is optionally followed by a conversion field, which is preceded by an
exclamation point ' ! ' , and a format spec, which is preceded by a colon ' :' . These specify a
non-default format for the replacement value.

The field name itself begins with an argname that is either a number or a keyword. If it’s a
number, it refers to a positional argument, and if it’s a keyword, it refers to a named
keyword argument. Consider examples given below:

s is formatted using argname only and inside the braces lan and ver is representing the
argument name. Instead of arg ⁡name position of the arguments can be used by specifying
their position inside the {} and by default an empty {} represents positional only. Consider
the example given below:

Position index starts from 0 and using argument positions can be passed in any order. If
the numerical argnames in a format string are 0 , 1 ,2 , … in sequence, they can all be omitted
(not just some) and the numbers 0 , 1 ,2 , … will be automatically inserted in that order.
Mixing of default and manual numbering will raised a ValueError, as follows:

Further dictionary keys can be used as the position of the argument for formatting a
string. See the example given below:

The conversion field causes type coercion before formatting. Normally, the job of
formatting a value is done by the ¿¿ () method of the value itself. However, in in some
cases it is desirable to force a type to be formatted as a string, overriding its own
definition of formatting. By converting the value to a string before calling ¿¿ () , the normal
formatting logic is bypassed. The list of conversion flags is listed in Table:
Conversion Flags Description

!s Calls the str() method

!r Calls the repr() function, returns a printable character

Calls the ascii() function, return a printable representation of an object.


!a
Ignores the non-ascii characters using \x, \u, \U

Using these flags, we can check further a string is in printable form or not, check the
examples listed below:

The format spec field contains a specification of how the value should be presented,
including such details as field width, alignment, padding, decimal precision and so on.
Each value type can define its own “formatting mini-language” or interpretation of the
format spec.

A format spec field can also include nested replacement fields within it. These nested
replacement fields may contain a field name, conversion flag, and format specification,
but deeper nesting is not allowed. The replacement fields within the format spec are
substituted before the format spec string is interpreted. This allows the formatting of a value
to be dynamically specified. A general form of format specifier is given below and in
Table highlights every aspect of a standard format specifier.

format spec= [ [ fill ] align ] [ sign ][ z ] [ ¿ ] [ 0 ] [ width ] [groupingoption ][ . precision][type]

Field Value Description

Fill Filled the string with any character when passed with valid align value

Align Explained in Table


Sign ‘+¿ ’ or ‘ – ‘ or space with numeric values only

Width A positive integer, controlling the length of the formatted string

grouping action ‘−¿’ or ‘, ’

Precision A decimal integer, indicating number of digits after the decimal point

Type Determines how the data should be represented

If a valid align value is specified, it can be preceded by a fill character that can be any
character and defaults to a space if omitted. It is not possible to use a literal curly brace ¿
or “ }”¿ as the fill character in a formatted string literal or when using the str . format ()
method. The various alignment option is highlighted in Tabel:
Alignment Description
Options

‘<’ Forces the field to be left-aligned within the available space (this is the default for most
objects).

‘>’ Forces the field to be right-aligned within the available space (this is the default for
numbers).

‘=’ Forces the padding to be placed after the sign (if any) but before the digits. This is
used for printing fields in the form ‘+000000120’. This alignment option is only valid
for numeric types. It becomes the default for numbers when ‘0’ immediately
precedes the field width.

‘^’ Forces the field to be centered within the available space.

Some examples of string formatting based on align value with predefined fill character
and width value (the length of the string) is illustrated next, using keyword argument:

Note that unless a minimum field width is defined, the field width will always be the
same size as the data to fill it, so that the alignment option has no meaning in this case.
Formatting any string of numbers with a positive or negative sign can be done by passing
the sign value just before the width of the string. All the possible signs with their details
are listed in Table.
Option Meaning

‘+¿ ’ indicates that a sign should be used for both positive as well as negative numbers.

‘−¿ ’ indicates that a sign should be used only for negative numbers (this is the default behavior).

Space indicates that a leading space should be used on positive numbers, and a minus sign
on negative numbers.

Considers the illustrations related to the sign field given below:

The optional literal z forces negative zero floating-point values to positive zero after
rounding to the format precision. This formatting option is added in version 3.11 [1]. The
PEP [1], proposes an extension to the string format specification allowing negative zero
to be normalized to positive zero.

In many fields of mathematics negative zero is surprising or unwanted – especially in the


context of displaying an (often rounded) numerical result. Using this optional z for
floating-point presentation types ( f , g, etc., as defined by the format specification
documentation)., normalized a negative zero to positive zero.
Like floating-point type a Decimal type can also represent a signed zero and using the
optional z Decimal negative zero can be a positive zero. See the illustrations given below:

The '¿' option causes the “alternate form” to be used for the conversion. The alternate
form is defined differently for different types. This option is only valid for integer, float
and complex types. For integers, when binary, octal, or hexadecimal output is used, this
option adds the respective prefix '0 b ', ' 0 o ', '0 x ', or '0 X ' to the output value. See the
examples given below:

In this illustration a type ‘b ’ is used for binary number and using ¿ just before it prefixes ‘
0 b ’ before the number. If the type is not preceded by ¿ then simply returns the conversion
without any prefix.

The comma (',') option signals the use of a comma for a thousand separator.
The '¿' option signals the use of an underscore for a thousand separator for floating point
presentation types and for integer presentation type ' d '. For integer presentation types 'b',
'o', 'x', and 'X', underscores will be inserted every 4 digits. For other presentation types,
specifying this option is an error.

The precision is a decimal integer indicating how many digits should be displayed after
the decimal point for presentation types ' f ' and ' F ', or before and after the decimal point
for presentation types ' g' or 'G '.

For string presentation type the precision field indicates the maximum field size - in other
words, how many characters will be used from the field content. The precision is not
allowed for integer presentation types.

Finally, the type determines how the data should be presented. The available presentation
types are listed in Table:
Type Meaning

‘ s’ String format. This is the default type for strings and may be omitted.

None Same as ‘ s’

b Binary format. Outputs the number in base 2.


c Character. Converts the integer to the corresponding unicode character before printing.

d Decimal Integer. Outputs the number in base 10.

o Octal format. Outputs the number in base 8.

x Hex format. Outputs the number in base 16, using lower-case letters for the digits above 9.

X Hex format. Outputs the number in base 16, using upper-case letters for the digits above 9. In
case '#' is specified, the prefix '0x' will be upper-cased to '0X' as well.

n Number. This is the same as 'd', except that it uses the current locale setting to insert the
appropriate number separator characters.

none The same as ‘d ’

Next some examples are listed related to string and integer formatting using a type of
representation, as follows:

In addition to the above presentation types, integers can be formatted with the floating-
point presentation types listed in Table (except ' n' and ¿). When doing so, float ( ) is used to
convert the integer to a floating-point number before formatting.
Type Meaning

‘e ’ Floating point

‘E’ Scientific notation. Same as 'e ' except it uses an upper case ‘ E ’ as the separator character.

‘f ’

‘F ’

‘ g’

‘G ’

‘n ’

‘% ’ Percentage. Multiplies the number by 100 and displays in fixed (' f ') format, followed by a
percent sign.

Remaining string formatting: string — Common string operations — Python 3.11.0 documentation

built-in format () function


In addition to the string’s format method, a single object may also be formatted with the
format(object , formatspec) built-in function (which the method uses internally).
Practice Questions
★ Take a string How are you? and find the following substrings using slicing: are , you ,
uoy (reverse of you), you using the negative index.
★ Take string as input from the console. Write a program to get the following
substrings:
input: Python
output: Pyon, the first and last two characters only
output: ytho , remove the first and last charcater
output: nythoP , interchange the first and last character
★ Take two strings as input and get the new string as follows:
str1: Hello
str2: All
output: HelloAllAllHello
★ Take a string and a number n as input and remove the nth character from the string
and print it.
str: Python Programming
n: 8
output: Python Prgramming

You might also like