0% found this document useful (0 votes)
16 views11 pages

An Interviewer's Favorite Question - "How Are Python Strings Stored in Internal Memory" - by Shubh Patni - Better Programmin

Uploaded by

Ammar Ahmad Khan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views11 pages

An Interviewer's Favorite Question - "How Are Python Strings Stored in Internal Memory" - by Shubh Patni - Better Programmin

Uploaded by

Ammar Ahmad Khan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

Open in app

You're reading for free via Shubh Patni's Friend Link. Upgrade to access the best of Medium.

Member-only story

An Interviewer’s Favorite Question: “How Are


Python Strings Stored in Internal Memory”
Understand the internal implementation of Python strings

Shubh Patni · Follow


Published in Better Programming
5 min read · Aug 11, 2021

Photo by Ryan Wallace on Unsplash


This article was co-authored with Muhammad
8
Abutahir, You can find him on
169
linkedin and instagram.

Strings! One of the most favorite topics for all the programming interviewers,
and loved by everyone who starts programming no matter what programming
language they choose. Playing with strings is extremely interesting, but do you
know how Python stores the strings internally?

What if I ask you a question like “Are duplicates allowed in strings?”. Most of
you would say yes! And would give an example like “Mommy.” We can see
here that the character ‘m’ is repeating, but is that really the case?

In this article, I will give you a very clear picture of how strings are stored
internally inside memory, and I promise your perspective will change
completely regarding strings.

One important piece of advice that I would like to give to the readers is that
understanding a programming language from a memory perspective is the
most efficient way of learning a programming language! I bet you’ll hardly
forget the core concepts of programming once you try this out.

With that said, let’s move on to the actual topic.

A string in Python is just a sequence of Unicode characters enclosed within


quotes. Remember that in Python there can be single quotes, double quotes, or
even triple single or triple double quotes.

When it comes to Python, strings are extremely efficient in terms of memory


cost. So let’s understand the reason!

If you dig deeper, it turns out that strings use ‘Interned Dictionary.’ It’s a
simple dictionary that stores the character as the key and the address as the
value. Let’s understand this with the help of an example:

s = “Hello world”
In the above line, I created a string Hello world and stored it in a variable
called s. Abstractly, we can visualize this as it is represented inside memory
as shown below.

Image created by author ‘Muhammad Abutahir’: Representation of the object in memory

Now let’s see what actually happens internally and how an interned
dictionary works. Let me give you an example by creating a single character
string s1 and assigning it to a new variable s2 .

s1 = ‘A’ #Single character string


s2 = s1
print(s1)# A
print(s2)# A
print(id(s1))#12345
print(id(s2))#12345
Image created by author ‘Muhammad Abutahir’: Showing string interning

Okay! let’s break down the above image: when we created the first string s1 , a
string object gets created inside the memory, after this starts the process of
string interning. Python will first look up into the interned dictionary if the
character ‘A’ exists, as it was empty initially. A new key-value pair gets created,
the character ‘A’ is set as the key and the location of the object in which it
resides is set as the value that is 123 .

Note: 123 is just an assumed id.

In the next step, when we assigned the string s1 to s2 , the address present in
s1 is sent to s2 and s2 starts to point towards the same object! We call this a
reference-type assignment.
Image created by author ‘Muhammad Abutahir’: the reference type assignment and string interning

Okay, now that’s clear! But why did I tell you that strings are extremely
memory efficient? Here’s why.

Why Strings Are Extremely Memory Efficient


As we saw earlier, strings use an interned dictionary, and it’s very similar to a
normal dictionary. We know dictionaries don’t allow duplicate keys, so a key
should be unique! Now how does that concept apply here in strings?

Let’s understand it clearly with the help of a ‘Multi-character’ string, shown


below:

s1 = ‘Hello’ #Multi character string


s2 = ‘World’
print(s1)#Hello
print(s2)#World
print(id(s1))#123123
print(id(s2))#454545

The above concept is very clear, but what would happen if I print the ids of a
character that is present in both the strings as common?
# Printing the ids of character ‘o’
print(id(s1[4]))#1004
print(id(s2[1]))#1004

Wow! That’s unusual, right? How can a character in different objects with
unique addresses have the same id?! Let’s understand it with the help of the
below figure:

Image created by author ‘Muhammad Abutahir’:The complete string interning process

So, I started with the creation of a string s1 , it’s important to understand that
the process of string interning starts simultaneously as the objects are created.
A multi-character string is a complete object but also from the figure above
you can notice that individual characters are also objects and they have their
own unique ids.

In the process of string interning, the individual characters get created in the
memory. Python will look into the interned dictionary to see if those
characters are already present, and if they are not present, an object is created
and the address along with the character as key are stored in the interned
dictionary.

In the above image, our string starts from H, so Python looks into the
container. Because it is empty, it stores the H as the key and its address as the
value in it. Next, the same thing repeats for the following two letters E and L.

The next letter is L again, so Python looks into the dictionary. As it is already
present, Python does not create a new object, rather it returns the address of
the previous L to the index location and this process continues.

The most interesting part is that this is not the case with just individual strings
stored in different variables! There is only one common interned dictionary
that is used by the whole Python program itself. Thus, even if the strings are
present in the different variables, they all will share the same addresses for
the unique characters present in the interned dictionary! This will make it
extremely memory-efficient! Also, about duplicates, they aren’t allowed when
you think from a memory perspective.

Summary
In this article, I discussed the internal implementation of strings and the
process of string interning in Python. As I have mentioned before,
understanding a programming language from its memory perspective is the
secret of mastering the fundamental concepts of that language.

String interning is a process of ensuring that only a single memory location is


allocated for a single unique character, and in the future, if the same
character occurs, then it will return the previously stored address.

Programming Python Software Development Software Engineering


Data Science

Shubh
Patni

Written by Shubh Patni


443 Followers · Writer for Better Programming

Programmer | Youtuber | Writer and Student at Northeastern University. https://www.shubhpatni.com/

More from Shubh Patni and Better Programming

Why You Need a ‘.ETH’ Domain Name

Shubh Patni in Level Up Coding


Shubh
Patni

Why You Need a ‘.ETH’ Domain Name


1. It will save you time and money

· 5 min read · May 4, 2021

971 6

Advice From a Software Engineer With 8 Years of Experience

Benoit Ruiz in Better Programming


Benoit
Ruiz

Advice From a Software Engineer With 8 Years of Experience


Practical tips for those who want to advance in their careers

22 min read · Mar 21


13.1K 244

Building a Multi-document Reader and Chatbot With LangChain and ChatGPT

Sami Maameri in Better Programming


Sami
Maameri

Building a Multi-document Reader and Chatbot With LangChain and


ChatGPT
The best part? The chatbot will remember your chat history

17 min read · May 20

1.3K 11

5 Exciting Projects in the Polkadot Ecosystem

Shubh Patni in Coinmonks


Shubh
Patni

5 Exciting Projects in the Polkadot Ecosystem


Most promising 10X projects in Polkadot

· 5 min read · May 17, 2021

198

See all from Shubh Patni

See all from Better Programming

Recommended from Medium


Futuristic digital illustration of a majestic whale soaring through a sky filled with fluffy
clouds, illuminated by digital code and light fragments, symbolizing swift, optimized
performance in Python programming. The image creatively embodies the concept of making
Python code lighter and faster, akin to a whale gliding effortlessly through the clouds.

Yang Zhou in TechToFreedom


Yang
Zhou

9 Subtle Tricks To Make Your Python Code Much Faster


Small changes, big differences

· 7 min read · 3 days ago

464 6

spark partition data skew optimize optimization pyspark sql python UI partition

Michael Berk in Towards Data Science


Michael
Berk

1.5 Years of Spark Knowledge in 8 Tips


My learnings from Databricks customer engagements

8 min read · 6 days ago

484 3

Lists

Coding & Development


11 stories · 349 saves

General Coding Knowledge


20 stories · 728 saves

Predictive Modeling w/ Python


20 stories · 739 saves

PrincipalTim Practical Guides to Machine Learning


ComponentSerie
l 10 stories · 849 saves
Analysis Analc
This is Why I Didn’t Accept You as a Senior Software Engineer

David Goudet
David
Goudet

This is Why I Didn’t Accept You as a Senior Software Engineer


An Alarming Trend in The Software Industry

· 5 min read · Jul 26

7.3K 76

Python “break” and “continue” statements: controlling loop flow in python

Vishal Thapa
Vishal
Thapa

Python “break” and “continue” statements: controlling loop flow in python


The “break” and “continue” statements in Python are control flow statements that allow you to
change the flow of your program in a loop…

2 min read · Sep 26

Flatten Nested JSON — Python & Scala

Siddharth Ghosh in SelectFrom


Siddharth
Ghosh

Flatten Nested JSON — Python & Scala


A lot of times I have come across in my use-case to flatten a nested JSON object. I found
several different solutions, some recommended…

5 min read · Jul 30

An Algo Trading Strategy which made +8,371%: A Python Case Study

Nikhil Adithyan in Level Up Coding


Nikhil
Adithyan

You might also like