0% found this document useful (0 votes)
269 views6 pages

2.4. An Anagram Detection Example - Problem Solving With Algorithms and Data Structures

This document discusses four algorithms for detecting if two strings are anagrams (rearrangements of the same letters) with different computational complexities: 1. Checking each character takes O(n^2) time as each of n characters is checked against up to n characters. 2. Sorting then comparing takes time equal to the sorting algorithm, which is O(n log n) or O(n^2). 3. Generating all possible strings takes n! time which grows extremely quickly, making it impractical. 4. Counting character frequencies takes O(n) time, providing a linear-time solution.

Uploaded by

Err33
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
269 views6 pages

2.4. An Anagram Detection Example - Problem Solving With Algorithms and Data Structures

This document discusses four algorithms for detecting if two strings are anagrams (rearrangements of the same letters) with different computational complexities: 1. Checking each character takes O(n^2) time as each of n characters is checked against up to n characters. 2. Sorting then comparing takes time equal to the sorting algorithm, which is O(n log n) or O(n^2). 3. Generating all possible strings takes n! time which grows extremely quickly, making it impractical. 4. Counting character frequencies takes O(n) time, providing a linear-time solution.

Uploaded by

Err33
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

2.4. An Anagram Detection Example Problem Solving with Algori...

6/11/17, 2:56 PM

2.4. An Anagram Detection Example


A good example problem f or show ing algorithms w ith diff erent orders of magnitude is the classic
anagram detection problem f or strings. One string is an anagram of another if the second is simply a
rearrangement of the f irst. For example, 'heart' and 'earth' are anagrams. The strings 'python'
and 'typhon' are anagrams as w ell. For the sake of simplicity, w e w ill assume that the tw o strings in
question are of equal length and that they are made up of symbols f rom the set of 26 low ercase
alphabetic characters. Our goal is to w rite a boolean f unction that w ill take tw o strings and return
w hether they are anagrams.

2.4.1. Solution 1: Checking Off


Our f irst solution to the anagram problem w ill check to see that each character in the f irst string actually
occurs in the second. If it is possible to checkoff each character, then the tw o strings must be
anagrams. Checking off a character w ill be accomplished by replacing it w ith the special Python value
None . How ever, since strings in Python are immutable, the f irst step in the process w ill be to convert
the second string to a list. Each character f rom the f irst string can be checked against the characters in
the list and if f ound, checked off by replacement. ActiveCode 1 show s this f unction.

Run Load History Show CodeLens

1 def anagramSolution1(s1,s2):
2 alist = list(s2)
3
4 pos1 = 0
5 stillOK = True
6
7 while pos1 < len(s1) and stillOK:
8 pos2 = 0
9 found = False
10 while pos2 < len(alist) and not found:
11 if s1[pos1] == alist[pos2]:
12 found = True
13 else:
14 pos2 = pos2 + 1
15
16 if found:
17 alist[pos2] = None
18 else:
19 stillOK = False
20
21 pos1 = pos1 + 1
22
23 return stillOK
24

ActiveCode: 1 Checking Off (active5)

1 of 6
2.4. An Anagram Detection Example Problem Solving with Algori...
6/11/17, 2:56 PM

To analyze this algorithm, w e need to note that each of the n characters in s1 w ill cause an iteration
through up to n characters in the list f rom s2 . Each of the n positions in the list w ill be visited once to
match a character f rom s1 . The number of visits then becomes the sum of the integers f rom 1 to n. We
stated earlier that this can be w ritten as
n
n(n + 1)
i =
i=1
2
1 1
= n2 + n
2 2
1
As n gets large, the n2 term w ill dominate the n term and the 2 can be ignored. Theref ore, this solution
is O(n2 ).

2.4.2. Solution 2: Sort and Compare


Another solution to the anagram problem w ill make use of the f act that even though s1 and s2 are
diff erent, they are anagrams only if they consist of exactly the same characters. So, if w e begin by
sorting each string alphabetically, f rom a to z, w e w ill end up w ith the same string if the original tw o
strings are anagrams. ActiveCode 2 show s this solution. Again, in Python w e can use the built-in sort
method on lists by simply converting each string to a list at the start.

Run Load History Show CodeLens

1 def anagramSolution2(s1,s2):
2 alist1 = list(s1)
3 alist2 = list(s2)
4
5 alist1.sort()
6 alist2.sort()
7
8 pos = 0
9 matches = True
10
11 while pos < len(s1) and matches:
12 if alist1[pos]==alist2[pos]:
13 pos = pos + 1
14 else:
15 matches = False
16
17 return matches
18
19 print(anagramSolution2('abcde','edcba'))
20

ActiveCode: 2 Sort and Compare (active6)

At f irst glance you may be tempted to think that this algorithm is O(n) , since there is one simple iteration

2 of 6
2.4. An Anagram Detection Example Problem Solving with Algori...
6/11/17, 2:56 PM

to compare the n characters af ter the sorting process. How ever, the tw o calls to the Python sort
method are not w ithout their ow n cost. As w e w ill see in a later chapter, sorting is typically either
O(n2 ) or O(n log n), so the sorting operations dominate the iteration. In the end, this algorithm w ill
have the same order of magnitude as that of the sorting process.

2.4.3. Solution 3: Brute Force


A brute force technique f or solving a problem typically tries to exhaust all possibilities. For the
anagram detection problem, w e can simply generate a list of all possible strings using the characters
f rom s1 and then see if s2 occurs. How ever, there is a diff iculty w ith this approach. When generating
all possible strings f rom s1 , there are n possible f irst characters, n 1 possible characters f or the
second position, n 2 f or the third, and so on. The total number of candidate strings is
n (n 1) (n 2). . . 3 2 1, w hich is n!. Although some of the strings may be duplicates, the
program cannot know this ahead of time and so it w ill still generate n! diff erent strings.

It turns out that n! grow s even f aster than 2n as n gets large. In f act, if s1 w ere 20 characters long,
there w ould be 20! = 2, 432, 902, 008, 176, 640, 000 possible candidate strings. If w e processed one
possibility every second, it w ould still take us 77,146,816,596 years to go through the entire list. This is
probably not going to be a good solution.

2.4.4. Solution 4: Count and Compare


Our f inal solution to the anagram problem takes advantage of the f act that any tw o anagrams w ill have
the same number of as, the same number of bs, the same number of cs, and so on. In order to decide
w hether tw o strings are anagrams, w e w ill f irst count the number of times each character occurs.
Since there are 26 possible characters, w e can use a list of 26 counters, one f or each possible
character. Each time w e see a particular character, w e w ill increment the counter at that position. In the
end, if the tw o lists of counters are identical, the strings must be anagrams. ActiveCode 3 show s this
solution.

Run Load History Show CodeLens

1 def anagramSolution4(s1,s2):
2 c1 = [0]*26
3 c2 = [0]*26
4
5 for i in range(len(s1)):
6 pos = ord(s1[i])-ord('a')
7 c1[pos] = c1[pos] + 1
8
9 for i in range(len(s2)):
10 pos = ord(s2[i])-ord('a')
11 c2[pos] = c2[pos] + 1
12
13 j = 0
14 stillOK = True
15 while j<26 and stillOK:
16 if c1[j]==c2[j]:
17 j = j + 1

3 of 6
2.4. An Anagram Detection Example Problem Solving with Algori...
6/11/17, 2:56 PM

18 else:
19 stillOK = False
20
21 return stillOK
22
23 print(anagramSolution4('apple','pleap'))
24

ActiveCode: 3 Count and Compare (active7)

Again, the solution has a number of iterations. How ever, unlike the f irst solution, none of them are
nested. The f irst tw o iterations used to count the characters are both based on n. The third iteration,
comparing the tw o lists of counts, alw ays takes 26 steps since there are 26 possible characters in the
strings. Adding it all up gives us T (n) = 2n + 26 steps. That is O(n) . We have f ound a linear order of
magnitude algorithm f or solving this problem.

Bef ore leaving this example, w e need to say something about space requirements. Although the last
solution w as able to run in linear time, it could only do so by using additional storage to keep the tw o
lists of character counts. In other w ords, this algorithm sacrif iced space in order to gain time.

This is a common occurrence. On many occasions you w ill need to make decisions betw een time and
space trade-off s. In this case, the amount of extra space is not signif icant. How ever, if the underlying
alphabet had millions of characters, there w ould be more concern. As a computer scientist, w hen given
a choice of algorithms, it w ill be up to you to determine the best use of computing resources given a
particular problem.

Se lf Che ck

Q-1: Given the f ollow ing code f ragment, w hat is its Big-O running time?

4 of 6
2.4. An Anagram Detection Example Problem Solving with Algori...
6/11/17, 2:56 PM

test = 0
for i in range(n):
for j in range(n):
test = test + i * j

(A) O(n)

(B) O(n^2)

(C) O(log n)

(D) O(n^3)

Check Me

Q-2: Given the f ollow ing code f ragment w hat is its Big-O running time?

test = 0
for i in range(n):
test = test + 1

for j in range(n):
test = test - 1

(A) O(n)

(B) O(n^2)

(C) O(log n)

(D) O(n^3)

Check Me

Q-3: Given the f ollow ing code f ragment w hat is its Big-O running time?

i = n
while i > 0:
k = 2 + 2
i = i // 2

(A) O(n)

(B) O(n^2)

(C) O(log n)

5 of 6
2.4. An Anagram Detection Example Problem Solving with Algori...
6/11/17, 2:56 PM

(D) O(n^3)

Check Me

6 of 6

You might also like