You can use defaultdict to tally each sub string beginning with each position in the input string. The getsubs method is a generator method that yields a smaller sub string every time it is called.
Example
from collections import defaultdict def getsubs(loc, s): substr = s[loc:] i = -1 while(substr): yield substr substr = s[loc:i] i -= 1 def longestRepetitiveSubstring(r): occ = defaultdict(int) # tally all occurrences of all substrings for i in range(len(r)): for sub in getsubs(i,r): occ[sub] += 1 # filter out all sub strings with fewer than 2 occurrences filtered = [k for k,v in occ.items() if v >= 2] if filtered: maxkey = max(filtered, key=len) # Find longest string return maxkey else: raise ValueError("no repetitions of any substring of '%s' with 2 or more occurrences" % (r)) longestRepetitiveSubstring("hellopeople18654randomtexthellopeoplefromallaroundthe world")
Output
This will give the output:
'hellopeople'