Python - Group list by first character of string
Last Updated :
23 Mar, 2023
Sometimes, we have a use case in which we need to perform the grouping of strings by various factors, like first letter or any other factor. These types of problems are typical to database queries and hence can occur in web development while programming. This article focuses on one such grouping by the first letter of the string. Let’s discuss certain ways in which this can be performed.
Method #1: Using next() + lambda + loop
The combination of the above 3 functions is used to solve this particular problem by the naive method. The lambda function performs the task of finding like initial character, and the next function helps in forwarding iteration.
Python3
# Python3 code to demonstrate
# Initial Character Case Categorization
# using next() + lambda + loop
# initializing list
test_list = ['an', 'a', 'geek', 'for', 'g', 'free']
# printing original list
print("The original list : " + str(test_list))
# using next() + lambda + loop
# Initial Character Case Categorization
def util_func(x, y): return x[0] == y[0]
res = []
for sub in test_list:
ele = next((x for x in res if util_func(sub, x[0])), [])
if ele == []:
res.append(ele)
ele.append(sub)
# print result
print("The list after Categorization : " + str(res))
Output : The original list : ['an', 'a', 'geek', 'for', 'g', 'free']
The list after Categorization : [['an', 'a'], ['geek', 'g'], ['for', 'free']]
Time Complexity: O(n^2), where n is the number of elements in the list.
Auxiliary Space: O(n), where n is the number of elements in the list.
Method #2: Using sorted() + groupby()
This particular task can also be solved using the groupby function which offers a convenient method to solve this problem. The sorted function sorts the elements by initial character to be feed to groupby for the relevant grouping.
Python3
# Python3 code to demonstrate
# Initial Character Case Categorization
# using sorted() + groupby()
from itertools import groupby
# initializing list
test_list = ['an', 'a', 'geek', 'for', 'g', 'free']
# printing original list
print("The original list : " + str(test_list))
# using sorted() + groupby()
# Initial Character Case Categorization
def util_func(x): return x[0]
temp = sorted(test_list, key=util_func)
res = [list(ele) for i, ele in groupby(temp, util_func)]
# print result
print("The list after Categorization : " + str(res))
Output : The original list : ['an', 'a', 'geek', 'for', 'g', 'free']
The list after Categorization : [['an', 'a'], ['geek', 'g'], ['for', 'free']]
Time complexity: O(nlogn), where n is the length of the input list.
Auxiliary space: O(n), where n is the length of the input list.
Method #3: Using for loop
Python3
# Python3 code to demonstrate
# Initial Character Case Categorization
# initializing list
test_list = ['an', 'a', 'geek', 'for', 'g', 'free']
# printing original list
print("The original list : " + str(test_list))
res = []
x=[]
for i in test_list:
if i[0] not in x:
x.append(i[0])
for i in x:
p=[]
for j in test_list:
if j[0]==i:
p.append(j)
res.append(p)
# print result
print("The list after Categorization : " + str(res))
OutputThe original list : ['an', 'a', 'geek', 'for', 'g', 'free']
The list after Categorization : [['an', 'a'], ['geek', 'g'], ['for', 'free']]
Time complexity: O(n^2), where n is the length of the input list.
Auxiliary space: O(n), where n is the length of the input list.
Approach 4: Using defaultdict
Python3
from collections import defaultdict
#Initializing list
test_list = ['an', 'a', 'geek', 'for', 'g', 'free']
#printing original list
print("The original list : " + str(test_list))
#Using defaultdict
res = defaultdict(list)
for i in test_list:
res[i[0]].append(i)
#print result
print("The list after Categorization : " + str(list(res.values())))
#This code is contributed by Edula Vinay Kumar Reddy
OutputThe original list : ['an', 'a', 'geek', 'for', 'g', 'free']
The list after Categorization : [['an', 'a'], ['geek', 'g'], ['for', 'free']]
Time Complexity: O(n) where n is the number of elements in test_list
Auxiliary Space: O(n) as we use a defaultdict to store the result
Explanation:
We use a defaultdict to store the result where the key is the first character of each string in the list and value is the list of all strings with the same first character.
The defaultdict automatically initializes the value as an empty list if the key is not present. So, we don't have to check if the key is present or not.
We iterate through the test_list and add the elements to the corresponding key in the defaultdict.
Finally, we convert the defaultdict values to a list to get the final result.\
Method #5: Using dictionary comprehension
Use dictionary comprehension to categorize the words based on their first character.
- First, we create a set of unique first characters in the list using set comprehension: set([word[0] for word in test_list]).
- Next, we create a dictionary comprehension where the keys are the first characters and the values are the list of words starting with that character: {char: [word for word in test_list if word.startswith(char)] for char in set([word[0] for word in test_list])}.
- Finally, we print the result.
Python3
# initializing list
test_list = ['an', 'a', 'geek', 'for', 'g', 'free']
# printing original list
print("The original list : " + str(test_list))
# using dictionary comprehension
# Initial Character Case Categorization
res = {char: [word for word in test_list if word.startswith(char)] for char in set([word[0] for word in test_list])}
# print result
print("The list after Categorization : " + str(res))
OutputThe original list : ['an', 'a', 'geek', 'for', 'g', 'free']
The list after Categorization : {'g': ['geek', 'g'], 'f': ['for', 'free'], 'a': ['an', 'a']}
Time complexity: O(n^2), where n is the length of the input list. This is because we use the startswith() method inside the list comprehension, which has a time complexity of O(n).
Auxiliary space: O(n), where n is the length of the input list. This is because we create a dictionary where each key has a list of words starting with that character.
Method #6: Using itertools.groupby() with sorted()
Use the itertools.groupby() function in combination with sorted() to group the words based on their first character.
Step-by-step approach:
- First, sort the input list using sorted().
- Then, use itertools.groupby() to group the words based on their first character. groupby() returns an iterator of (key, group) pairs, where key is the first character and group is an iterator of words starting with that character.
- Iterate over the (key, group) pairs, convert the group iterator to a list using list(), and append it to the result list res.
- Finally, print the result.
Below is the implementation of the above approach:
Python3
import itertools
# initializing list
test_list = ['an', 'a', 'geek', 'for', 'g', 'free']
# printing original list
print("The original list : " + str(test_list))
# using itertools.groupby() with sorted()
# Initial Character Case Categorization
res = []
for k, g in itertools.groupby(sorted(test_list), key=lambda x: x[0]):
res.append(list(g))
# print result
print("The list after Categorization : " + str(res))
OutputThe original list : ['an', 'a', 'geek', 'for', 'g', 'free']
The list after Categorization : [['a', 'an'], ['for', 'free'], ['g', 'geek']]
Time complexity: O(n log n), where n is the length of the input list. This is because we use sorted() which has a time complexity of O(n log n).
Auxiliary space: O(n), where n is the length of the input list. This is because we create a list of lists, where each inner list contains the words starting with the same character.
Similar Reads
Python | Lowercase first character of String The problem of capitalizing a string is quite common and has been discussed many times. But sometimes, we might have a problem like this in which we need to convert the first character of the string to lowercase. Let us discuss certain ways in which this can be performed. Method #1: Using string sli
4 min read
Python - Groups Strings on Kth character Sometimes, while working with Python Strings, we can have a problem in which we need to perform Grouping of Python Strings on the basis of its Kth character. This kind of problem can come in day-day programming. Let's discuss certain ways in which this task can be performed. Method #1: Using loop Th
4 min read
Split String into List of characters in Python We are given a string and our task is to split this string into a list of its individual characters, this can happen when we want to analyze or manipulate each character separately. For example, if we have a string like this: 'gfg' then the output will be ['g', 'f', 'g'].Using ListThe simplest way t
2 min read
Python | Remove last character in list of strings Sometimes, we come across an issue in which we require to delete the last character from each string, that we might have added by mistake and we need to extend this to the whole list. This type of utility is common in web development. Having shorthands to perform this particular job is always a plus
8 min read
Split String of list on K character in Python In this article, we will explore various methods to split string of list on K character in Python. The simplest way to do is by using a loop and split().Using Loop and split()In this method, we'll iterate through each word in the list using for loop and split it based on given K character using spli
2 min read
Python | Group List on K character Sometimes, we may face an issue in which we require to split a list to list of list on the K character sent as deliminator. This kind of problem can be used to send messages or can be used in cases where it is desired to have list of list of native list. Letâs discuss certain ways in which this can
3 min read
Python - Sort by Rear Character in Strings List Given a String list, perform sort by the rear character in the Strings list. Input : test_list = ['gfg', 'is', 'for', 'geeks'] Output : ['gfg', 'for', 'is', 'geeks'] Explanation : g < r < s = s, hence the order. Input : test_list = ['gfz', 'is', 'for', 'geeks'] Output : ['for', 'is', 'geeks',
5 min read
Python | Split string in groups of n consecutive characters Given a string (be it either string of numbers or characters), write a Python program to split the string by every nth character. Examples: Input : str = "Geeksforgeeks", n = 3 Output : ['Gee', 'ksf', 'org', 'eek', 's'] Input : str = "1234567891234567", n = 4 Output : [1234, 5678, 9123, 4567] Method
2 min read
Python | First character occurrence from rear String There are many ways to find out the first index of element in String as python in its language provides index() function that returns the index of first occurrence of element in String. But if one desires to get the last occurrence of element in string, usually a longer method has to be applied. Let
4 min read
Splitting String to List of Characters - Python We are given a string, and our task is to split it into a list where each element is an individual character. For example, if the input string is "hello", the output should be ['h', 'e', 'l', 'l', 'o']. Let's discuss various ways to do this in Python.Using list()The simplest way to split a string in
2 min read