String manipulation is one of the critical components of text data analysis. While analyzing text data, we might need to count the frequency of characters in the text. In this article, we will discuss different approaches to count occurrences of each character in a string in Python.
- Using For Loop and set() Function to Count Occurrences of Each Character in a String
- Count Occurrences of Each Character in a String Using the count() Method in Python
- Using A Python Dictionary to Count Occurrences of Each Character in a String
- Count Occurrences of Each Character in a String Using collections.Counter() Function
- Using Collections.defaultdict()
- Conclusion
Using For Loop and set() Function to Count Occurrences of Each Character in a String
We can use a for loop and the set()
function to count the occurrences of each character in a string in Python. For this, we will use the following steps.
- First, we will create a set of all the characters present in the input string. For this, we will use the
set()
function. Theset()
function takes an iterable object as its input argument and returns a set of all the elements of the iterable object. - After creating the set of characters, we will use nested for loops to count the occurrences of each character in the string.
- In the outer for loop, we will iterate through the elements of the set. Inside the for loop, we will define a variable
countOfChar
and initialize it to 0. - Then, we will iterate over the input string using another for loop.
- Inside the inner for loop, if we find the current character of the set in the string, we will increment
countOfChar
by 1. Otherwise, we will move to the next character in the string. - After execution of the inner for loop, we will get the count of a single character. We will print it using a print statement. Then, we will move to the next character in the set using the outer for loop.
After execution of the for loops, the number of occurrences of each character in the string will be printed. You can observe this in the following example.
input_string = "Pythonforbeginners is a great source to get started with Python."
print("The input string is:", input_string)
mySet = set(input_string)
for element in mySet:
countOfChar = 0
for character in input_string:
if character == element:
countOfChar += 1
print("Count of character '{}' is {}".format(element, countOfChar))
Output:
The input string is: Pythonforbeginners is a great source to get started with Python.
Count of character 'o' is 5
Count of character 'a' is 3
Count of character 'c' is 1
Count of character 'e' is 6
Count of character 'd' is 1
Count of character 't' is 8
Count of character 'r' is 5
Count of character 'y' is 2
Count of character 'n' is 4
Count of character 'u' is 1
Count of character 's' is 4
Count of character 'g' is 3
Count of character 'w' is 1
Count of character '.' is 1
Count of character 'h' is 3
Count of character ' ' is 9
Count of character 'P' is 2
Count of character 'b' is 1
Count of character 'i' is 3
Count of character 'f' is 1
If you want to store the frequencies of the characters, you can use a python dictionary. For storing the frequencies, we will first create an empty dictionary named countOfChars
.
After calculating the count of a character, we will add the character as the key and the count as the value to the dictionary. You can observe this in the following code.
input_string = "Pythonforbeginners is a great source to get started with Python."
print("The input string is:", input_string)
mySet = set(input_string)
countOfChars = dict()
for element in mySet:
countOfChar = 0
for character in input_string:
if character == element:
countOfChar += 1
countOfChars[element] = countOfChar
print("Count of characters is:")
print(countOfChars)
Output:
The input string is: Pythonforbeginners is a great source to get started with Python.
Count of characters is:
{'s': 4, 'P': 2, 'b': 1, '.': 1, 'd': 1, 'c': 1, 'g': 3, 'r': 5, 'i': 3, 'o': 5, 'u': 1, 'a': 3, 'f': 1, 'e': 6, 'n': 4, 'y': 2, ' ': 9, 'w': 1, 't': 8, 'h': 3}
Count Occurrences of Each Character in a String Using the count() Method in Python
The count()
method in a string is used to count the frequency of a character in a string. When invoked on a string, the count()
method takes a character as its input argument. After execution, it returns the frequency of the character given as the input argument.
To count the occurrences of each character in a string using the count()
method, we will use the following steps.
- First, we will create a set of characters in the input string using the
set()
function. - After that, we will iterate through the elements of the set using a for loop.
- Inside the for loop, we will invoke the
count()
method on the input string with the current element of the set as its input argument. After execution, thecount()
method will return the number of occurrences of the current element of the set. We will print the value using the print statement.
After execution of the for loop, the frequency of all the characters will be printed. You can observe this in the following example.
input_string = "Pythonforbeginners is a great source to get started with Python."
print("The input string is:", input_string)
mySet = set(input_string)
countOfChars = dict()
for element in mySet:
countOfChar = input_string.count(element)
countOfChars[element] = countOfChar
print("Count of character '{}' is {}".format(element, countOfChar))
Output:
The input string is: Pythonforbeginners is a great source to get started with Python.
Count of character 'e' is 6
Count of character 'n' is 4
Count of character 'a' is 3
Count of character '.' is 1
Count of character 'h' is 3
Count of character 'r' is 5
Count of character 'f' is 1
Count of character 'y' is 2
Count of character 's' is 4
Count of character 't' is 8
Count of character 'w' is 1
Count of character 'i' is 3
Count of character 'd' is 1
Count of character 'g' is 3
Count of character 'u' is 1
Count of character 'c' is 1
Count of character 'o' is 5
Count of character 'P' is 2
Count of character 'b' is 1
Count of character ' ' is 9
You can also store the frequency of the characters in a dictionary as shown below.
input_string = "Pythonforbeginners is a great source to get started with Python."
print("The input string is:", input_string)
mySet = set(input_string)
countOfChars = dict()
for element in mySet:
countOfChar = input_string.count(element)
countOfChars[element] = countOfChar
print("Count of characters is:")
print(countOfChars)
Output:
The input string is: Pythonforbeginners is a great source to get started with Python.
Count of characters is:
{'t': 8, 'o': 5, 'P': 2, 'n': 4, 'f': 1, 'e': 6, 'g': 3, 'c': 1, '.': 1, 's': 4, 'w': 1, 'y': 2, ' ': 9, 'u': 1, 'i': 3, 'd': 1, 'a': 3, 'h': 3, 'r': 5, 'b': 1}
The above approaches have high time complexity. If there are N distinct characters in the string and the string length is M, the time complexity of the execution will be of the order of M*N. Therefore, using these approaches is not advised if you have to analyze strings of thousands of characters. For that, we can use other approaches discussed in the following sections.
Using A Python Dictionary to Count Occurrences of Each Character in a String
A dictionary in python stores key-value pairs. To count occurrences of each character in a string in Python using a dictionary, we will use the following approach.
- First, we will create an empty dictionary named
countOfChars
to store the characters and their frequency. - Now, we will iterate over the input string using a for loop.
- During iteration, we will check if the present character is present in the dictionary using the membership operator.
- If the character is present in the dictionary, we will increment the value associated with the character by 1. Otherwise, we will add the character as a key in the dictionary with 1 as its associated value.
After execution of the for loop, we will get the count of each character in the countOfChars
dictionary. You can observe this in the following example.
input_string = "Pythonforbeginners is a great source to get started with Python."
print("The input string is:", input_string)
countOfChars = dict()
for character in input_string:
if character in countOfChars:
countOfChars[character] += 1
else:
countOfChars[character] = 1
print("The count of characters in the string is:")
print(countOfChars)
Output:
The input string is: Pythonforbeginners is a great source to get started with Python.
The count of characters in the string is:
{'P': 2, 'y': 2, 't': 8, 'h': 3, 'o': 5, 'n': 4, 'f': 1, 'r': 5, 'b': 1, 'e': 6, 'g': 3, 'i': 3, 's': 4, ' ': 9, 'a': 3, 'u': 1, 'c': 1, 'd': 1, 'w': 1, '.': 1}
Instead of using the if else statements, you can use python try-except blocks to count the occurrences of the characters in the string.
- Inside the for loop, we will increment the value associated with the current character in the dictionary by 1 in the try block. If the character doesn’t exist in the dictionary, the program will raise a KeyError exception.
- In the except block, we will catch the KeyError exception. Here, we will assign the character to the dictionary as a key with 1 as its associated value.
After execution of the for loop, we will get the count of each character in the countOfChars
dictionary. You can observe this in the following example.
input_string = "Pythonforbeginners is a great source to get started with Python."
print("The input string is:", input_string)
countOfChars = dict()
for character in input_string:
try:
countOfChars[character] += 1
except KeyError:
countOfChars[character] = 1
print("The count of characters in the string is:")
print(countOfChars)
Output:
The input string is: Pythonforbeginners is a great source to get started with Python.
The count of characters in the string is:
{'P': 2, 'y': 2, 't': 8, 'h': 3, 'o': 5, 'n': 4, 'f': 1, 'r': 5, 'b': 1, 'e': 6, 'g': 3, 'i': 3, 's': 4, ' ': 9, 'a': 3, 'u': 1, 'c': 1, 'd': 1, 'w': 1, '.': 1}
The approach using the try-except block works best if we have a large input string with very few distinct characters compared to the length of the string. If the input string is small and the length of the input string is not very large than the total number of distinct characters, the approach will be slower. This is due to the reason that handling exceptions is a costly operation.
If the program raises the KeyError exception very frequently, it will degrade the performance of the program. So, you should choose between using the if-else statement or the try-except blocks according to the input string length and the number of distinct characters in the string.
You can also avoid using both the if else statements and the try-except blocks. For this, we need to use the following approach.
- First, we will create a set of characters in the original string using the
set()
function. - Then, we will initialize the dictionary
countOfChars
using the elements of the set as the keys and 0 as the associated values. - Now, we will iterate through the characters of the input string using a for loop. During iteration, we will increment the value associated with the current character in
countOfChars
by 1.
After executing the for loop, we will get the count of occurrences of each character in the string. You can observe this in the following example.
input_string = "Pythonforbeginners is a great source to get started with Python."
print("The input string is:", input_string)
mySet = set(input_string)
countOfChars = dict()
for element in mySet:
countOfChars[element] = 0
for character in input_string:
countOfChars[character] += 1
print("The count of characters in the string is:")
print(countOfChars)
Output:
The input string is: Pythonforbeginners is a great source to get started with Python.
The count of characters in the string is:
{'d': 1, 'r': 5, 'y': 2, 'a': 3, 'P': 2, 'i': 3, 's': 4, ' ': 9, 'f': 1, '.': 1, 'h': 3, 't': 8, 'g': 3, 'c': 1, 'u': 1, 'e': 6, 'n': 4, 'w': 1, 'o': 5, 'b': 1}
Count Occurrences of Each Character in a String Using collections.Counter() Function
The collections module provides us with various functions to handle collection objects like list, string, set, etc. The Counter()
function is one of those functions. It is used to count the frequency of elements in a collection object.
The Counter()
function takes a collection object as its input argument. After execution, it returns a collections Counter object. The Counter object contains all the characters and their frequency in the form of a dictionary.
To count the occurrences of each character in a string in Python, you can simply pass it to the Counter()
function and print the output as shown in the following example.
from collections import Counter
input_string = "Pythonforbeginners is a great source to get started with Python."
print("The input string is:", input_string)
countOfChars = Counter(input_string)
print("The count of characters in the string is:")
print(countOfChars)
Output:
The input string is: Pythonforbeginners is a great source to get started with Python.
The count of characters in the string is:
Counter({' ': 9, 't': 8, 'e': 6, 'o': 5, 'r': 5, 'n': 4, 's': 4, 'h': 3, 'g': 3, 'i': 3, 'a': 3, 'P': 2, 'y': 2, 'f': 1, 'b': 1, 'u': 1, 'c': 1, 'd': 1, 'w': 1, '.': 1})
Using Collections.defaultdict()
The collections module provides us with a dictionary object with enhanced features. This is called defaultdict
. The defaultdict
objects don’t raise the KeyError exception if you try to modify the value associated with a key that doesn’t exist in the dictionary. Instead, it creates a key by itself and then proceeds with the execution of the statement.
For instance, if there is no key named“Aditya”
in a simple dictionary and we do the operation myDict[“Aditya”]+=1
, the program will run into a KeyError exception. On the other hand, a defaultdict
object will first create the key “Aditya”
in the dictionary and will successfully execute the above statement. However, we need to help the defaultdict object to create the default value for the key.
The defaultdict()
function takes another function, say fun1
as its input argument. Whenever the defaultdict object needs to create a new key with a default value, it executes fun1
and uses the value returned by fun1
as the associated value for the new key. In our case, we need to have the default value 0 for the count of a character, we will pass the int()
function as the input argument to the defaultdict()
function.
To count the occurrences of each character in a string using the defaultdict object in Python, we will use the following steps.
- First, we will create a defaultdict object using the
collections.defaultdict()
function. Here, we will pass theint()
function as input argument to thedefaultdict()
function. - Then, we will iterate through the characters of the input string using a for loop.
- While iteration, we will keep incrementing the value associated with each character in the defaultdict object.
After execution of the for loop, we will get the count of each character in the defaultdict object. You can observe this in the following example.
from collections import defaultdict
input_string = "Pythonforbeginners is a great source to get started with Python."
print("The input string is:", input_string)
countOfChars = defaultdict(int)
for character in input_string:
countOfChars[character] += 1
print("The count of characters in the string is:")
print(countOfChars)
Output:
The input string is: Pythonforbeginners is a great source to get started with Python.
The count of characters in the string is:
defaultdict(<class 'int'>, {'P': 2, 'y': 2, 't': 8, 'h': 3, 'o': 5, 'n': 4, 'f': 1, 'r': 5, 'b': 1, 'e': 6, 'g': 3, 'i': 3, 's': 4, ' ': 9, 'a': 3, 'u': 1, 'c': 1, 'd': 1, 'w': 1, '.': 1})
Conclusion
In this article, we have discussed different ways to count occurrences of each character in a string in python. Out of all these approaches, I would suggest you use the approaches using the collections module as these are the most efficient approaches.
I hope you enjoyed reading this article. To know more about python programming, you can read this article on dictionary comprehension in Python. You might also like this article on regression in machine learning. You can also have a look on this article on data analyst vs data scientist.
Stay tuned for more informative articles. Happy Learning!
The post Count Occurrences of Each Character in a String in Python appeared first on PythonForBeginners.com.