Tokenizing a String in C++



Tokenization refer to the process of breaking a string based on certain delimiters such as spaces, commas, semicolons, or any other character. Mostly white spaces are used as delimiters to tokenize strings. In this article, we will discuss all the ways tokenize a string in C++.

For example, if we have a string like "Hello, World! Welcome to Tutorialspoint.", we can tokenize it like this:

string str = "Hello, World! Welcome to Tutorialspoint."; tokens = {"Hello", "World", "Welcome", "to", "Tutorialspoint"};

Tokenizing a String in C++

In C++, we can tokenize a string using various methods. Here are some common approaches which we will be discussing:

Using stringstream to Tokenize a String

The stringstream class in C++ used to perform input and output operations on strings. To tokenize a string using stringstream, you can pass the stringstream object and delimiter as arguments to the getline() function. This function will read characters from the stringstream until it reaches the specified delimiter. So, if you run this in a loop until the end of the stringstream, you can extract all tokens.

Example

In the code below, we have used getline() function to read tokens from a stringstream object in a while loop. The tokens are stored in a vector.

Open Compiler
#include <iostream> #include <sstream> #include <vector> using namespace std; int main() { string str = "Hello, World! Welcome to C++."; stringstream ss(str); vector<string> tokens; string token; while (getline(ss, token, ' ')) { // using space as delimiter tokens.push_back(token); } for (const auto &word : tokens) { cout << word << endl; } return 0; }

The output of the above code will be:

Hello,
World!
Welcome
to
C++.

Using strtok to Tokenize a String

The strtok() function is a C-style string tokenizer. The difference here is that, it modifies the original string by replacing delimiters with null characters. Let's see how to use it in C++ with an example.

Example

In the code below, we have used strtok() function to tokenize a string. The tokens are stored in a vector.

Open Compiler
#include <iostream> #include <cstring> #include <vector> using namespace std; int main() { char str[] = "Hello, World! Welcome to C++."; vector<string> tokens; // using space, comma, period and exclamation as delimiters char *token = strtok(str, " ,.!"); while (token != nullptr) { tokens.push_back(token); token = strtok(nullptr, " ,.!"); // continue tokenizing } for (const auto &word : tokens) { cout << word << endl; } return 0; }

The output of the above code will be:

Hello
World
Welcome
to
C++

Using Token Iterator to Tokenize a String

In C++, you can also use the iterator to iterate over a string and tokenize it into words according to the delimiters. This method is more flexible, you can define any type of delimiters.

Example

In the code below, we have used iterators to traverse through the string and extract tokens based on specified delimiters.

Open Compiler
#include <iostream> #include <string> #include <vector> #include <iterator> using namespace std; int main() { string str = "Hello, World! Welcome to C++."; vector<string> tokens; string::iterator it = str.begin(); string::iterator end = str.end(); string token; while (it != end) { if (*it == ' ' || *it == ',' || *it == '.' || *it == '!') { if (!token.empty()) { tokens.push_back(token); token.clear(); } } else { token += *it; } ++it; } if (!token.empty()) { // add the last token tokens.push_back(token); } for (const auto &word : tokens) { cout << word << endl; } return 0; }

The output of the above code will be:

Hello
World
Welcome
to
C++
Farhan Muhamed
Farhan Muhamed

No Code Developer, Vibe Coder

Updated on: 2025-06-13T18:51:05+05:30

6K+ Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements