
Data Structure
Networking
RDBMS
Operating System
Java
MS Excel
iOS
HTML
CSS
Android
Python
C Programming
C++
C#
MongoDB
MySQL
Javascript
PHP
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
Tokenizing a String in C++
Tokenization refer to the process of breaking a string based on certain delimiters such as spaces, commas, semicolons, or any other character. Mostly white spaces are used as delimiters to tokenize strings. In this article, we will discuss all the ways tokenize a string in C++.
For example, if we have a string like "Hello, World! Welcome to Tutorialspoint.", we can tokenize it like this:
string str = "Hello, World! Welcome to Tutorialspoint."; tokens = {"Hello", "World", "Welcome", "to", "Tutorialspoint"};
Tokenizing a String in C++
In C++, we can tokenize a string using various methods. Here are some common approaches which we will be discussing:
Using stringstream to Tokenize a String
The stringstream class in C++ used to perform input and output operations on strings. To tokenize a string using stringstream, you can pass the stringstream object and delimiter as arguments to the getline() function. This function will read characters from the stringstream until it reaches the specified delimiter. So, if you run this in a loop until the end of the stringstream, you can extract all tokens.
Example
In the code below, we have used getline() function to read tokens from a stringstream object in a while loop. The tokens are stored in a vector.
#include <iostream> #include <sstream> #include <vector> using namespace std; int main() { string str = "Hello, World! Welcome to C++."; stringstream ss(str); vector<string> tokens; string token; while (getline(ss, token, ' ')) { // using space as delimiter tokens.push_back(token); } for (const auto &word : tokens) { cout << word << endl; } return 0; }
The output of the above code will be:
Hello, World! Welcome to C++.
Using strtok to Tokenize a String
The strtok() function is a C-style string tokenizer. The difference here is that, it modifies the original string by replacing delimiters with null characters. Let's see how to use it in C++ with an example.
Example
In the code below, we have used strtok() function to tokenize a string. The tokens are stored in a vector.
#include <iostream> #include <cstring> #include <vector> using namespace std; int main() { char str[] = "Hello, World! Welcome to C++."; vector<string> tokens; // using space, comma, period and exclamation as delimiters char *token = strtok(str, " ,.!"); while (token != nullptr) { tokens.push_back(token); token = strtok(nullptr, " ,.!"); // continue tokenizing } for (const auto &word : tokens) { cout << word << endl; } return 0; }
The output of the above code will be:
Hello World Welcome to C++
Using Token Iterator to Tokenize a String
In C++, you can also use the iterator to iterate over a string and tokenize it into words according to the delimiters. This method is more flexible, you can define any type of delimiters.
Example
In the code below, we have used iterators to traverse through the string and extract tokens based on specified delimiters.
#include <iostream> #include <string> #include <vector> #include <iterator> using namespace std; int main() { string str = "Hello, World! Welcome to C++."; vector<string> tokens; string::iterator it = str.begin(); string::iterator end = str.end(); string token; while (it != end) { if (*it == ' ' || *it == ',' || *it == '.' || *it == '!') { if (!token.empty()) { tokens.push_back(token); token.clear(); } } else { token += *it; } ++it; } if (!token.empty()) { // add the last token tokens.push_back(token); } for (const auto &word : tokens) { cout << word << endl; } return 0; }
The output of the above code will be:
Hello World Welcome to C++