Python NLTK | tokenize.regexp() Last Updated : 07 Jun, 2019 Comments Improve Suggest changes Like Article Like Report With the help of NLTK tokenize.regexp() module, we are able to extract the tokens from string by using regular expression with RegexpTokenizer() method. Syntax : tokenize.RegexpTokenizer() Return : Return array of tokens using regular expression Example #1 : In this example we are using RegexpTokenizer() method to extract the stream of tokens with the help of regular expressions. Python3 1== # import RegexpTokenizer() method from nltk from nltk.tokenize import RegexpTokenizer # Create a reference variable for Class RegexpTokenizer tk = RegexpTokenizer('\s+', gaps = True) # Create a string input gfg = "I love Python" # Use tokenize method geek = tk.tokenize(gfg) print(geek) Output : ['I', 'love', 'Python'] Example #2 : Python3 1== # import RegexpTokenizer() method from nltk from nltk.tokenize import RegexpTokenizer # Create a reference variable for Class RegexpTokenizer tk = RegexpTokenizer('\s+', gaps = True) # Create a string input gfg = "Geeks for Geeks" # Use tokenize method geek = tk.tokenize(gfg) print(geek) Output : ['Geeks', 'for', 'Geeks'] Comment More infoAdvertise with us Next Article Python NLTK | tokenize.regexp() J Jitender_1998 Follow Improve Article Tags : Python Python-nltk Practice Tags : python Similar Reads Python NLTK | nltk.TweetTokenizer() With the help of NLTK nltk.TweetTokenizer() method, we are able to convert the stream of words into small  tokens so that we can analyse the audio stream with the help of nltk.TweetTokenizer() method. Syntax : nltk.TweetTokenizer() Return : Return the stream of token Example #1 : In this example whe 1 min read Tokenize text using NLTK in python To run the below python program, (NLTK) natural language toolkit has to be installed in your system.The NLTK module is a massive tool kit, aimed at helping you with the entire Natural Language Processing (NLP) methodology.In order to install NLTK run the following commands in your terminal. sudo pip 3 min read Python NLTK | nltk.tokenize.SExprTokenizer() With the help of nltk.tokenize.SExprTokenizer() method, we are able to extract the tokens from string of characters or numbers by using tokenize.SExprTokenizer() method. It actually looking for proper brackets to make tokens. Syntax : tokenize.SExprTokenizer() Return : Return the tokens from a strin 1 min read Python NLTK | nltk.tokenize.TabTokenizer() With the help of nltk.tokenize.TabTokenizer() method, we are able to extract the tokens from string of words on the basis of tabs between them by using tokenize.TabTokenizer() method. Syntax : tokenize.TabTokenizer() Return : Return the tokens of words. Example #1 : In this example we can see that b 1 min read Python NLTK | nltk.tokenize.mwe() With the help of NLTK nltk.tokenize.mwe() method, we can tokenize the audio stream into multi_word expression token which helps to bind the tokens with underscore by using nltk.tokenize.mwe() method. Remember it is case sensitive. Syntax : MWETokenizer.tokenize() Return : Return bind tokens as one i 1 min read Python NLTK | nltk.WhitespaceTokenizer With the help of nltk.tokenize.WhitespaceTokenizer() method, we are able to extract the tokens from string of words or sentences without whitespaces, new line and tabs by using tokenize.WhitespaceTokenizer() method. Syntax : tokenize.WhitespaceTokenizer() Return : Return the tokens from a string Exa 1 min read Python NLTK | tokenize.WordPunctTokenizer() With the help of nltk.tokenize.WordPunctTokenizer()() method, we are able to extract the tokens from string of words or sentences in the form of Alphabetic and Non-Alphabetic character by using tokenize.WordPunctTokenizer()() method. Syntax : tokenize.WordPunctTokenizer()() Return : Return the token 1 min read Python NLTK | nltk.tokenize.StanfordTokenizer() With the help of nltk.tokenize.StanfordTokenizer() method, we are able to extract the tokens from string of characters or numbers by using tokenize.StanfordTokenizer() method. It follows stanford standard for generating tokens. Syntax : tokenize.StanfordTokenizer() Return : Return the tokens from a 1 min read Python NLTK | nltk.tokenize.SpaceTokenizer() With the help of nltk.tokenize.SpaceTokenizer() method, we are able to extract the tokens from string of words on the basis of space between them by using tokenize.SpaceTokenizer() method. Syntax : tokenize.SpaceTokenizer() Return : Return the tokens of words. Example #1 : In this example we can see 1 min read Python - Tokenize text using Enchant Enchant is a module in Python which is used to check the spelling of a word, gives suggestions to correct words. Also, gives antonym and synonym of words. It checks whether a word exists in dictionary or not. Enchant also provides the enchant.tokenize module to tokenize text. Tokenizing involves spl 3 min read Like