
Data Structure
Networking
RDBMS
Operating System
Java
MS Excel
iOS
HTML
CSS
Android
Python
C Programming
C++
C#
MongoDB
MySQL
Javascript
PHP
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
Use TensorFlow to Work with Character Substring in Python
Character substrings can be used with Tensorflow using the ‘substr’ method which is present in ‘strings’ module of Tensorflow. It is then converted into a Numpy array and then displayed.
Read More: What is TensorFlow and how Keras work with TensorFlow to create Neural Networks?
We will see how to represent Unicode strings using Python, and manipulate those using Unicode equivalents. First, separate the Unicode strings into tokens based on script detection with the help of the Unicode equivalents of standard string ops.
We are using the Google Colaboratory to run the below code. Google Colab or Colaboratory helps run Python code over the browser and requires zero configuration and free access to GPUs (Graphical Processing Units). Colaboratory has been built on top of Jupyter Notebook.
print("The default unit is byte") print("When len is 1, a single byte is returned") tf.strings.substr(thanks, pos=7, len=1).numpy() print("The unit is specified as UTF8_CHAR") print("It takes up 4 bytes") print(tf.strings.substr(thanks, pos=7, len=1, unit='UTF8_CHAR').numpy())
Code credit: https://www.tensorflow.org/tutorials/load_data/unicode
Output
The default unit is byte When len is 1, a single byte is returned The unit is specified as UTF8_CHAR It takes up 4 bytes b''
Explanation
- The tf.strings.substr operation takes the "unit" parameter.
- It then uses this to determine the kind of offsets the "pos" and "len" paremeters would contain.