Tamil Inscription in Python
Tamil Inscription in Python
PYTHON
ABSTARCT
Recognition of ancient Tamil characters has always been a challenge for Epigraphers.
This is primarily because the language has evolved over several centuries and the character set
has over this time both expanded and diversified.
This proposed work focuses on improving optical character recognition techniques for ancient
Tamil script which was in use between the 7th and 12th centuries.
While comprehensively curating a functional data set for ancient Tamil characters is an
arduous task, in this work, a data set has been curated using cropped images of characters
found on certain temple inscriptions, specific to this time period as a case study.
After using Otsu thresholding method for binarization of the image, a two-dimensional
convolution neural network is defined and used to train, classify, and recognize the ancient
Tamil characters.
INTRODUCTION
Inscriptions and manuscripts are important sources of information to understand the history
and culture of ancient civilizations.
They can be found on everything - from rocks, slabs, and pillars, to the walls of the temples.
Most of these inscriptions convey vast and useful monarchical information about proceedings
of administrative and religious processes.
These inscriptions are valuable documented proof to understand better the quality of life
during the specified era.
Tamil Nadu tops the list in the Survey list of Survey of Indian Epigraphy (1996). This implies
that Tamil Nadu has a significantly large number of inscriptions. Tamil is an ancient language
of the world, and amongst the earliest languages in the Indian subcontinent.
EXISTING SYSTEM
Many organizations and institutions are involved in digitizing and documenting Tamil
inscriptions.
This often includes creating digital copies of inscriptions, recording metadata, and organizing
databases for easy access.
This helps in preserving the inscriptions and facilitates further study and analysis.
DISADVANTAGES
Existing systems may be built on outdated technology, making it challenging to keep up with
modern standards and security requirements.
Legacy systems may face difficulties integrating with newer technologies, hindering
interoperability and efficiency.
Older systems may lack proper documentation, and support may be limited or unavailable,
making it challenging to troubleshoot issues.
PROPOSED SYSTEM
Include metadata such as location, date of discovery, historical context, and associated
artifacts.
Integrate linguistic analysis tools to study the language used in inscriptions and identify
linguistic patterns and changes over time.
Support collaboration with linguists and epigraphists for in-depth language research.
ADVANTAGES
The system facilitates the digital preservation of Tamil inscriptions, ensuring that high-
resolution images and detailed documentation are securely stored for future generations.
Regular updates keep the system aligned with the latest technologies, security standards, and
improvements in linguistic analysis tools, ensuring its long-term relevance and effectiveness.
SYSTEM REQUIREMENTS
HARDWARE REQUIREMENT
SOFTWARE REQUIREMENT
HARDWARE REQUIREMENT
Ram : 4 GB
SOFTWARE REQUIREMENT
IDE : Pycharm
ARCHITECTURE
ALGORITHM
H5 MODEL ALGORITHM
Keras saves models in this format as it can easily store the weights and model configuration in
a single file.
The H5 Model addresses the mental and physical health issues related to trauma suffered by
refugees, the relationship between mental and physical health problems prevalent among
refugee populations, the potential for trauma to persist in refugee camps, and the need for a
new, more comprehensive model of refugee care.
Groups can contain datasets or other groups info, forming a hierarchical structure.
MODULE DESCRIPTION
"Inscription dataset creation module" in the field of machine learning or data science.
If this term is specific to a particular software module or tool developed after my last update,
I recommend checking the latest documentation, resources, or online materials associated with
that specific tool for an accurate and up-to-date definition.
In the context of machine learning, training is the process of teaching a model to make
accurate predictions by learning patterns and relationships from a labeled dataset.
IMAGE PROCESSING MODULE
Image processing modules are commonly used in computer vision, computer graphics, and
other fields to enhance, analyze, or extract information from images.
This type of module may be part of a broader system dedicated to cultural heritage
preservation, historical research, or archaeological analysis.
CONCLUSION
By using a CNN and Image Recognition techniques, an operable system was designed for
modern and ancient Tamil.
The difference in the style of the ancient Tamil scripts from the modern Tamil script posed as
a challenge to execute the task efficiently.
Multiple samples of ancient inscriptions from some historical temples were taken as case
studies to implement the developed methodology.
The outputs obtained could not be digitally segmented due to lack of availability of any
language parser for ancient Tamil scripts. OCR techniques for ancient Tamilscripts is a rich
research topic.
code