AI Language Models and Sanskrit LLM
AI Language Models and Sanskrit LLM
Use/feedback
Retraining (with additional data if needed)
4/4/2025 "AI and Language models: towards a robust Sanskrit LLM", 5
Hansraj College (online) 4 Apr 25
Central problem in AI
IndoAryan - 76.87%
Dravidian - 20.82%
Austro Asiatic -
1.11%
Tibeto Burman - 1%
Andamanese* - 0%
"AI and Language models: towards a robust Sanskrit LLM", Hansraj College (online)
4 Apr 25
4/4/2025 19
AI and Linguistics
-sutra vrtti,vyakhya,bhashya,shastra
-tantra-yukti
compose good texts by removing tantra-doshas
obtain correct unambiguous meaning of a text
connect sentences for clarification of meaning
Vakya Yojana, Artha yojana
- shaabda-bodha
akanksha, yogyata, asatti, tatparya
4/4/2025 "AI and Language models: towards a robust Sanskrit LLM", 28
Hansraj College (online) 4 Apr 25
Methods of argumentation
-purva paksha
knowing the argument of the opponent, find flaws with it
-uttara paksha
propose the new (supposedly flawless) argument
-siddhanta
established theory, truth, vaada
-nigraha sthana
points of defeat in the debate/argumentation
-faster/real-time response
-better capacity in original writing
-better reasoning capacity like humans
-creativity
-lower development cost
-lower access cost
-lower power consumption
-easy deployability/accessibility
Text Processing and Information Retrieval: efficient keyword-based searches and information
retrieval from large Sanskrit text corpora, aiding researchers in locating relevant passages
Topic Modeling: topic modeling techniques helping researchers to identify prevalent themes and
topics in Sanskrit texts.
Sentiment Analysis: sentiment analysis to determine the emotional tone or attitude expressed in
Sanskrit writings.
Digital Humanities and Cultural Studies like Textual Criticism: identify textual variants, manuscript
differences, and the evolution of words and phrases over time in Sanskrit manuscripts.
Text Digitization: digitize ancient Sanskrit manuscripts, preserving and making them accessible in
digital formats.
Educational Tools: used in language learning apps and tools to provide learners with accurate
segmentation of words and sentences, aiding in pronunciation and comprehension.
Corpus Linguistics: statistical analysis of language usage patterns, historical shifts, and linguistic
phenomena in Sanskrit over time.
Microsoft Research
Translation
Speech transcription
Medical data intelligence
Voice/Text chatbots
Bangalore HQ in India
Legal domain
Health
Education
4/4/2025
4/4/2025 "AI and Language models: towards a robust Sanskrit LLM", 84
Hansraj College (online) 4 Apr 25
Google
Sanitizer
Lexicographer
4/4/2025 "AI and Language models: towards a robust Sanskrit LLM", 100
Hansraj College (online) 4 Apr 25
But we have our own unique
Challenges
4/4/2025 "AI and Language models: towards a robust Sanskrit LLM", 101
Hansraj College (online) 4 Apr 25
Diversity
Language variation and mixing
Paucity of Standards
Funding
Casual approach towards our languages
Teamwork
Lack of competition
Complexity of natural languages, more so in multilingual societies like India
4/4/2025 "AI and Language models: towards a robust Sanskrit LLM", 102
Hansraj College (online) 4 Apr 25
And there are Challenges in using AI
too
4/4/2025 "AI and Language models: towards a robust Sanskrit LLM", 103
Hansraj College (online) 4 Apr 25
Constantly evolving techniques in AI complicate the
problem further
AI in 90s vs AI now vs AI tomorrow
4/4/2025 "AI and Language models: towards a robust Sanskrit LLM", 104
Hansraj College (online) 4 Apr 25
Thanks !
കൂ क କ
ಕ ਕ
क క
ક గ
ক ಕ
ક
ಕ
କ
ਕ
ক
क
ક
గ
[email protected]
91-11-26741308
4/4/2025 "AI and Language models: towards a robust Sanskrit LLM", 105
Hansraj College (online) 4 Apr 25