Arwi: Case Study of Arabic, Syriac, and Diacritical Unicode Characters
Arwi: Case Study of Arabic, Syriac, and Diacritical Unicode Characters
net/publication/270278754
CITATIONS READS
0 589
1 author:
Seyed Buhari
King Abdulaziz University
113 PUBLICATIONS 810 CITATIONS
SEE PROFILE
Some of the authors of this publication are also working on these related projects:
All content following this page was uploaded by Seyed Buhari on 04 January 2015.
1
Arwi – An Introduction
2
Tamil Script
n Historical information: 100 BC
q http://www.xs4all.nl/~wjsn/tekst/taalschriften.htm#QXQ
n Left-to-right writing
n Uses 65 characters and variants
q Uses combinations of them also
3
Arabic Script
n Right-to-left writing
q Like Persian, Urdu
4
Arwi Script
n Right-to-left writing
n Used to write variety of Islamic books
q Belief, Law, Sufism, Medicine, etc
n http://en.wikipedia.org/wiki/Arwi_language
5
Arwi Script
n Arwi script – A sample page from the book titled
“Sumthu Subyan”
6
Arwi Script
Arwi Script has helped the Muslim community to learn write Arabic te xt faster,
which is the language of the Holy Quran.
7
Status of Arwi Script
n Arwi is still used in certain Islamic schools
(madrashas) in Tamil Nadu
n Some famous books are preserved in libraries
n Lack of printing facility has affected the further
usage of this script
q Few books written in Arwi have been
translated into Tamil Script
q This shows the importance of those texts to
the public
n As per the knowledge of the author, no ARWI
font exists
8
Status of Arwi Script - Wikipedia
Ref.: http://en.wikipedia.org/wiki/Arwi_language
Famous books by great scholars like Imaam Shaafi (Radiallahu Anhu - May
Allah be pleased with him) and Imaam Abu Hanifa (May Allah be pleased with
him) have been translated into Arabic-Tamil. Authors also indicate that decline
of Arwi has caused a steady decline in the education of the women in the latter
half of the 20th century. Characters mentioned as work in progress are
handled in our work.
9
Related Works
1. Tschacher. T., "How to die before dying? Sharia and Sufism in a 19th
century Arabic-Tamil Poem", Panel 38, 18th European Conference on
Modern South Asian Studies , at Lund, Sweden, 6 – 9 July 2004.
http://www.sasnet.lu.se/panelabstracts/38.html
2. Shuayb Alim, "Arabic, Arwi and Persian in Sarandib and Tamil Nadu",
Madras, 1993.
10
Related Works
3. http://www.armu.com/armu/works/archives/12dec1998/amc1.html
[Accessed on: 22nd April 2008]
4. Nuhman MA, "Sri Lankan Muslims: Ethnic Identity within Cultural Diversity",
International Centre For Ethnic Studies, Colombo, Sri Lanka, 2007.
Nuhman [7] quotes about the bill on whether to make Sinhala as the only
official language. In that discussion, some speakers have quoted that Arwi
used as a writing script by Muslims is Tamil language. Those speakers
were stating the importance of making Tamil as one of the official
languages. Nuhman [7] refers to the issues of understanding Arwi scripts
by people who understand Arabic and those who understand Tamil. People
who understand Arabic can read Arwi but can't understand and those who
know Tamil and not Arabic can't read but if someone reads for them they
can understand Arwi. Author describes about the use of Arabic script for
languages like Malayalam and Bengali.
11
Related Works
n Nuhman[4] indicates:
q Arabic: 28 characters and 6 vowels
handled
n Mohan [5] quotes the presence of many literary
works in Arwi
Nuhman[4] indicates that there were 200 published and around 2000
unpublished literary works written in Arwi. Thus, two drastically different
languages where combined to form a scripting language Arwi instead of
developing a brand new language. Author concludes by saying that we
could use any writing script to write any other languages with some
modifications except for those languages like Chinese which uses
ideograph.
5. Mohan V., "Muslims of Sri Lanka", Aalekh Publishers, Jaipur , India, 1985.
12
Arwi Books
n Religious Rules
q Maani (The Treasure) - Maapillai Lebbe Alim
n Poems:
q Adabumalai (About Morale and Discipline)
Note: Shamu Sihabudeen Appa has written many poets (including Adabumalai
and Thakkasuruth) in Arwi. Mapillai Lebbe Alim has written many books in
Arwi.
13
Tamil and Arwi Alphabets – A Comparison
14
Arwi Script – Available Unicode Equivalents
15
Arwi Script – Available Unicode Equivalents
n Dot below 0643 (Kaf) is also needed
n Number representation in Arabic: (U+0661 –
U+0669)
Arabic Numerals are not exactly followed in Arwi. There is slight difference
between them with regards to numerals 4, 5 and 6. Sometimes, eve n the
numeral 7 is expressed slightly different (something like the English character
‘L’).
16
Arwi Script – Available Unicode Equivalents
17
Font Development
n Issues to be considered:
q Needs to consider cursive nature, joining,
Editing Software
n Rendering issues of different editing tools
18
Font Development
n Two Approaches:
q Development of a web page where people can type in
19
JavaScript based Arwi Typing Webpage
http://en.wikipedia.org/wiki/Help:Multilingual_support_(Indic)
At first, to enable type in Arabic or Arwi Font, Install files for complex script and
right-to-left languages (including Thai) option must be enabled on the users'
PC. This is done using Regional and Language Settings in Control Panel.
20
JavaScript based Arwi Typing Webpage
Virtual Keypad
Expected Webpage
This software is made using JavaScript and thus does not require any server
side support. You could just get the whole code and run on any machine. This
runs both on Windows and Linux machines. This software provides options for
users to mix both Tamil and Arwi scripts even though that is not the normal
method of writing in Arwi Script. When users types in Arwi, the character
alignment becomes right-to-left.
21
JavaScript based Arwi Typing Webpage
n Tamil keymap vary based on:
q If we use certain fonts like Bamini or Sarukesi
q If we use Unicode fonts like Latha font
n JavaScript based Webpage permits both the
keymap options using the radio button for
selection
n Users who are not aware of Tamil typing, can
use the virtual keypad provided.
n Arwi can also be typed using virtual keypad.
No need for Arwi keymap setup on the PC
22
JavaScript based Arwi Typing Webpage
n Scripting changes from left-to-right to right-to-left once
user decides to go to Arwi typing
n Issues faced:
q Windows Vista: Webpage works as expected on
n Internet Explorer
(6.0.2900.2180.xpsp_sp2_rtm.040803-2158)
n Problem displaying: 0656 (Arabic Subscript Alef),
0657 (Arabic Inverted Damma), 065C (Arabic
Vowel Sign Dot Below), 0328 (Combining Ogonek)
23
JavaScript based Arwi Typing Webpage
n Windows XP:
q Upgraded Internet Explorer to 7.0.5730.13 version:
q Firefox 2.0.0.13
After finding that few characters don’t appear properly on Windows XP, we did
check for the presence of the Unicode character in Windows XP and compare
that with Windows Vista. This is done using Charmap with Advanced view. We
did select Unicode subrange in "Group by" option and selected Combining
Diacritical Marks and Arabic to verify for the presence of the Unicode
characters. We could conclude that few characters were not present in
Windows XP and thus could not be displayed.
24
JavaScript based Arwi Typing Webpage
n Joining of Syriac character was done using Unicode
character 0640 (hyphen), 200D (Zero Width Joiner) and
070F (Syriac Abbreviation Mark):
document.write('<INPUT type="button" style="font-size: 30; font-
weight:bold" name="\u0746" value=" \u0746 "
onclick=AppendCharacter("\u0640\u200d\u070f\u0746")>');
Internet Explorer – Win XP Firefox 2.0.0.16 – Win XP
25
JavaScript based Arwi Typing Webpage
n In Mozilla Firefox 3, the Arabic
HTML appears better expect for
U+0657 character. Also, Syrian
characters need not include
\u0640\u200d\u070f to join
properly
26
JavaScript based Arwi Typing Webpage
n SuSe 10.2 (Firefox 2.0.0.6):
q Appearance issues
27
JavaScript based Arwi Typing Webpage
n In SuSe 10.2, with Firefox 3.0.1, Webpage worked fine. There is no
need for U+0640, U+200D and U+070F
n Further Analysis:
q Windows Vista has more Unicode characters compared to
Windows XP
n Can be verified using Charmap
28
JavaScript based Arwi Typing Webpage
n In Safari Version 3.1.2 (525.21) on Windows XP:
q All the characters seem to work fine both on
29
Tamil Keyboard - Comparison
Tamil – Latha – Unicode
Differences in Keypad for Unicode and non-unicode based Tamil font has
been a concern for those who wish to type in Tamil. Those who have learnt
Tamil using Typewriter find it difficult to move on to Unicode based Tamil
Fonts. There exists certain software that could convert text from one font to
another, just make sure that the rendering works fine.
30
Arabic Keyboard - Proposed Arwi
Arabic Keyboard
Arwi Keyboard
31
Font Rendering – Editor Tools
n Rendering of Arabic, Tamil and Arwi characters vary
between Notepad, WordPad, Microsoft Word,
OpenOffice.org tools, etc
n Table shows characters typed with Webpage and
copied and pasted in different editors
32
Font Rendering – Editor Tools
n Editor Tool Issues:
q Authors in [6] describe about rendering problems for
33
Font Rendering – Issues
7. http://acharya.iitm.ac.in
34
Font Rendering – Issues
8. http://zeyarath.blogspot.com
35
Font Development - fontforge
Users can make HTML pages using ARWI font by including the tag <font
face="arwi">
36
fontforge – Characters Added
n We have added new characters needed for
Arwi font to the DejaVuSans Unicode font
n For each character, we need to have medi,
init and final forms
DejaVuSans.ttf font present in the /usr/share/fonts/truetype folder was selected as the base
font. Installation of fontforge software on SuSe machine was straight forward with rpm (rpm –
ivf fontforge-i386.rpm). In fontforge, when we open the DejaVuSans font, we could see that
each character is shown as two cells. The cell on the top indicates the character and the
bottom cell indicates the drawing or representation of the character.
To add new glyphs to the given font, we need to add slot and proceed to enter the glyph and
then link this glyph to the base character. This is done as below: Encoding à Add Encoding
Slots (Indicate the number of glyphs you want to add)
Select each one of the newly added slot and do the following:
Element à Glyph Info: For a Glyph (Unicode Name: uni06A1.init; Unicode Value: -1)
For a base character (Unicode Name: uni06A1; Unicode Value: U+06a1)
Then click 'Set From Name' (in Glyph Info) and click 'Ok'.
For each and every glyph created, we need to click File à Generate Fonts à Save as
True type. Note: User rights are to be considered when saving the fonts in the
respective folders.
Make sure that the glyph is added to the substitutions option in the main or base Unicode font.
After doing the above, if we wish to see the impact on OpenOffice.org Writer, we need to close
OpenOffice.org Writer and re-open it. Also, make sure the keymap entry is removed and
added again (if there needs to be a change in the keymap).
37
fontforge – Characters Added
Example of
Initial, Middle,
and Final
Forms
The name of the font can be changed using Font Info under Element menu in
fontforge software. To generate the font, use Generate Fonts option under
File menu and then select TTF type. It was noted that we need to close the
fontforge before having the font to be available for typing in any editor
software.
For each character, we have substitutions like the initial, middle and final one.
We need to link the substitution glyph with the original base character. This
is done using Element à Glyph Info à Substitutions. Select the
appropriate substitution like 'init' or 'medi' or 'fina' and link to the newly
created glyph.
'medi': Medical forms in Arabic Lookup 8 subtable
'fina': Terminal forms in Arabic Lookup 9 subtable
'init': Initial forms in Arabic Lookup 10 subtable
38
Keyboard Layout Setup - Windows
From the glyph, we can see to which base character it is linked using: Element
à Show Dependent à Substitutions.
39
Keyboard Layout Setup - Windows
Using the Keyboard Layout Manager software, we provide the users with the
Keyboard Layout file, which is named as ArabuTamil.klm2000. To i nstall the
given ArabuTamil keypad on to any Windows machine, users need to use the
Keyboard Layout Manager Software. Install the software and open the
software. Then, click New under Keyboards and in the layout type
"ArabuTamil" and select any language that you are not using or not planning to
use (we have selected Arabic(Yemen)). Then click on Create. Once, the
Option is created, select the ArabuTamil option and click Edit. Click on Import
and select the ArabuTamil.klm2000 file given. Then, click Open followed by
OK twice. Finally, you need to Confirm changes. Then, in your la nguage bar,
you will find the necessary ArabuTamil (as Arabic(Yemen)) option present.
Now that you can type ArabuTamil in any editor software by selecting
ArabuTamil in the language bar.
40
Keyboard Layout – Windows - Issues
41
Keyboard Layout – Linux
q export LANG=ar_SA.UTF-8
42
Keyboard Layout – vim - Linux
was made
q New file is named as: arwi_utf-8.vim
43
Keyboard Layout – vim - Linux
let b:keymap_name = "arwi"
loadkeymap
q <char-0x0636> " (1590) - DAD
w <char-0x0635> " (1589) - SAD
n The character 'q' (Lowercase) is mapped to Arabic Dad
which is represented in hexadecimal and decimal as
0x0636 and 1590 respectively
n Once the keymap is ready, you could type in the vim
using the statement as set keymap=arwi, after pressing
Esc+:
44
Keyboard Layout – Input Locale - Linux
n Input locales:
q Personal Settings (Configure Desktop) à Regional &
Add
n Creating new Arwi Keymap
q Go to xkb folder in either /usr/share/X11 or /etc/X11
or /usr/X11R6/lib/X11 folder
q Add an entry (anywhere) under the ! layout section of
45
Keyboard Layout – Input Locale - Linux
46
Keyboard Layout – Input Locale - Linux
n The keymap needs to be added to
/usr/share/X11/xkb/symbols folder
n Copy the existing ara (Arabic) to arwi (Arwi)
n AE stands for 1234 row in keyboard
n AD stands for QWERT row in keyboard
n AC stands for ASDF row in keyboard
n AB stands for ZXCV row in keyboard
q AB01 stands for ‘z’ character
47
Keyboard Layout – Linux
n After changing keymap file (in
/usr/share/X11/xkb/symbols):
q Remove keymap from the layout
48
Keyboard Layout – Linux
n Keymap can be temporarily enabled using
anyone of the commands (Example, arwi
keymap):
q setxkbmap –symbols 'arwi'
q setxkbmap –symbols arwi
q setxkbmap –layout 'arwi'
q setxkbmap –layout arwi
49
Issues Faced – Rendering Differences
50
Issues Faced
51
Issues Faced
n Simple test was done by copying the existing
diacritical character at a different position and testing
with different tools PDF by XeTeX
& ConTeXt
Specimen
Font
Previewer
52
Issues Faced
n Testing on Ubuntu 8.04
53
Characters that needs to be added
n Make sure that the initial, final and middle glyphs of
these Characters are present:
q 0686, 068A, 068D, 0693, 0694, 06A3, 06B9, 06BA,
0657
n Character that needs to be added:
q 0643 WITH A DOT BELOW
54
Further References
http://www.arabeyes.org/download/download/3rd/arabic.xkb
http://www.vim.org/htmldoc/arabic.html
http://countrystudies.us/sri-lanka/38.htm [Accessed on: 22nd
April 2008]
http://www.armu.com/armu/works/archives/12dec1998/amc1.
html [Accessed on: 22nd April 2008]
Tschacher, Arwi (Arabic-Tamil) – An Introduction [Accessed
on: 23rd April 2008],
http://web.archive.org/web/20040822180630/www.fas.nus.ed
u.sg/journal/kolam/vols/kolam5&6/1AOldLit/Arwi.htm
http://www.klm32.com/ [Accessed on: 22nd April 2008]
http://acharya.iitm.ac.in/multi_sys/unicode/render/ren_07.php
[Accessed on: 23rd April 2008]
55
Conclusion
56
Arwi: Case study of Arabic, Syriac and
Diacritical Unicode characters
Thank You
Questions are most welcome
57