Modulo:mrooteo
Aspekto
MODULO | ||
Memtesto disponeblas sur la paĝo Ŝablono:kat-elemento-eo. |
- dependas de
{{radikoj}}
- ĉi tiu modulo efektivigas laboron de ŝablono
{{kat-elemento-eo}}
- aldone vokata far
((mfarado))
--[===[
MODULE "MROOTEO" (root eo ie Esperanto root)
"eo.wiktionary.org/wiki/Modulo:mrooteo" <!--2024-Sep-13-->
Purpose: describes and categorizes an Esperanto morpheme according to a
list stored in a separate template, to be used on category pages
Utilo: priskribas kaj enkategoriigas esperantan morfemon laux
listo konservita en aparta sxablono, uzinda sur kategoriaj pagxoj
Manfaat: ...
Syfte: beskriver och kategoriserar morfem paa esperanto ...
Used by templates / Uzata far sxablonoj:
* "kat-elemento-eo"
Used by modules:
* "mfarado"
Required submodules / Bezonataj submoduloj:
* none / neniuj
Required templates:
* "SXablono:radikoj"
Incoming: * named obligatory "in=" one of 3 types:
* bare root (I: -il-, M: nul, N: mov, P: fi-, U: -j) only lowercase !!!FIXME!!! NOT yet
ASCII plus 6 lowercase -eo- letters, identified by NOT
containing any space
* new-style pagename "Vorto -eo- enhavanta morfemon N (kapt)"
identified by containing at least one space and last character
being a bracket ")"
* legacy-style pagename, identified by containing at least one
space and last character NOT being a bracket ")"
colons ":" and underscores "_" prohibited in order to prevent
things like "Kategorio:Radiko arb'" (fullpagename) or
"Radiko_arb'" (URL-style)
* named optional "vs=" word class "o" "a" "i" one or two letters !!!FIXME!!! NOT yet
* 1 special parameter
"givetable=" value "true" if called from a module
* 2 hidden parameters
* "nocat=" no error possible
* "detrc=" no error possible
Pagename is never accessed, since this module is intended to be
called from other module at the end, rather than from a template.
For same reason there is no point to peek the caller's frame. No
anonymous parameters are tolerated. !!!FIXME!!! not yet checked
Reads template "SXablono:radikoj":
* empty lines skipped
* length of a nonempty line 2 ... 100'000 octet:s !!!FIXME!!! NOT yet checked
* nonempty lines must alternate level-2 wiki headings with root lists
* leading spaces, trailing spaces and multiple spaces are prohibited
* headings
* 7 ... 200 octet:s !!!FIXME!!! NOT yet checked
* headings must be formed like "== " ... " ==" with exactly ONE
separation space at every side
* legal char:s in headings: !!!FIXME!!! NOT yet checked
* dash "-"
* 10 ASCII numbers
* 26 ASCII UPPERCase
* 26 ASCII lowercase
* 6 -eo- UPPERCase
* 6 -eo- lowercase
* sorting of the headings is not required but dupes are prohibited
* root lists
* 2 ... 100'000 octet:s (checked already before)
* root lists contain a list of roots, all in one line, separated by spaces
* legal char:s in root lists: !!!FIXME!!! NOT yet checked
* dash "-" earliest or last in a root only
* 26 ASCII lowercase
* 6 -eo- lowercase
* sorting of the roots is not required but dupes are prohibited
Strategy:
* identify the type of incoming root from "in=" by dashes and a constant
table with nonstandalone roots !!!FIXME!!! NOT yet
* read the template and convert the raw text block into single non-empty
lines (at least 2 char:s, prohibit leading and trailing spaces) in a table
* we have more 3 tables (besides the one for lines), one for all headings,
one for hits by headings, and one for roots collected from a line
* every new heading is added into the table for all headings with
the stripped heading being the key/index, and value always "true"
avoiding dupes that way
* every new heading is also temporarily stored in a string
* walk through the roots putting them all into a table (emptied at the
beginning of the line) with the root being the key/index, and value
always "true" avoiding dupes that way, if the root from the list is equal
the incoming root, then store the heading into the table for hits by
headings, still do NOT abort search at a hit
* for categories translate the found headings into names by means of
a constant table, pass unchanged if no translation found
* add an extra category based on the number of hits ZERO or non-ZERO
Error codes:
* #E01 internal
* #E02 "in=" bad (wrong length, or contains colon or underscore)
* #E03 "in=" bad (later type identification failed)
* #E04 template obviously bad (not found or empty or quasi-empty)
* #E05 empty or bad line (too short (<2) or leading or trailing space
or double space)
* #E06 number of lines bad (must be even and at least 2)
* #E07 overall pattern bad (alternating lines required)
* #E19 heading bad (too short or bad equal signs or bad char)
* #E20 dupe heading
* #E21 root list bad char (other than eo lowercase)
* #E22 root list illegal use of "-" (root equals "-", two consecutive
dashes, dash in other position than earliest and last)
* #E24 dupe root in one line (2 details follow)
]===]
local exporttable = {}
------------------------------------------------------------------------
---- CONSTANTS [O] ----
------------------------------------------------------------------------
local constrtemplate = string.char(0xC5,0x9C) .. "ablono:radikoj"
local contabvisi = {}
contabvisi [0] = 'La elemento "'
contabvisi [1] = '" (tipo ' -- needs terminating ") " from elsewhere
contabvisi [2] = 'troveblas en '
contabvisi [3] = 'ne troveblas en iu listo kaj do estas neoficiala' -- NO dot at end
local contabcats = {}
contabcats [0] = "Ne-AV-elemento" -- ZERO hits
contabcats [1] = "AV-elemento" -- non-ZERO hits
contabcats [ 'OA0'] = "Fundamenta elemento"
contabcats [ 'OA1'] = "Elemento de la 1-a Oficiala Aldono"
contabcats [ 'OA2'] = "Elemento de la 2-a Oficiala Aldono"
contabcats [ 'OA3'] = "Elemento de la 3-a Oficiala Aldono"
contabcats [ 'OA4'] = "Elemento de la 4-a Oficiala Aldono"
contabcats [ 'OA5'] = "Elemento de la 5-a Oficiala Aldono"
contabcats [ 'OA6'] = "Elemento de la 6-a Oficiala Aldono"
contabcats [ 'OA7'] = "Elemento de la 7-a Oficiala Aldono"
contabcats [ 'OA8'] = "Elemento de la 8-a Oficiala Aldono"
contabcats [ 'OA9'] = "Elemento de la 9-a Oficiala Aldono"
contabcats ['OA10'] = "Elemento de la 10-a Oficiala Aldono"
------------------------------------------------------------------------
---- SPECIAL STUFF OUTSIDE MAIN [B] ----
------------------------------------------------------------------------
-- SPECIAL VAR:S
local qboodetrc = true -- from "detrc=true" but default is "true" !!!
local qstrtrace = '<br>' -- for main & sub:s, debug report request by "detrc="
------------------------------------------------------------------------
---- DEBUG FUNCTIONS [D] ----
------------------------------------------------------------------------
-- Local function LFDTRACEMSG
-- Enhance upvalue "qstrtrace" with fixed text.
-- for variables the other sub "lfdshowvar" is preferable but in exceptional
-- cases it can be justified to send text with values of variables to this sub
-- no size limit
-- upvalue "qstrtrace" must NOT be type "nil" on entry (is inited to "<br>")
-- uses upvalue "qboodetrc"
local function lfdtracemsg (strshortline)
if (qboodetrc and (type(strshortline)=='string')) then
qstrtrace = qstrtrace .. strshortline .. '.<br>' -- dot added !!!
end--if
end--function lfdtracemsg
------------------------------------------------------------------------
-- Local function LFDMINISANI
-- Input : * strdangerous -- must be type "string", empty legal
-- * numlimitdivthree
-- Output : * strsanitized -- can happen to be quasi-empty with <<"">>
-- To be called from "lfdshowvcore" <- "lfdshowvar" only.
-- * we absolutely must disallow: cross "#" 35 | apo "'" 39 |
-- star "*" 42 | dash 45 | colon 58 | "<" 60 | ">" 62 | "[" 91 | "]" 93
-- * spaces are showed as "{32}" if repetitive or at begin or at end
local function lfdminisani (strdangerous, numlimitdivthree)
local strsanitized = '"' -- begin quot
local num38len = 0
local num38index = 1 -- ONE-based
local num38signo = 0
local num38prev = 0
local boohtmlenc = false
local boovisienc = false
num38len = string.len (strdangerous)
while true do
boohtmlenc = false -- % reset on
boovisienc = false -- % every iteration
if (num38index>num38len) then -- ONE-based
break -- done string char after char
end--if
num38signo = string.byte (strdangerous,num38index,num38index)
if ((num38signo<43) or (num38signo==45) or (num38signo==58) or (num38signo==60) or (num38signo==62) or (num38signo==91) or (num38signo==93) or (num38signo>122)) then
boohtmlenc = true
end--if
if ((num38signo<32) or (num38signo>126)) then
boovisienc = true -- overrides "boohtmlenc"
end--if
if ((num38signo==32) and ((num38prev==32) or (num38index==1) or (num38index==num38len))) then
boovisienc = true -- overrides "boohtmlenc"
end--if
if (boovisienc) then
strsanitized = strsanitized .. '{' .. tostring (num38signo) .. '}'
else
if (boohtmlenc) then
strsanitized = strsanitized .. '&#' .. tostring (num38signo) .. ';'
else
strsanitized = strsanitized .. string.char (num38signo)
end--if
end--if
if ((num38len>(numlimitdivthree*3)) and (num38index==numlimitdivthree)) then
num38index = num38len - numlimitdivthree -- jump forwards
strsanitized = strsanitized .. '" ... "'
else
num38index = num38index + 1 -- ONE-based
end--if
num38prev = num38signo
end--while
strsanitized = strsanitized .. '"' -- don't forget final quot
return strsanitized
end--function lfdminisani
------------------------------------------------------------------------
-- Local function LFDSHOWVCORE
-- Prebrew report about content of a variable including optional full
-- listing of a table with numerical and string indexes. !!!FIXME!!!
-- Input : * vardubious -- content (any type including "nil" is acceptable)
-- * str77name -- name of the variable (string)
-- * vardescri -- optional comment, default empty, begin with "@" to
-- place it before name of the variable, else after
-- * vartablim -- optional limit, default ZERO, limits both string
-- keys and numeric keys
-- Depends on functions :
-- [D] lfdminisani
local function lfdshowvcore (vardubious, str77name, vardescri, vartablim)
local taballkeystring = {}
local strtype = ''
local strreport = ''
local numindax = 0
local numlencx = 0
local numkeynumber = 0
local numkeystring = 0
local numkeycetera = 0
local numkey77min = 999999
local numkey77max = -999999
local boobe77fore = false
if (type(str77name)~='string') then
str77name = '??' -- bite the bullet
else
str77name = '"' .. str77name .. '"'
end--if
if (type(vardescri)~='string') then
vardescri = '' -- omit comment
end--if
if (string.len(vardescri)>=2) then
boobe77fore = (string.byte(vardescri,1,1)==64) -- prefix "@"
if (boobe77fore) then
vardescri = string.sub(vardescri,2,-1) -- CANNOT become empty
end--if
end--if
if (type(vartablim)~='number') then
vartablim = 0 -- deactivate listing of a table
end--if
if ((vardescri~='') and (not boobe77fore)) then
str77name = str77name .. ' (' .. vardescri .. ')' -- now a combo
end--if
strtype = type(vardubious)
if (strtype=='table') then
for k,v in pairs(vardubious) do
if (type(k)=='number') then
numkey77min = math.min (numkey77min,k)
numkey77max = math.max (numkey77max,k)
numkeynumber = numkeynumber + 1
else
if (type(k)=='string') then
taballkeystring [numkeystring] = k
numkeystring = numkeystring + 1
else
numkeycetera = numkeycetera + 1
end--if
end--if
end--for
strreport = 'Table ' .. str77name
if ((numkeynumber==0) and (numkeystring==0) and (numkeycetera==0)) then
strreport = strreport .. ' is empty'
else
strreport = strreport .. ' contains '
if (numkeynumber==0) then
strreport = strreport .. 'NO numeric keys'
end--if
if (numkeynumber==1) then
strreport = strreport .. 'a single numeric key equal ' .. tostring (numkey77min)
end--if
if (numkeynumber>=2) then
strreport = strreport .. tostring (numkeynumber) .. ' numeric keys ranging from ' .. tostring (numkey77min) .. ' to ' .. tostring (numkey77max)
end--if
strreport = strreport .. ' and ' .. tostring (numkeystring) .. ' string keys and ' .. tostring (numkeycetera) .. ' other keys'
end--if
if ((numkeynumber~=0) and (vartablim~=0)) then -- !!!FIXME!!!
strreport = strreport .. ' ### content num keys :'
numindax = numkey77min
while true do
if ((numindax>vartablim) or (numindax>numkey77max)) then
break -- done table
end--if
strreport = strreport .. ' ' .. tostring(numindax) .. ' -> ' .. lfdminisani(tostring(vardubious[numindax]),30)
numindax = numindax + 1
end--while
end--if
if ((numkeystring~=0) and (vartablim~=0)) then -- !!!FIXME!!!
strreport = strreport .. ' ### content string keys :'
end--if
else
strreport = 'Variable ' .. str77name .. ' has type "' .. strtype .. '"'
if (strtype=='string') then
numlencx = string.len (vardubious)
strreport = strreport .. ' and length ' .. tostring (numlencx)
if (numlencx~=0) then
strreport = strreport .. ' and content ' .. lfdminisani (vardubious,30)
end--if
else
if (strtype~='nil') then
strreport = strreport .. ' and content "' .. tostring (vardubious) .. '"'
end--if
end--if (strtype=='string') else
end--if (strtype=='table') else
if ((vardescri~='') and boobe77fore) then
strreport = vardescri .. ' : ' .. strreport -- very last step
end--if
return strreport
end--function lfdshowvcore
------------------------------------------------------------------------
-- Local function LFDSHOWVAR
-- Enhance upvalue "qstrtrace" with report about content of a
-- variable including optional full listing of a table with numerical
-- and string indexes. !!!FIXME!!!
-- Depends on functions :
-- [D] lfdminisani lfdshowvcore
-- upvalue "qstrtrace" must NOT be type "nil" on entry (is inited to "<br>")
-- uses upvalue "qboodetrc"
local function lfdshowvar (varduubious, strnaame, vardeskkri, vartabljjm)
if (qboodetrc) then
qstrtrace = qstrtrace .. lfdshowvcore (varduubious, strnaame, vardeskkri, vartabljjm) .. '.<br>' -- dot added !!!
end--if
end--function lfdshowvar
------------------------------------------------------------------------
---- MATH FUNCTIONS [E] ----
------------------------------------------------------------------------
local function mathmod (xdividendo, xdivisoro)
local resultmod = 0 -- MOD operator is "%" and bitwise AND operator lack too
resultmod = xdividendo % xdivisoro
return resultmod
end--function mathmod
------------------------------------------------------------------------
---- UTF8 FUNCTIONS [U] ----
------------------------------------------------------------------------
-- Local function LFULNUTF8CHAR
-- Evaluate length of a single UTF8 char in octet:s.
-- Input : * numbgoctet -- beginning octet of a UTF8 char
-- Output : * numlen1234x -- unit octet, number 1...4, or ZERO if invalid
-- Does NOT thoroughly check the validity, looks at ONE octet only.
local function lfulnutf8char (numbgoctet)
local numlen1234x = 0
if (numbgoctet<128) then
numlen1234x = 1 -- $00...$7F -- ANSI/ASCII
end--if
if ((numbgoctet>=194) and (numbgoctet<=223)) then
numlen1234x = 2 -- $C2 to $DF
end--if
if ((numbgoctet>=224) and (numbgoctet<=239)) then
numlen1234x = 3 -- $E0 to $EF
end--if
if ((numbgoctet>=240) and (numbgoctet<=244)) then
numlen1234x = 4 -- $F0 to $F4
end--if
return numlen1234x
end--function lfulnutf8char
------------------------------------------------------------------------
-- Local function LFUTRISTLETR
-- Evaluate char (from ASCII + selectable extra set from UTF8) to
-- tristate result (no letter vs uppercase letter vs lowercase letter).
-- Input : * strin5trist : single unicode char (1 or 2 octet:s) or
-- longer string
-- * strsel5set : "ASCII" (default, empty string or type "nil"
-- will do too) "eo" "sv" (value "GENE" NOT here)
-- Output : * numtype5x : 0 no letter or invalid UTF8 -- 1 upper -- 2 lower
-- Depends on functions : (this is LFUTRISTLETR)
-- [U] lfulnutf8char
-- [G] lfgtestuc lfgtestlc
-- [E] mathdiv mathmod mathbitwrit
-- Possible further char:s or fragments of such are disregarded, the
-- question answered is "Is there one uppercase or lowercase letter
-- available at begin?".
-- Defined sets:
-- "eo" 2 x 6 uppercase and lowercase (CX GX HX JX SX UX cx gx hx jx sx ux)
-- upper CX $0108 GX $011C HX $0124 JX $0134 SX $015C UX $016C lower +1
-- "sv" 2 x 4 uppercase and lowercase (AE AA EE OE ae aa ee oe)
-- upper AE $00C4 AA $00C5 EE $00C9 OE $00D6 lower +$20
local function lfutristletr (strin5trist, strsel5set)
local numtype5x = 0 -- preASSume invalid
local numlong5den = 0 -- actual length of input string
local numlong5bor = 0 -- expected length of single char
local numcha5r = 0 -- UINT8 beginning char
local numcha5s = 0 -- UINT8 later char (BIG ENDIAN, lower value here above)
local numcxa5rel = 0 -- UINT8 code relative to beginning of block $00...$FF
local numtem5p = 0
local boois5uppr = false
local boois5lowr = false
while true do -- fake loop -- this is LFUTRISTLETR
numlong5den = string.len (strin5trist)
if (numlong5den==0) then
break -- bad string length
end--if
numcha5r = string.byte (strin5trist,1,1)
numlong5bor = lfulnutf8char(numcha5r)
if ((numlong5bor==0) or (numlong5den<numlong5bor)) then
break -- truncated char or invalid
end--if
if (numlong5bor==1) then
boois5uppr = lfgtestuc(numcha5r)
boois5lowr = lfgtestlc(numcha5r)
break -- success with ASCII, almost done
end--if
numcha5s = string.byte (strin5trist,2,2) -- only $80 to $BF
numcxa5rel = (mathmod(numcha5r,4)*64) + (numcha5s-128) -- 4 times 64
if ((strsel5set=='eo') and ((numcha5r==196) or (numcha5r==197))) then
numtem5p = mathbitwrit (numcxa5rel,0,false) -- bad way to do AND $FE
if ((numtem5p==8) or (numtem5p==28) or (numtem5p==36) or (numtem5p==52) or (numtem5p==92) or (numtem5p==108)) then
boois5uppr = (numtem5p==numcxa5rel) -- UC below, block of 1
boois5lowr = not boois5uppr
break -- success with -eo-, almost done
end--if
end--if ((strsel5set=='eo') and ...
if ((strsel5set=='sv') and (numcha5r==195)) then
numtem5p = mathbitwrit (numcxa5rel,5,false) -- bad way to do AND $DF
if ((numtem5p==196) or (numtem5p==197) or (numtem5p==201) or (numtem5p==214)) then
boois5uppr = (numtem5p==numcxa5rel) -- UC below, block of 32
boois5lowr = not boois5uppr
break -- success with -sv-, almost done
end--if
end--if ((strsel5set=='sv') and ...
break -- finally to join mark -- unknown non-ASCII char is a fact :-(
end--while -- fake loop -- join mark
if (boois5uppr) then
numtype5x = 1
end--if
if (boois5lowr) then
numtype5x = 2
end--if
return numtype5x
end--function lfutristletr
------------------------------------------------------------------------
---- HIGH LEVEL STRING FUNCTIONS [I] ----
------------------------------------------------------------------------
-- Local function LFIDEBRACKET
-- Separate bracketed part of a string and return the inner or outer
-- part. On failure the string is returned complete and unchanged.
-- There must be exactly ONE "(" and exactly ONE ")" in correct order.
-- Input : * strde31br, boooutside
-- * numxminlencz -- minimal length of inner part, must be >= 1 !!!
-- Note that for length of hit ZERO ie "()" we have "begg" + 1 = "endd"
-- and for length of hit ONE ie "(x)" we have "begg" + 2 = "endd".
-- Example: "crap (NO)" -> len = 9
-- 123456789
-- "begg" = 6 and "endd" = 9
-- Expected result: "NO" or "crap " (note the trailing space)
-- Example: "(XX) YES" -> len = 8
-- 12345678
-- "begg" = 1 and "endd" = 4
-- Expected result: "XX" or " YES" (note the leading space)
local function lfidebracket (strde31br, boooutside, numxminlencz)
local numindoux = 1 -- ONE-based
local numdlong = 0
local num31wesel = 0
local numbegg = 0 -- ONE-based, ZERO invalid
local numendd = 0 -- ONE-based, ZERO invalid
numdlong = string.len (strde31br)
while true do
if (numindoux>numdlong) then
break -- ONE-based -- if both "numbegg" "numendd" non-ZERO then maybe
end--if
num31wesel = string.byte(strde31br,numindoux,numindoux)
if (num31wesel==40) then -- "("
if (numbegg==0) then
numbegg = numindoux -- pos of "("
else
numbegg = 0
break -- damn: more than 1 "(" present
end--if
end--if
if (num31wesel==41) then -- ")"
if ((numendd==0) and (numbegg~=0) and ((numbegg+numxminlencz)<numindoux)) then
numendd = numindoux -- pos of ")"
else
numendd = 0
break -- damn: more than 1 ")" present or ")" precedes "("
end--if
end--if
numindoux = numindoux + 1
end--while
if ((numbegg~=0) and (numendd~=0)) then
if (boooutside) then
strde31br = string.sub(strde31br,1,(numbegg-1)) .. string.sub(strde31br,(numendd+1),numdlong)
else
strde31br = string.sub(strde31br,(numbegg+1),(numendd-1)) -- separate substring
end--if
end--if
return strde31br -- same string variable
end--function lfidebracket
------------------------------------------------------------------------
-- Local function LFIKATALDIGU
-- Brew cat insertion (no extra colon ":") or link to
-- appendix from 3 elements.
local function lfikataldigu (strprefixx, strkataldnomo, strhintvisi)
local strrbkma = ''
if (type(strhintvisi)=='string') then
strrbkma = '[[' .. strprefixx .. ':' .. strkataldnomo .. '|' .. strhintvisi .. ']]'
else
strrbkma = '[[' .. strprefixx .. ':' .. strkataldnomo .. ']]'
end--if
return strrbkma
end--function lfikataldigu
------------------------------------------------------------------------
---- VARIABLES [R] ----
------------------------------------------------------------------------
function exporttable.ek (arxframent)
-- general unknown type
local vartymp = 0 -- temp variable without type
local varret = 0 -- final result string or table
-- special type "args" AKA "arx"
local arxourown = 0 -- metaized "args" from our own "frame"
local arxexxtra = 0 -- for methods via mw.getCurrentFrame()
-- general table
local tabinput = {} -- all non-empty lines from template
local taballheadingsnodupe = {}
local tabheadingshit = {}
local tablistwithrootsnodupe = {}
-- general str
local strinrootin = '' -- from named 'in='
local strtext = '' -- huge string from the source template
local strtypofrut = '' -- from "numtyperoot" converted to string
local strdupesexx = ''
local strduperoot = ''
local strtmp = ''
local strvisgud = '' -- visible good output
local strvisred = '' -- reduced visible good output for output table
local strinvkat = '' -- invisible category part
local strviserr = '' -- visible error message
local strtrakat = '' -- invisible tracking categories
-- general num
local numlong = 0 -- length of parameter
local numerr = 0 -- 0 OK | 1 internal | 2 "in=" bad | 3 template bad ...
local numdcba = 0
local numtypeinpu = 0 -- 0 raw | 1 new | 2 leg
local numtyperoot = 0 -- 0 unknown | (67 C) 73 I 77 M 78 N 80 P 85 U
local numlines3mi = 0 -- number of lines from template
local numhitshits = 0 -- number of hits
-- general boo
local boonocat = false -- from "nocat=true"
local boogivet = false -- from "givetable=true"
---- GET THE ARX (OUR OWN) ----
-- must be seized independently on "numerr" even if we already suck
arxourown = arxframent.args -- "args" from our own "frame"
if (type(arxourown)~='table') then
arxourown = {} -- guard against indexing error
numerr = 1 -- #E01 internal
end--if
---- SEIZE ONE NAMED AND OBLIGATORY PARAM ----
strinrootin = ''
if (numerr==0) then
vartymp = arxourown['in']
if (type(vartymp)=='string') then
numlong = string.len (vartymp)
if ((numlong>0) and (numlong<200)) then
strinrootin = vartymp
end--if
end--if
if (strinrootin=='') then
numerr = 2 -- #E02 missing or bad length
else
if ((string.find(strinrootin,":",1,true)) or (string.find(strinrootin,"_",1,true))) then
numerr = 2 -- #E02 colon or underscore
end--if
end--if (strinrootin=='') else
end--if
lfdtracemsg ('This is "mrootero", requested "detrc" report')
lfdshowvar (numerr,'numerr','after seizure of anon param')
lfdshowvar (constrtemplate,'constrtemplate')
---- PROCESS 1 SPECIAL AND 2 HIDDEN NAMED PARAMS ----
-- "detrc=" and "nocat=" must be seized independently on "numerr"
-- even if we already suck, but type "table" must be ensured above !!!
boogivet = (arxourown['givetable']=='true')
boonocat = (arxourown['nocat']=='true')
if (arxourown["detrc"]=="true") then
lfdtracemsg ('Param "detrc=true" seized')
else
qboodetrc = false -- was preassigned to "true"
qstrtrace = '' -- shut up now
end--if
lfdshowvar (numerr,'numerr','done with special&hidden params')
lfdshowvar (boogivet,'boogivet')
lfdshowvar (boonocat,'boonocat')
---- FIND OUT THE TYPE OF INPUT AND ROOT ----
-- "Vorto -eo- enhavanta morfemon N (kapt)"
-- "Postfiksajxo ar'" -- "Radiko ide'" -- "Finajxo as"
-- "Memstara elemento da" -- "Liternomo co" -- "Antauxfiksajxo fi'"
-- note that minimal length of "Postfiksajxo" is 1 due to "-i-" and "-t-"
-- apo:s in the legacy patterns are troublesome in many ways, stupid
-- {{PAGENAME}} encodes apo to "'" and we must use "mw.text.decode"
numtypeinpu = 0 -- 0 raw | 1 new | 2 leg
numtyperoot = 0 -- 0 unknown | (67 C) 73 I 77 M 78 N 80 P 85 U
if (numerr==0) then
vartymp = string.find(strinrootin," ",1,true) -- space means more than bare
if (type(vartymp)=="number") then
if (string.byte(strinrootin,-1,-1)==41) then -- ")"
numtypeinpu = 1 -- new
else
numtypeinpu = 2 -- leg
end--if
end--if
lfdshowvar (numtypeinpu,'numtypeinpu')
end--if
if ((numerr==0) and (numtypeinpu==1)) then -- new
strtmp = lfidebracket(strinrootin,true,2) -- extract outer packaging
while true do -- fake loop
if (string.len(strtmp)~=32) then
numerr = 3 -- #E03
break -- to join mark
end--if
if (string.sub(strtmp,1,30)~="Vorto -eo- enhavanta morfemon ") then
numerr = 3 -- #E03
break -- to join mark
end--if
if (string.byte(strtmp,32,32)~=32) then
numerr = 3 -- #E03
break -- to join mark
end--if
numtyperoot = string.byte(strtmp,31,31)
if ((numtyperoot~=73) and (numtyperoot~=77) and (numtyperoot~=78) and (numtyperoot~=80) and (numtyperoot~=85)) then
numerr = 3 -- #E03
break -- to join mark
end--if
strinrootin = lfidebracket(strinrootin,false,2) -- all OK -> extract root
break -- finally to join mark
end--while -- fake loop -- join mark
end--if ((numerr==0) and (numtypeinpu==1)) then
if ((numerr==0) and (numtypeinpu==2)) then -- leg
strinrootin = mw.text.decode(strinrootin) -- fix possible apo
while true do -- fake loop
numlong = string.len(strinrootin)
if (numlong>=15) then
if (string.sub(strinrootin,1,9)=="Postfiksa") then
strinrootin = string.sub(strinrootin,14,-2) -- cut off apo too
strinrootin = '-' .. strinrootin .. '-'
numtyperoot = 73 -- "I" minimal length 1 (dubious "-i-" "-t-")
break -- to join mark -- success
end--if
end--if
if (numlong>=10) then
if (string.sub(strinrootin,1,7)=="Radiko ") then
strinrootin = string.sub(strinrootin,8,-2) -- cut off apo too
numtyperoot = 78 -- "N" minimal length 2
break -- to join mark -- success
end--if
end--if
if (numlong>=9) then
if (string.sub(strinrootin,1,4)=="Fina") then
strinrootin = string.sub(strinrootin,9,-1) -- no apo here
numtyperoot = 83 -- "U" minimal length 1
strinrootin = '-' .. strinrootin
break -- to join mark -- success
end--if
end--if
if (numlong>=20) then
if (string.sub(strinrootin,1,18)=="Memstara elemento ") then
strinrootin = string.sub(strinrootin,19,-1) -- no apo here
numtyperoot = 77 -- "M" minimal length 2
break -- to join mark -- success
end--if
end--if
if (numlong>=11) then
if (string.sub(strinrootin,1,10)=="Liternomo ") then
strinrootin = string.sub(strinrootin,11,-1) -- no apo here
numtyperoot = 77 -- "M" minimal length 1
break -- to join mark -- success
end--if
end--if
if (numlong>=18) then
if (string.sub(strinrootin,1,4)=="Anta") then
strinrootin = string.sub(strinrootin,16,-2) -- cut off apo too
strinrootin = strinrootin .. '-'
numtyperoot = 80 -- "P" minimal length 2, no like "abiotic" in -eo-
break -- to join mark -- success
end--if
end--if
numerr = 3 -- #E03 -- nothing found, invalid string legacy type fed in
break -- finally to join mark
end--while -- fake loop -- join mark
end--if ((numerr==0) and (numtypeinpu==2)) then
if (numerr==0) then
if (numtyperoot==0) then
strtypofrut = '??' -- unknown
else
strtypofrut = string.char(numtyperoot) -- 73 I 77 M 78 N 80 P 85 U
end--if
lfdshowvar (numerr,'numerr','done with root analysis')
lfdshowvar (numtyperoot,'numtyperoot')
lfdshowvar (strtypofrut,'strtypofrut')
lfdshowvar (strinrootin,'strinrootin','the isolated root')
end--if
---- CHECK WHETHER THE POINTED TEMPLATE EXISTS AT ALL AND EXPAND IT ----
-- we expect "constrtemplate" as the FULLPAGENAME
-- note that "mw.text.unstrip" is required due to wiki headings in the text
-- note that we need a separate "frame" if called from a module, pick
-- it using "mw.getCurrentFrame()"
strtext = ''
if (numerr==0) then
arxexxtra = mw.getCurrentFrame()
vartymp = arxexxtra:callParserFunction ('#ifexist:'..constrtemplate,'1','0')
if (vartymp=='1') then
vartymp = arxexxtra:expandTemplate { title = constrtemplate }
if ((type(vartymp))=='string') then
strtext = mw.text.unstrip (vartymp) -- result may be empty
end--if
end--if (vartymp=='1') then
if (strtext=='') then
numerr = 4 -- #E04 empty template
end--if
end--if
lfdshowvar (numerr,'numerr','done with template expansion')
lfdshowvar (strtext,'strtext')
---- COPY INTO TABLE STRIPPING OFF ALL BLANK LINES ----
-- * note that incoming "strtext" may be quasi-empty (empty lines and EOL:s)
-- * we always add an EOL to the text
if (numerr==0) then
do -- scope
local varkernel = 0
local strsingle = ''
local numsrclen = 0
local numsrcpos = 1 -- ONE based
local numtabinx = 0 -- ONE based -- INC-before-write
strtext = strtext .. string.char(10)
numsrclen = string.len(strtext) -- at least 2
while true do
if (numsrcpos>=numsrclen) then
break -- no chance for a non-empty line anymore
end--if
varkernel = string.find(strtext,string.char(10),numsrcpos,true)
if (type(varkernel)~="number") then
numerr = 1 -- #E01 internal
break -- ??
end--if
strsingle = ''
if (numsrcpos<varkernel) then
strsingle = string.sub(strtext,numsrcpos,(varkernel-1)) -- omit the LF
end--if
numsrcpos = varkernel + 1
if (strsingle~='') then
if (string.len(strsingle)==1) then
numerr = 5 -- #E05 empty or bad line -- too short (<2)
break
end--if
if ((string.byte(strsingle,1,1)==32) or (string.byte(strsingle,-1,-1)==32)) then
numerr = 5 -- #E05 empty or bad line -- leading or trailing space
break
end--if
numtabinx = numtabinx + 1
tabinput[numtabinx] = strsingle
end--if
end--while
numlines3mi = numtabinx
end--do scope
if ((numlines3mi<3) or (mathmod(numlines3mi,2)==1)) then
numerr = 6 -- #E06 -- at least 2 lines required and must be even
end--if
end--if
lfdshowvar (numerr,'numerr','done with tabelization')
lfdshowvar (numlines3mi,'numlines3mi','at least 2 lines required and must be even')
---- CORE HARD WORK WALKING THROUGH OUR TABLE AND PROCESSING LINES ----
if (numerr==0) then
do -- scope
local varoden = 0
local strprevhed = '' -- stripped previous heading used for hit & dupe
local stroneroot = ''
local numindelx = 1 -- ONE-based read index in table
local numlinnlen = 0
local numrootiex = 0 -- ONE-based read octet position in line
local numhitindx = 0 -- number of found hits -- INC-before-write
local boonowhead = true -- on "false" it is list with roots
while true do -- outer loop over alternating lines
if (numindelx>numlines3mi) then
break -- done
end--if
strtmp = tabinput [numindelx] -- pick line
if ((string.byte(strtmp,1,1)==61)~=boonowhead) then -- "="
numerr = 7 -- #E07 overall pattern bad (alternating lines needed)
break
end--if
if (boonowhead) then
if (string.len(strtmp)<7) then
numerr = 19 -- #E19 heading too short
break
end--if
if ((string.sub(strtmp,1,3)~="== ") or (string.sub(strtmp,-3,-1)~=" ==")) then
numerr = 19 -- #E19 bad equal signs
break
end--if
strtmp = string.sub(strtmp,4,-4) -- at least ONE char left
if (taballheadingsnodupe[strtmp]) then
numerr = 20 -- #E20 dupe heading
break
end--if
taballheadingsnodupe[strtmp] = true -- we can't do this twice !!!
strprevhed = strtmp -- maybe we will need it later
else
tablistwithrootsnodupe = {} -- reset on every line
numrootiex = 1 -- ONE-based -- reset on every line
strtmp = strtmp .. ' ' -- have always a termination space
numlinnlen = string.len(strtmp) -- at least 3
while true do -- inner loop over roots
if (numrootiex>=numlinnlen) then
break -- no chance for a root anymore -- exit inner loop
end--if
varoden = string.find(strtmp,' ',numrootiex,true)
if (type(varoden)~="number") then
numerr = 1 -- #E01 internal
break -- ??
end--if
if (numrootiex<varoden) then
stroneroot = string.sub(strtmp,numrootiex,(varoden-1)) -- avoid the space
else
numerr = 22 -- #E22 empty root is no fun :-( !!!FIXME!!! dashes
break
end--if
numrootiex = varoden + 1
if (tablistwithrootsnodupe[stroneroot]) then
numerr = 24 -- #E24 dupe root
strdupesexx = strprevhed
strduperoot = stroneroot
break
end--if
tablistwithrootsnodupe[stroneroot] = true -- we can't do this twice !!!
if (stroneroot==strinrootin) then -- !!! HERE WE GOT A HIT !!!
numhitindx = numhitindx + 1
tabheadingshit [numhitindx] = strprevhed -- do NOT abort search
end--if
end--while -- inner loop over roots
if (numerr~=0) then
break -- don't forget to exit outer loop too
end--if
end--if (boonowhead) else
numindelx = numindelx + 1 -- ONE-based read index in table
boonowhead = not boonowhead -- alternate
end--while -- outer loop over alternating lines
numhitshits = numhitindx
end--do scope
end--if (numerr==0) then
lfdshowvar (numerr,'numerr','done with processing lines')
lfdshowvar (tabheadingshit,'tabheadingshit')
lfdshowvar (numhitshits,'numhitshits')
---- BREW VISIBLE TEXT ----
-- "strinrootin" contains the raw morpheme or word but dashes can persist
-- based on "tabheadingshit" and "numhitshits"
-- here we fill "strvisred" and "strvisgud"
if (numerr==0) then
if (numhitshits==0) then
strvisred = contabvisi [3] -- "ne troveblas" ...
else
strvisred = contabvisi [2] -- "troveblas en " follows list
numdcba = 1
while true do -- at least ONE iteration
if (numdcba>numhitshits) then
break
end--if
if (numdcba~=1) then
strvisred = strvisred .. ' kaj '
end--if
strvisred = strvisred .. tabheadingshit [numdcba]
numdcba = numdcba + 1
end--while
end--if (numhitshits==0) else
strvisgud = contabvisi [0] .. strinrootin .. contabvisi [1] .. strtypofrut .. ') ' .. strvisred .. '.'
end--if (numerr==0) then
---- BREW ROOT CAT:S ----
-- "strinrootin" contains the raw morpheme or word but dashes can persist
-- based on "tabheadingshit" and "numhitshits"
-- all will receive the same sorting hint ... if the morpheme or word
-- begins with a dash then remove it so that "-o" falls under "O" not
-- under "-" ... still keep possible irrelevant trailing dash
if ((numerr==0) and (not boonocat)) then
do -- scope
local strcathint = ''
local strfoundheading = ''
strcathint = strinrootin
if (string.byte(strcathint,1,1)==45) then
strcathint = string.sub (strcathint,2,-1) -- remove Boulder Dash
end--if
if (numhitshits==0) then
strinvkat = lfikataldigu('Kategorio',contabcats [0],strcathint)
else
strinvkat = lfikataldigu('Kategorio',contabcats [1],strcathint)
numdcba = 1
while true do -- at least ONE iteration
if (numdcba>numhitshits) then
break
end--if
strfoundheading = tabheadingshit [numdcba]
vartymp = contabcats [strfoundheading] -- risk of type "nil"
if (type(vartymp)=="string") then
strfoundheading = vartymp -- use the translated one
end--if
strinvkat = strinvkat .. lfikataldigu('Kategorio',strfoundheading,strcathint)
numdcba = numdcba + 1
end--while
end--if (numhitshits==0) else
end--do scope
end--if ((numerr==0) and (not boonocat)) then
---- BREW TRACKING CAT:S #E02...#E99 ----
-- no tracking cat:s for #E01
-- "nocat=true" suppresses even tracking cat:s
if ((numerr>1) and (not boonocat)) then
strtrakat = lfikataldigu('Kategorio','Eraro (mrooteo)') -- !!!FIXME!!!
end--if
---- WHINE IF YOU MUST #E01...#E99 ----
if (numerr~=0) then
strviserr = '<br><b>Eraro #E' .. tonumber(numerr) .. '</b>' -- !!!FIXME!!!
end--if
if (numerr==21) then -- #E21
strviserr = strviserr .. '<br>Root list bad char (other than eo lowercase)' -- !!!FIXME!!! NOT yet detected
end--if
if (numerr==24) then -- #E24
strviserr = strviserr .. '<br>Dupe in section "' .. strdupesexx .. '" with root "' .. strduperoot .. '".'
end--if
---- RETURN THE STRING OR INNER TABLE ----
-- on #E02 and higher we risk partial results in "strvisgud" and "strinvkat"
lfdtracemsg ('Ready to return string glued together from 1 + 1 + 4 parts or table')
lfdshowvar (strvisgud,'strvisgud')
lfdshowvar (strvisred,'strvisred','table only')
lfdshowvar (strinvkat,'strinvkat')
lfdshowvar (strviserr,'strviserr')
lfdshowvar (strtrakat,'strtrakat')
if (boogivet) then
varret = { [0]=numerr, strvisgud, strinvkat, strvisred, qstrtrace }
else
if (numerr==0) then
varret = strvisgud .. strinvkat
else
varret = strviserr .. strtrakat
end--if
if (qboodetrc) then -- "qstrtrace" declared separately outside main function
varret = "<br>" .. qstrtrace .. "<br><br>" .. varret
end--if
end--if
return varret
end--function
---- RETURN THE JUNK OUTER TABLE ----
return exporttable