Wiktionary:Beer parlour/2024/January: difference between revisions

From Wiktionary, the free dictionary
Jump to navigation Jump to search
Content deleted Content added
Line 197: Line 197:
:::: We have both Latin and Greek reconstructions as well. Also, your comment on redlinks is not accurate as we can usually rather easily tell when a compound is formed in Celtic or Brythonic by various sound changes. But even so, we always try to reconstruct compounds at the lowest level unless it's obvious they were formed earlier. There is absolutely no reason to void out the link for a PC term in a derived terms list. -- [[User:Sokkjo|Sokkjō]] 22:21, 11 January 2024 (UTC)
:::: We have both Latin and Greek reconstructions as well. Also, your comment on redlinks is not accurate as we can usually rather easily tell when a compound is formed in Celtic or Brythonic by various sound changes. But even so, we always try to reconstruct compounds at the lowest level unless it's obvious they were formed earlier. There is absolutely no reason to void out the link for a PC term in a derived terms list. -- [[User:Sokkjo|Sokkjō]] 22:21, 11 January 2024 (UTC)
:::::For attested languages I think we should stick to the existing orthographic conventions as much as possible. Fully reconstructed languages are different.
:::::For attested languages I think we should stick to the existing orthographic conventions as much as possible. Fully reconstructed languages are different.
:::::I'm not aware of a way to tell when the compounds in question were formed. Unless there is clear evidence that a term existed, I think there should be no redlink. —[[User:Caoimhin ceallach|Caoimhin ceallach]] ([[User talk:Caoimhin ceallach|talk]]) 22:45, 11 January 2024 (UTC)
:::::I'm not aware of a way to tell when the compounds in question were formed. Unless there is clear evidence that a term existed, I think there should be no redlink. —[[User:Caoimhin ceallach|Caoimhin ceallach]] ([[User talk:Caoimhin ceallach|talk]]) 22:45, 11 January 2024 (UTC)root
:::::: So not on [[RC:Latin/hendo]] but yes on [[RC:Proto-Germanic/erþō]]. Consistent. Some examples are finding {{m|cel-bry-pro|*amm-}} instead of ''*ėmm-'' in ''i''-umlaut environments, or consonant clusters ''*mβ'' instead of ''*mm''. Do you also believe that Proto-Germanic entries should void out links in their derived terms lists, for, you know, consistency? -- [[User:Sokkjo|Sokkjō]] 21:18, 12 January 2024 (UTC)
:::::: So not on [[RC:Latin/hendo]] but yes on [[RC:Proto-Germanic/erþō]]. Consistent. Some examples are finding {{m|cel-bry-pro|*amm-}} instead of ''*ėmm-'' in ''i''-umlaut environments, or consonant clusters ''*mβ'' instead of ''*mm''. Do you also believe that Proto-Germanic entries should void out links in their derived terms lists, for, you know, consistency? -- [[User:Sokkjo|Sokkjō]] 21:18, 12 January 2024 (UTC)
:::::::@[[User:Sokkjo|Sokkjo]] Then let's change PIE, if consistency is key. If not, why not? The logic of what's being proposed is pretty clear: when a language is not attested, reconstructions should use hyphens. The logic behind it seems very straightforward. [[User:Theknightwho|Theknightwho]] ([[User talk:Theknightwho|talk]]) 21:34, 12 January 2024 (UTC)
:::::::@[[User:Sokkjo|Sokkjo]] Then let's change PIE, if consistency is key. If not, why not? The logic of what's being proposed is pretty clear: when a language is not attested, reconstructions should use hyphens. The logic behind it seems very straightforward. [[User:Theknightwho|Theknightwho]] ([[User talk:Theknightwho|talk]]) 21:34, 12 January 2024 (UTC)
Line 205: Line 205:
:::::::: Which part? I was explaining how you can tell if a compound is formed in PBry. or later over PC. -- [[User:Sokkjo|Sokkjō]] 22:34, 12 January 2024 (UTC)
:::::::: Which part? I was explaining how you can tell if a compound is formed in PBry. or later over PC. -- [[User:Sokkjo|Sokkjō]] 22:34, 12 January 2024 (UTC)
:::::::::The example {{m|cel-bry-pro|*amm-}} is unreliable since ''*ambi-'' often lost the ''*i'' to syncope before it could cause i-affection.<ref>{{R:cel:Schrijver|pages=268-276}}</ref> Presence or absence of ''i''-affection of a prefix in Brittonic is not diagnostic of when it was prefixed. — ''Ceso femmuin mbolgaig mbung'', ''[[User:Mellohi!|mello]]'''''[[User talk:Mellohi!|hi!]]''' ([[Special:Contributions/Mellohi!|投稿]]) 01:26, 13 January 2024 (UTC)
:::::::::The example {{m|cel-bry-pro|*amm-}} is unreliable since ''*ambi-'' often lost the ''*i'' to syncope before it could cause i-affection.<ref>{{R:cel:Schrijver|pages=268-276}}</ref> Presence or absence of ''i''-affection of a prefix in Brittonic is not diagnostic of when it was prefixed. — ''Ceso femmuin mbolgaig mbung'', ''[[User:Mellohi!|mello]]'''''[[User talk:Mellohi!|hi!]]''' ([[Special:Contributions/Mellohi!|投稿]]) 01:26, 13 January 2024 (UTC)
:::::::::: 1000%, the ''*i'' in ''*ambi-'' was syncopated before ''i''-umlaut took place. I'm referring to ''i''-umlaut from the root, not from the prefix itself. I assumed that was clear, but thanks for the painfully obvious reference. -- [[User:Sokkjo|Sokkjō]] 09:21, 13 January 2024 (UTC)


===References===
===References===

Revision as of 09:21, 13 January 2024


Proto-Berber

This topic actually includes two proposals. The first is to remove the hyphens from entries such as Reconstruction:Proto-Berber/am-an. After all, we don't add hyphens to indo-european words like *wĺ̥kʷ-os. The second is to treat numidian as a dialect of proto-berber, as it seems to have been. Therefore, we could treat the numidian GLD as an attested form of proto-berber *aǵăllid. Ελίας (talk) 13:52, 1 January 2024 (UTC)[reply]

User:USERNAME for confirmed group

Tim Utikal — This unsigned comment was added by Tim Utikal (talkcontribs).

@Tim Utikal: Did you make a mistake here? Are you trying to be whitelisted for certain user rights? —Justin (koavf)TCM 21:38, 1 January 2024 (UTC)[reply]
yes I'm sorry Tim Utikal (talk) 21:39, 1 January 2024 (UTC)[reply]
Per Wiktionary:Confirmed users, this can be granted in exceptional cases, but it will also just naturally occur to your account after a few days and several edits. Is there a particular need to have it quicker? —Justin (koavf)TCM 22:02, 1 January 2024 (UTC)[reply]
@Tim Utikal You are already in "autoconfirmed users" so there should be no need for this. Benwing2 (talk) 23:17, 1 January 2024 (UTC)[reply]
not really I just find it strange that I didn't already get it since I meet the requirements for quite a while now. I just realized anyway sorry for bothering. Tim Utikal (talk) 00:00, 2 January 2024 (UTC)[reply]

Deprecating Latnx

For background, Latnx is a script code used for the “extended” Latin script, meaning it covers the whole range Latin-script characters, while Latn only covers the common ones. Originally, the two were given different CSS styles, because most unusual Latin characters (such as those in IPA) were only supported by specialist fonts, and we didn’t want to use those fonts for languages that didn’t need it. That hasn’t been an issue for a long time.

So far as I can tell, the Latnx script code does absolutely nothing which is distinct from the normal Latn code. There are currently no special styles assigned to it in MediaWiki:Gadget-LanguagesAndScripts.css, and I can’t find any other special uses that make it a necessity. It’s dead weight, and has been since at least some time before 2019: @Erutuon removed all special styles assigned to it in this diff (when it was handled by MediaWiki:Common.css), but according to their edit summary this was “because the rule before my recent edits had no effect”, implying that it had been defunct for some time before then.

A few other things to consider:

  1. It adds clutter, which is annoying.
  2. It duplicates stuff unnecessarily in Module:scripts/data.
  3. Most languages which use characters only covered by Latnx are currently only set to use Latn, and fixing this would be a massive headache. This is normally okay, but becomes a problem with single-character entries for those characters, since the script module checks for Latn, finds no characters match, so assigns them the script code None, which causes browsers to not render them properly.
  4. There is no performance advantage to this, so far as I can tell.

Given all of this, I think we should just get rid of it. Theknightwho (talk) 00:10, 2 January 2024 (UTC)[reply]

Given that it does nothing, Support. CitationsFreak (talk) 00:15, 2 January 2024 (UTC)[reply]
@Theknightwho: I'd like to have User:Erutuon and User:This, that and the other weigh in to verify that this removal is OK; if so, I would support its removal. Benwing2 (talk) 02:29, 2 January 2024 (UTC)[reply]
@Benwing2:, shall I ping them for the other two script-deprecating votes? CitationsFreak (talk) 03:32, 2 January 2024 (UTC)[reply]
@CitationsFreak No need; by mentioning them I've already pinged them. Benwing2 (talk) 03:51, 2 January 2024 (UTC)[reply]
Ah. CitationsFreak (talk) 06:26, 2 January 2024 (UTC)[reply]
@Benwing2 @Theknightwho I can confirm that Latnx does nothing special styling-wise, but 35 Lua modules mention this script code, and it's possible that one of them is doing some special handling when this script code is found. I'll leave the Lua stuff in your hands. This, that and the other (talk) 05:48, 3 January 2024 (UTC)[reply]
I still need to do a thorough check, but I should note:
  1. A significant number of these are users’ private modules, which are the owner’s responsibility to update to account for any changes. I think they’re mostly sandboxes, anyway.
  2. Several more are language data modules, which we would expect: any that contain a language using Latnx will mention it.
  3. Yet more only mention it alongside Latn to make sure both are treated in the same way.
  4. There are no special modules associated with Latnx.
I haven’t seen anything (so far) which gives me cause for concern.
Theknightwho (talk) 17:35, 3 January 2024 (UTC)[reply]
Given that we now know deleting the code will have no practical effect, I'm going to go ahead and start actioning this. Theknightwho (talk) 16:56, 6 January 2024 (UTC)[reply]

Deprecating xzh-Tibt

The script code xzh-Tibt is only used for the Zhang-Zhung language. It does precisely one thing differently from Tibt, which is that it adds “BabelStone Tibetan sMar-chen” to its list of fonts, which displays Tibetan writing as though it were the Marchen script (which is one of its daughter scripts).

However, as explained on BabelStone’s website, this font was created in 2007 as a stopgap until the Marchen script had been encoded in Unicode, which it was back in 2016. The correct script code for this is Marc, and it’s already set up on our system. That leaves us with xzh-Tibt as a confusing duplicate, for the sake of an unnecessary font which almost no-one has anyway.

We only have 3 lemmas for Zhang-Zhung, two of which need to be moved to the correct encoding, so this shouldn’t be too arduous Looking at the source, they seem to have been recorded in the Tibetan script, so no moves are necessary. Theknightwho (talk) 01:10, 2 January 2024 (UTC)[reply]

Delete. Perhaps a pedantic point, but I note that the .xzh-Tibt font-family rule doesn't add that font to the list, it entirely substitutes the .Tibt, .xzh-Tibt font list with its own one-font list. This, that and the other (talk) 05:52, 3 January 2024 (UTC)[reply]
@This, that and the other Good point. It’s also problematic because Zhang-Zhung is primarily attested in the conventional Tibetan script in documents post-dating their fall to the Tibetan Empire, so we should simply change one of its scripts from this to Tibt and be done with it.
There are, in fact, other unencoded scripts that were also used for Zhang-Zhung (such as Marchung, among several others), but again, this code doesn’t facilitate those. Theknightwho (talk) 17:21, 3 January 2024 (UTC)[reply]

Deprecating pjt-Latn

This is only used by the Pitjantjatjara language of Australia. However, I see no reason why it has a separate script code: the only difference is that it three fonts listed in MediaWiki:Gadget-LanguagesAndScripts.css, which are Microsoft Sans Serif, Tahoma, and Code2000, with a fallback of using a sans-serif font (which we use for everything anyway).

The justification given in the CSS file is Pitjantjatjara (ḻ ṉ ṟ ṯ and capitals), which I assume is because font support used to be poor for these; no doubt that’s why Code2000 is listed, which supports many unusual characters. This has long been unnecessary, and they’re even included in Latn, not just Latnx. It’s clearly just a holdover from the long-deleted Template:pjt-Latn, created before we even had modules. Theknightwho (talk) 01:33, 2 January 2024 (UTC)[reply]

The "Georgia" font used for page titles by the default (Vector) skin doesn't have these characters. The heading on ngiṉṯaka, displayed here without the pjt-Latn override font:
ngiṉṯaka
This looks pretty ugly to me; the retroflex consonants look markedly smaller than the surrounding letters. I'm inclined to hold onto the special rules for now, although we could definitely choose a nicer font. This, that and the other (talk) 05:58, 3 January 2024 (UTC) 22:07, 3 January 2024 (UTC)[reply]
@This, that and the other I’m happy to have special rules, but I don’t think it necessitates having a separate script code. We can just set the rules to apply to Latn when used with language pjt. This issue will come up with a lot of other languages, so the current solution is unwieldy and scales poorly. Theknightwho (talk) 06:02, 3 January 2024 (UTC)[reply]
@Theknightwho So long as this works, and gets applied to the page title specifically, I'll be satisfied. This, that and the other (talk) 06:07, 3 January 2024 (UTC)[reply]
@This, that and the other: The method we use now in Module:headword (display title with script class) won't fix the page title (header level 1) in ngiṉṯaka because the display title doesn't contain the language code. Always adding script class and language code to the display title with pjt wouldn't make sense because some Pitjantjatjara entries have other entries on the same page. Maybe it would be okay to include the language code whenever there are letters with line below like , though that requires some new field for unusual characters in Module:headword or language data, and might end up with display title conflicts (last one wins, but there might be an error message) if there are multiple languages' entries on a page with the same unusual characters. — Eru·tuon 20:59, 9 January 2024 (UTC)[reply]

Petition to upgrade Medieval Greek

from Sarri.greek. Notifying, especially for Ancient Greek @Mahagaja, Erutuon, JohnC5, @Atelaes, ObsequiousNewt; also @Benwing2, Chuck Entz. Happy 2024 to everyone. Could en.wikt, please reconsider for Medieval Greek[…]

  • 1) using the linguistic term Medieval Greek instead of the historical term 'Byzantine' for its name (Module:languages). It is also visible at {{grc-IPA}}).
  • 2) upgrading Medieval Greek from etymology language (currently under grc) to autonomous language section? This is needed to correct the title Ancient Greek over words of 6th century onwards at Cat:Medieval Greek (Category:Byzantine Greek). I know that this is a nuisance for modules, but please consider updating; it is an omission of too many centuries for Greek.

All reference sources are listed at previous petition 2023. The 2019 Cambridge Grammar of Medieval and Early Modern Greek DOI intro presents information in English.). At el.wiktionary Cat:Med.Greek we use code gkm. Treated in polytonic script. No other templates needed. Period, from Justinian's Novellae (those written in Greek) to medieval texts extending to Late Medieval, equivalent to Early Modern. Thank you in advance for taking time to look into this. From el.wiktionary, ‑‑Sarri.greek  I 09:20, 2 January 2024 (UTC)[reply]

I have no objection, but WT:RFM is the usual venue for requesting splits (in this case, splitting gkm Medieval Greek out from grc Ancient Greek). —Mahāgaja · talk 09:31, 2 January 2024 (UTC)[reply]
Thank you very much @Mahagaja, I will make application there (my browser has problem there, too long page). ‑‑Sarri.greek  I 09:40, 2 January 2024 (UTC)[reply]

Removing Old Galician-Portuguese references/further readings in Galician entries

Pinging @Stríðsdrengur, @MedK1, @Sarilho1 and @Froaringus.

Recently I've been looking for Galician entries with quotations prior to 1500, thus considered Old Galician-Portuguese, and in the meantime also came across Galician entries (most of them) with OGP references/further readings, such as Corpus Xelmírez and Dicionario de Dicionarios do galego medieval. I think they should be removed, since they're used for OGP. Do you guys agree? Amanyn (talk) 17:03, 2 January 2024 (UTC)[reply]

Pinging @Froaringus because the last ping didn't work (I think). Amanyn (talk) 17:17, 2 January 2024 (UTC)[reply]
@Amanyn, I agree with this wholeheartedly, but Froaringus does not — he added the references there and wants them to stay where they are, so right now we're in a bit of an impasse (you can look at our convo here). I believe the current de facto treatment is to keep them in both pages. MedK1 (talk) 01:56, 3 January 2024 (UTC)[reply]
@MedK1: I may be 100% mistaken, but as I recall, pinging only works if you use it and sign your post, so going back to a previous post and adding in {{Ping|Foo}} doesn't work. (Again, I am an ignorant person, so I am wrong often.) —Justin (koavf)TCM 03:52, 3 January 2024 (UTC)[reply]
Yeah, Chuck Entz even corrected me in Wiktionary:Information desk/2023/December for doing that. Amanyn (talk) 16:27, 3 January 2024 (UTC)[reply]
Oh, that's good to know! Thank you! MedK1 (talk) 01:15, 4 January 2024 (UTC)[reply]
@MedK1 Alright then. I think we should discuss that with Froaringus again later, but now something a bit off-topic (since I've already pinged some OGP contributors, they might also see this) — don't you think quotations should be added to OGP alternative forms/spellings? While they make sense (the ones I came across so far), I think it just makes sense to prove they existed with a quoatation. For example, I believe that coelho was an OGP word, but shouldn't there be a quotation to prove? And that applies sometimes to main forms too: the quoation used for cõelho, for example, has cõello instead, and the same for meninho, whose quoation has menỹo. Of course it makes sense that if cõello existed cõelho also did (and the same for meninho, being menỹo an abbreviation, and the "y" used instead because the author was Spanish(?)), but I think a quoation with the word written like it is in the title would make more sense and fit it better. Amanyn (talk) 16:23, 3 January 2024 (UTC)[reply]
@Amanyn I agree with that too! The "infrastructure" for OGP right now is super rudimentary; we only have 900 lemmas so far, too — something like what you're proposing would definitely be an improvement here, especially since we're treating each version as its own lemma (see WT:AROA-OPT. MedK1 (talk) 01:15, 4 January 2024 (UTC)[reply]
@MedK1 Alright then! I won't start with it right now since I don't have any experience with quoations, but I'll look into it later. Amanyn (talk) 20:01, 4 January 2024 (UTC)[reply]

List of verbs by conversion of final voiceless /s/ into voiced /z/

An example is excuse with the homographs: verb /ɪkˈskjuːz/ vs noun /ɪkˈskjuːs/. Similarly, / close (verb /kloʊz/ vs adj./adv. /kloʊs/), use (verb /juz/ vs noun /jus/), and advice vs advise.

I'd also create a list for nouns derived change of stress into inition position (e.g., record, present, protest, rebel, refuse, etc.) ? JMGN (talk) 20:19, 2 January 2024 (UTC)[reply]

@JMGN are you asking for help to make such a list, or are you looking to make it yourself? These word should all be in Category:English heteronyms. If you are looking to contribute, this content may be in scope for our Appendix. You can do a search for the page title, including the Appendix: prefix, and click the red link to start contributing. This, that and the other (talk) 05:32, 3 January 2024 (UTC)[reply]

Google Groups to stop archiving new Usenet posts

A banner has appeared on Google Groups:

Effective from 22 February 2024, Google Groups will no longer support new Usenet content. Posting and subscribing will be disallowed, and new content from Usenet peers will not appear. Viewing and searching of historical data will still be supported as it is done today.

You can read more at [1].

This will make Google Groups useless for citing new terms and hot words via Usenet as time moves on. Given that Usenet has not seen substantial message traffic for some years, the impact on Wiktionary will be minimal.

However, as a matter of practicality, unless anyone else is able to find another Usenet archive, I think it would be worth changing CFI so that Usenet messages are only considered durably archived up to 21 February 2024 (perhaps to be voted on after Google goes ahead with the change). This, that and the other (talk) 05:15, 3 January 2024 (UTC)[reply]

Just another "killedbygoogle". G has been crap for Usenet access for years anyway: they folded it into their inferior Google Groups offering, and then lost interest in Groups (which was even in the beginning an inferior clone of what Yahoo! had). Anyone else miss DejaNews? Equinox 05:20, 3 January 2024 (UTC)[reply]
You can still access new Usenet posts, just on a different client. CitationsFreak (talk) 09:08, 3 January 2024 (UTC)[reply]
Google reminds me of the strangler fig, a hemiepiphyte, which rapidly grows up trees in tropical rain forests (not having to waste energy on forming a strong trunk itself), often eventually killing the tree. Not all strangler figs kill their support/host. Are they doing to Wikimedia what they have done to Usenet. DCDuring (talk) 20:51, 3 January 2024 (UTC)[reply]
We're not owned by Google. I think we won't be a victim of Google. CitationsFreak (talk) 23:16, 3 January 2024 (UTC)[reply]
Strangler fig don't own the trees they rely on to reach the sun before those trees, weakened by the sap the figs have taken, die in their shade. (I think there's a rich metaphor here.) DCDuring (talk) 01:35, 4 January 2024 (UTC)[reply]
@CitationsFreak are you aware of any ongoing archiving effort for Usenet posts? This, that and the other (talk) 23:30, 3 January 2024 (UTC)[reply]
There's https://www.usenetarchives.com/ (no search function, however) and (for the 80s Usenet stuff) https://usenet.trashworldnews.com . CitationsFreak (talk) 23:34, 3 January 2024 (UTC)[reply]
No full-text search = practically useless for our purposes.
Incidentally, my curiosity was piqued by the part of Google's statement which says that Usenet is now mainly used for sharing binaries (files) rather than text-based emails. Seemingly the binaries being shared involve "Linux ISOs": piracy of software, movies, porn, etc. In this context, it seems that the Usenet community of today has an incentive not to keep comprehensive, searchable archives. This, that and the other (talk) 06:33, 4 January 2024 (UTC)[reply]
Honestly, while full-text search would be a great thing to have for our purposes (as well as in general), you can find some slang from certain Usenet subgroups (or whatever they're called) in the relevant subgroup [1]. This is what the OED did, as a matter of fact.
[1]As an example, you could expect to find more skateboarding slang in alt.rec.skateboards than in comp.os.linux.setup, although that shouldn't mean you wouldn't find none. CitationsFreak (talk) 07:06, 4 January 2024 (UTC)[reply]
@CitationsFreak Good point. However, I was poking around the website and I can't find a single message from 2023, or even the second half of 2022. Are you sure they are still archiving? This, that and the other (talk) 02:03, 7 January 2024 (UTC)[reply]

The Winter/Summer 2024 Competition is here!

After looking at the response I got from the original idea I wrote, I have decided to officially make it a contest that Wiktionarians can partake in. The goal is to, in the format of a play, define as many English words as possible before we reach "zzzs".

RULES

1. All definitions must be in alphabetical order, starting at "a" and ending at "zzzs". (That means that, for the "trumspringa" example above, you couldn't define "a" or "1984" as your next definition, since "a" comes before "trumpsringa" and "1984" starts with a number.)

2. Each person can use three sentences for their definition, two for the use and definition of a word and one for some optional stage directions. You do not need to use two sentences to use and define a word, although it is encouraged. Sign your entries like this: ([Wiktionarian]) after what you wrote.

3. Each line of dialogue must be formatted like a play script would be. This is the speaker's name, in bold and all-caps, followed by a colon and them some dialogue. In addition, the word that is being defined must be in bold. Here is an example:

RANDOM CITIZEN #1: I have trumspringa. I have the desire to give up my job for farming. (RandomCitizen1)

Stage directions should be italicized, as so:

RANDOM CITIZEN #1 tosses his briefcase to the left-hand side of the stage.

4. The next word you define should come fairly close after the previous word defined (something like 600 lemmas after the word that's defined). Here is an example of something that would be valid:

RANDOM CITIZEN #1: I have trumspringa. I have the desire to give up my job for farming. (RandomCitizen1) Being the ts is a hard job. Being the person who something something is a hard job. (RandomCiti2)

And here is something that would not:

RANDOM CITIZEN #2: I have trumspringa. I have the desire to give up my job for farming. (RandomCitizen1) This is because you can't zyxt. This is because you can't see. (MidEngFan)

This is because "trumspringa" and "ts" are less than 600 entries away from each other (if they existed with the defined meaning on Wikt), and "zyxt" is more than that (if it was an English lemma).

5. I'm allowed to make up new rules when I feel that they are needed.

So, that's it. Have any questions on them? CitationsFreak (talk) 06:35, 4 January 2024 (UTC)[reply]

Sounds fun! Can you link to it though (or create a page, if you haven't already)? How are we determining whether something is about 600 entries away? Is there a particular list of lemmas we could use as a gauge? Andrew Sheedy (talk) 07:06, 4 January 2024 (UTC)[reply]
Here's the list of every English lemma we have at Wiktionary. Page will be made shortly. CitationsFreak (talk) 07:08, 4 January 2024 (UTC)[reply]
@Andrew Sheedy Here's the link: Wiktionary Winter/Summer Competition 2023. CitationsFreak (talk) 07:17, 4 January 2024 (UTC)[reply]
Great, thanks! Could we maybe make it a rule that every sentence of stage directions also has to include a word within about 600 lemmas of the previous word (though without defining it)? Otherwise I think stage directions risk becoming either boring or stealing the show from the dialogue. So following your example, a stage direction that would work instead could be RANDOM CITIZEN #1 puts on a pair of swim trunks. Andrew Sheedy (talk) 07:19, 4 January 2024 (UTC)[reply]
I shoulda said it earlier but I think the original goal of the Wiktionary "competitions" was to add more content to the project. Making learning fun!! Anyway, we have moved beyond the time where "find a word beginning with such-and-such a letter pair" is a challenge involving research and creation, since Wikt is quite huge now; so maybe legitimate games are all that are left :) I hope Wonderfool won't ruin your play (or maybe he will make it great). Equinox 21:51, 6 January 2024 (UTC)[reply]

Bit concerned about User:Mynewfiles

Smells like our old buddy "Pass a Method". Look at the huge torrent of made-up USA terms tonight. None with citations, none with anything, just pure Polyfilla. Equinox 07:50, 7 January 2024 (UTC)[reply]

I can guarantee you 10,000% that I am not "Pass a method". Mynewfiles (talk) 07:54, 7 January 2024 (UTC)[reply]
This isn't PaM, just a3a0. — SURJECTION / T / C / L / 09:28, 7 January 2024 (UTC)[reply]
Looks like Wonderfool to me. Denazz (talk) 13:44, 10 January 2024 (UTC)[reply]

Affix segmentation with hyphens in derived terms lists in proto-languages

For PIE entries we use hyphens to separate morphological segments in the Derived terms section. I then extended this formatting to Proto-Celtic, given that e.g.:

  • This increases reconstruction transparency and readability. This goes triple for Proto-Celtic since it had many stacked-prefix words which become hard to parse if run together.
  • Celtic allows particles and pronouns to intervene between the first and second morphemes of a verb, so making the insertion position clearer with prefix separation by hyphens would be a benefit.
  • Importing formatting qualities of our well-regarded PIE entries should lift Proto-Celtic entries to a similar quality level as them.

But Victar deleted a set of hyphens on one Celtic verb where I formatted like that, and told me to discuss the issue over here. So should PIE-style hyphenation of affixes be used in derived terms sections of other proto-languages? — Ceso femmuin mbolgaig mbung, mellohi! (投稿) 18:13, 9 January 2024 (UTC)[reply]

There shouldn't be hyphens in Reconstruction entry names, but there's nothing wrong with displaying them with hyphens, e.g. {{l|cel-pro|*atinoweti|*ati-noweti}}. —Mahāgaja · talk 07:33, 10 January 2024 (UTC)[reply]
@Mellohi! Why are you bypassing links to the Proto-Celtic entries? The reason we do this on PIE entries is because they are often too speculative to create actual entries for, but that is far less the case with Proto-Celtic.
I still also do not in support of hyphens in derived terms in lists, which is in breaking of how we format most languages, in addition to being more difficult to read, in my opinion, but may I suggest an alternative? What I've done on some Proto-Iranian entries is add {{q|+ prefix}} to list entry redlinks, see RC:Proto-Iranian/hmáwčati. Is this close enough to what you're going for? -- Sokkjō 23:30, 10 January 2024 (UTC)[reply]
I think we should strive for consistency. We display PIE term segmented by morphemes, because it is universally considered useful. I think that principle should be extended to other reconstructed languages. I think separating prefixes with hyphens is imperative at any rate, for the reasons listed above. (Victar's alternative suggestion looks messy and has no precedent.) If anything, the argument should be about whether we should display *uφor-φi-φoik-e, but I don't currently have a view on that.
(Aside: Mellohi!'s formatting {{l|cel-pro||*uɸor-ɸiɸoike}}, without redlink, is preferable, as it isn't clear whether the term existed in that form in Proto-Celtic or was derived somewhere along the way to Proto-Brythonic. That's the way it's done for PIE. In my view our current guidelines concerning this are lacking, but that's a discussion for another day.) —Caoimhin ceallach (talk) 12:57, 11 January 2024 (UTC)[reply]
If the goal is purely to "strive for consistency", than PIE entries should have their hyphens removed, as hyphen-less derived terms lists is the overwhelming standard. I also don't see many editors for Germanic, Latin or Greek supporting this format for those languages. The argument for cold consistency is flawed from the start because each language has their own linguistic needs and community preferences, both on en.Wikt and academically. -- Sokkjō 22:06, 11 January 2024 (UTC)[reply]
We're only talking about reconstructed languages. I said we should segment words of reconstructed languages with hyphens, to the extent that we consider it useful, as this is a means of clarifying word structure that we already use for PIE. This is not 'cold consistency'. —Caoimhin ceallach (talk) 22:15, 11 January 2024 (UTC)[reply]
We have both Latin and Greek reconstructions as well. Also, your comment on redlinks is not accurate as we can usually rather easily tell when a compound is formed in Celtic or Brythonic by various sound changes. But even so, we always try to reconstruct compounds at the lowest level unless it's obvious they were formed earlier. There is absolutely no reason to void out the link for a PC term in a derived terms list. -- Sokkjō 22:21, 11 January 2024 (UTC)[reply]
For attested languages I think we should stick to the existing orthographic conventions as much as possible. Fully reconstructed languages are different.
I'm not aware of a way to tell when the compounds in question were formed. Unless there is clear evidence that a term existed, I think there should be no redlink. —Caoimhin ceallach (talk) 22:45, 11 January 2024 (UTC)root[reply]
So not on RC:Latin/hendo but yes on RC:Proto-Germanic/erþō. Consistent. Some examples are finding *amm- instead of *ėmm- in i-umlaut environments, or consonant clusters *mβ instead of *mm. Do you also believe that Proto-Germanic entries should void out links in their derived terms lists, for, you know, consistency? -- Sokkjō 21:18, 12 January 2024 (UTC)[reply]
@Sokkjo Then let's change PIE, if consistency is key. If not, why not? The logic of what's being proposed is pretty clear: when a language is not attested, reconstructions should use hyphens. The logic behind it seems very straightforward. Theknightwho (talk) 21:34, 12 January 2024 (UTC)[reply]
It is not. See my comments above. -- Sokkjō 21:44, 12 January 2024 (UTC)[reply]
@Sokkjo I've read them - what I was responding to was your comment that seemed to purposefully misunderstand @Caoimhin ceallach. Theknightwho (talk) 21:45, 12 January 2024 (UTC)[reply]
I don't understand what you're trying to say. —Caoimhin ceallach (talk) 22:10, 12 January 2024 (UTC)[reply]
Which part? I was explaining how you can tell if a compound is formed in PBry. or later over PC. -- Sokkjō 22:34, 12 January 2024 (UTC)[reply]
The example *amm- is unreliable since *ambi- often lost the *i to syncope before it could cause i-affection.[1] Presence or absence of i-affection of a prefix in Brittonic is not diagnostic of when it was prefixed. — Ceso femmuin mbolgaig mbung, mellohi! (投稿) 01:26, 13 January 2024 (UTC)[reply]
1000%, the *i in *ambi- was syncopated before i-umlaut took place. I'm referring to i-umlaut from the root, not from the prefix itself. I assumed that was clear, but thanks for the painfully obvious reference. -- Sokkjō 09:21, 13 January 2024 (UTC)[reply]

References

  1. ^ Schrijver, Peter C. H. (1995) Studies in British Celtic historical phonology (Leiden studies in Indo-European; 5), Amsterdam, Atlanta: Rodopi, pages 268-276

Reusing references: Can we look over your shoulder?

Apologies for writing in English.

The Technical Wishes team at Wikimedia Deutschland is planning to make reusing references easier. For our research, we are looking for wiki contributors willing to show us how they are interacting with references.

  • The format will be a 1-hour video call, where you would share your screen. More information here.
  • Interviews can be conducted in English, German or Dutch.
  • Compensation is available.
  • Sessions will be held in January and February.
  • Sign up here if you are interested.
  • Please note that we probably won’t be able to have sessions with everyone who is interested. Our UX researcher will try to create a good balance of wiki contributors, e.g. in terms of wiki experience, tech experience, editing preferences, gender, disability and more. If you’re a fit, she will reach out to you to schedule an appointment.

We’re looking forward to seeing you, Thereza Mengs (WMDE)

I was a bit puzzled, but the German text[2] makes clear that this is about how editors use the same reference more than once in an article. This does not appear relevant to Wiktionary entries, but doesn't everybody use <ref name="..."> for this purpose, if needed?  --Lambiam 13:37, 10 January 2024 (UTC)[reply]
Yes, I use <ref name="..."> on Wiktionary entries. I learned this technique on English Wikipedia circa 2018-2019. This usually happens in my editing here on Wiktionary when there is an authoritative source relevant to both the etymology and pronunciation of a term. I used this technique with Amnok, Tuman, Tumen and Yalu for instance, and I know there are some other ones like these four. --Geographyinitiative (talk) 13:44, 10 January 2024 (UTC) (Modified)[reply]
I use <ref name="..."> too. It's the only method shown at Help:Footnotes#Multiple_citations_of_the_same_reference_or_footnote, and I don't recall seeing approaches other than this one. Voltaigne (talk) 16:06, 10 January 2024 (UTC)[reply]
While <ref name="..."> mostly works, one occasionally needs to make references to different pages of the same work, especially for grammatical details. We even have some templates that are set up for taking multiple pages, though that's rarely appropriate for inline references. One trick I've used in the other place is to use an inline notation for pages (basically, adding colon plus page number), using Wikipedia's {{rp}}, but that has the disadvantage of not being invented here. Another method is to have one table of page references and another of the referenced works themselves. --RichardW57m (talk) 17:15, 10 January 2024 (UTC)[reply]

Words formed by substitution: new template suggestion

In Blood and Crip slang it's common for words to be respelled to start with "B" and "C", respectively. Some examples are kick to bick, cool to bool, Compton to Bompton (currently redlinked, see [3]), as well as (in the other direction) bro to cro and brodie to crodie. The problem is how to explain this in an etymology. On crodie I originally used {{blend|en|Crip|brodie|t1=a member of the Crips gang}} which isn't correct, and doing {{blend|en|C|brodie}} doesn't feel right either since it doesn't make much sense to say a word is being blent with a single letter. I propose a template {{substitution}} or {{subs}} which would function like this:

{{subs|en|kick|B|alt1=(k)ick}}, producing "Template:glink of B into (k)ick"

Other words in English formed similarly are medireview and cdesign proponentsist. I'm curious if there exist similar cases in other languages. Ioaxxere (talk) 03:30, 11 January 2024 (UTC)[reply]

Is this only orthographic or is bick also pronounced with /b/?
Some more examples of either this or a related phenomenon are the (sometimes jocular, sometimes serious/academic) substitution of cistrans, e.g. TransformersCisformers and translatecislate (where I said in an HTML comment just a few hours ago that "blend" didn't seem right); also atmosphereatmosflat; which are also pronounced differently. I am ambivalent about whether this needs to be templatized. - -sche (discuss) 04:53, 11 January 2024 (UTC)[reply]
Right, we have to talk about it. I wondered why you haven’t just created an entry for “prefix” b-, maybe not enough examples. Where is the line though?
It’s not orthographic, this derives from speech. Those gangbangers by default aren’t even as literate as we imagine well-behaved citizens, so I could hear lots of audio examples in my playlists in the background that I have not gotten around to quote. no bap is as often said as no kizzy, but nibba is hardly the like situation, though playing upon the older GM use. Either can be argued separately, whether we should have them at all—where we see the limits of attestation again, where quoting from texts misses out the legitimacy in so far it depends upon speech. But I think the case is lost since we include bowdlerisations and starred sh*t. Fay Freak (talk) 12:11, 11 January 2024 (UTC)[reply]
How is this different from pig Latin and double Dutch? is there a limit on which words can be so modified? As a taboo avoidance, it also reminds me of minced oaths and things like you see in the etymology of bear, where another word is substituted. There's also a phenomenon I've heard of with religious Jews where they change sounds/letters in divine names because they're too sacred to say or write, i.e. "Elokim" instead of "Elohim". Chuck Entz (talk) 14:46, 11 January 2024 (UTC)[reply]
Just now I’ve had the sudden realization that this “substitution” is a simulfix. I added that two and a half years ago to the modules after overviewing the affix types. This part of speech is hitherto exclusively used in the entry 🅱️. Fay Freak (talk) 15:14, 11 January 2024 (UTC)[reply]

Moravian

(Notifying Solvyn, Atitarev, Benwing2, Hergilei, Zhnka, Jan.Kamenicek): and perhaps @Thadh, @Sławobóg, and @Mahagaja as people potentially able to comment: should we consider splitting Moravian as an L2? Have there been discussions about this in the past? Vininn126 (talk) 21:20, 11 January 2024 (UTC)[reply]

@Vininn126 Yes. Sławobóg (talk) 21:24, 11 January 2024 (UTC)[reply]
Shows how good my memory is! We only had one commenter on Moravian (and I don't even necessarily agree with the comment @Bezimenen), so I think we should give it a little more attention and get input from Czech editors. Vininn126 (talk) 21:28, 11 January 2024 (UTC)[reply]
Political argument is bad argument. Sławobóg (talk) 21:51, 11 January 2024 (UTC)[reply]
I'm not making any arguments - I disagree with the idea that a billion can pop up. I have no idea how similar the two lects are and I'm trying to sus that out. Vininn126 (talk) 21:56, 11 January 2024 (UTC)[reply]
I'm referring to his "+ I don't want to give food for thought to Z-Russians". Sławobóg (talk) 22:17, 11 January 2024 (UTC)[reply]
Ah, sorry! Yes, I feel that's completely irrelevant. Vininn126 (talk) 22:19, 11 January 2024 (UTC)[reply]
In fact there is nothing like a unified Moravian language or dialect, there are just many different dialects in Moravia. They can be grouped into five basic groups, which differ from each other very significantly, see File:Moravian_dialects.png. --Jan Kameníček (talk) 22:21, 11 January 2024 (UTC)[reply]
Are they related at least typologically? How different are they from each other? Vininn126 (talk) 22:23, 11 January 2024 (UTC)[reply]
They still definitely have, and in the past had, Czech as Dachsprache, and it would never be wrong to add Moravian to Czech, like it rarely turns out wrong to add regionalisms of Arabic dialects under Arabic; here it is even more unintuitive to split and not necessary a priori. One consideration when separating the dialect was not too make up stuff too much, passer-by editors don’t expect: there are people who don’t read instruction manuals and get started with assembling their furniture or playing the board-game right away, though I am not one of them. Hence I at least don’t like a split. Fay Freak (talk) 12:24, 12 January 2024 (UTC)[reply]
It might still be worth it to at least give it an etym-only code. Vininn126 (talk) 12:59, 12 January 2024 (UTC)[reply]
@Vininn126 If as per User:Jan.Kamenicek there is no unified Moravian lect, then IMO it makes no sense to have a "Moravian" etym code, but it might make sense to have several etym codes, one per major dialect area. I don't know. (Although there may be no need for even this; for example, I am doing some work on Catalan now, and Catalan is essentially a pluricentric language with a standardized dialectal norm for Valencian plus various other dialects outside of the Central Catalan standard, e.g. Balearic, Algherese, Northern, Northwestern, but so far we've found no need for etym codes for these dialects; categorizing labels and accent qualifiers are enough.) But I agree with User:Fay Freak that an L2 split is unlikely to make sense given the situation. Benwing2 (talk) 17:19, 12 January 2024 (UTC)[reply]