Jump to content

User talk:Citation bot

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia
(Redirected from User talk:Citation bot 4)

Note that the bot's maintainer and assistants (Thing 1 and Thing 2), can go weeks without logging in to Wikipedia. The code is open source and interested parties are invited to assist with the operation and extension of the bot. Before reporting a bug, please note: Addition of DUPLICATE_xxx= to citation templates by this bot is a feature. When there are two identical parameters in a citation template, the bot renames one to DUPLICATE_xxx=. The bot is pointing out the problem with the template. The solution is to choose one of the two parameters and remove the other one, or to convert it to an appropriate parameter. A 503 error means that the bot is overloaded and you should try again later – wait at least 15 minutes and then complain here.

Submit a Bug Report

Or, for a faster response from the maintainers, submit a pull request with appropriate code fix on GitHub, if you can write the needed code.


Feature requests

[edit]
  • Implement support to expand from https://doi.org/10.1093/ww/9780199540884.013.U192476 to {{Who's Who}}
    Example: https://en.wikipedia.org/w/index.php?title=Friern_Hospital&diff=prev&oldid=1167644213
  • Implement support to convert cite web to {{BioRef}} and {{GBIF}}
  • Use https://www.crossref.org/blog/news-crossref-and-retraction-watch/
  • https://www.ncbi.nlm.nih.gov/books/NBK25497/ set NLM_APIKEY and NLM_EMAIL
  • journal/publisher that only differ by 'and' and '&' should be treated as identical https://en.wikipedia.org/w/index.php?title=Congenital_cartilaginous_rest_of_the_neck&diff=prev&oldid=1199200383
  • Free archive.org links such as curl -sH "Accept: application/json" "https://scholar.archive.org/search?q=doi:10.1080/14786449908621245" | jq -r .results[0].fulltext.access_url
  • Use GET instead of POST for better proxy caches when talking to data-bases when possible.
  • Start to convert Google Books URL to "new" format https://www.google.com/books/edition/_/m8W2AgAAQBAJ?gbpv=1&pg=PA379
  • When encountering a {{cite journal}} or {{citation}} with |journal=bioRxiv or |journal=bioRxiv: The Preprint Server for Biology [case insensitive], the bot should convert the citation to a proper {{cite bioRxiv}}, i.e.
    • {{cite journal |last1=Larivière |first1=Vincent |last2=Kiermer |first2=Véronique |last3=MacCallum |first3=Catriona J. |last4=McNutt |first4=Marcia |last5=Patterson |first5=Mark |last6=Pulverer |first6=Bernd |last7=Swaminathan |first7=Sowmya |last8=Taylor |first8=Stuart |last9=Curry |first9=Stephen |date=2016-07-05 |title=A simple proposal for the publication of journal citation distributions |journal=bioRxiv |page=062109 |url=http://biorxiv.org/lookup/doi/10.1101/062109 |language=en |doi=10.1101/062109 |hdl=1866/23301 |s2cid=64293941 |hdl-access=free}}
    • Larivière, Vincent; Kiermer, Véronique; MacCallum, Catriona J.; McNutt, Marcia; Patterson, Mark; Pulverer, Bernd; Swaminathan, Sowmya; Taylor, Stuart; Curry, Stephen (5 July 2016). "A simple proposal for the publication of journal citation distributions". bioRxiv: 062109. doi:10.1101/062109. hdl:1866/23301. S2CID 64293941.
    • The bot should keep |author/last/first/date/year/title/language=, convert |doi= to |biorxiv=, and throw the rest away.
    • {{cite bioRxiv |last1=Larivière |first1=Vincent |last2=Kiermer |first2=Véronique |last3=MacCallum |first3=Catriona J. |last4=McNutt |first4=Marcia |last5=Patterson |first5=Mark |last6=Pulverer |first6=Bernd |last7=Swaminathan |first7=Sowmya |last8=Taylor |first8=Stuart |last9=Curry |first9=Stephen |date=2016-07-05 |title=A simple proposal for the publication of journal citation distributions |language=en |biorxiv=10.1101/062109}}
    • Larivière, Vincent; Kiermer, Véronique; MacCallum, Catriona J.; McNutt, Marcia; Patterson, Mark; Pulverer, Bernd; Swaminathan, Sowmya; Taylor, Stuart; Curry, Stephen (5 July 2016). "A simple proposal for the publication of journal citation distributions". bioRxiv 10.1101/062109.
    • If it was from a {{citation}}, append |mode=cs2 to it.
    • To be extra safe, this should only be done when the DOI starts with 10.1101. Headbomb {t · c · p · b} 08:20, 20 June 2024 (UTC)[reply]
    • Same for with |journal=medRxiv or |journal=medRxiv: The Preprint Server for Health Sciences [case insensitive] and convertions to {{cite medrxiv}}
  • If encountering a {{cite bioRxiv}} that is fully published, convert it to a {{cite journal}}

Changing every citation of a publisher's webpage to Cite book

[edit]

I have remained silent on this issue even though it has irritated me for a while now. And now that there is discussion above about the widespread useless cosmetic edits this bot continues to waste everyone's time with, I'll raise it: Why must every citation of a publisher's webpage be changed to to Cite book? I can only speak for myself, but every time I cite such book webpages I am not citing the book itself. I am specifically referencing the information published on the webpage. So of course I do not want the citation to be changed to Cite book with a bunch of parameters of the book itself (ISBN, date, etc) added. So I inevitably stop the bot or replace the reference with a third-party source. I realise the defense will be "It doesn't hurt" or that some users are actually citing the book. And I realise this is not the most pressing issue, but why must the bot come to its own conclusion of the editor's intent? I see another user complained of this issue last year. Οἶδα (talk) 22:25, 27 September 2023 (UTC)[reply]

This may be the kind of situation where it's safest to explicitly tell citation bot not to muck with the citation. It's hard to automatically judge whether the human editor actually wanted "cite web" or "cite book". (There are many examples of people using "cite web" to cite resources that should actually be books, journal articles, etc.) –jacobolus (t) 01:38, 28 September 2023 (UTC)[reply]
I understand. But it still feels like an another unnecessary task for this bot to insert itself into every article it can possibly find. For example, this edit is completely useless and actually corrupts my intention of the citation. Call me crazy but I don't want or need a bot telling me what I am citing (and actively altering my citations accordingly). Οἶδα (talk) 21:32, 13 October 2023 (UTC)[reply]
When I've quoted publisher blurbs in the past, I usually set |type=publisher's blurb for clarity. In the specific case you've linked just above, another option would be not to cite the publisher's landing page at all, and add the book to a "Selected works" subsection or something. Indeed, the altered citation is sequential to another one, and so seems a bit superfluous. Or, alternatively, use "Citation bot bypass" somewhere in your citation as suggested by jacobolus above.
Given the overall lazy referencing culture of less experienced editors, it's likely that in the majority of cases, people who drop a link to a publisher landing page are probably trying to cite the book itself, so this behaviour of assuming that's the case is net beneficial. Folly Mox (talk) 22:13, 13 October 2023 (UTC)[reply]
I cannot personally maintain that the majority of users citing a publisher's webpage are lazily intending to cite the book itself. My experience suggests otherwise which is why I have taken issue, but I realise my editing purview might be skewed. However, if that is observably true then I will resign to accepting this as a forgivable externality. Οἶδα (talk) 06:35, 14 October 2023 (UTC)[reply]
In fairness to your point, I haven't looked into the data about how frequently this sort of change is appropriate; it could be the case that my own perspective is the skewed one. Folly Mox (talk) 08:32, 14 October 2023 (UTC)[reply]
I couldn't find a list of tasks that the bot has been approved for (other than the very first approval) nor a thorough description of all of its mystical activities. I was surprised to find it would change "Cite web" to "Cite book" (for unclear reasons). The only cure, if the bot is unchanged, seems to be the <!-- Citation bot bypass--> mechanism documented at User:Citation_bot#Stopping_the_bot_from_editing - R. S. Shaw (talk) 04:12, 6 December 2023 (UTC)[reply]

Why is Citation Bot removing a page # from a cite's URL

[edit]

On Charles Clinton, Citation bot removes "?seq=9" from this URL. That bit of code give the Page # within the larger cite, so why does Citation bot remove it? It makes sense to me to leave that bit of code in there but the bot doesn't seem to think so. It's removed it twice, once here and once here, so maybe I'm wrong... Would appreciate some clarification. Thanks, Shearonink (talk) 03:36, 4 March 2024 (UTC)[reply]

Also, if by some chance I am correct, is there any way to stop people from running the Bot needlessly on this supposed issue? Thanks, Shearonink (talk) 03:37, 4 March 2024 (UTC)[reply]
The landing page is the same in either case. Headbomb {t · c · p · b} 04:09, 4 March 2024 (UTC)[reply]
It isn't the same for me... The one without the ?seq lands me on the main page, the URL with the ?seq=9" lands me on the exact page with the quoted text... Shearonink (talk) 04:55, 4 March 2024 (UTC)[reply]
Yes, I also see the preview page showing page 9 of 17 with the ?seq parameter. —David Eppstein (talk) 08:06, 4 March 2024 (UTC)[reply]
Oh good David Eppstein it isn't just me... The ?seq code might be taking us & other registered editors to the exact page because we have a JSTOR account through the WP Library I guess... But even if people don't have a JSTOR account the *code* should be left there, otherwise the URL seems useless. I like to give readers the option of going down the rabbithole of verifiability if they want to. Why is WP giving readers an URL that is to the entire book or article as the Citation bot default when the bot is run on the article? Shearonink (talk) 15:49, 4 March 2024 (UTC)[reply]
Actually currently JSTOR thinks it is providing me access through UT Dallas, I guess because I was there for a conference last summer. But yes, this should be left in place, like the pg= parameter of Google Books links, for the same reason. —David Eppstein (talk) 17:27, 4 March 2024 (UTC)[reply]
@Headbomb 2409:4070:4381:EF12:0:0:1BF5:A5 (talk) 12:34, 29 April 2024 (UTC)[reply]
@Headbomb. I just undid the edit the bot had done at @Jay8g's request. RememberOrwell (talk) 06:04, 18 September 2024 (UTC)[reply]

CITEVAR and manually formatted references

[edit]

I asked this in the discussion of an earlier bug but it was archived without providing an answer. Can you please explain

  1. How is it not a violation of WP:CITEVAR for Citation bot to convert manually-formatted references into templates, as it is doing e.g. at Special:Diff/1216926071? A human might do this but a bot automatically doing it is completely something else, especially in cases such as here where it does not even improve the consistency of formatting (the article is still a mix of CS1, CS2, and manually-formatted references).
  2. For those of us who might deliberately format references manually because we don't want bots messing with our citations, or we made a deliberate decision that the citation templates were inadequate for some specific citation, do we now have to start explicitly locking the bots out of articles altogether?
  3. Where is this included in the BAG-approved tasks for this bot?
  4. I find the bot's edit summary "Changed bare reference" to be significantly misleading. This is not a bare-url reference. It is a well-formatted reference that happens to be manually formatted. Where is there any guideline or policy suggesting that such references are a problem that needs to be fixed?

David Eppstein (talk) 20:29, 2 April 2024 (UTC)[reply]

There's already citation templates on that page. No CITEVAR violation happened. Headbomb {t · c · p · b} 23:09, 2 April 2024 (UTC)[reply]
I mix manually formatted citations and template-formatted citations on pages all the time, deliberately. I would be extremely annoyed if a bot took it upon itself to change that deliberate decision. —David Eppstein (talk) 23:17, 2 April 2024 (UTC)[reply]
It should however, preserve the editors. Headbomb {t · c · p · b} 23:11, 2 April 2024 (UTC)[reply]

STILL creating new CS1 errors

[edit]

Changing an incorrect cite journal to cite book [1]: Good (although would have been better as cite conference).

Creating a new CS1 error where there was none before, because it left the paper title in the book title parameter and did not change the journal parameter to a book title parameter: doubleplusungood.

Stop it.

Posting as a message rather than a new bug because this is not a new bug. It is an old bug that has been ignored far too long by the developers (see #Causing template errors, above). It needs to be fixed. —David Eppstein (talk) 23:07, 20 June 2024 (UTC)[reply]

It's not creating error, it's flagging errors that were already there, but not reported. |journal=FM 2014: Formal Methods was wrong before. That the bot didn't manage to fix it doesn't make it a new error. Now the error is reported. This is an improvement, even though ideally the bot would be able to figure out and fix the error itself. Headbomb {t · c · p · b} 23:11, 20 June 2024 (UTC)[reply]
INCORRECT. It is creating an error, because formerly readers could see the paper title, see the book title (called a journal, but still formatted in italics the way readers would expect a book title to look), and see that it was a paper in a book with that title. After the edit, readers were presented only with the paper title, formatted as a book title, falsely telling them both in visible appearance and reference metadata that the reference was to an entire book-length work. It is not merely that it is creating CS1 errors, although that is bad enough. It is also making the reference less accurate in both its metadata and in its visible appearance. —David Eppstein (talk) 23:20, 20 June 2024 (UTC)[reply]
I've gotten really exhausted with this category of error introduced by Citation bot, which I encounter every day I edit. I used to creep its contributions and clean up after it, but I've started just reverting its edits that cause this kind of template error, regardless of any value added, and only sometimes actually fix up the citations myself. Few of the editors who call Citation bot on large sets of pages ever check in after it to see if it's causing errors, so typically no one notices my reverts.
I saw a few weeks back that for one subset of conferences (IEEE maybe? or SPIE?) Citation bot has successfully been changing {{cite journal}} to {{cite book}} without introducing errors and growing the backlogs. So there has been a partial fix, but it's pretty frustrating that this known error has been perpetuated in thousands of edits spanning months.
Citation bot does not have an approved BRFA task to change citation template types, and changing to {{cite book}} has been the one that's particularly fraught and error-prone ever since support for the aliases of |periodical= was dropped from {{cite book}} a year ago. The easiest thing would be if support were readded, but that seems highly unlikely. I do think that eventually, if this bug isn't fixed, I'll end up asking BAG to ban Citation bot changing template type to {{cite book}}. Disabling the functionality would be an improvement over the current situation. Folly Mox (talk) 00:02, 21 June 2024 (UTC)[reply]

Still ongoing failure to remove journal= from conversions to cite book, creating new CS1 errors and wasted time for human editors: Special:Diff/1245112056. —David Eppstein (talk) 06:53, 18 September 2024 (UTC)[reply]

Mathematical Reviews is not a book

[edit]
Status
new bug
Reported by
David Eppstein (talk) 06:48, 3 July 2024 (UTC)[reply]
What happens
Converts correct cite journal, describing a book review published in the journal Mathematical Reviews (which more or less coincides with the modern MathSciNet online database, but genuinely used to be a journal) into cite book, replacing the given title for the review with the title of the book, after a previous pass of a bot (probably the same bot) helpfully and incorrectly added the book doi to the review references. In the process a CS1 error is generated because the citations to the wrong reference of the wrong type still have a leftover journal parameter.

As I keep saying, this is the type of damage that can be predicted to happen when bots run over the same reference over and over and over and over, probably making improvements on the first pass or two but also introducing minor mistakes that they then amplify into major mistakes until eventually the reference is totally garbled. The whole process of repeatedly polishing citations so many times needs to be rethought. Get it right the first time and then stop.
What should happen
Not that.
We can't proceed until
Feedback from maintainers


This is likely about [2] where a reference to MR is confused to a reference to the work reviewed by MR. Headbomb {t · c · p · b} 20:12, 3 July 2024 (UTC)[reply]

The majority of that reference is to the book itself (DOI, ISBN, volume, etc) and not the MR. AManWithNoPlan (talk) 00:59, 4 July 2024 (UTC)[reply]
You are completely missing the point.
After multiple passes of citation-cleaning bots including Citation bot and OAbot, what was originally a reference purely to a review in Mathematical Reviews gradually became more and more borked, in the process resembling a reference to the reviewed work. The most recent pass of Citation bot took a reference that, by then, resembled a citation to a book and made it look more like a citation to the book. But that was only the latest step of this borkage. Sometime longer ago a bot planted a turd in the citation and then the bots kept on polishing it, making it shinier and shinier but not any less smelly.
The problem here is not the individual edit. The problem is that when bots repeatedly replace and replace and replace bits of citations, without intelligence or oversight, they have a tendency to amplify their earlier mistakes. All it takes is a month or two of a bug where bad dois or bad hdls get added to citations (and we've seen such bugs, not just in this bot) and then later iterations take that as gospel and keep massaging the citation to more closely resemble that bad piece of the reference. One or two passes of Citation bot is usually an improvement. After that, further passes are as likely to break things and make more work for human editors as they are to make anything better.
We need some sort of cone of shame that can stop the bots from continuing to worry the same sore spots over and over, without keeping them away from new citations in need of bot cleanup. —David Eppstein (talk) 19:19, 5 July 2024 (UTC)[reply]
The reciprocal operation seems more common in my experience: DOIs to book reviews where the citation points to the reviewed book. Perhaps the least fun is where the same content is published originally in a journal and later as a book chapter, and the citation scripts pick the opposite publication to the original editor, resulting in wholly mixy-match metadata that can take twenty or thirty minutes to untangle.
Whenever I find myself fixing citations that Citation bot has micrd up in this way (which can often as not be blamed on Crossref), I'll drop a hidden html comment so it ignores the citation in the future, but it would nice not to have to do that every time. However, bots sprinkling |script-embargo-date= or suchlike all over doesn't feel like a super premium solution either. Folly Mox (talk) 16:40, 7 July 2024 (UTC)[reply]
There is no such parameter as |script-embargo-date=. What did you really mean?
Trappist the monk (talk) 16:46, 7 July 2024 (UTC)[reply]
Sorry. I was workshopping ideas of how to slow down or arrest the process of citation scripts repeatedly replace and replace and replace bits of citations, and what it might implement like to have some sort of cone of shame that can stop the bots from continuing to worry the same sore spots over and over, without keeping them away from new citations in need of bot cleanup.
I think I skipped a step where I typed out the immediately rejected ideas of scripts keeping track of which citations they had previously edited (too resource intensive), or checking revision histories for their own activity (ditto). Then I leapt straight into rejecting the third idea, where bots drop themselves and each other little reminder notes using an invented parameter for the purpose.
Unlike a few other problems that get mention on this talkpage, I don't have any clear idea how to prevent the sort of error described in this bug report. I forgot to type out some of my unclear bad ideas, probably due to being in an IRL conversation during the edit. Folly Mox (talk) 17:13, 7 July 2024 (UTC)[reply]

can we please not "Upgrade ISBN10 to 13"?

[edit]

As far as I can tell this has no practical advantage at all, and only serves to make the opaque identifier take up more space at readers' expense. –jacobolus (t) 23:48, 15 July 2024 (UTC)[reply]

Looking for 13 ISBN leads to more google hits oddly. AManWithNoPlan (talk) 00:47, 22 August 2024 (UTC)[reply]
I don't understand your reply. Can you clarify? I agree with jacobolus. RememberOrwell (talk) 06:06, 18 September 2024 (UTC)[reply]

Can we please not add bibcode when it contains no useful information?

[edit]

Citation Bot has recently been adding more bibcodes to various citations, but nearly every time I click through the bibcode turns out to contain zero new information. That is, the bibcode has some metadata already included in the Wikipedia citation plus an abstract already included at the publisher's website linked from a DOI, and nothing else whatsoever. Adding these bibcodes to citations seems like a waste of space which is at best useless, or at worst wastes readers time. Sometimes bibcode links contain full text or some other useful information, so I wouldn't say bibcode should never be added, but it seems very unhelpful to add it just because it happens to exist. –jacobolus (t) 04:24, 27 July 2024 (UTC)[reply]

Bibcodes always contain useful information. Like every other identifiers, iIf you don't like them, ignore them. That doesn't make them useless to others who know how to use them. No different than PMIDs in medicine. Headbomb {t · c · p · b} 04:25, 27 July 2024 (UTC)[reply]
Which useful information is it that they contain, exactly? How does such a knowing person "use" them? PMIDs are also often useless, I agree. Adding an extra half-dozen opaque identifiers which all point to the same identical information does a disservice to readers and is harmful to the project overall, because it makes the citations harder to read and forces readers to carefully sift through chaff to find the links they are looking for. Anyone who cares about these identifiers for their own sake, for whatever reason, can find them absolutely trivially. –jacobolus (t) 19:31, 27 July 2024 (UTC)[reply]
Perhaps it would be useful to add an invisible-by-default view for all of these extra identifiers which could be revealed in CSS to the trivial number of "others who know how to use them" without needing to shove a bunch of line noise in everyone else's face. –jacobolus (t) 19:33, 27 July 2024 (UTC)[reply]
Perhaps there should be some way of distinguishing bibcodes or other ids that provide useful information (like full article text) from the ones that merely point to other ids, so that the useful ones can be shown and the useless ones can be hidden.
But this may be reader-dependent. For instance MathSciNet codes are useful to people with subscription access to MathSciNet (who are shown reviews of the works) but useless to non-subscribers (who get a landing page with a bare citation). In such cases I don't think Wikipedia is capable of determining which readers can make use of the id. —David Eppstein (talk) 19:38, 27 July 2024 (UTC)[reply]
The bibcodes I'm talking about are opaque IDs pointing at a web page which includes: author, title, journal name/issue, date, page numbers, DOI (all included already in the wikipedia citation), plus an abstract (included on the DOI page), but no other information at all. I don't see any benefit to anyone in clicking through to such a page, unless someone's goal is to find the bibcode itself for some (obscure, niche, irrelevant to wikipedia) purpose.
As a concrete example, here is a Wikipedia citation after Citation bot added a bibcode:
Vincenty, Thaddeus (1 April 1975). "Direct and Inverse Solutions of Geodesics on the Ellipsoid with Application of Nested Equations" (PDF). Survey Review. 23 (176). Kingston Road, Tolworth, Surrey: Directorate of Overseas Surveys: 88–93. Bibcode:1975SurRv..23...88V. doi:10.1179/sre.1975.23.176.88. Retrieved 21 July 2008.
If it were up to me, this should instead be:
Vincenty, Thaddeus (1975). "Direct and Inverse Solutions of Geodesics on the Ellipsoid with Application of Nested Equations" (PDF). Survey Review. 23 (176): 88–93. doi:10.1179/sre.1975.23.176.88.
The publisher and their location are not essential or even useful information to include in journal citations like this when we can include them in a wiki page about the journal (though frankly even a wikilink to Survey Review has only marginal value here), but the bibcode especially is pointless, because when we click through we find the following info on the bibcode page:
Where the latter link just points the same place as doi:10.1179/sre.1975.23.176.88.
There is literally no new useful information at the bibcode link.
Remember, from WP:NOT, Wikipedia is an encyclopedia, not an indiscriminate collection of information. The purpose of citations is to help readers locate a source for particular claims being made in articles, and that's it. Any information beyond that should be carefully considered and balanced against the significant cost imposed on readers who don't care when we add extra links and opaque identifiers.
I often feel like the main project of Citation bot and some of its friends and supporters is to turn the bottom of every Wikipedia page into a comprehensive bibliographic cross-reference of citation index identifiers. But in my opinion this is not what Wikipedia is for, and they really have no community mandate to impose this vision across the site. –jacobolus (t) 19:56, 27 July 2024 (UTC)[reply]
Agree that Bibcode feels like the worst offender of the unnecessary stable identifiers, mostly due to aesthetics: s2cid is equally useless (unless we count doing Semantic Scholar's work for them) but at least they're not an almost intelligible word followed by a mishmash of letters, numbers, and dots.
I'm sure not all of that awful example citation is Citation bot's fault: it doesn't typically add street addresses or access dates. It would be nice if we could have some sort of discussion somewhere about what is and isn't desirable for citation scripts to add to references, although I doubt anyone who isn't already active on this talkpage would care.
And to answer your edit summary, no, he never checks on the results of his bot runs, and calls Citation bot so profusely that whenever I type "Abductive", my text prediction suggests "who never checks their work" from all the edit summaries I've left cleaning up after him. Folly Mox (talk) 20:25, 27 July 2024 (UTC)[reply]
I think Citation bot backed off on s2cids, or at least I haven't found as many being added recently. (If so, thanks for the change!) –jacobolus (t) 20:45, 27 July 2024 (UTC)[reply]
The Bibcode link does provide information about what the paper cites and what has cited it. Whether or not the DOI resolves to a page that also provides such information depends upon the journal and publisher (that stuff is paywalled by default on Physical Review websites, for example). XOR'easter (talk) 23:50, 27 July 2024 (UTC)[reply]
The list of citing papers at a bibcode link is extremely incomplete though. For example, for this particular paper the bibcode page lists 110 results whereas the publisher's page lists 750 results, Semantic Scholar lists 1219 results, Google Scholar lists lists 1742 results, and I'm sure there are other citation indices including this paper in their graph. I don't think a list of citations alone is enough to justify the space it takes to linking any of these citation index pages (beyond the publisher page or sometimes a third-party page including a preprint or similar). Anyone who wants to hop around the literature graph starting from this paper, as part of their research process, is capable of going to their preferred citation index and typing in the title or other basic metadata to find this paper. –jacobolus (t) 00:42, 28 July 2024 (UTC)[reply]
It's not comprehensive, for sure; in my experience, the thoroughness varies by field. The only point I wanted to make is that it provides more than absolutely nothing. (Also, I kind of like bibcodes just because they are alphanumeric-punctuation mishmashes. They give a bibliography a 3l33t h4x0r feel.) XOR'easter (talk) 19:09, 28 July 2024 (UTC)[reply]
As XOR'easter says, "but no other information at all" is wrong.
It also contains how many papers cite it, and how this varies over the year (e.g. [3].)
So far this is no different than including PMIDs.
But additionally, bibcodes will also often contain/host papers itself (e.g. Bibcode:1995ApJS..100..473K), and point to preprints, and related papers (for example, Bibcode:2007A&A...470..685L is the 2nd paper in a series of 3).
Again, that you don't personally like Bibcode or find it useful is not a reason to deprive the reader of easy access to this ressource. Headbomb {t · c · p · b} 00:50, 28 July 2024 (UTC)[reply]
"how many papers cite it" – or to be precise, a very significant undercount by more than an order of magnitude of how many papers cite it. If we just wanted that we should link Google scholar, but I don't think this information justifies any citation index link; it's not relevant to locating the paper, which is the primary purpose of Wikipedia citations.
"bibcodes will also often ..." – this is not sufficient justification to include every possible bibcode. It only offers a supporting reason to occasionally add a bibcode when it hosts a paper not available from the publisher or some other source which has the right to host it. If the bot cannot determine these cases programmatically, then it should leave it to humans to decide them.
you don't personally It's not about what I personally like, it's about what is worth spending very valuable Wikipedia readers' attention on. There is certainly no site-wide consensus about adding this type of metadata at every possible opportunity, so what you are really arguing for is that bot authors should get to unilaterally make sweeping controversial decisions to match their own preference; I think that approach runs counter to the spirit of the Wikipedia project. In my opinion, every bit of metadata, especially anything added by bots, has to have some strong and clear benefit to justify the space it takes up, and just "it exists and some people sometimes like it" is not good enough reason to mass spam these site-wide. –jacobolus (t) 04:05, 28 July 2024 (UTC)[reply]

Adds cs1-formatted reference to article whose references are entirely in cs2

[edit]
Status
new bug
Reported by
David Eppstein (talk) 21:20, 28 July 2024 (UTC)[reply]
What happens
In this edit the bot turned a bare-url reference, in an article all of whose many templated references were in Citation Style 2 (some using cite templates with mode=cs2), into a cite web template in Citation Style 1
What should happen
Not that. There is no reason to use cite web when the citation template works ok. In this case it could have been cite report if the bot were more intelligent, but that's above and beyond the bug in question
We can't proceed until
Feedback from maintainers


It should be enough to do a pass for new {{cite xxx}} being added in the edit if every other cite was {{citation}} (or {{cite xxx|mode=cs2}}. The exception should be that {{cite arxiv}}, {{cite bioRxiv}}, {{cite citeseerx}}, {{cite medrxiv}}, and {{cite ssrn}} all have |mode=cs2 added to them instead of being converted to {{citation}}.

Undoes DELIBERATE formatting of conference-proceedings-in-journal-special issue as cite journal, violating CITEVAR and reintroducing previously-fixed CS1 errors

[edit]
Status
new bug
Reported by
David Eppstein (talk) 01:32, 14 August 2024 (UTC)[reply]
What happens
In Kenneth E. Iverson, one of the references is to a paper in the "Conference proceedings on APL as a tool of thought - APL '89", published in a special issue of APL Quote Quad, a periodical. It had a cite conference format but a journal= parameter, a CS1 error, which I fixed in Special:Diff/1238853961. It is not possible to simultaneously format it as a book and a periodical; I deliberately chose one, using cite journal to get most of the metadata correct and using department= to provide the remaining metadata, that this journal issue is a conference proceedings and the name of the proceedings. In the very next edit Special:Diff/1239895881, Citation bot under the control of User:Headbomb edit-warred to restore the citation to its unfixed CS1 error state as a cite book with erroneous journal parameter, but now with an extra erroneous department parameter. Incidentally, all of the major computer graphics conferences and many database conferences now publish their conference proceedings as special issues of journals (and have done so for years), and there are many older programming language conference proceedings published in ACM SIGPLAN Notices. This is something we must handle properly, not a weird one-off situation that we can handle by marking it as special. The situation that the citation templates do not make it easy or convenient to cite such things should not be exacerbated by the citation bot not understanding these things and lobotomizing the citations to fit its poor understanding.
What should happen
Not that
We can't proceed until
Feedback from maintainers


David Eppstein, {{Cite conference}} supports both book and journal parameters. That's what I use to cite conference proceedings published as special journal issues. Not sure why Citation bot doesn't. Would certainly be a quicker fix than publisher by publisher. Also, {{cite journal}} supports |isbn=, so I'm not sure why conference proceedings keep getting mistranslated into {{cite book}}s. Folly Mox (talk) 01:40, 14 August 2024 (UTC)[reply]
It's interesting that cite conference allows both conference= and journal=, and that would also be an acceptable way of formatting the citation, but it doesn't format the citation as a publication in a periodical the way cite journal does. (It spells out the volume and issue instead of using the abbreviated format of cite journal.) —David Eppstein (talk) 01:46, 14 August 2024 (UTC)[reply]
It would be nice if {{citation}} supported something like a "conference" parameter. –jacobolus (t) 07:58, 21 August 2024 (UTC)[reply]

First, if you do something deliberately weird, then the 'solution', so to speak, is to follow User:Citation bot#Stopping the bot from editing (bullet #2), not block the bot on the entire article. Second, the issue here is that you're trying to have two citations in one. The first is from the doi:10.1145/75145.75170:

  • Hagamen, W.; Berry, P. C.; Iverson, K. E.; Weber, J. C. (1989). "Processing natural language syntactic and semantic mechanisms". ACM SIGAPL APL Quote Quad. 19 (4): 184–189. doi:10.1145/75145.75170.

The second is from the ISBN 0897913272/doi:10.1145/75144.75170 (note 10.1145/75144.75170 vs 10.1145/75145.75170):

  • Hagamen, W.; Berry, P. C.; Iverson, K. E.; Weber, J. C. (1989). "Processing natural language syntactic and semantic mechanisms". In Kertész, Ádám; Shaw, Lynne C. (eds.). APL '89 Conference Proceedings: APL as a tool of thought; New York City, August 7–10, 1989. New York, NY: ACM. pp. 184–189. ISBN 0897913272.

Headbomb {t · c · p · b} 03:16, 14 August 2024 (UTC)[reply]

Perhaps you failed to read my message. Perhaps you are unaware how annoying it is to put time and effort into cleaning up problems only to have some editor-with-bot fuck up the article in exactly the same way again. But regardless, you are incorrect. They are not two publications. They are a single publication, of a paper in a conference proceedings in a journal. Conference proceedings get published in journals, all the time. Get over it and stop making work for others when you don't understand things. That goes for the bot, too. —David Eppstein (talk) 04:50, 14 August 2024 (UTC)[reply]
Clearly these have been published both in a book (doi:10.1145/75144.75170) and in APL Quote Quad (doi:10.1145/75145.75170) and the core of the issue is that you've mixed the journal DOI with the book ISBN. These two should not be present in the same citation. That you insist that they are is the root cause of your issues.
I'll flip things back on you, because you clearly are unaware of how annoying it is to be accused of bad faith behaviour get reflexively reverted [4] without understanding what it is you reverted. Headbomb {t · c · p · b} 05:26, 14 August 2024 (UTC)[reply]
Ok, now this is rising to the level of WP:IDIDNTHEARTHAT. The original publication of the book was as an issue of APL Quote Quad. Perhaps you have never subscribed to an ACM SIG newsletter and sometimes received surprise conference proceedings in your mailbox when the conference published its proceedings as an issue of the newsletter. I have. It used to be a standard way to publish the proceedings of minor ACM conferences (the major ones got a separately published proceedings volume). Maybe they separated them later and decided to give them separate dois. Did not the fact that they had identical page numbers give you any second thoughts? Who would reprint a whole conference proceedings as a second, separate publication, and why? —David Eppstein (talk) 05:40, 14 August 2024 (UTC)[reply]
They have identical publication dates (1 July 1989) as well as page numbers, and both PDFs are marked with "APL QUOTE QUAD" in the page footers. XOR'easter (talk) 23:01, 14 August 2024 (UTC)[reply]
Canceling deliberate bot exclusions, as in special:diff/1240206999, without consensus/discussion, seems way, way out of line. –jacobolus (t) 07:56, 21 August 2024 (UTC)[reply]

Don't replace |title= with |chapter= when not adding a new title

[edit]
Status
new bug
Reported by
:Jay8g [VTE] 06:27, 5 September 2024 (UTC)[reply]
What happens
[5]
What should happen
The title should be kept as is to avoid creating a CS1 error
We can't proceed until
Feedback from maintainers


Although I agree that the bot's edit was bad, maybe the bot was confused by doi:10.5040/9781472597540.0007 which looks like it should go to chapter 7 within the book (whatever title that chapter might have)? doi:10.5040/9781472597540 appears to refer to the entire book. — Preceding unsigned comment added by David Eppstein (talkcontribs)
Probably caused by the chapter and booktitle being the same. Headbomb {t · c · p · b} 17:35, 19 September 2024 (UTC)[reply]

Partial enumeration (maybe not really a bug)

[edit]
Status
new bug or not-bug
Reported by
Folly Mox (talk) 18:25, 8 September 2024 (UTC)[reply]
What happens
When Citation bot alters |first=|last= to |first1=|last1= in the presence of a second (or more) author, it leaves |author-link= rather than |author1-link=.
None of these combinations actually confuse the aliasing in Module:CS1 – the correction of a non-error generates a different non-error – but for consistency I'm wondering why change some of the parameters but not all? (Haven't verified whether ∅ → 1 leaves |author-mask= or |author1-mask=, but it might show the same behaviour.)
Relevant diffs/links
Special:Diff/1244047255
We can't proceed until
Feedback from maintainers


Probably an edge case that's not worth fixing

[edit]
Status
new bug
Reported by
Ed [talk] [OMT] 06:46, 10 September 2024 (UTC)[reply]
What should happen
Good question!
Relevant diffs/links
[6]
We can't proceed until
Feedback from maintainers


This citation references an online-only supplement that is not in the journal and therefore not in the article's page range. I suspect this is rare enough to not need any bot code changes, or if there's a better way to input the citation template I am all ears. Ed [talk] [OMT] 06:46, 10 September 2024 (UTC)[reply]

caps

[edit]
Status
new bug
Reported by
Jonatan Svensson Glad (talk) 19:56, 16 September 2024 (UTC)[reply]
What happens
|title=Phylogenetic Placement and Circumscription of Tribes Inuleae s. STR. And Plucheeae (Asteraceae): Evidence from Sequences of Chloroplast Gene NDHF
What should happen
|title=Phylogenetic Placement and Circumscription of Tribes Inuleae s. str. and Plucheeae (Asteraceae): Evidence from Sequences of Chloroplast Gene ndhF
Relevant diffs/links
https://en.wikipedia.org/w/index.php?title=Plucheeae&diff=prev&oldid=1246081258
We can't proceed until
Feedback from maintainers


I think Citation bot is to aggressive in it's capitalization of every three-/four-letter combinations and words following a dot. Also on other references it many times incorrectly capitalizes words inside parentheses. Jonatan Svensson Glad (talk) 19:56, 16 September 2024 (UTC)[reply]

specific issues fixed. some special code for parentheses does need added AManWithNoPlan (talk) 14:30, 19 September 2024 (UTC)[reply]

Unreal page numbers from PubMed

[edit]
Status
new bug
Reported by
Jc3s5h (talk) 04:27, 2 October 2024 (UTC)[reply]
What happens
numbers that are not page numbers are cited as page numbers
What should happen
The publication described on the "link showing what happens" line does not appear to use page numbers so no page number should be reported.
Relevant diffs/links
https://en.wikipedia.org/w/index.php?title=Voter_identification_laws_in_the_United_States&diff=1248756987&oldid=1248584828#cite_note-h933-163
We can't proceed until
Feedback from maintainers


That's an article number and should be formatted using |article-number= instead of |pages=. See § Added page parameter when it should be article-number, above. —David Eppstein (talk) 05:43, 2 October 2024 (UTC)[reply]

Added page parameter when it should be article-number, based upon CrossRef

[edit]
Status
new bug
Reported by
David Eppstein (talk) 07:17, 26 September 2024 (UTC)[reply]
What happens
In an earlier version of quasicrystal, reference [42] Paßens et al (Nature Communications) had |pages=15367 (obviously incorrect). Using AWB, a week or so ago, User:Srich32977 made it worse by changing this to |📃=15367 causing an invalid parameter error. Then in Special:Diff/1246321827, Citation bot noticed the missing parameter and added |page=15367, better than before but still not correct. In this instance, 15367 is an article number, not a page number, so it should have been |article-number=15367. (Citation bot left Srich32977's garbage parameter in place but I do not think that is a bug.)
We can't proceed until
Feedback from maintainers


Thank you for this example. Unlike many of the other ones, this one actually has the article number as an article and not just a page in CrossRef. I now have an example to work with. AManWithNoPlan (talk) 14:20, 8 October 2024 (UTC)[reply]

Multiple missed opportunities in a single article (bad titles and url type)

[edit]
Status
new bug
Reported by
David Eppstein (talk) 06:07, 5 October 2024 (UTC)[reply]
What happens
Special:Diff/1249477062
What should happen
Special:Diff/1249492278
We can't proceed until
Feedback from maintainers


Title part now done. URL no done yet. AManWithNoPlan (talk) 13:53, 1 November 2024 (UTC)[reply]

Invented access-date

[edit]
Status
new bug
Reported by
:Jay8g [VTE] 23:50, 8 October 2024 (UTC)[reply]
What happens
[7] - the bot turned the nonsense string access-date02024-10-07 into the reasonable-looking but incorrect access-date=7 October 2004. I have no idea where it got that date from.
What should happen
access-date=2024-10-07
We can't proceed until
Feedback from maintainers


{{cite web|url=x |access-date02024-10-07 }}{{Use dmy dates}} is a minimal reproducer. The problem is that that date gets "cleaned" to match the DMY format. AManWithNoPlan (talk) 23:34, 22 October 2024 (UTC)[reply]

Citations with DOIs that start with 10.2307/j.<foobar> are books

[edit]
Status
new bug
Reported by
Headbomb {t · c · p · b} 08:14, 12 October 2024 (UTC)[reply]
What should happen
[8]
We can't proceed until
Feedback from maintainers


Delete stray <formula>...<formula/> and <roman>...<roman/>

[edit]
Status
new bug
Reported by
Headbomb {t · c · p · b} 19:51, 18 October 2024 (UTC)[reply]
What should happen
[9]
We can't proceed until
Feedback from maintainers


How common is this, and do you have a search to find them? AManWithNoPlan (talk) 19:23, 2 November 2024 (UTC)[reply]

No idea how common it is. The bot added them here. Headbomb {t · c · p · b} 00:38, 9 November 2024 (UTC)[reply]

URL removed

[edit]
Status
new bug
Reported by
Mika1h (talk) 10:57, 30 October 2024 (UTC)[reply]
What happens
Bot replaces cite web with cite book, it removes the URL completely
What should happen
Nothing, the ref cited a Library Journal review that's listed on the Amazon site for the book, now it cites just the book, there's no link to click to see the review.
Relevant diffs/links
https://en.wikipedia.org/w/index.php?title=Shatnerverse&diff=prev&oldid=1254239468
We can't proceed until
Feedback from maintainers


Identical to § Changing every citation of a publisher's webpage to Cite book above. While the choice of formatting may be questioned (can't the Library Journal review be located somewhere less objectionable than Amazon?) the behaviour here is the same underlying misfeature of altering any webpage citation where a book's bibliographic information is presented, as if the citation was meant to be to content of the book rather than e.g. a publisher's blurb or library listing. I think there are more discussions of this in the talkpage archives here; I used to favour this feature, but I'm no longer so sure it's a net positive. Folly Mox (talk) 11:36, 30 October 2024 (UTC)[reply]
Apart from User talk:Citation bot/Archive 32 § Web->Book: I don't think that it was right in this case... (May 2022) linked in the thread above, there was some conversation at User talk:Citation bot/Archive 39 § Introduces ref error when citing Penguin publisher website (May 2024). There could be others. I have to go to work. Folly Mox (talk) 13:01, 30 October 2024 (UTC)[reply]

Date format

[edit]
Status
new bug
Reported by
ChaseKiwi (talk) 15:26, 2 November 2024 (UTC)[reply]
What happens
bot changed date in reference on page Philippines–Taiwan relations for doi-broken-date=2024-08-20 to doi-broken-date=1 November 2024
What should happen
keep original date and its format, not change date format which is adopted elsewhere on page alone and certainly not change date
Relevant diffs/links
https://en.wikipedia.org/w/index.php?title=Philippines%E2%80%93Taiwan_relations&curid=39377200&diff=1254952941&oldid=1242823249
We can't proceed until
Feedback from maintainers


It certainly should be updated, since the broken date is that last time checked, and not the first time found to be dead. AManWithNoPlan (talk) 16:28, 2 November 2024 (UTC)[reply]

A point, although it has been known for doi's by genunine journals never to be issued. What ever the change in date format is bad practice.ChaseKiwi (talk) 21:43, 2 November 2024 (UTC)[reply]

Stuck in an endless loop on certain pages

[edit]
Status
new bug
Reported by
:Jay8g [VTE] 20:06, 2 November 2024 (UTC)[reply]
What happens
On List of assassinations in the Philippines, Citation Bot gets stuck in an endless loop and eventually crashes. The results page is filled with thousands of
   ~Renamed "work" -> "agency"
   ~Renamed "agency" -> "work"
We can't proceed until
Feedback from maintainers


I have fixed the page, which has invalide information. I will look at fixing the bot to deal with that. AManWithNoPlan (talk) 22:14, 2 November 2024 (UTC)[reply]

MathML

[edit]
Status
new bug
Reported by
Jonatan Svensson Glad (talk) 17:05, 3 November 2024 (UTC)[reply]
What happens
Weird math, mrow and nowiki tags in title
Relevant diffs/links
https://en.wikipedia.org/w/index.php?title=User%3AJosve05a%2Fsandbox&diff=prev&oldid=1255192629
We can't proceed until
Feedback from maintainers


That math/mrow syntax is MathML, which Wikimedia does not support directly. You can format it using Wikimedia math syntax as " and " (<math>S\to F</math> and <math>S\to J</math>) but in this case I would prefer template syntax, "SF and SJ" ({{math|''S'' → ''F''}} and {{math|''S'' → ''J''}}) because Wikimedia math does not work well within linked text. I'm not convinced that the bot understands these issues well enough to translate the mathml into Wikimedia syntax. Probably the easiest is just to drop the tags and keep the text within them, giving "S→F and S→J". —David Eppstein (talk) 18:31, 3 November 2024 (UTC)[reply]
The bot explicity wraps incoming titles with certain math items in nowiki tags so that they are human readable for the most part and also obviously needing fixed. AManWithNoPlan (talk) 12:55, 8 November 2024 (UTC)[reply]
That seems like an entirely reasonable way to handle this sort of markup, to me, maybe enough to label this as wontfix. —David Eppstein (talk) 21:44, 8 November 2024 (UTC)[reply]
[edit]

For some reason this link doesn't open a page on Google Books, but this one does. Citation bot will change the latter to the former, and say it is anonymizing links. Any way we can get it to keep the bsq though? Andre🚐 10:16, 7 November 2024 (UTC)[reply]

Not sure what your issues are, but they work for me. AManWithNoPlan (talk) 17:13, 7 November 2024 (UTC)[reply]
In the first link, there's no page opened with the query. While the latter does. I tried in an incognito too and same thing. Andre🚐 19:43, 7 November 2024 (UTC)[reply]
I can confirm the reported behaviour here. Headbomb {t · c · p · b} 00:35, 9 November 2024 (UTC)[reply]
[10] also works, but [11] doesn't. Removing the hl=en I suppose is anonymization, but can it just leave the bsq and the gbpv, and I guess it would have to also not shorten the URL as it does? [12] doesn't work either. Andre🚐 05:42, 9 November 2024 (UTC)[reply]

you have one of the best bots.. keep up the amazing work

[edit]
Bot Operator's Barnstar
The Citation bot is a Tremendous Bot, keep up the amazing work on editing.
Status
new bug
Reported by
David Eppstein (talk) 21:43, 8 November 2024 (UTC)[reply]
What happens
In Special:Diff/1256146187 the only change is to alter a double-spaced period in a reference title to a single-spaced period, something that makes no difference in the final rendered appearance.
What should happen
Not that.
We can't proceed until
Feedback from maintainers