Or "&" is rendered as "amp;", "<" as "lt;" and ">" as "gt;".
Version: unspecified
Severity: critical
Or "&" is rendered as "amp;", "<" as "lt;" and ">" as "gt;".
Version: unspecified
Severity: critical
(In reply to comment #0)
Or "&" is rendered as "amp;", "<" as "lt;" and ">" as "gt;".
Should be "If '&' is expected there, '&' is eaten and the rest is shown as plain text. Otherwise it triggers an error and the whole line is not rendered."
Reproducible on mw.org, but not on http://leuksman.com/mw
On leuksman.com <math>a<b</math> renders correctly, while on mw.org a box with a<b (or \displaystyle a<b, depending on where you insert the formula) is displayed.
(In reply to comment #2)
Reproducible on mw.org, but not on http://leuksman.com/mw
On leuksman.com <math>a<b</math> renders correctly, while on mw.org a box with
a<b (or \displaystyle a<b, depending on where you insert the formula) is
displayed.
Guessing it's related to Tidy.
(In reply to comment #2)
Reproducible on mw.org, but not on http://leuksman.com/mw
On leuksman.com <math>a<b</math> renders correctly, while on mw.org a box with
a<b (or \displaystyle a<b, depending on where you insert the formula) is
displayed.
By the way, on leuksman.com, before the MathJax transformation is applied, raw HTML (with escaped < stuff) is displayed making it not so pretty.
mal.malego wrote:
Come on. This has been resolved over two years ago in my mathJax user script (en.wikipedia.org/wiki/User:Nageh/mathJax). Why does it reappear in the MediaWiki code? I wished the devs would give a little bit more feedback about what they are doing when they are reusing my code but then cut away stuff out of what seems pure ignorance. If you want to try a working MathJax implementation, try my user script.
@Nageh, please do not jump to conclusions that quick. You might not have realized it, but looking into it, it seems your script was actually creating invalid HTML for the "math/tex" script element. The content is not HTML escaped.
The Wikipedia developers (in this case, brion and 2 volunteer patch contributors (one of them me)) are accustomed to writing certain things in certain ways. Apparently one of the developers added escaping to the HTML element creation, because that's what he's used to doing. That has brought forward that actually all the time the script element was not properly created and read out again. Now it's written correctly, but of course reading has broken, which is why this ticket was filed. Stuff like that happens, it's just part of the development cycle.
From en.wikipedia with Nageh MathJax script:
<script type="math/tex" id="MathJax-Element-408">\displaystyle u'' + p(x)u' + q(x)u=f(x),\quad x>a </script>
Should be:
<script type="math/tex" id="MathJax-Element-408">\displaystyle u'' + p(x)u' + q(x)u=f(x),\quad x>a </script>
mal.malego wrote:
Fair enough, and obviously my criticism was unwarranted. Sorry for my attack.
At the same time this is actually something that MathJax should be expected to take care of. Even in the standard installation of MathJax, no matter whether you write < or < the symbol will be put unescaped into the script element. I will report this as a bug to the MathJax devs.
mal.malego wrote:
I really should have thought further before posting my last message. There is nothing wrong with leaving the <, >, and & symbols un-escaped because the maths text will be added as a text node(!) to the DOM, and thus will NOT be interpreted by the HTML parser and will NOT create invalid HTML, and my script was NOT broken. ;)
mal.malego wrote:
As I said, comments by others are always being ignored. Now 1.20wmf1 has been deployed on the English Wikipedia, and all the TeX code is broken. < gets mangled to < and then mangled again to &lt;. Sigh.
(In reply to comment #9)
As I said, comments by others are always being ignored. Now 1.20wmf1 has been
deployed on the English Wikipedia, and all the TeX code is broken. < gets
mangled to < and then mangled again to &lt;. Sigh.
This sounds rather serious. Do you have a link to an example of such breakage?
mal.malego wrote:
Just selected "Leave it as TeX" in the Preferences->Appearance menu. Then open any page that includes TeX code with any of <, >, or &, and view the source (or the text that is displayed). Example page: http://en.wikipedia.org/wiki/Decimal_representation . I have implemented a work-around for the mathJax user script for the moment.
(In reply to comment #11)
Just selected "Leave it as TeX" in the Preferences->Appearance menu. Then open
any page that includes TeX code with any of <, >, or &, and view the source (or
the text that is displayed). Example page:
http://en.wikipedia.org/wiki/Decimal_representation . I have implemented a
work-around for the mathJax user script for the moment.
Ah, that makes more sense. This is only a problem for people with that particular user preference set. So instead of seeing (for example) "$ r_n\leq x < r_n+\frac{1}{10^n}.\, $", people should be seeing "$ r_n\leq x < r_n+\frac{1}{10^n}.\, $"? Is that correct?
This alone wouldn't be a high priority. Are you saying that the escaping is also messing up the MathJax user gadget?
mal.malego wrote:
Yes, that is correct. Take a look at the HTML source and you'll notice that the reason you see < is because the ampersand is encoded as &, which is followed by "lt;" as a normal text. However, the HTML source should simply contain < which would get rendered as <.
Also, this change has only been introduced with 1.20wmf1 so it's a quite disappointing to hear that reverting this flawed change isn't a high priority.
The change was also messing up the mathJax user script, but I have implemented a work-around. So whether you decide to fix this or not, I don't know, but I'm not the only one to note that the MediaWiki devs community is pretty remote from its users (see comments at sections http://en.wikipedia.org/wiki/Wikipedia:Village_pump_%28technical%29#New_.22diff.22_view_is_horrible_and_illegible and http://en.wikipedia.org/wiki/Wikipedia:Village_pump_%28technical%29#Change_notifcations_to_editors ).
I've now switched to the experimental mathJax and this bug shows itself in the differential equations example in Help:Formula http://en.wikipedia.org/wiki/Help:Formula#Differential_equation. Rather than the correct output its displaying
\displaystyle u'' + p(x)u' + q(x)u=f(x),\quad x>a
with the > instead of >.
It also messing with arrays and matrices
\begin{matrix}
x & y \\
z & w
\end{matrix}
This is converted this to & before passing to MathJax so it is rendering as
x amp;y
z amp;w
This is more serious as all matrices, case statements, multiline equations, and tables are broken. See http://en.wikipedia.org/wiki/Help:Formula#Fractions.2C_matrices.2C_multilines for a whole section of broken examples.
In terms of articles containing formulas, I would consider the appropriate "Severity" for this bug is "blocker" for the use of MathJax on Wikipedia, since every formula containing <, >, or & is broken.
Setting the fields accordingly, since this is consistent with the description of the fields at [[mw:Bug management/Bugzilla usage#Priority]]. I would set also "priority=highest" but I'm not sure about the availability of devs for fixing this... So I'll leave it to some developer to check the appropriate priority for this bug.
See also reports on
(In reply to comment #18)
See also reports on
- [[Wikipedia talk:WikiProject Mathematics#MathJax]]
- [[de:Portal Diskussion:Mathematik#Mathjax wird getestet!]]
Fixed the double escaping, but the generated DOM is still unescaped, which I don't really like.
That probably requires changes in MathJax though.
I think it is change I6d548d06 (https://gerrit.wikimedia.org/r/#/c/9739/)
Not sure what the desired output for the MW_MATH_SOURCE should be. This will not parsed by MathJax so we end up with illegal and untreated html. Perhaphs better to pass it to htmlspecialchars().
Just noticed that MathJax suports \lt and \gt for < and >. These could solve the problem with < and >.
http://www.onemathematicalcat.org/MathJaxDocumentation/TeXSyntax.htm#L
The & used in matrices and arrays need not be a error
According to http://www.w3.org/TR/html5/tokenization.html#data-state "& " is legal and interpreted as Ampersand then space.
There are a few subtitles which we might need to watch for. \& is a literal ampersand, \>is an alternate medium space. I don't know if html entities are allowed but they seem to work x ⊝ y gives x circled minus y.
mal.malego wrote:
The data is content of a script(!) element, of course unescaped entities are legal there! Do you escape your < and > signs in javascript code???
Marking as fixed. There're already two patches which are independently worked out and almost the same. Reopen if they don't fix this issue.
mal.malego wrote:
Yes, it is extremely easy to fix. That's why I didn't understand the cold shoulders I got.
(In reply to comment #30)
Marking as fixed. There're already two patches which are independently worked
out and almost the same. Reopen if they don't fix this issue.
Is it really appropriate to mark it as fixed when the patches haven't been reviewed or merged yet?
I don't think the fix is quite correct as it introduces problems for the source option. What I think you want is:
if( $this->mode == MW_MATH_SOURCE ) { # No need to render or parse anything more! # New lines are replaced with spaces, which avoids confusing our parser (bugs 23190, 22818) return Xml::element( 'span', $this->_attribs( 'span', array( 'class' => 'tex', 'dir' => 'ltr' ) ), '$ ' . str_replace( "\n", " ", htmlspecialchars($this->tex) ) . ' $' ); } if( $this->mode == MW_MATH_MATHJAX ) { # No need to render or parse anything more! # New lines are replaced with spaces, which avoids confusing our parser (bugs 23190, 22818) return Xml::element( 'span', $this->_attribs( 'span', array( 'class' => 'tex', 'dir' => 'ltr' ) ), '$ ' . str_replace( "\n", " ", $this->tex ) . ' $' ); }
You want to call htmlspecialchars in MW_MATH_SOURCE but not MW_MATH_MATHJAX. You might also want to mention the bug in the comments.
(In reply to comment #33)
You want to call htmlspecialchars in MW_MATH_SOURCE but not MW_MATH_MATHJAX.
I don't understand why. It will be further escaped by Xml::element(), so you'll see still see double-escaped TeX in source mode.