Skip to content

Latest commit

 

History

History
260 lines (224 loc) · 15.2 KB

2023-03-31-edition-97.markdown

File metadata and controls

260 lines (224 loc) · 15.2 KB
title layout date author categories navbar
Git Rev News Edition 97 (March 31st, 2023)
default
2023-03-31 12:06:51 +0100
chriscool
news
false

Git Rev News: Edition 97 (March 31st, 2023)

Welcome to the 97th edition of Git Rev News, a digest of all things Git. For our goals, the archives, the way we work, and how to contribute or to subscribe, see the Git Rev News page on git.github.io.

This edition covers what happened during the months of February 2023 and March 2023.

Discussions

Support

  • Bug report: symbolic-ref --short command echos the wrong text while use Chinese language

    Mengzi Yi (孟子易) sent a bug report to the mailing list saying that when a Chinese name like 测试-加-增加-加-增加 was given to a branch, then calling git symbolic-ref --short HEAD on that branch didn't result in the right output (for example 测试-� instead of maybe 测试-加).

    Peff, alias Jeff King, replied saying that he couldn't reproduce the issue on Linux and wondered if it was related to using MacOS as its HFS+ filesystem might do some Unicode normalization. He said that it might alternatively be related to the shortening code in shorten_unambiguous_ref() treating the names as bytes instead of characters. Another possibility he mentioned was that the shortening code, which used scanf(), was assuming that the resulting string could not be longer than the input, but that this might be wrong when some Unicode normalization and locale were used.

    Eric Sunshine replied to Peff saying he was able to reproduce the bug on MacOS 10.13 (while Mengzi used MacOS 13.2), but that it didn't appear to be related to HFS+ Unicode normalization as on disk the bytes of the branch name he got were the same as what Peff got on Linux.

    Peff replied to Eric asking if he could test a patch that would add debug output and allocate twice as much memory for the shortened name that would store the output from scanf() than for the input of that function. Peff said the debug output on his Linux machine showed that the input was 39 bytes long while the output was 28.

    Eric tested Peff's patch and initially reported 39 and 9 for the input and output respectively. When setting LANG=zh-CN.UTF-8, he got the same input and output lengths as Peff though, which pointed to scanf() being indeed the culprit.

    Junio Hamano, the Git maintainer, replied to Eric's findings saying "Well, that's ... criminal." and wondering if setting LANG to $ANY_VALID_ONE.UTF-8 would work the same way.

    This made Eric realize that the zh-CN language code he used was invalid (it should have been zh_CN, so with an underscore character instead of a dash). Eric anyway found out that using valid LANG codes like en_US, fr_FR, de_DE, ru_RU and zh_CN resulted in the 测试-? truncated output, while using LANG=C yielded the correct 测试-加-增加-加-增加 output.

    Junio, Peff and Eric discussed these results further, wondering what scanf() on MacOS could be doing wrong. Then Peff suggested replacing the call to this function with some manual parsing, and sent a sample in-email patch to do that.

    Eric tested Peff's patch and reported that it looked correct, worked nicely and fixed the issue. He also agreed with the approach of getting rid of scanf() calls in general.

    Peff then sent a regular small patch series based on his previous patch, which fixed a leak and made the changes easier to follow.

    Junio and Eric reviewed the series and then discussed with Peff a bug Junio found in it. Then Peff sent a version 2 of the patch series that fixed the bug and added tests.

    Torsten Bögershausen in the meantime tried to reproduce the original bug and discussed how to do that with Eric. He also commented on the new tests in the version 2 of the patch series as he found that it wasn't clear in which context the bug could appear. Junio suggested some clarifications that were approved by others. The resulting patches were merged and included in the recent Git v2.40.0 release.

Releases

Other News

Various

Light reading

Git tools and sites

Credits

This edition of Git Rev News was curated by Christian Couder <[email protected]>, Jakub Narębski <[email protected]>, Markus Jansen <[email protected]> and Kaartic Sivaraam <[email protected]> with help from Bruno Brito.