Re: WIN32 pg_import_system_collations

Lists: pgsql-hackers
From: Dmitry Koval <d(dot)koval(at)postgrespro(dot)ru>
To: Juan José Santamaría Flecha <juanjo(dot)santamaria(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: WIN32 pg_import_system_collations
Date: 2022-01-24 21:23:38
Message-ID: [email protected]
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi Juan José,

I a bit tested this feature and have small doubts about block:

+/*
+ * Windows will use hyphens between language and territory, where POSIX
+ * uses an underscore. Simply make it POSIX looking.
+ */
+ hyphen = strchr(localebuf, '-');
+ if (hyphen)
+ *hyphen = '_';

After this block modified collation name is used in function

GetNLSVersionEx(COMPARE_STRING, wide_collcollate, &version)

(see win32_read_locale() -> CollationFromLocale() -> CollationCreate()
call). Is it correct to use (wide_collcollate = "en_NZ") instead of
(wide_collcollate = "en-NZ") in GetNLSVersionEx() function?

1) Documentation [1], [2], quote:
If it is a neutral locale for which the script is significant,
the pattern is <language>-<Script>.

2) Conversation [3], David Rowley, quote:
Then, since GetNLSVersionEx()
wants yet another variant with a - rather than an _, I've just added a
couple of lines to swap the _ for a -.

On my computer (Windows 10 Pro 21H2 19044.1466, MSVC2019 version
16.11.9) work correctly both variants ("en_NZ", "en-NZ").

But David Rowley (MSVC2010 and MSVC2017) replaced "_" to "-"
for the same function. Maybe he had a problem with "_" on MSVC2010 or
MSVC2017?

[1]
https://docs.microsoft.com/en-us/windows/win32/api/winnls/nf-winnls-getnlsversionex
[2] https://docs.microsoft.com/en-us/windows/win32/intl/locale-names
[3]
https://www.postgresql.org/message-id/flat/CAApHDvq3FXpH268rt-6sD_Uhe7Ekv9RKXHFvpv%3D%3Duh4c9OeHHQ%40mail.gmail.com

With best regards,
Dmitry Koval.


From: Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>
To: Dmitry Koval <d(dot)koval(at)postgrespro(dot)ru>, Juan José Santamaría Flecha <juanjo(dot)santamaria(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: WIN32 pg_import_system_collations
Date: 2022-01-25 10:40:45
Message-ID: [email protected]
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 24.01.22 22:23, Dmitry Koval wrote:
> +/*
> + * Windows will use hyphens between language and territory, where POSIX
> + * uses an underscore. Simply make it POSIX looking.
> + */
> + hyphen = strchr(localebuf, '-');
> + if (hyphen)
> +    *hyphen = '_';
>
> After this block modified collation name is used in function
>
> GetNLSVersionEx(COMPARE_STRING, wide_collcollate, &version)
>
> (see win32_read_locale() -> CollationFromLocale() -> CollationCreate()
> call). Is it correct to use (wide_collcollate = "en_NZ") instead of
> (wide_collcollate = "en-NZ") in GetNLSVersionEx() function?

I don't really know if this is necessary anyway. Just create the
collations with the names that the operating system presents. There is
no requirement to make the names match POSIX.

If you want to make them match POSIX for some reason, you can also just
change the object name but leave the collcollate/collctype fields the
way they came from the OS.


From: Juan José Santamaría Flecha <juanjo(dot)santamaria(at)gmail(dot)com>
To: Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>
Cc: Dmitry Koval <d(dot)koval(at)postgrespro(dot)ru>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: WIN32 pg_import_system_collations
Date: 2022-01-25 14:49:01
Message-ID: CAC+AXB0kd+YgF408sy5c2Bsa3PQdiSS6YJZLq6TMx33_WTJ-rA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Jan 25, 2022 at 11:40 AM Peter Eisentraut <
peter(dot)eisentraut(at)enterprisedb(dot)com> wrote:

> On 24.01.22 22:23, Dmitry Koval wrote:
>

Thanks for looking into this.

> > +/*
> > + * Windows will use hyphens between language and territory, where POSIX
> > + * uses an underscore. Simply make it POSIX looking.
> > + */
> > + hyphen = strchr(localebuf, '-');
> > + if (hyphen)
> > + *hyphen = '_';
> >
> > After this block modified collation name is used in function
> >
> > GetNLSVersionEx(COMPARE_STRING, wide_collcollate, &version)
> >
> > (see win32_read_locale() -> CollationFromLocale() -> CollationCreate()
> > call). Is it correct to use (wide_collcollate = "en_NZ") instead of
> > (wide_collcollate = "en-NZ") in GetNLSVersionEx() function?
>

The problem that David Rowley addressed was coming from Windows collations
in the shape of "English_New Zealand", GetNLSVersionEx() will work with
both "en_NZ" and "en-NZ". You can check collversion in pg_collation in the
patched version.

>
> I don't really know if this is necessary anyway. Just create the
> collations with the names that the operating system presents. There is
> no requirement to make the names match POSIX.
>
> If you want to make them match POSIX for some reason, you can also just
> change the object name but leave the collcollate/collctype fields the
> way they came from the OS.
>

I think there is some value in making collation names consistent across
different platforms, e.g. making user scripts more portable. So, I'm doing
that in the attached version, just changing the object name.

Regards,

Juan José Santamaría Flecha

Attachment Content-Type Size
v4-0001-WIN32-pg_import_system_collations.patch application/octet-stream 52.3 KB

From: Andres Freund <andres(at)anarazel(dot)de>
To: Juan José Santamaría Flecha <juanjo(dot)santamaria(at)gmail(dot)com>
Cc: Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Dmitry Koval <d(dot)koval(at)postgrespro(dot)ru>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: WIN32 pg_import_system_collations
Date: 2022-03-22 01:00:57
Message-ID: [email protected]
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

On 2022-01-25 15:49:01 +0100, Juan José Santamaría Flecha wrote:
> So, I'm doing that in the attached version, just changing the object name.

Currently fails to apply, please rebase: http://cfbot.cputube.org/patch_37_3450.log

Marked as waiting-on-author.

- Andres


From: Juan José Santamaría Flecha <juanjo(dot)santamaria(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Dmitry Koval <d(dot)koval(at)postgrespro(dot)ru>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: WIN32 pg_import_system_collations
Date: 2022-04-11 12:20:30
Message-ID: CAC+AXB1v_H68VQoXK=Pdnc7VwUC-9ifdQEtvyx_qEkMiBGJKbw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Mar 22, 2022 at 2:00 AM Andres Freund <andres(at)anarazel(dot)de> wrote:

>
> Currently fails to apply, please rebase:
> http://cfbot.cputube.org/patch_37_3450.log
>
> Marked as waiting-on-author.
>
> Please, find attached a rebased version, no other significant change.

Regards,

Juan José Santamaría Flecha

Attachment Content-Type Size
v5-0001-WIN32-pg_import_system_collations.patch application/octet-stream 52.3 KB

From: Juan José Santamaría Flecha <juanjo(dot)santamaria(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Dmitry Koval <d(dot)koval(at)postgrespro(dot)ru>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: WIN32 pg_import_system_collations
Date: 2022-07-12 19:32:33
Message-ID: CAC+AXB1ha0tnWcqMdjaBq_Lkw346E+h_5UqeYK5NZ29WR8a-bQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Please find attached a rebased version. I have split the patch into two
parts trying to make it easier to review, one with the code changes and the
other with the test.

Other than that, there are minimal changes from the previous version to the
code due to the update of _WIN32_WINNT and enabling the test on cirrus.

Regards,

Juan José Santamaría Flecha

>

Attachment Content-Type Size
v6-0001-WIN32-pg_import_system_collations.patch application/octet-stream 7.0 KB
v6-0002-Add-collate.windows.win1252-test.patch application/octet-stream 45.9 KB

From: Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>
To: Juan José Santamaría Flecha <juanjo(dot)santamaria(at)gmail(dot)com>
Cc: Dmitry Koval <d(dot)koval(at)postgrespro(dot)ru>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: WIN32 pg_import_system_collations
Date: 2022-10-31 14:09:36
Message-ID: [email protected]
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 12.07.22 21:32, Juan José Santamaría Flecha wrote:
> Please find attached a rebased version. I have split the patch into two
> parts trying to make it easier to review, one with the code changes and
> the other with the test.
>
> Other than that, there are minimal changes from the previous version to
> the code due to the update of _WIN32_WINNT and enabling the test on cirrus.

I'm not familiar with Windows, so I'm just looking at the overall
structure of this patch. I think it pretty much makes sense. But we
need to consider that this operates on the confluence of various
different operating system interfaces that not all people will be
familiar with, so we need to really get the documentation done well.

Consider this function you are introducing:

+/*
+ * Create a collation if the input locale is valid for so.
+ * Also keeps track of the number of valid locales and collations created.
+ */
+static int
+CollationFromLocale(char *isolocale, char *localebuf, int *nvalid,
+ int *ncreated, int nspid)

This declaration is incomprehensible without studying all the callers
and the surrounding code.

Start with the name: What does "collation from locale" mean? Does it
make a collation? Does it convert one? Does it find one? There should
be a verb in there.

(I think in the context of this file, a lower case name would be more
appropriate for a static function.)

Then the arguments. The input arguments should be "const". All the
arguments should be documented. What is "isolocale", what is
"localebuf", how are they different? What is being counted by "valid"
(collatons?, locales?), and what makes a thing valid and invalid? What
is being "created"? What is nspid? What is the return value?

Please make another pass over this.

Also consider describing in the commit message what you are doing in
more detail, including some of the things that have been discussed in
this thread.


From: Juan José Santamaría Flecha <juanjo(dot)santamaria(at)gmail(dot)com>
To: Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>
Cc: Dmitry Koval <d(dot)koval(at)postgrespro(dot)ru>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: WIN32 pg_import_system_collations
Date: 2022-11-04 22:08:24
Message-ID: CAC+AXB078_wJumwG9nOP--m3cTWKhAUeC22S73rUMVhXknstUg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Oct 31, 2022 at 3:09 PM Peter Eisentraut <
peter(dot)eisentraut(at)enterprisedb(dot)com> wrote:

Thanks for taking a look into this patch.

>
> Consider this function you are introducing:
>
> +/*
> + * Create a collation if the input locale is valid for so.
> + * Also keeps track of the number of valid locales and collations created.
> + */
> +static int
> +CollationFromLocale(char *isolocale, char *localebuf, int *nvalid,
> + int *ncreated, int nspid)
>
> This declaration is incomprehensible without studying all the callers
> and the surrounding code.
>
> Start with the name: What does "collation from locale" mean? Does it
> make a collation? Does it convert one? Does it find one? There should
> be a verb in there.
>
> (I think in the context of this file, a lower case name would be more
> appropriate for a static function.)
>
> Then the arguments. The input arguments should be "const". All the
> arguments should be documented. What is "isolocale", what is
> "localebuf", how are they different? What is being counted by "valid"
> (collatons?, locales?), and what makes a thing valid and invalid? What
> is being "created"? What is nspid? What is the return value?
>
> Please make another pass over this.
>
> Ok, I can definitely improve the comments for that function.

> Also consider describing in the commit message what you are doing in
> more detail, including some of the things that have been discussed in
> this thread.
>
> Going through the thread for the commit message, I think that maybe the
collation naming remarks were not properly addressed. In the current
version the collations retain their native name, but an alias is created
for those with a shape that we can assume a POSIX equivalent exists.

Please find attached a new version.

Regards,

Juan José Santamaría Flecha

Attachment Content-Type Size
v7-0002-Add-collate.windows.win1252-test.patch application/octet-stream 45.7 KB
v7-0001-WIN32-pg_import_system_collations.patch application/octet-stream 8.4 KB

From: Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>
To: Juan José Santamaría Flecha <juanjo(dot)santamaria(at)gmail(dot)com>
Cc: Dmitry Koval <d(dot)koval(at)postgrespro(dot)ru>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: WIN32 pg_import_system_collations
Date: 2022-11-07 15:08:17
Message-ID: [email protected]
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 04.11.22 23:08, Juan José Santamaría Flecha wrote:
> Ok, I can definitely improve the comments for that function.
>
> Also consider describing in the commit message what you are doing in
> more detail, including some of the things that have been discussed in
> this thread.
>
> Going through the thread for the commit message, I think that maybe the
> collation naming remarks were not properly addressed. In the current
> version the collations retain their native name, but an alias is created
> for those with a shape that we can assume a POSIX equivalent exists.

This looks pretty good to me. The refactoring of the non-Windows parts
makes sense. The Windows parts look reasonable on manual inspection,
but again, I don't have access to Windows here, so someone else should
also look it over.

A small style issue: Change return (TRUE) to return TRUE.

The code

+ if (strlen(localebuf) == 5 && localebuf[2] == '-')

might be too specific. At least on some POSIX systems, I have seen
locales with a three-letter language name. Maybe you should look with
strchr() and not be too strict about the exact position.

For the test patch, why is a separate test for non-UTF8 needed on
Windows. Does the UTF8 one not work?

+ version() !~ 'Visual C\+\+'

This probably won't work for MinGW.


From: Juan José Santamaría Flecha <juanjo(dot)santamaria(at)gmail(dot)com>
To: Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>
Cc: Dmitry Koval <d(dot)koval(at)postgrespro(dot)ru>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: WIN32 pg_import_system_collations
Date: 2022-11-08 23:02:39
Message-ID: CAC+AXB2VTU3Udv0nPMB3xO30rnpxcEsFuog7q30S_VmErS4b8g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Nov 7, 2022 at 4:08 PM Peter Eisentraut <
peter(dot)eisentraut(at)enterprisedb(dot)com> wrote:

>
> This looks pretty good to me. The refactoring of the non-Windows parts
> makes sense. The Windows parts look reasonable on manual inspection,
> but again, I don't have access to Windows here, so someone else should
> also look it over.
>
> I was going to say that at least it is getting tested on the CI, but I
have found out that meson changes version(). That is fixed in this version.

> A small style issue: Change return (TRUE) to return TRUE.
>
> Fixed.

> The code
>
> + if (strlen(localebuf) == 5 && localebuf[2] == '-')
>
> might be too specific. At least on some POSIX systems, I have seen
> locales with a three-letter language name. Maybe you should look with
> strchr() and not be too strict about the exact position.
>
> Ok, in this version the POSIX alias is created unconditionally.

> For the test patch, why is a separate test for non-UTF8 needed on
> Windows. Does the UTF8 one not work?
>
> Windows locales will retain their CP_ACP encoding unless you change the OS
code page to UFT8, which is still experimental [1].

> + version() !~ 'Visual C\+\+'
>
> This probably won't work for MinGW.
>
> When I proposed this patch it wouldn't have worked because of the
project's Windows minimum version requirement, now it should work in MinGW.
It actually doesn't because most locales are failing with "skipping locale
with unrecognized encoding", but checking what's wrong
with pg_get_encoding_from_locale() in MiNGW is subject for another thread.

[1]
https://stackoverflow.com/questions/56419639/what-does-beta-use-unicode-utf-8-for-worldwide-language-support-actually-do

Regards,

Juan José Santamaría Flecha

Attachment Content-Type Size
v8-0001-WIN32-pg_import_system_collations.patch application/octet-stream 8.4 KB
v8-0002-Add-collate.windows.win1252-test.patch application/octet-stream 45.6 KB

From: Juan José Santamaría Flecha <juanjo(dot)santamaria(at)gmail(dot)com>
To: Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>
Cc: Dmitry Koval <d(dot)koval(at)postgrespro(dot)ru>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: WIN32 pg_import_system_collations
Date: 2022-11-10 10:08:32
Message-ID: CAC+AXB3Wx3CH-1OBEg_1uWcAOzYQ-S+B3_c0emhmgejgen7iNw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Nov 9, 2022 at 12:02 AM Juan José Santamaría Flecha <
juanjo(dot)santamaria(at)gmail(dot)com> wrote:

> On Mon, Nov 7, 2022 at 4:08 PM Peter Eisentraut <
> peter(dot)eisentraut(at)enterprisedb(dot)com> wrote:
>
>>
>> This looks pretty good to me. The refactoring of the non-Windows parts
>> makes sense. The Windows parts look reasonable on manual inspection,
>> but again, I don't have access to Windows here, so someone else should
>> also look it over.
>>
>> I was going to say that at least it is getting tested on the CI, but I
> have found out that meson changes version(). That is fixed in this version.
>

Now is currently failing due to [1], so maybe we can leave this patch on
hold until that's addressed.

[1]
https://www.postgresql.org/message-id/CAC%2BAXB1wJEqfKCuVcNpoH%3Dgxd61N%3D7c2fR3Ew6YRPpSfEUA%3DyQ%40mail.gmail.com

Regards,

Juan José Santamaría Flecha


From: Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>
To: Juan José Santamaría Flecha <juanjo(dot)santamaria(at)gmail(dot)com>
Cc: Dmitry Koval <d(dot)koval(at)postgrespro(dot)ru>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: WIN32 pg_import_system_collations
Date: 2022-12-01 07:46:41
Message-ID: [email protected]
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 10.11.22 11:08, Juan José Santamaría Flecha wrote:
> This looks pretty good to me.  The refactoring of the
> non-Windows parts
> makes sense.  The Windows parts look reasonable on manual
> inspection,
> but again, I don't have access to Windows here, so someone else
> should
> also look it over.
>
> I was going to say that at least it is getting tested on the CI, but
> I have found out that meson changes version(). That is fixed in this
> version.
>
>
> Now is currently failing due to [1], so maybe we can leave this patch on
> hold until that's addressed.
>
> [1]
> https://www.postgresql.org/message-id/CAC%2BAXB1wJEqfKCuVcNpoH%3Dgxd61N%3D7c2fR3Ew6YRPpSfEUA%3DyQ%40mail.gmail.com <https://www.postgresql.org/message-id/CAC%2BAXB1wJEqfKCuVcNpoH%3Dgxd61N%3D7c2fR3Ew6YRPpSfEUA%3DyQ%40mail.gmail.com>

What is the status of this now? I think the other issue has been addressed?


From: Juan José Santamaría Flecha <juanjo(dot)santamaria(at)gmail(dot)com>
To: Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>
Cc: Dmitry Koval <d(dot)koval(at)postgrespro(dot)ru>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: WIN32 pg_import_system_collations
Date: 2022-12-09 12:48:53
Message-ID: CAC+AXB2Y4zi-U5L3agNgbFmejV0A3x+YB+bb4KmBGigSZBVvbA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Dec 1, 2022 at 8:46 AM Peter Eisentraut <
peter(dot)eisentraut(at)enterprisedb(dot)com> wrote:

>
> What is the status of this now? I think the other issue has been
> addressed?
>

Yes, that's addressed for MSVC builds. I think there are a couple of
pending issues for MinGW, but those should have their own threads.

The patch had rotten, so PFA a rebased version.

Regards,

Juan José Santamaría Flecha

Attachment Content-Type Size
v9-0002-Add-collate.windows.win1252-test.patch application/x-patch 45.1 KB
v9-0001-WIN32-pg_import_system_collations.patch application/x-patch 8.4 KB

From: Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>
To: Juan José Santamaría Flecha <juanjo(dot)santamaria(at)gmail(dot)com>
Cc: Dmitry Koval <d(dot)koval(at)postgrespro(dot)ru>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: WIN32 pg_import_system_collations
Date: 2023-01-03 13:48:56
Message-ID: [email protected]
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 09.12.22 13:48, Juan José Santamaría Flecha wrote:
> On Thu, Dec 1, 2022 at 8:46 AM Peter Eisentraut
> <peter(dot)eisentraut(at)enterprisedb(dot)com
> <mailto:peter(dot)eisentraut(at)enterprisedb(dot)com>> wrote:
>
>
> What is the status of this now?  I think the other issue has been
> addressed?
>
>
> Yes, that's addressed for MSVC builds. I think there are a couple of
> pending issues for MinGW, but those should have their own threads.
>
> The patch had rotten, so PFA a rebased version.

committed


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Juan José Santamaría Flecha <juanjo(dot)santamaria(at)gmail(dot)com>
Cc: Dmitry Koval <d(dot)koval(at)postgrespro(dot)ru>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: WIN32 pg_import_system_collations
Date: 2023-02-26 21:02:38
Message-ID: [email protected]
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On 2023-01-03 Tu 08:48, Peter Eisentraut wrote:
> On 09.12.22 13:48, Juan José Santamaría Flecha wrote:
>> On Thu, Dec 1, 2022 at 8:46 AM Peter Eisentraut
>> <peter(dot)eisentraut(at)enterprisedb(dot)com
>> <mailto:peter(dot)eisentraut(at)enterprisedb(dot)com>> wrote:
>>
>>
>>     What is the status of this now?  I think the other issue has been
>>     addressed?
>>
>>
>> Yes, that's addressed for MSVC builds. I think there are a couple of
>> pending issues for MinGW, but those should have their own threads.
>>
>> The patch had rotten, so PFA a rebased version.
>
> committed
>
>

Now that I have removed the barrier to testing this in the buildfarm,
and added an appropriate locale setting to drongo, we can see that this
test fails like this:

diff -w -U3 c:/prog/bf/root/HEAD/pgsql.build/src/test/regress/expected/collate.windows.win1252.out c:/prog/bf/root/HEAD/pgsql.build/src/test/regress/results/collate.windows.win1252.out
--- c:/prog/bf/root/HEAD/pgsql.build/src/test/regress/expected/collate.windows.win1252.out 2023-01-23 04:39:06.755149600 +0000
+++ c:/prog/bf/root/HEAD/pgsql.build/src/test/regress/results/collate.windows.win1252.out 2023-02-26 17:32:54.115515200 +0000
@@ -363,16 +363,17 @@

-- to_char
SET lc_time TO 'de_DE';
+ERROR: invalid value for parameter "lc_time": "de_DE"
SELECT to_char(date '2010-03-01', 'DD TMMON YYYY');
to_char
-------------
- 01 MRZ 2010
+ 01 MAR 2010
(1 row)

SELECT to_char(date '2010-03-01', 'DD TMMON YYYY' COLLATE "de_DE");
to_char
-------------
- 01 MRZ 2010
+ 01 MAR 2010
(1 row)

-- to_date

The last of these is especially an issue, as it doesn't even throw an error.

See
<https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=drongo&dt=2023-02-26%2016%3A56%3A30>

cheers

andrew

--
Andrew Dunstan
EDB:https://www.enterprisedb.com


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Juan José Santamaría Flecha <juanjo(dot)santamaria(at)gmail(dot)com>
Cc: Dmitry Koval <d(dot)koval(at)postgrespro(dot)ru>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: WIN32 pg_import_system_collations
Date: 2023-02-27 12:09:57
Message-ID: [email protected]
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On 2023-02-26 Su 16:02, Andrew Dunstan wrote:
>
>
> On 2023-01-03 Tu 08:48, Peter Eisentraut wrote:
>> On 09.12.22 13:48, Juan José Santamaría Flecha wrote:
>>> On Thu, Dec 1, 2022 at 8:46 AM Peter Eisentraut
>>> <peter(dot)eisentraut(at)enterprisedb(dot)com
>>> <mailto:peter(dot)eisentraut(at)enterprisedb(dot)com>> wrote:
>>>
>>>
>>>     What is the status of this now?  I think the other issue has been
>>>     addressed?
>>>
>>>
>>> Yes, that's addressed for MSVC builds. I think there are a couple of
>>> pending issues for MinGW, but those should have their own threads.
>>>
>>> The patch had rotten, so PFA a rebased version.
>>
>> committed
>>
>>
>
> Now that I have removed the barrier to testing this in the buildfarm,
> and added an appropriate locale setting to drongo, we can see that
> this test fails like this:
>
>
> diff -w -U3 c:/prog/bf/root/HEAD/pgsql.build/src/test/regress/expected/collate.windows.win1252.out c:/prog/bf/root/HEAD/pgsql.build/src/test/regress/results/collate.windows.win1252.out
> --- c:/prog/bf/root/HEAD/pgsql.build/src/test/regress/expected/collate.windows.win1252.out 2023-01-23 04:39:06.755149600 +0000
> +++ c:/prog/bf/root/HEAD/pgsql.build/src/test/regress/results/collate.windows.win1252.out 2023-02-26 17:32:54.115515200 +0000
> @@ -363,16 +363,17 @@
>
> -- to_char
> SET lc_time TO 'de_DE';
> +ERROR: invalid value for parameter "lc_time": "de_DE"
> SELECT to_char(date '2010-03-01', 'DD TMMON YYYY');
> to_char
> -------------
> - 01 MRZ 2010
> + 01 MAR 2010
> (1 row)
>
> SELECT to_char(date '2010-03-01', 'DD TMMON YYYY' COLLATE "de_DE");
> to_char
> -------------
> - 01 MRZ 2010
> + 01 MAR 2010
> (1 row)
>
> -- to_date
>
>
> The last of these is especially an issue, as it doesn't even throw an
> error.
>
> See
> <https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=drongo&dt=2023-02-26%2016%3A56%3A30>
>
>
>

Further investigation shows that if we change the two instances of
"de_DE" to "de-DE" the tests behave as expected, so it appears that
while POSIX style aliases have been created for the BCP 47 style
locales, using the POSIX aliases doesn't in fact work. I cant see
anything that turns the POSIX locale name back into BCP 47 at the point
of use, which seems to be what's needed.

cheers

andrew

--
Andrew Dunstan
EDB:https://www.enterprisedb.com


From: Juan José Santamaría Flecha <juanjo(dot)santamaria(at)gmail(dot)com>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Dmitry Koval <d(dot)koval(at)postgrespro(dot)ru>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: WIN32 pg_import_system_collations
Date: 2023-02-27 22:05:23
Message-ID: CAC+AXB3gf1Uho2S9BZ7vBAaqa5Z-3N34cMc7V60EkcO6-dXcZA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Feb 27, 2023 at 1:10 PM Andrew Dunstan <andrew(at)dunslane(dot)net> wrote:

> On 2023-02-26 Su 16:02, Andrew Dunstan wrote:
>
> Now that I have removed the barrier to testing this in the buildfarm, and
> added an appropriate locale setting to drongo, we can see that this test
> fails like this:
>
>
> diff -w -U3 c:/prog/bf/root/HEAD/pgsql.build/src/test/regress/expected/collate.windows.win1252.out c:/prog/bf/root/HEAD/pgsql.build/src/test/regress/results/collate.windows.win1252.out
> --- c:/prog/bf/root/HEAD/pgsql.build/src/test/regress/expected/collate.windows.win1252.out 2023-01-23 04:39:06.755149600 +0000
> +++ c:/prog/bf/root/HEAD/pgsql.build/src/test/regress/results/collate.windows.win1252.out 2023-02-26 17:32:54.115515200 +0000
> @@ -363,16 +363,17 @@
>
> -- to_char
> SET lc_time TO 'de_DE';
> +ERROR: invalid value for parameter "lc_time": "de_DE"
> SELECT to_char(date '2010-03-01', 'DD TMMON YYYY');
> to_char
> -------------
> - 01 MRZ 2010
> + 01 MAR 2010
> (1 row)
>
> SELECT to_char(date '2010-03-01', 'DD TMMON YYYY' COLLATE "de_DE");
> to_char
> -------------
> - 01 MRZ 2010
> + 01 MAR 2010
> (1 row)
>
> -- to_date
>
>
> The last of these is especially an issue, as it doesn't even throw an
> error.
>
> See
> <https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=drongo&dt=2023-02-26%2016%3A56%3A30>
> <https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=drongo&dt=2023-02-26%2016%3A56%3A30>
>
>
> Further investigation shows that if we change the two instances of "de_DE"
> to "de-DE" the tests behave as expected, so it appears that while POSIX
> style aliases have been created for the BCP 47 style locales, using the
> POSIX aliases doesn't in fact work. I cant see anything that turns the
> POSIX locale name back into BCP 47 at the point of use, which seems to be
> what's needed.
>

The command that's failing is "SET lc_time TO 'de_DE';", and that area of
code is untouched by this patch. As mentioned in [1], the problem seems to
come from a Windows bug that the CI images and my development machines have
patched out.

I think we should change the locale name to make the test more robust, as
the attached. But I don't see a problem with making an alias for the
collations.

Regards,

Juan José Santamaría Flecha

Attachment Content-Type Size
0001-change-locale-name-for-test-collate.windows.win1252.patch application/octet-stream 1.5 KB

From: Juan José Santamaría Flecha <juanjo(dot)santamaria(at)gmail(dot)com>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Dmitry Koval <d(dot)koval(at)postgrespro(dot)ru>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: WIN32 pg_import_system_collations
Date: 2023-02-27 22:20:51
Message-ID: CAC+AXB2aTuHZpa2RGtkSR6N78NA6YQCRvHZMePYQwfsxMF+uJw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

El lun, 27 feb 2023, 23:05, Juan José Santamaría Flecha <
juanjo(dot)santamaria(at)gmail(dot)com> escribió:

>
> The command that's failing is "SET lc_time TO 'de_DE';", and that area of
> code is untouched by this patch. As mentioned in [1], the problem seems to
> come from a Windows bug that the CI images and my development machines have
> patched out.
>

What I wanted to post as [1]:

https://www.postgresql.org/message-id/CAC%2BAXB1agvrgpyHEfqbDr2MOpcON3d%2BWYte_SLzn1E4TamLs9g%40mail.gmail.com

> Regards,
>
> Juan José Santamaría Flecha
>


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Juan José Santamaría Flecha <juanjo(dot)santamaria(at)gmail(dot)com>
Cc: Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Dmitry Koval <d(dot)koval(at)postgrespro(dot)ru>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: WIN32 pg_import_system_collations
Date: 2023-02-28 11:55:17
Message-ID: [email protected]
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On 2023-02-27 Mo 17:20, Juan José Santamaría Flecha wrote:
>
>
> El lun, 27 feb 2023, 23:05, Juan José Santamaría Flecha
> <juanjo(dot)santamaria(at)gmail(dot)com> escribió:
>
>
> The command that's failing is "SET lc_time TO 'de_DE';", and that
> area of code is untouched by this patch. As mentioned in [1],
> the problem seems to come from a Windows bug that the CI images
> and my development machines have patched out.
>
>
> What I wanted to post as [1]:
>
> https://www.postgresql.org/message-id/CAC%2BAXB1agvrgpyHEfqbDr2MOpcON3d%2BWYte_SLzn1E4TamLs9g%40mail.gmail.com

Hmm, yeah. I'm not sure I understand the point of this test anyway:

SELECT to_char(date '2010-03-01', 'DD TMMON YYYY' COLLATE "de_DE");

cheers

andrew

--
Andrew Dunstan
EDB:https://www.enterprisedb.com


From: Juan José Santamaría Flecha <juanjo(dot)santamaria(at)gmail(dot)com>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Dmitry Koval <d(dot)koval(at)postgrespro(dot)ru>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: WIN32 pg_import_system_collations
Date: 2023-02-28 16:40:35
Message-ID: CAC+AXB0Pj9=mtk+-oQJg-X5GyiTOfn_cOWYS4VBt-w3+J5rX5w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Feb 28, 2023 at 12:55 PM Andrew Dunstan <andrew(at)dunslane(dot)net> wrote:

> On 2023-02-27 Mo 17:20, Juan José Santamaría Flecha wrote:
>
>
> Hmm, yeah. I'm not sure I understand the point of this test anyway:
>
>
> SELECT to_char(date '2010-03-01', 'DD TMMON YYYY' COLLATE "de_DE");
>

Uhm, they probably don't make much sense except for "tr_TR", so I'm fine
with removing them. PFA a patch for so.

Regards,

Juan José Santamaría Flecha

Attachment Content-Type Size
0002-remove-useless-test-from-collate.windows.win1252.patch application/octet-stream 2.6 KB
0001-change-locale-name-for-test-collate.windows.win1252.patch application/octet-stream 1.5 KB

From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Juan José Santamaría Flecha <juanjo(dot)santamaria(at)gmail(dot)com>
Cc: Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Dmitry Koval <d(dot)koval(at)postgrespro(dot)ru>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: WIN32 pg_import_system_collations
Date: 2023-02-28 20:26:07
Message-ID: [email protected]
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On 2023-02-28 Tu 11:40, Juan José Santamaría Flecha wrote:
>
> On Tue, Feb 28, 2023 at 12:55 PM Andrew Dunstan <andrew(at)dunslane(dot)net>
> wrote:
>
> On 2023-02-27 Mo 17:20, Juan José Santamaría Flecha wrote:
>
>
> Hmm, yeah. I'm not sure I understand the point of this test anyway:
>
>
> SELECT to_char(date '2010-03-01', 'DD TMMON YYYY' COLLATE "de_DE");
>
>
> Uhm, they probably don't make much sense except for "tr_TR", so I'm
> fine with removing them. PFA a patch for so.
>
>

I think you missed my point, which was that the COLLATE clause above
seemed particularly pointless. But I agree that all these are not much
use, so I'll remove them as you suggest.

cheers

andrew

--
Andrew Dunstan
EDB:https://www.enterprisedb.com


From: Juan José Santamaría Flecha <juanjo(dot)santamaria(at)gmail(dot)com>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Dmitry Koval <d(dot)koval(at)postgrespro(dot)ru>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: WIN32 pg_import_system_collations
Date: 2023-03-01 08:49:52
Message-ID: CAC+AXB0JrfLjm935fEJ+yuEV+TZzJ-E3cZbv3UpztfpPsP8hBQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Feb 28, 2023 at 9:26 PM Andrew Dunstan <andrew(at)dunslane(dot)net> wrote:

> I think you missed my point, which was that the COLLATE clause above
> seemed particularly pointless. But I agree that all these are not much use,
> so I'll remove them as you suggest.
>

Maybe there has been some miscommunication, please let me try to explain
myself a little better. The whole test is an attempt to mimic
collate.linux.utf8, which has that same command, only for collate 'tr_TR',
and so does collate.icu.utf8 but commented out.

I've seen that you have committed this and now drongo is green, which is
great. Thank you for taking care of it.

Regards,

Juan José Santamaría Flecha