Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: Create infrastructure for translations #61220

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

melissawm
Copy link
Contributor

Hi all,

This PR is a proposal for adding the translations infrastructure to the pandas web page.

Following the discussion in #56301, we (a group of folks working on the Scientific Python grant) have been working to set up infrastructure and translate the contents of the pandas web site. As of this moment, we have 100% translations for the pandas website into Spanish and Brazilian Portuguese, with other languages available for translation (depending on volunteer translators).

What this PR does:

  • Reorganizes web site sources file structure for multilanguage support, with a new "pt" folder which, in the future, can hold Brazilian Portuguese translations pulled in from Crowdin.
  • Adds a language switcher to the top of the page
  • Adds language option to web pages command line builder

What this PR does not do:

  • Add actual translations for the full contents of the website. This needs to be done in a follow-up.

This PR is a draft, as we are looking for feedback on the approach and appetite for this change. We would love to have more languages added, and we firmly believe having the translations infrastructure may help recruit new translators which will then see their work published on the actual website. We can also work on adding a "Translations team" to the pandas website if desired, with data pulled in automatically from Crowdin.

To build, this will require the following command:

python pandas_web.py pandas/content --target-path build --languages en pt

If you want to check out other related work, please take a look at scipy/scipy.org#617

Some of this is still work in progress, and @goanpeca is working on automations to make synchronizing and updating the translations easier- he can also help answer questions on the overall integration with Crowdin.

Any feedback is appreciated, and we are happy to answer questions and discuss more if needed.

Cheers!

  • closes #xxxx (Replace xxxx with the GitHub issue number)
  • Tests added and passed if fixing a bug or adding a new feature
  • All code checks passed.
  • Added type annotations to new arguments/methods/functions.
  • Added an entry in the latest doc/source/whatsnew/vX.X.X.rst file if fixing a bug or adding a new feature.

* Reorganizes file structure for multilanguage support
* Adds a language switcher to the top of the page
* Adds language option to web pages command line builder
@mroeschke
Copy link
Member

Thanks for starting this @melissawm

  1. Reviewing the feedback in the original issue About the internationalization of documentation #56301, it appears the pandas core devs (including myself) would prefer translations to live outside the core repo. Am I understanding correctly, that the pt directory, or any other, new abbreviated language directory, would mean the translation would live in this repo?
  2. If docs in the en folder get modified, the --languages flag will automatically update the changed docs to the target language?

@melissawm
Copy link
Contributor Author

Hi @mroeschke !

  1. I think we could devise a way to build the website pulling in the translations from the https://github.com/Scientific-Python-Translations/pandas-translations repo, although that may complicate your CI set up. It's your call though, happy to explore that.
  2. As far as I understand, no - changes to the en folder are propagated to the https://github.com/Scientific-Python-Translations/pandas-translations repo, which in turn is passed over to translators, and that will update the other languages. Maybe @goanpeca can help me with this one.

@goanpeca
Copy link

goanpeca commented Apr 9, 2025

Hi @melissawm

regarding

If docs in the en folder get modified, the --languages flag will automatically update the changed docs to the target language?

No, currently a github action is set to run daily (could be modified as needed) to check if the content has changed, and if it does we copy the changes over at https://github.com/Scientific-Python-Translations/pandas-translations where the crowidn integration is set.

A different action that runs once per week (can be modified as needed) checks if the translations are over a certain threshold of completion (by default 95%) and if there are new strings available a PR will be merged automatically over at https://github.com/Scientific-Python-Translations/pandas-translations with the translated content.

I think we could devise a way to build the website pulling in the translations from the https://github.com/Scientific-Python-Translations/pandas-translations repo, although that may complicate your CI set up. It's your call though, happy to explore that.

Regarding this, we could indeed pull the translations from the repo on build time to avoid having that content on this repo. And as @melissawm it would be a bit more involved for CI on this side, but we can make that work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants