Opened 6 years ago

Closed 6 years ago

Last modified 2 years ago

#9959 closed defect (fixed)

BridgeDB seems to be missing English translations

Reported by: isis Owned by: isis
Priority: Immediate Milestone:
Component: Circumvention/BridgeDB Version:
Severity: Normal Keywords: bridgedb-ui, bridgedb-translations, translations, bridgedb-https
Cc: sysrqb, aagbsn Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description

See my last comment on #9157. English seems to be missing. I've never used babel or pybabel before, or worked with .pot/.po/.mo files, so it's likely that I just did something stupid when I last updated the translation files. Everytime I ask on IRC, no one is able to tell me how this thing is supposed to work...any help is much appreciated.

Child Tickets

Change History (8)

comment:1 Changed 6 years ago by aagbsn

Hi Isis --

the default language is English, see: lib/bridgedb/I18n.py:6, which IIRC simply doesn't replace any strings (that is, it defaults to the original text) if the specified language is not found.

I'd be glad to help you understand how the .pot/.po/.mo file work

Firstly, all the templates (html) and code (py) files have some syntactic sugar wrapping any strings that you want to translate. It looks like _("A String I Want To Translate").

The second thing you do is extract all of the above such strings and dump them into a .pot file. pybabel provides a set of message extractors, and these are added to setup.py along with a set of paths (message_extractors) that tell pybabel what sort of files it should hunt for strings in.

when you invoke extract_messages" pybabel will automagically output a .pot file inside the project path under projectname/i18n/templates/projectname.pot (this path is static, I guess pybabel could use a patch if you want to change it).

For each language you want to get a translation for, you need to init_catalog -l LANG for each of those languages. That will generate an untranslated .po file in the appropriate path. You're going to replace that file anyway with the one that transifex produces, though.

The .pot template file is handed off to transifex/pootle/hand edited, and a set of corresponding .po files are produced, one for each translation. If you take a look at each of the files, you'll see they correspond to the template (remember, you should always redo the extract_messages command any time you add new strings for translation), where each extracted string has has the original string followed by the translated string. After these files are merged into the appropriate path (see init_catalog), you just need to compile the catalog to produce a binary format (.mo) that works with gettext to efficiently look up the translated string.

At runtime, you figure out what locale the user wants (in BridgeDB see HTTPServer.py setLocaleFromRequestHeader) and tell gettext which language you want to translate for (lang.install()). Then, whenver the _() wrapper is encountered, gettext goes off and searches the corresponding .mo file (you did update it all the messages, and reinstall BridgeDB to put all the .mo files in the right paths, right?) to find the matching string and returns that instead of the original text.

Hopefully this makes it a bit clearer what is going on. If you're having troubles with English missing some of the time, dump the output of the expected locale to a debug log and make sure it is set to None or "en" so that the default (untranslated) locale will be returned. It could be that BridgeDB isn't setting the locale properly in some context (global state + twisted async madness? who knows..). If English is always missing, git bisect is your friend :)

comment:2 in reply to:  1 ; Changed 6 years ago by isis

Status: newneeds_review

Replying to aagbsn:

Hi Isis --

Hey aagbsn! Thanks for helping!

the default language is English, see: lib/bridgedb/I18n.py:6, which IIRC simply doesn't replace any strings (that is, it defaults to the original text) if the specified language is not found.

Alright. At least I'm not going crazier. I was worried for a minute that there had always been English "translations" and that I must have somehow deleted them and obscured that from myself in the git history.

I'd be glad to help you understand how the .pot/.po/.mo file work

Firstly, all the templates (html) and code (py) files have some syntactic sugar wrapping any strings that you want to translate. It looks like _("A String I Want To Translate").

First, it's fixed already, in my branch feature/9959-pas-danglais. All of the .pot/.po/.mo stuff I think that I've already understood (I rewrote the README months ago, so it now includes detailed instructions for dealing with all that).

Two things however, when I was working this out are now marked XXX in the README:

[xxx outdated, these commands seem to not exist...]

     python setup.py trans && python setup.py install_data

I'm not sure what trans was supposed to do, nor can I find reference to a distutils command class anywhere for that, so I assumed it was something from an old python gettext-ish module that was removed/replaced at some point.

install_data doesn't do anything (even though the command class for it is present) because modern versions of setuptools now handle installing data files in the .egg through the pkg_resources API. I should probably just remove both of these, but I didn't know what trans was so I left it hanging out.

Hopefully this makes it a bit clearer what is going on. If you're having troubles with English missing some of the time, dump the output of the expected locale to a debug log and make sure it is set to None or "en" so that the default (untranslated) locale will be returned. It could be that BridgeDB isn't setting the locale properly in some context (global state + twisted async madness? who knows..). If English is always missing, git bisect is your friend :)


Hrm. I fixed it by creating "untranslated" .po files for en, en_GB and en_US. Which seems like madness to me, and I can't for the life of me (I already bisected twice) figure out why BridgeDB all of a sudden wants translation files for English. But it's working again at least.

One strange thing that I noticed, if you look at my comment on #9517 which links to this ticket, is that all of the languages in the Accept-Language header have a q weight assigned to them -- all except for en, the first one. I had assumed that this was an implied en;q=1 meaning "primary preference"...but maybe it just wasn't ever assigned a weight at all. If that is the case...beetlejuice, that bug could be anywhere...in BridgeDB, in Twisted, in Firefox, in pybabel...ugh. But it's fixed so "no happy; be worry" as they always say, right?

comment:3 in reply to:  2 ; Changed 6 years ago by aagbsn

Replying to isis:

Replying to aagbsn:

Hi Isis --

Hey aagbsn! Thanks for helping!

the default language is English, see: lib/bridgedb/I18n.py:6, which IIRC simply doesn't replace any strings (that is, it defaults to the original text) if the specified language is not found.

Alright. At least I'm not going crazier. I was worried for a minute that there had always been English "translations" and that I must have somehow deleted them and obscured that from myself in the git history.

I'd be glad to help you understand how the .pot/.po/.mo file work

Firstly, all the templates (html) and code (py) files have some syntactic sugar wrapping any strings that you want to translate. It looks like _("A String I Want To Translate").

First, it's fixed already, in my branch feature/9959-pas-danglais. All of the .pot/.po/.mo stuff I think that I've already understood (I rewrote the README months ago, so it now includes detailed instructions for dealing with all that).

Two things however, when I was working this out are now marked XXX in the README:

[xxx outdated, these commands seem to not exist...]

     python setup.py trans && python setup.py install_data

I believe these were how we did translation before, and install_data points at the method installData in setup.py that is made redundant by the package_data directive in setup().

I'm not sure what trans was supposed to do, nor can I find reference to a distutils command class anywhere for that, so I assumed it was something from an old python gettext-ish module that was removed/replaced at some point.

I think that sounds accurate.

install_data doesn't do anything (even though the command class for it is present) because modern versions of setuptools now handle installing data files in the .egg through the pkg_resources API. I should probably just remove both of these, but I didn't know what trans was so I left it hanging out.

I think you can drop it and keep only the commands that are necessary for translating and installing BridgeDB.

Hopefully this makes it a bit clearer what is going on. If you're having troubles with English missing some of the time, dump the output of the expected locale to a debug log and make sure it is set to None or "en" so that the default (untranslated) locale will be returned. It could be that BridgeDB isn't setting the locale properly in some context (global state + twisted async madness? who knows..). If English is always missing, git bisect is your friend :)


Hrm. I fixed it by creating "untranslated" .po files for en, en_GB and en_US. Which seems like madness to me, and I can't for the life of me (I already bisected twice) figure out why BridgeDB all of a sudden wants translation files for English. But it's working again at least.

Well, sounds like that will work. Were translations broken even after you rolled back to a prior commit and did a clean install?

One strange thing that I noticed, if you look at my comment on #9517 which links to this ticket, is that all of the languages in the Accept-Language header have a q weight assigned to them -- all except for en, the first one. I had assumed that this was an implied en;q=1 meaning "primary preference"...but maybe it just wasn't ever assigned a weight at all. If that is the case...beetlejuice, that bug could be anywhere...in BridgeDB, in Twisted, in Firefox, in pybabel...ugh. But it's fixed so "no happy; be worry" as they always say, right?

Hm, how many languages are present in your Accept-Language header?

comment:4 in reply to:  3 ; Changed 6 years ago by isis

Replying to aagbsn:

I believe these were how we did translation before, and install_data points at the method installData in setup.py that is made redundant by the package_data directive in setup().

I'm not sure what trans was supposed to do, nor can I find reference to a distutils command class anywhere for that, so I assumed it was something from an old python gettext-ish module that was removed/replaced at some point.

I think that sounds accurate.

Okay, I'll just remove those then.


Hrm. I fixed it by creating "untranslated" .po files for en, en_GB and en_US. Which seems like madness to me, and I can't for the life of me (I already bisected twice) figure out why BridgeDB all of a sudden wants translation files for English. But it's working again at least.

Well, sounds like that will work. Were translations broken even after you rolled back to a prior commit and did a clean install?

Yes, which is why I believe that this is a bug or API change in something else, in Twisted/FF/pybabel/etc.


One strange thing that I noticed, if you look at my comment on #9517 which links to this ticket, is that all of the languages in the Accept-Language header have a q weight assigned to them -- all except for en, the first one. I had assumed that this was an implied en;q=1 meaning "primary preference"...but maybe it just wasn't ever assigned a weight at all. If that is the case...beetlejuice, that bug could be anywhere...in BridgeDB, in Twisted, in Firefox, in pybabel...ugh. But it's fixed so "no happy; be worry" as they always say, right?

Hm, how many languages are present in your Accept-Language header?

Something around twenty. Is that a problem?

comment:5 in reply to:  4 ; Changed 6 years ago by aagbsn

Replying to isis:

Hrm. I fixed it by creating "untranslated" .po files for en, en_GB and en_US. Which seems like madness to me, and I can't for the life of me (I already bisected twice) figure out why BridgeDB all of a sudden wants translation files for English. But it's working again at least.

Well, sounds like that will work. Were translations broken even after you rolled back to a prior commit and did a clean install?

Yes, which is why I believe that this is a bug or API change in something else, in Twisted/FF/pybabel/etc.

Ah, but the requirements.txt in bridgedb has all of the versions specified (see output of pip freeze). Are you using the same versions of these packages every time?

Hm, how many languages are present in your Accept-Language header?

Something around twenty. Is that a problem?

I just wondered if you had the same problem(s) with an off-the-shelf TBB (en_US, en, iirc).

comment:6 in reply to:  5 Changed 6 years ago by isis

Replying to aagbsn:

Replying to isis:

Hrm. I fixed it by creating "untranslated" .po files for en, en_GB and en_US. Which seems like madness to me, and I can't for the life of me (I already bisected twice) figure out why BridgeDB all of a sudden wants translation files for English. But it's working again at least.

Well, sounds like that will work. Were translations broken even after you rolled back to a prior commit and did a clean install?

Yes, which is why I believe that this is a bug or API change in something else, in Twisted/FF/pybabel/etc.

Ah, but the requirements.txt in bridgedb has all of the versions specified (see output of pip freeze). Are you using the same versions of these packages every time?

Some of those requirements did not exist according to pip (they were renamed); others got updates but that was months before this bug appeared. Probably the most major thing with all the requirements was switching to the newest setuptools (1.1.x), and fixing everything so that the databases/scripts/modules could run in a virtualenv. There were a couple glitches fixing all that up where the setuptools on my dev machine was 1.1.x and then there was a slight bugginess that would manifest in the virtualenv setuptools being a 0.6.x version...but this should all be fixed, and both versions are 1.1.x. Other than that I cannot think of anything, unless it was the RTL patches from #9157 that somehow made BridgeDB forget that English is the default.

Hm, how many languages are present in your Accept-Language header?

Something around twenty. Is that a problem?

I just wondered if you had the same problem(s) with an off-the-shelf TBB (en_US, en, iirc).

I also tried with an unmodified TBB; it just failed to render the page. Sorry, forgot to mention that.

comment:7 Changed 6 years ago by isis

Resolution: fixed
Status: needs_reviewclosed

It was never clear why, but this bug turned out to be due to the .po/.mo files missing for English (odd, because they were never needed before). This was fixed in my branch, feature/9959-pas-danglais, and then a couple further changes were added in my hotfix/9959-pas-danglais_install-data branch.

comment:8 Changed 2 years ago by teor

Severity: Normal

Set all tickets without a severity to "Normal"

Note: See TracTickets for help on using tickets.