Fallback charset enables fingerprinting of bundle localization
|Reported by:||dcf||Owned by:||mikeperry|
|Keywords:||tbb-fingerprinting, tbb-pref, MikePerry201402R||Cc:||gk|
|Actual Points:||Parent ID:|
Torbutton has the spoof_english pref that changes the value of the Accept-Language header to en-us,en;q=0.5; this cloaks what particular localized bundle you may be using. But localized bundles still differ in their default (fallback) charset. By figuring out what characters a byte sequence decodes as, it's possible to find out what charset is in use.
It looks like our current bundles may come with any of 6 different default charsets:
- utf-8: ar fa
- iso-8859-1: de es-ES fr it nl pt-PT vi
- iso-8859-2: pl
- windows-1251: ru
- euc-kr: ko
- gbk: zh
I found these by grepping the langpacks' unpacked *.xpi files for "intl.charset.default".
As an example of how byte sequences can be variously decoded, here are decodings of "\xc3\xa3":
- utf-8: ã
- iso-8859-1: Ã£
- iso-8859-2: ĂŁ
- windows-1251: ГЈ
- euc-kr: 찾
- gbk: 茫
That is, an HTML page can contain the sequence "\xc3\xa3" and it will render as different characters depending on the charset in effect.
A possible solution is just to force intl.charset.default to UTF-8 in all localizations. Here are some Mozilla bugs I found that are relevant to setting this pref to UTF-8: 910165 406498 536506 910169.
Also see https://developer.mozilla.org/en-US/docs/Localizations_and_character_encodings#Specifying_the_fallback_encoding, which indicates that Firefox's behavior with respect to the fallback charset will change:
As of Firefox 28, this section is obsolete, since the preference intl.charset.default no longer exists. The mapping from locales onto fallback encodings is now built into Gecko itself.
In the best case, this could be interpreted to mean that the spoof_english setting will become sufficient, and the fallback will become as it would be for en-US. Or it might just mean that the preference is moved to somewhere inside Gecko. It seems the relevant bug is 910192: Get rid of intl.charset.default as a localizable pref and deduce the fallback....
Change History (23)
comment:11 Changed 16 months ago by mikeperry
- Keywords MikePerry201402R added; MikePerry201401R removed
comment:15 follow-up: ↓ 16 Changed 15 months ago by mikeperry
- Resolution set to fixed
- Status changed from needs_review to closed