Opened 4 years ago

Closed 4 years ago

#18297 closed defect (fixed)

Tor browser uses Chinese-style glyphs to display Japanese

Reported by: cypherpunks Owned by: tbb-team
Priority: Medium Milestone:
Component: Applications/Tor Browser Version:
Severity: Normal Keywords: tbb-fingerprinting-fonts, TorBrowserTeam201602R, tbb-5.5-regression
Cc: arthuredelstein Actual Points:
Parent ID: #18097 Points:
Reviewer: Sponsor:

Description

In Tor browser 5.5.1 version (en_US), if I visit https://en.wikipedia.org/wiki/Han_unification#Examples_of_language-dependent_glyphs the browser correctly shows Japanese-style glyph variants in the ja column.

However, on most actual Japanese web pages, Chinese-style glyphs end up being used. This happens even if I select "Japanese" in the View->Character Encoding menu (I know that character encoding and glyph variant are two separate things, but a Japan-specific character encoding should be taken as an indicator that Japanese glyph variants should be used).

For example example, if I search for the characters from the Wikipedia table on this page - http://www.aozora.gr.jp/cards/001779/files/56648_58207.html - in every instance the simplified Chinese variant is used instead of the Japanese glyph.

Child Tickets

Change History (8)

comment:1 Changed 4 years ago by gk

Cc: arthuredelstein added
Parent ID: #18097
Status: newneeds_information

What operating system are you using? Does this happen with a ja bundle as well?

comment:2 Changed 4 years ago by cypherpunks

I am using 32-bit Linux.
The ja bundle exhibits the same behavior (e.g. on the aozora bunko page referenced above).
In fact, the ja bundle uses simplified Chinese glyph variants even in the (Japanese) UI.

comment:3 Changed 4 years ago by arthuredelstein

We are currently using Noto Sans CJK SC Regular. This font supports Simplified and Traditional Chinese, Japanese and Korean glyphs, and provides alternative "localized forms" via the opentype locl feature so that the appropriate glyph is displayed, depending on language.

This works properly when the web page uses

<meta charset="UTF-8"/>

such as on ​https://en.wikipedia.org/wiki/Han_unification#Examples_of_language-dependent_glyphs

but it fails to work properly when it uses

<meta http-equiv="Content-Type" content="text/html;charset=Shift_JIS" />

such as on http://www.aozora.gr.jp/cards/001779/files/56648_58207.html

A solution I found to this problem is to remove the unified Noto Sans CJK font and instead include a separate font for each of SC, TC, Japanese, and Korean, namely:

Noto Sans JP Regular
Noto Sans TC Regular
Noto Sans SC Regular
Noto Sans KR Regular

Doing this will add ~1 MB (zipped) to the bundle, which I think is not too bad. I'm working on writing patches to do this.

(Another alternative would be to try to fix the Shift_JIS/locl bug in Firefox or whatever is the offending font library. I reported the bug here: https://bugzilla.mozilla.org/show_bug.cgi?id=1247479)

Last edited 4 years ago by arthuredelstein (previous) (diff)

comment:4 Changed 4 years ago by cypherpunks

You did not say if such fix will work on the chrome as well. Will it?

comment:5 in reply to:  4 Changed 4 years ago by arthuredelstein

Replying to cypherpunks:

You did not say if such fix will work on the chrome as well. Will it?

Yes, I just confirmed that the multi-font solution works for chrome.

comment:6 Changed 4 years ago by arthuredelstein

Keywords: tbb-fingerprinting-fonts TorBrowserTeam201602R tbb-5.5-regression added
Status: needs_informationneeds_review

comment:7 Changed 4 years ago by gk

I think we can go with this idea both for stable and the alphas. If the Mozilla bug gets fixed we either wait for the next ESR that contains it or if it is not too involved we backport the fix and start using Noto Sans CJK SC Regular again.

comment:8 Changed 4 years ago by gk

Resolution: fixed
Status: needs_reviewclosed

Fix on tor-browser-bundle master, maint-5.5 and hardened-builds (04ac5b14d07a37a99880c35209b7f13415183bd1, 603a12865c01ac95b701f17b24f87bcf3cfa9f16 and b0ca06e94cef2668fc92783e2c2e3d16d44930da) and on tor-browser-38.6.1esr-6.0-1 and tor-browser-38.6.1esr-5.5-1 (94474093f792979682e81b96ecd5b00ed1476170 and 1597c6e1fc5973cbe967b5364eb55ca480729610).

Note: See TracTickets for help on using tickets.