Opened 6 years ago

Last modified 3 weeks ago

#10831 assigned enhancement

Captchas are not accessible for blind users

Reported by: PZajda Owned by: juggy
Priority: Medium Milestone:
Component: Circumvention/BridgeDB Version:
Severity: Normal Keywords: bridgedb-reportbug, bridgedb-ui, s30-o22a2, anti-censorship-roadmap-2020
Cc: isis, brade, mcs, contact@… Actual Points:
Parent ID: #31279 Points: 5
Reviewer: Sponsor: Sponsor30-can

Description

Hi,

There are no way for blind users to solve captchas on bridge.torproject.org, would it be possible to add a way to make it possible please?
A solution could be to provide an audio captcha, using combinations of numbers or in all cases, something spelled.
I think Recaptcha gives the possibility to add it, but I used it a long time ago so I am not sure it is still the case.

Child Tickets

Change History (30)

comment:1 Changed 6 years ago by isis

There are currently two ways to give CAPTCHAs to a BridgeDB user:

  1. Request a CAPTCHA from a reCaptcha API server using either BridgeDB's IP or a random fake IP, steal the image and the 'recaptcha_challenge_string' form field from the response (the code for this is here), and then serve it to the client. The client's CAPTCHA solution is then sent back to the reCaptcha API server for verification.
  1. There is a branch for #10809 which changes to using a local cache of descriptors, which is created with Gimp. I think we intend to to go the later route of using homebrewed CAPTCHAs, and adding audio CAPTCHA support would be excellent. The scripts which generate the CAPTCHAs cannot be run on BridgeDB, because Gimp requires X to be installed. The script produces a directory of image files which are named for the CAPTCHA answer, i.e. aT2bXvw7.jpg.

#2 is the better way to go, I think, as BridgeDB is switching to that. Though having support for reCaptcha's audio CAPTCHAs (#1) in BridgeDB would be good too.

For #2: I am uncertain of the best way to do this.

  • One idea would be to convert the image filenames to audio, by extending the gimp-captcha scripts to also the produce audio files. I have not looked into Python TTS engine wrapping modules lately, and so I have little advice to give there.
  • Another idea, which might be more resource friendly, would be to ignore the filename completely and generate a random string, then use some TTS module to create the CAPTCHA (doing all this only if the audio CAPTCHA has been requested by a user).

comment:2 Changed 6 years ago by isis

Status: newassigned
Type: defectenhancement

comment:3 Changed 5 years ago by isis

Keywords: isis2015Q3Q4 isisExB isisExC added

comment:4 Changed 5 years ago by isis

Keywords: bridgedb-ui added

comment:5 Changed 3 years ago by isis

Severity: Blocker
Status: assignednew

comment:6 Changed 3 years ago by isis

Severity: BlockerNormal

comment:7 Changed 3 years ago by mcs

Cc: brade mcs added

comment:8 Changed 3 years ago by unknown_artist

Can we use Python's captcha library for generating audio captchas? Also, we can use the same library for generating image captchas because it doesn't require X to be installed and hopefully we can run it on BridgeDB.

comment:9 Changed 3 years ago by Samdney

Cc: contact@… added

comment:10 Changed 3 years ago by Samdney

Cc: contact@… removed

comment:11 Changed 3 years ago by Samdney

Cc: contact@… added

comment:12 in reply to:  8 Changed 3 years ago by Samdney

Replying to unknown_artist:

Can we use Python's captcha library for generating audio captchas? Also, we can use the same library for generating image captchas because it doesn't require X to be installed and hopefully we can run it on BridgeDB.

If you write a ticket, please give more specific information. If I look on the web for python + captcha, I already find several python libraries ... .

Hence, nobody knows what exactly you are talking about.

Please write down for example the exactly name of the library, possible functions, a sketch of the code you want to implement etc... .

comment:13 Changed 3 years ago by unknown_artist

I am planning to use https://pypi.python.org/pypi/captcha for generating captchas. As per the documentation, we can do something like this for generating audio captchas :

from captcha.audio import AudioCaptcha
audio = AudioCaptcha(voicedir='/path/to/voices')
audio.write('aT2bXvw7','aT2bXvw7.wav')

The above code snippet will generate an audio captcha whose correct answer is aT2bXvw7
The voice directory should contain single character named directories, for example :

  • a/
  • b/
  • c/

These directories should contain 8 bit PCM .wav files. Each character directory may contain as many .wav files and one of them will be randomly chosen for captcha generation

comment:14 in reply to:  13 Changed 3 years ago by isis

Replying to unknown_artist:

I am planning to use https://pypi.python.org/pypi/captcha for generating captchas. As per the documentation, we can do something like this for generating audio captchas :

from captcha.audio import AudioCaptcha
audio = AudioCaptcha(voicedir='/path/to/voices')
audio.write('aT2bXvw7','aT2bXvw7.wav')

The above code snippet will generate an audio captcha whose correct answer is aT2bXvw7
The voice directory should contain single character named directories, for example :

  • a/
  • b/
  • c/

These directories should contain 8 bit PCM .wav files. Each character directory may contain as many .wav files and one of them will be randomly chosen for captcha generation


Hi unknown_artist!

Thanks for looking into this! It looks good. We'd need to make the recordings as part of this ticket, since their default voice only includes characters 0-9. From their README, it looks like they'd appreciate an upstream contribution of voice files as well.

We'll also need to update the interface at https://bridges.torproject.org/bridges to have some button people can click to hear audio, and probably have a hidden directive for screen readers before the header bar at the top of the page, e.g. something like:

.screen-reader-text { 
   clip: rect(1px, 1px, 1px, 1px); 
   height: 1px; 
   width: 1px; 
   overflow: hidden; 
   position: absolute !important;
}
<span class="screen-reader-text">Instructions for those using screen readers: 
please use access key 'a' to play an audio captcha, enter the characters you 
hear into the form which is accessible via access key 't', and then press 
enter. Please be aware that the audio captcha is in English.</span>

The American Foundation for the Blind has some helpful tips for making web things easier on people with braille terminals and screen readers.

We may also want to put a screen reader note (on the page which contains the actual bridges) to let them know what the access key is for the "Select All" button to copy the bridge lines. (It also doesn't appear to have an access key right now.)

Oh, all of the strings above should also be translated; you can do that by making them constants in bridgedb/strings.py.

Let me know if you need any help!

Last edited 3 years ago by isis (previous) (diff)

comment:15 Changed 17 months ago by gaba

Keywords: isis2015Q3Q4 isisExB isisExC removed
Owner: isis deleted
Points: 5
Status: newassigned

comment:16 Changed 16 months ago by gaba

Sponsor: Sponsor19

comment:17 Changed 13 months ago by gaba

Keywords: anti-censorship-roadmap-2019 added

comment:18 Changed 12 months ago by phw

Sponsor: Sponsor19Sponsor30-can

Moving from Sponsor 19 to Sponsor 30.

comment:19 Changed 12 months ago by gaba

Keywords: anti-censorship-roadmap added; anti-censorship-roadmap-2019 removed

comment:20 Changed 8 months ago by gaba

Parent ID: #31279

comment:21 Changed 8 months ago by gaba

Keywords: s30-o22a2 added

comment:22 Changed 4 months ago by gaba

Keywords: anti-censorship-roadmap-2020Q1 added; anti-censorship-roadmap removed

comment:23 Changed 3 months ago by teor

Status: assignednew

Change tickets that are assigned to nobody to "new".

comment:24 Changed 8 weeks ago by juggy

I wrote a sample web server https://github.com/jugheadjones10/bridgedb-audio-captcha that serves the original BridgeDB captcha page with audio captchas (using suggestions from the comments here). Could I receive some feedback about any naive code or problems that might arise if this is integrated into BridgeDB? Thank you!

Last edited 8 weeks ago by juggy (previous) (diff)

comment:25 Changed 8 weeks ago by pili

Owner: set to juggy
Status: newassigned

comment:26 Changed 8 weeks ago by juggy

Status: assignedneeds_revision

comment:27 Changed 8 weeks ago by juggy

Status: needs_revisionneeds_review

comment:28 in reply to:  24 Changed 8 weeks ago by phw

Status: needs_reviewnew

Replying to juggy:

I wrote a sample web server https://github.com/jugheadjones10/bridgedb-audio-captcha that serves the original BridgeDB captcha page with audio captchas (using suggestions from the comments here). Could I receive some feedback about any naive code or problems that might arise if this is integrated into BridgeDB? Thank you!


Thanks for working on this! I gave it a shot and it worked for me. Here are some thoughts:

  • The size of a single audio CAPTCHA seems to be approximately 85 KB. It should be straightforward to add the audio CAPTCHA to bridges.torproject.org but if possible, we should also make it available over moat. We could encode it in Base64 and send it in the HTTP response to a moat request. However, > 85 extra KB per request sounds expensive for a CAPTCHA that only a small fraction of users would use but we may be able to reduce the size.
  • The library's default voice is English, which is a potential usability problem. It would be neat if we had multiple languages but this doesn't strike me as a critical issue. Most people will recognise English numbers.
  • Your GitHub repository contains the following question:

    A concern : Given the simple input-output nature of the Python audio captcha library, it seems like it wouldn't take long to train a simple model to accurately crack the audio captcha.

    That's true but I wouldn't expect the audio CAPTCHA to be easier to break than the visual CAPTCHA, or am I missing something? As long as it doesn't make our distributor easier to attack, I see no problem in deploying it.

comment:29 Changed 7 weeks ago by juggy

Status: newassigned

comment:30 Changed 3 weeks ago by gaba

Keywords: anti-censorship-roadmap-2020 added; anti-censorship-roadmap-2020Q1 removed

No more Q1 for 2020.

Note: See TracTickets for help on using tickets.