BridgeDB email responder is not interactive

added bridgedb-email component::circumvention/bridgedb owner::isis parent::7547 priority::medium resolution::fixed status::closed type::defect labels

Replying to aagbsn:

BridgeDB's email response mentions that it supports queries for ipv6 bridges and bridges with specified transports, but the rate limiting feature prevents the responder from being used in an interactive way.

We could modify the rate limiting feature to allow several requests before responding negatively.

Alternately, the first response could include a brief set of instructions only, and only apply rate limiting to subsequent queries for bridges. However, this might be confusing, especially if not all translations are initially available.

What if we were to do separate rate limits? Something like:

a stricter (less queries allowed) for the 'get bridges' command
a more permissive rate limit for all other valid commands
an eventual blocked-for-X-amount-of-time for some threshold of non-valid commands

Trac:
Status: new to needs_information
Cc: N/A to isis@torproject.org

Replying to isis:

Replying to aagbsn:

BridgeDB's email response mentions that it supports queries for ipv6 bridges and bridges with specified transports, but the rate limiting feature prevents the responder from being used in an interactive way.

We could modify the rate limiting feature to allow several requests before responding negatively.

Alternately, the first response could include a brief set of instructions only, and only apply rate limiting to subsequent queries for bridges. However, this might be confusing, especially if not all translations are initially available.

What if we were to do separate rate limits? Something like:

a stricter (less queries allowed) for the 'get bridges' command

a more permissive rate limit for all other valid commands

an eventual blocked-for-X-amount-of-time for some threshold of non-valid commands This is a fine strategy, though it might be easier to just relax the rate limit to something like 5 requests per hour.

We should also consider replying with obfs2,3 bridges by default in each mail.

Replying to aagbsn:

Replying to isis:

What if we were to do separate rate limits? Something like:

a stricter (less queries allowed) for the 'get bridges' command

a more permissive rate limit for all other valid commands

an eventual blocked-for-X-amount-of-time for some threshold of non-valid commands This is a fine strategy, though it might be easier to just relax the rate limit to something like 5 requests per hour.

We should also consider replying with obfs2,3 bridges by default in each mail.

This sounds good. We should definitely start replying with obfs2/3 bridges (can we whip up another quick hack?) The user won't be able to retrieve new bridges within a certain time period in any case, so providing the ability to send multiple commands will be useful. However, this could also be confusing to a user if these limits aren't explicitly defined, so we need to make sure it is obvious to the user that "they must wait three hours between 'get bridges' request".

Another option is that when we receive a request from a 'first-time' user (we don't have a hash of their email address in the DB) we respond to their request with a welcome email which provides instructions on how to format emails and which features we support, and we record that we sent that instructional mail. Then on receipt of a subsequent mail which contains 'get bridges' we process it normally and return bridges as appropriate.

Maybe we also add a 'get help' command which is a request to resend the welcome email?

With this, i think command processing can easily be rate-limited to 5/hour as aagbsn suggested. Is this too complex?

Is the rate limiting based on the IP of the client?

Also, what is the point of rate limiting in BridgeDB? A user with a single IP shouldn't be able to get more than a bunch of bridges anyway, right?

Replying to asn:

Is the rate limiting based on the IP of the client?

It's based on the email address. Currently, an email address is allowed to request bridges every 3 hours (they won't receive new bridges with every request, though). If an email is received from the same address within a three hour period, the first email will be responded to with bridges, the second will contain a warning that says they are requesting bridges too frequently, and all subsequent emails will be ignored until the time period is passed.

Also, what is the point of rate limiting in BridgeDB? A user with a single IP shouldn't be able to get more than a bunch of bridges anyway, right?

Right, maybe aagbsn (or arma, nickm, Karsten) have a better answer, because within a single time period we should return the same bridges. That being said, maybe the rate limiting is to reduce the number of emails bridgedb needs to process by disincentivizing users spamming it? I don't see a reason for bridgedb to respond to multiple emails within the time period if it will be responding with the same bridges each time.

Trac:
Cc: isis@torproject.org to isis@torproject.org, Matthew.Finkel@gmail.com

Replying to sysrqb:

Replying to asn:

Is the rate limiting based on the IP of the client?

It's based on the email address. Currently, an email address is allowed to request bridges every 3 hours (they won't receive new bridges with every request, though). If an email is received from the same address within a three hour period, the first email will be responded to with bridges, the second will contain a warning that says they are requesting bridges too frequently, and all subsequent emails will be ignored until the time period is passed.

Also, what is the point of rate limiting in BridgeDB? A user with a single IP shouldn't be able to get more than a bunch of bridges anyway, right?

Right, maybe aagbsn (or arma, nickm, Karsten) have a better answer, because within a single time period we should return the same bridges. That being said, maybe the rate limiting is to reduce the number of emails bridgedb needs to process by disincentivizing users spamming it? I don't see a reason for bridgedb to respond to multiple emails within the time period if it will be responding with the same bridges each time.

This. We also see a barrage of requests over HTTPS.

Sadly, the attackers/scrapers simply register "creative" names (somename0001@gmail.com, somename0002@gmail.com .. somename0020@gmail.com) and keep at it.

Any ideas? Text CAPTCHA? ASCII-art cats?

--Aaron

Replying to aagbsn:

Replying to sysrqb:

Replying to asn:

Also, what is the point of rate limiting in BridgeDB? A user with a single IP shouldn't be able to get more than a bunch of bridges anyway, right?

Right, maybe aagbsn (or arma, nickm, Karsten) have a better answer, because within a single time period we should return the same bridges. That being said, maybe the rate limiting is to reduce the number of emails bridgedb needs to process by disincentivizing users spamming it? I don't see a reason for bridgedb to respond to multiple emails within the time period if it will be responding with the same bridges each time.

This. We also see a barrage of requests over HTTPS.

Sadly, the attackers/scrapers simply register "creative" names (somename0001@gmail.com, somename0002@gmail.com .. somename0020@gmail.com) and keep at it.

Any ideas? Text CAPTCHA? ASCII-art cats?

--Aaron

Hrm, I do like the ASCII-art catz idea, that might do the trick.

OTOH, I think if we make some progress with #7520 (moved) (however it's designed) and require some sort of earned token/credit, then I think this is a step in the right direction. If we assume accounts for this social distributor are a limited resource, then (short of compromised account/impersonation/malicious users) we'll be able to drop all unauthenticated requests and spamming/abuse will be linkable. Originally, restricting requests to specific domains was an okay solution, but the justification really is not as applicable anymore, afaict. I feel like this is veering OT for this ticket, however. (sorry)

So, thoughts? More interactive email responder is good/bad? (choose one)

Replying to sysrqb:

Replying to aagbsn:

Replying to sysrqb:

Replying to asn:

Also, what is the point of rate limiting in BridgeDB? A user with a single IP shouldn't be able to get more than a bunch of bridges anyway, right?

Right, maybe aagbsn (or arma, nickm, Karsten) have a better answer, because within a single time period we should return the same bridges. That being said, maybe the rate limiting is to reduce the number of emails bridgedb needs to process by disincentivizing users spamming it? I don't see a reason for bridgedb to respond to multiple emails within the time period if it will be responding with the same bridges each time.

This. We also see a barrage of requests over HTTPS.

Sadly, the attackers/scrapers simply register "creative" names (somename0001@gmail.com, somename0002@gmail.com .. somename0020@gmail.com) and keep at it.

Any ideas? Text CAPTCHA? ASCII-art cats?

--Aaron

Hrm, I do like the ASCII-art catz idea, that might do the trick.

OTOH, I think if we make some progress with #7520 (moved) (however it's designed) and require some sort of earned token/credit, then I think this is a step in the right direction. If we assume accounts for this social distributor are a limited resource, then (short of compromised account/impersonation/malicious users) we'll be able to drop all unauthenticated requests and spamming/abuse will be linkable. Originally, restricting requests to specific domains was an okay solution, but the justification really is not as applicable anymore, afaict. I feel like this is veering OT for this ticket, however. (sorry)

So, thoughts? More interactive email responder is good/bad? (choose one)

I think that #7520 (moved) is likely to be a longer term project (several months at least), and would be a separate distributor from the current email distributor -- BridgeDB allocates fractions of its bridges to different distribution strategies, we evaluate which bridges see more use, and in turn allocate larger amounts of bridges to those strategies.

Any strategy for making bridge scraping harder buys a window of time and we learn what the response time of the scrapers (and, their abilities).

Btw, another item we might want to add is exposing the rate that bridges are served via each distributor, so we can graph the data and see how effective any scraping mitigation actually are.

Replying to sysrqb:

Replying to aagbsn:

Replying to sysrqb:

Replying to asn:

Also, what is the point of rate limiting in BridgeDB? A user with a single IP shouldn't be able to get more than a bunch of bridges anyway, right?

Right, maybe aagbsn (or arma, nickm, Karsten) have a better answer, because within a single time period we should return the same bridges. That being said, maybe the rate limiting is to reduce the number of emails bridgedb needs to process by disincentivizing users spamming it? I don't see a reason for bridgedb to respond to multiple emails within the time period if it will be responding with the same bridges each time.

This. We also see a barrage of requests over HTTPS.

Sadly, the attackers/scrapers simply register "creative" names (somename0001@gmail.com, somename0002@gmail.com .. somename0020@gmail.com) and keep at it.

Any ideas? Text CAPTCHA? ASCII-art cats?

--Aaron

Hrm, I do like the ASCII-art catz idea, that might do the trick.

I'm still not sold that rate-limiting is necessary for BridgeDB to be able to process mails. I would have thought that a huge amount of mails is needed to bloat a modern computer (especially without expensive crypto happening during processing). But there was probably a reason that the rate-limiting was introduced in the first place, so I guess it's fine.

OTOH, I think if we make some progress with #7520 (moved) (however it's designed) and require some sort of earned token/credit, then I think this is a step in the right direction. If we assume accounts for this social distributor are a limited resource, then (short of compromised account/impersonation/malicious users) we'll be able to drop all unauthenticated requests and spamming/abuse will be linkable. Originally, restricting requests to specific domains was an okay solution, but the justification really is not as applicable anymore, afaict. I feel like this is veering OT for this ticket, however. (sorry)

So, thoughts? More interactive email responder is good/bad? (choose one)

Like, aagbsn, I also see #7520 (moved) as a long-term project (my guess is that it won't be deployed in the next 6 months).

I'd say that the interactive email responder might be easy-ish to implement, and it will greatly improve the experience of its users, so it's probably worth pursuing IMO.

Replying to asn:

Replying to sysrqb: I'm still not sold that rate-limiting is necessary for BridgeDB to be able to process mails. I would have thought that a huge amount of mails is needed to bloat a modern computer (especially without expensive crypto happening during processing). But there was probably a reason that the rate-limiting was introduced in the first place, so I guess it's fine.

I don't think it's something we absolutely need, but I can't think of a reason why a person needs to send a query for the same bridges over and over and over again. And, if they do, I don't see why we need to respond to them. The most intense part of the response is the process of hashing the email address and determining from where in the hash ring their bridges should be selected.

So, thoughts? More interactive email responder is good/bad? (choose one)

Like, aagbsn, I also see #7520 (moved) as a long-term project (my guess is that it won't be deployed in the next 6 months).

Agreed, but I think designing this should be a priority for BridgeDB.

I'd say that the interactive email responder might be easy-ish to implement, and it will greatly improve the experience of its users, so it's probably worth pursuing IMO.

Sounds good.

Trac:
Status: needs_information to accepted
Owner: N/A to sysrqb
Parent: N/A to #8616 (moved)

Trac:
Parent: #8616 (moved) to #7547 (moved)
Keywords: N/A deleted, bridgedb-email added

Fixed, see this comment on #5463 (moved). I'm marking this as fixed, unless there is some specific usability or interactivity issue from the branches in #5463 (moved) which justifies further work on this ticket.

Trac:
Owner: sysrqb to isis
Status: accepted to assigned

post scriptum: It's now a mix between interactive and non-interactive, depending on whether it looks like the client knows what they're doing. See this comment on #5463 (moved) to see what the current responses look like.

Trac:
Status: assigned to closed
Resolution: N/A to fixed

closed

mentioned in issue #8241 (moved)

mentioned in issue #10813 (moved)

moved to tpo/anti-censorship/bridgedb#7550 (closed)

BridgeDB email responder is not interactive

Child items 0

Activity