Opened 6 years ago

Closed 4 years ago

#7550 closed defect (fixed)

BridgeDB email responder is not interactive

Reported by: aagbsn Owned by: isis
Priority: Medium Milestone:
Component: Obfuscation/BridgeDB Version:
Severity: Keywords: bridgedb-email
Cc: isis@…, Matthew.Finkel@… Actual Points:
Parent ID: #7547 Points:
Reviewer: Sponsor:

Description

BridgeDB's email response mentions that it supports queries for ipv6 bridges and bridges with specified transports, but the rate limiting feature prevents the responder from being used in an interactive way.

We could modify the rate limiting feature to allow several requests before responding negatively.

Alternately, the first response could include a brief set of instructions only, and only apply rate limiting to subsequent queries for bridges. However, this might be confusing, especially if not all translations are initially available.

Child Tickets

Change History (14)

comment:1 in reply to:  description ; Changed 5 years ago by isis

Cc: isis@… added
Status: newneeds_information

Replying to aagbsn:

BridgeDB's email response mentions that it supports queries for ipv6 bridges and bridges with specified transports, but the rate limiting feature prevents the responder from being used in an interactive way.

We could modify the rate limiting feature to allow several requests before responding negatively.

Alternately, the first response could include a brief set of instructions only, and only apply rate limiting to subsequent queries for bridges. However, this might be confusing, especially if not all translations are initially available.

What if we were to do separate rate limits? Something like:

  1. a stricter (less queries allowed) for the 'get bridges' command
  2. a more permissive rate limit for all other valid commands
  3. an eventual blocked-for-X-amount-of-time for some threshold of non-valid commands

comment:2 in reply to:  1 ; Changed 5 years ago by aagbsn

Replying to isis:

Replying to aagbsn:

BridgeDB's email response mentions that it supports queries for ipv6 bridges and bridges with specified transports, but the rate limiting feature prevents the responder from being used in an interactive way.

We could modify the rate limiting feature to allow several requests before responding negatively.

Alternately, the first response could include a brief set of instructions only, and only apply rate limiting to subsequent queries for bridges. However, this might be confusing, especially if not all translations are initially available.

What if we were to do separate rate limits? Something like:

  1. a stricter (less queries allowed) for the 'get bridges' command
  1. a more permissive rate limit for all other valid commands
  2. an eventual blocked-for-X-amount-of-time for some threshold of non-valid commands

This is a fine strategy, though it might be easier to just relax the rate limit to something like 5 requests per hour.

We should also consider replying with obfs2,3 bridges by default in each mail.

comment:3 in reply to:  2 Changed 5 years ago by sysrqb

Replying to aagbsn:

Replying to isis:

What if we were to do separate rate limits? Something like:

  1. a stricter (less queries allowed) for the 'get bridges' command
  2. a more permissive rate limit for all other valid commands
  3. an eventual blocked-for-X-amount-of-time for some threshold of non-valid commands

This is a fine strategy, though it might be easier to just relax the rate limit to something like 5 requests per hour.

We should also consider replying with obfs2,3 bridges by default in each mail.


This sounds good. We should definitely start replying with obfs2/3 bridges (can we whip up another quick hack?) The user won't be able to retrieve new bridges within a certain time period in any case, so providing the ability to send multiple commands will be useful. However, this could also be confusing to a user if these limits aren't explicitly defined, so we need to make sure it is obvious to the user that "they must wait three hours between 'get bridges' request".

Another option is that when we receive a request from a 'first-time' user (we don't have a hash of their email address in the DB) we respond to their request with a welcome email which provides instructions on how to format emails and which features we support, and we record that we sent that instructional mail. Then on receipt of a subsequent mail which contains 'get bridges' we process it normally and return bridges as appropriate.

Maybe we also add a 'get help' command which is a request to resend the welcome email?

With this, i think command processing can easily be rate-limited to 5/hour as aagbsn suggested. Is this too complex?

comment:4 Changed 5 years ago by asn

Is the rate limiting based on the IP of the client?

Also, what is the point of rate limiting in BridgeDB? A user with a single IP shouldn't be able to get more than a bunch of bridges anyway, right?

comment:5 in reply to:  4 ; Changed 5 years ago by sysrqb

Cc: Matthew.Finkel@… added

Replying to asn:

Is the rate limiting based on the IP of the client?

It's based on the email address. Currently, an email address is allowed to request bridges every 3 hours (they won't receive new bridges with every request, though). If an email is received from the same address within a three hour period, the first email will be responded to with bridges, the second will contain a warning that says they are requesting bridges too frequently, and all subsequent emails will be ignored until the time period is passed.

Also, what is the point of rate limiting in BridgeDB? A user with a single IP shouldn't be able to get more than a bunch of bridges anyway, right?

Right, maybe aagbsn (or arma, nickm, Karsten) have a better answer, because within a single time period we should return the same bridges. That being said, maybe the rate limiting is to reduce the number of emails bridgedb needs to process by disincentivizing users spamming it? I don't see a reason for bridgedb to respond to multiple emails within the time period if it will be responding with the same bridges each time.

comment:6 in reply to:  5 ; Changed 5 years ago by aagbsn

Replying to sysrqb:

Replying to asn:

Is the rate limiting based on the IP of the client?

It's based on the email address. Currently, an email address is allowed to request bridges every 3 hours (they won't receive new bridges with every request, though). If an email is received from the same address within a three hour period, the first email will be responded to with bridges, the second will contain a warning that says they are requesting bridges too frequently, and all subsequent emails will be ignored until the time period is passed.

Also, what is the point of rate limiting in BridgeDB? A user with a single IP shouldn't be able to get more than a bunch of bridges anyway, right?

Right, maybe aagbsn (or arma, nickm, Karsten) have a better answer, because within a single time period we should return the same bridges. That being said, maybe the rate limiting is to reduce the number of emails bridgedb needs to process by disincentivizing users spamming it? I don't see a reason for bridgedb to respond to multiple emails within the time period if it will be responding with the same bridges each time.

This. We also see a barrage of requests over HTTPS.

Sadly, the attackers/scrapers simply register "creative" names (somename0001@…, somename0002@… .. somename0020@…) and keep at it.

Any ideas? Text CAPTCHA? ASCII-art cats?

--Aaron

comment:7 in reply to:  6 ; Changed 5 years ago by sysrqb

Replying to aagbsn:

Replying to sysrqb:

Replying to asn:

Also, what is the point of rate limiting in BridgeDB? A user with a single IP shouldn't be able to get more than a bunch of bridges anyway, right?

Right, maybe aagbsn (or arma, nickm, Karsten) have a better answer, because within a single time period we should return the same bridges. That being said, maybe the rate limiting is to reduce the number of emails bridgedb needs to process by disincentivizing users spamming it? I don't see a reason for bridgedb to respond to multiple emails within the time period if it will be responding with the same bridges each time.

This. We also see a barrage of requests over HTTPS.

Sadly, the attackers/scrapers simply register "creative" names (somename0001@…, somename0002@… .. somename0020@…) and keep at it.

Any ideas? Text CAPTCHA? ASCII-art cats?

--Aaron

Hrm, I do like the ASCII-art catz idea, that might do the trick.

OTOH, I think if we make some progress with #7520 (however it's designed) and require some sort of earned token/credit, then I think this is a step in the right direction. If we assume accounts for this social distributor are a limited resource, then (short of compromised account/impersonation/malicious users) we'll be able to drop all unauthenticated requests and spamming/abuse will be linkable. Originally, restricting requests to specific domains *was* an okay solution, but the justification really is not as applicable anymore, afaict. I feel like this is veering OT for this ticket, however. (sorry)

So, thoughts? More interactive email responder is good/bad? (choose one)

comment:8 in reply to:  7 Changed 5 years ago by aagbsn

Replying to sysrqb:

Replying to aagbsn:

Replying to sysrqb:

Replying to asn:

Also, what is the point of rate limiting in BridgeDB? A user with a single IP shouldn't be able to get more than a bunch of bridges anyway, right?

Right, maybe aagbsn (or arma, nickm, Karsten) have a better answer, because within a single time period we should return the same bridges. That being said, maybe the rate limiting is to reduce the number of emails bridgedb needs to process by disincentivizing users spamming it? I don't see a reason for bridgedb to respond to multiple emails within the time period if it will be responding with the same bridges each time.

This. We also see a barrage of requests over HTTPS.

Sadly, the attackers/scrapers simply register "creative" names (somename0001@…, somename0002@… .. somename0020@…) and keep at it.

Any ideas? Text CAPTCHA? ASCII-art cats?

--Aaron

Hrm, I do like the ASCII-art catz idea, that might do the trick.

OTOH, I think if we make some progress with #7520 (however it's designed) and require some sort of earned token/credit, then I think this is a step in the right direction. If we assume accounts for this social distributor are a limited resource, then (short of compromised account/impersonation/malicious users) we'll be able to drop all unauthenticated requests and spamming/abuse will be linkable. Originally, restricting requests to specific domains *was* an okay solution, but the justification really is not as applicable anymore, afaict. I feel like this is veering OT for this ticket, however. (sorry)

So, thoughts? More interactive email responder is good/bad? (choose one)

I think that #7520 is likely to be a longer term project (several months at least), and would be a separate distributor from the current email distributor -- BridgeDB allocates fractions of its bridges to different distribution strategies, we evaluate which bridges see more use, and in turn allocate larger amounts of bridges to those strategies.

Any strategy for making bridge scraping harder buys a window of time and we learn what the response time of the scrapers (and, their abilities).

Btw, another item we might want to add is exposing the rate that bridges are served via each distributor, so we can graph the data and see how effective any scraping mitigation actually are.

comment:9 in reply to:  7 ; Changed 5 years ago by asn

Replying to sysrqb:

Replying to aagbsn:

Replying to sysrqb:

Replying to asn:

Also, what is the point of rate limiting in BridgeDB? A user with a single IP shouldn't be able to get more than a bunch of bridges anyway, right?

Right, maybe aagbsn (or arma, nickm, Karsten) have a better answer, because within a single time period we should return the same bridges. That being said, maybe the rate limiting is to reduce the number of emails bridgedb needs to process by disincentivizing users spamming it? I don't see a reason for bridgedb to respond to multiple emails within the time period if it will be responding with the same bridges each time.

This. We also see a barrage of requests over HTTPS.

Sadly, the attackers/scrapers simply register "creative" names (somename0001@…, somename0002@… .. somename0020@…) and keep at it.

Any ideas? Text CAPTCHA? ASCII-art cats?

--Aaron

Hrm, I do like the ASCII-art catz idea, that might do the trick.

I'm still not sold that rate-limiting is necessary for BridgeDB to be able to process mails. I would have thought that a _huge_ amount of mails is needed to bloat a modern computer (especially without expensive crypto happening during processing). But there was probably a reason that the rate-limiting was introduced in the first place, so I guess it's fine.

OTOH, I think if we make some progress with #7520 (however it's designed) and require some sort of earned token/credit, then I think this is a step in the right direction. If we assume accounts for this social distributor are a limited resource, then (short of compromised account/impersonation/malicious users) we'll be able to drop all unauthenticated requests and spamming/abuse will be linkable. Originally, restricting requests to specific domains *was* an okay solution, but the justification really is not as applicable anymore, afaict. I feel like this is veering OT for this ticket, however. (sorry)

So, thoughts? More interactive email responder is good/bad? (choose one)

Like, aagbsn, I also see #7520 as a long-term project (my guess is that it won't be deployed in the next 6 months).

I'd say that the interactive email responder might be easy-ish to implement, and it will greatly improve the experience of its users, so it's probably worth pursuing IMO.

comment:10 in reply to:  9 Changed 5 years ago by sysrqb

Replying to asn:

Replying to sysrqb:

I'm still not sold that rate-limiting is necessary for BridgeDB to be able to process mails. I would have thought that a _huge_ amount of mails is needed to bloat a modern computer (especially without expensive crypto happening during processing). But there was probably a reason that the rate-limiting was introduced in the first place, so I guess it's fine.

I don't think it's something we absolutely need, but I can't think of a reason why a person needs to send a query for the same bridges over and over and over again. And, if they do, I don't see why we need to respond to them. The most intense part of the response is the process of hashing the email address and determining from where in the hash ring their bridges should be selected.

So, thoughts? More interactive email responder is good/bad? (choose one)

Like, aagbsn, I also see #7520 as a long-term project (my guess is that it won't be deployed in the next 6 months).

Agreed, but I think designing this should be a priority for BridgeDB.

I'd say that the interactive email responder might be easy-ish to implement, and it will greatly improve the experience of its users, so it's probably worth pursuing IMO.

Sounds good.

comment:11 Changed 5 years ago by sysrqb

Owner: set to sysrqb
Parent ID: #8616
Status: needs_informationaccepted

comment:12 Changed 5 years ago by isis

Keywords: bridgedb-email added
Parent ID: #8616#7547

comment:13 Changed 4 years ago by isis

Owner: changed from sysrqb to isis
Status: acceptedassigned

Fixed, see this comment on #5463. I'm marking this as fixed, unless there is some specific usability or interactivity issue from the branches in #5463 which justifies further work on this ticket.

comment:14 Changed 4 years ago by isis

Resolution: fixed
Status: assignedclosed

post scriptum: It's now a mix between interactive and non-interactive, depending on whether it looks like the client knows what they're doing. See this comment on #5463 to see what the current responses look like.

Note: See TracTickets for help on using tickets.