Opened 8 months ago

Last modified 3 weeks ago

#29207 assigned enhancement

New design for broker -- proxy protocol for snowflakes

Reported by: cohosh Owned by: cohosh
Priority: High Milestone:
Component: Circumvention/Snowflake Version:
Severity: Normal Keywords: snowflake, design, ex-sponsor-19, anti-censorship-roadmap
Cc: dcf, arlolra, cohosh Actual Points:
Parent ID: Points: 5
Reviewer: Sponsor: Sponsor28-must

Description (last modified by cohosh)

This is related to the Snowflake protocol design tickets #29206 and #29293.

We want to write these protocols in a way that is not Snowflake-specific but allows any type of bridge to connect to or poll our broker/BridgeDB bridge distribution service.

The idea is that in the beginning we will start with very reliable "bridge" (which could be snowflakes) that perhaps rotate IP addresses every month or so. After that we can collect measurements and move towards more ephemeral "bridges".

Some things to keep in mind are the types of information that the snowflakes give to the broker (such as proxy version/type) to allow for metrics. This information might change so we'll want a flexible and extensible protocol.

Child Tickets

TicketTypeStatusOwnerSummary
#29260tasknewShould Snowflake proxies have a way to identify themselves to the broker
#29734enhancementclosedcohoshBroker should receive country stats information from Proxy and Client
#29736enhancementassignedahfUse WebSocket protocol to communicate between snowflake proxies and broker

Change History (21)

comment:1 Changed 8 months ago by cohosh

Cc: cohosh added

comment:2 Changed 8 months ago by cohosh

Sponsor: Sponsor19

comment:3 Changed 8 months ago by cohosh

Description: modified (diff)
Keywords: snowflake design added

comment:4 Changed 8 months ago by gaba

Points: 5

comment:5 Changed 8 months ago by cohosh

As referenced in #29426, the broker currently gives proxies a 504 message if no client is available which is a questionable design:

2019/02/07 12:11:51 broker returns: 504
INFO: peerconnection.go:468: fired OnIceCandidateError: 143
2019/02/07 12:12:01 broker returns: 504

At the very least it makes logs confusing.

comment:6 Changed 7 months ago by gaba

Keywords: network-team-roadmap-2019-Q1Q2 added

comment:7 Changed 7 months ago by cohosh

We have some preliminary notes about what the protocols between each part of the Snowflake system should accomplish: https://github.com/ahf/snowflake-notes/blob/master/Protocol.markdown

comment:8 in reply to:  7 Changed 7 months ago by dcf

Replying to cohosh:

We have some preliminary notes about what the protocols between each part of the Snowflake system should accomplish: https://github.com/ahf/snowflake-notes/blob/master/Protocol.markdown

Regarding the question

Does the "token bucket" algorithm in Snowflake right now do what we think?

I suspect the answer is no. I'm pretty sure BucketRateLimit didn't work before ab34f8e889 (part of #28732), and I won't swear that the commit fixed it :) I was tempted, then, to rip out the rate-limit code for the sake of starting from a simpler basis, but decided not to.

I'm pretty sure BucketRateLimit and DummyRateLimit are based off objects of the same name in the flash proxy code. As I recall, they were working in flash proxy.

comment:9 Changed 5 months ago by gaba

Keywords: network-team-roadmap-2019-Q1Q2 removed

comment:10 Changed 4 months ago by gaba

Keywords: ex-sponsor-19 added

Adding the keyword to mark everything that didn't fit into the time for sponsor 19.

comment:11 Changed 4 months ago by phw

Sponsor: Sponsor19Sponsor28-must

Moving from Sponsor 19 to Sponsor 28.

comment:12 Changed 3 months ago by gaba

Keywords: anti-censorship-roadmap added
Owner: set to ahf
Status: newassigned

comment:13 in reply to:  5 ; Changed 5 weeks ago by serna

Replying to cohosh:

As referenced in #29426, the broker currently gives proxies a 504 message if no client is available which is a questionable design

I suggest changing the status to 204 No Content, it would represent that there's no error but it also no one to connect to.

comment:14 Changed 4 weeks ago by cohosh

Thanks serna! I like the idea of providing a 2XX response. This will also help with the error/warning messages produced in the browser-based proxy logs.

comment:15 in reply to:  13 ; Changed 4 weeks ago by dcf

Replying to serna:

Replying to cohosh:

As referenced in #29426, the broker currently gives proxies a 504 message if no client is available which is a questionable design

I suggest changing the status to 204 No Content, it would represent that there's no error but it also no one to connect to.

I disagree here. It's better not to hide any necessary information in the HTTP layer, because not every way of interacting with the broker will have those HTTP features. See for example comment:11:ticket:25985. Even assuming HTTP, codes like 204 are probably less likely to pass untouched by proxies. I propose to just always use status code 200, unless there is a genuine internal server error, and encode all necessary information in the HTTP body. We're not designing a REST interface here.

comment:16 in reply to:  15 Changed 4 weeks ago by cohosh

Replying to dcf:

Replying to serna:

Replying to cohosh:

As referenced in #29426, the broker currently gives proxies a 504 message if no client is available which is a questionable design

I suggest changing the status to 204 No Content, it would represent that there's no error but it also no one to connect to.

I disagree here. It's better not to hide any necessary information in the HTTP layer, because not every way of interacting with the broker will have those HTTP features. See for example comment:11:ticket:25985. Even assuming HTTP, codes like 204 are probably less likely to pass untouched by proxies. I propose to just always use status code 200, unless there is a genuine internal server error, and encode all necessary information in the HTTP body. We're not designing a REST interface here.

Okay, this makes sense to me. So the idea would be to change all codes to 200 and then do a redesign as suggested in #30704.

comment:17 Changed 3 weeks ago by cohosh

Owner: changed from ahf to cohosh

comment:18 Changed 3 weeks ago by cohosh

Priority: Very HighHigh
Type: taskenhancement

comment:19 Changed 3 weeks ago by cohosh

Summary of how things work now

Note: see Broker.markdown for documentation of the Snowflake broker. This is a more specific proxy-focused break down of the messages sent.

Proxy Poll

The proxy sends

POST [broker URL] HTTP [version]
X-Session-ID: [session id]

[session id]

and the broker confirms that the session id given in the header matches that given in the body.

The broker then responds with one of three messages:

  • If the session ID in the header did not match the session ID in the body, it sends:
     HTTP 400 Bad Request
    
  • If there is a client matched to the proxy, it sends:
     HTTP 200 OK
    
    {
        type: offer
        sdp: [WebRTC SDP]
    
    }
    
    

where the HTTP response body is a serialized WebRTC Session description offer

  • If there are no clients matched the proxy, it sends:
     HTTP 504 Gateway Timeout
    

Proxy Answers

The proxy sends

POST [broker URL] HTTP[version]
X-Session-ID: [session id]

{
    type: answer
    sdp: [WebRTC SDP]
}

where the HTTP response body is a serialized WebRTC Session description answer.

The broker then uses the provided session ID to match this answer with the correct snowflake and provides one of three responses:

  • If the proxy took too long to respond, it sends:
     HTTP 410 Gone
    
  • If the body of the POST request was empty or surpassed the read limit, it sends:
     HTTP 400 Bad Request
    
  • If the answer was sent to the client, it sends:
     HTTP 200 OK
    

comment:20 Changed 3 weeks ago by cohosh

Overall, we want to move to putting all relevant information in the HTTP response body and provide 200 OK responses from the broker for all valid interactions. I think it's okay to leave the 400 Bad Request responses alone for requests we catch that definitely deviate from the snowflake protocol.

We want to do this in a backwards compatable way to accomodate proxies that have not yet updated. Note that we have control over the broker deployment at this time. I propose the following changes:

Proxy Poll

Updated proxies send

POST [broker URL] HTTP [version]

{
    version: "1.0",
    sid: [session id]
}

The broker checks to see if the response body contains a JSON format with a version field that specifies version 1.0. If not, revert to old method but don't compare the HTTP header with the body (I'm not sure what the usefulness of that is).

The broker then responds with one of two messages:

  • If there are no clients matched the proxy, it sends:
     HTTP 200 OK
    
     {
         status: "no proxies"
     }
    

For old snowflakes (determined as described above), it sends the old HTTP 504 Gateway Timeout response.

We might be able to avoid some branching logic and send the new 200 response to old proxies. I think this is OK to do for backwards compatability because the call to webrtc.Deserialize will fail and return a null offer for proxy-go instances and browser-based proxies will also fail to received the offer and continue polling. However, it feels a bit risky.

  • If there is a client matched to the proxy, it sends to updated proxies:
     HTTP 200 OK
     {
         status: "client match",
         {
             type: offer,
             sdp: [WebRTC SDP]
    
         }
     }
    

and the old message to non updated proxies. I'm not sure if the new message will work, I have to read more about how the JSON functions in Go and node respond to this structure. To receive the offer, the proxy needs to be able to find the "type" and "sdp" fields in the JSON object.

Proxy Answers

Updated proxies send:

POST [broker URL] HTTP[version]

{
    version: "1.0",
    sid: [session id],
    {
        type: answer,
        sdp: [WebRTC SDP]
    }
}

The broker then checks the version as above to determine whether it's an old or new proxy and uses the provided session ID to match this answer with the correct snowflake and provides one of three responses:

  • If the proxy took too long to respond, it sends:
     HTTP 200 OK
    
     {
         status: "client gone"
     }
    
    to updated proxies and HTTP 410 Gone to old ones. We could send the new message to old proxies, however old proxies assume the answer was sent to a client and will wait for a data channel to open if they receive a 200 OK message. We do have a timeout built in now since #25688 and #31100 so the token will eventually free a slot for more clients.
  • If the body of the POST request was empty or surpassed the read limit, it still sends HTTP 400 Bad Request to all proxies.
  • If the answer was sent to the client, it sends to all proxies:
     HTTP 200 OK
    
     {
         status: "success"
     }
    
Last edited 3 weeks ago by cohosh (previous) (diff)

comment:21 in reply to:  15 Changed 3 weeks ago by dcf

Replying to dcf:

Replying to serna:

I suggest changing the status to 204 No Content, it would represent that there's no error but it also no one to connect to.

I disagree here. It's better not to hide any necessary information in the HTTP layer, because not every way of interacting with the broker will have those HTTP features. See for example comment:11:ticket:25985. Even assuming HTTP, codes like 204 are probably less likely to pass untouched by proxies. I propose to just always use status code 200, unless there is a genuine internal server error, and encode all necessary information in the HTTP body. We're not designing a REST interface here.

Let me slightly amend my statement here. I was mistakenly thinking that this was about the client–broker protocol, not the proxy–broker protocol. While I do think that it's important to avoid HTTP entanglement in the client–broker protocol (#29293), in the proxy–broker protocol there's probably no harm in assuming HTTP in the proxy–broker protocol. That said, there's also no harm in moving that information into the body, as in comment:20.

Note: See TracTickets for help on using tickets.