wiki:doc/meek

Version 2 (modified by dcf, 5 years ago) (diff)

Move App Engine notes from GoAgent page.

Meek is a transport that uses HTTP for carrying bytes and TLS for obfuscation. Traffic is relayed through a third-party server (Google App Engine). It uses a trick to talk to the third party so that it looks like it is talking to an unblocked server.

Quick start

git clone https://www.bamsoftware.com/git/meek.git
cd meek/meek-client
export GOPATH=~/go
go get
go build
tor -f torrc

Overview

When meek-client receives a SOCKS request from the Tor client, it generates a random session id string. meek-client makes an HTTP POST to https://meek-reflect.appspot.com/, which is a special web app set up for this transport. The request looks something like this:

GET / HTTP/1.1
Host: meek-reflect.appspot.com
X-Session-Id: x5ej2h96frvLXeqgKNjIyRDRJidU8RMIeRPDzvLVG+E=
Content-Type: application/octet-stream
Content-Length: 100

<data payload follows>

Normally a censor would be able to search for the string meek-reflect.appspot.com and block the connection. But the HTTP request is inside HTTPS, and the IP address and SNI of the HTTPS connection are those of www.google.com.

The web app running at meek-reflect.appspot.com is very simple: it just copies the POST request it receives, and makes an identical request to a meek-server running on a Tor relay. There is a meek-server instance running at http://tor1.bamsoftware.com:7002/. The request from App Engine to meek-server looks like:

GET / HTTP/1.1
Host: tor1.bamsoftware.com:7002
X-Session-Id: x5ej2h96frvLXeqgKNjIyRDRJidU8RMIeRPDzvLVG+E=
Content-Type: application/octet-stream
Content-Length: 100

<data payload follows>

When meek-server receives this request, it reads the session id string x5ej2h96frvLXeqgKNjIyRDRJidU8RMIeRPDzvLVG+E= and checks if it has an existing session (i.e., ORPort connection) by that name. If it does, it copies the data payload to the ORPort. Otherwise, it creates a new ORPort connection, and copies the data payload to it. In any case, it then tries to read from the ORPort, and sends an HTTP response back to App Engine with any data from the Tor relay:

HTTP/1.1 200 OK
Content-Type: application/octet-stream
Content-Length: 200

<data payload follows>

The web app at meek-reflect.appspot.com simply copies the HTTP response and sends it back to the client:

HTTP/1.1 200 OK
Content-Type: application/octet-stream
Content-Length: 200

<data payload follows>

When meek-client finally receives this response, it writes the data payload back to its SOCKS port. The process then repeats: the client tries to read from its SOCKS port, it makes a request to App Engine, App Engine copies request to meek-server, meek-server writes to and reads from the ORPort, meek-server sends a response, App Engine copies the response to meek-client, and meek-client writes the response body to the SOCKS port.

There is no way for the server to push data to the client. It has to wait for a request in order to send its data in the response. For this reason, the client polls the server, making a request periodically even if it has nothing to send. The polling starts out frequent and backs off exponentially while the server has nothing to say. The polling interval resets to the minimum every time the server sends some data.

Other clients operating simultaneously will use different session id strings, so the server can tell them apart. Stale session ids that have not had any traffic in a while are closed and forgotten.

Things to do

Build a PHP middle relay that can be used in the place of App Engine. (Easy and fun.)

General notes about App Engine

Quotas for unpaid apps:

You can pay to get higher quotas:

There is also a higher "premier" level of service that comes with support:

Paying for more bandwidth on a public proxy server would be nice in that users wouldn't have to set up their own instance, and we could hardcode the server inside a browser bundle. (It would be nice to know the cost per gigabyte of running App Engine and, say, an ordinary Tor relay.)

An idea to reduce overhead and eliminate polling is to use HTTP as a long-lived bidirectional channel, sending upstream data in a POST body and receiving data in the response body simultaneously. (That is, you send a POST with no Content-Length, the server reads your header and forwards the request to the relay, the server writes back a header, and after that you use the connection as an ordinary socket, with upstream and downstream data interleaved.) An implementation of this idea is at https://www.bamsoftware.com/git/meeker.git. The idea doesn't work with App Engine, for two reasons. 1) requests must be handled within 60 seconds, and 2) App Engine doesn't support streaming requests of this kind:

"App Engine calls the handler with a Request and a ResponseWriter, then waits for the handler to write to the ResponseWriter and return. When the handler returns, the data in the ResponseWriter's internal buffer is sent to the user. This is practically the same as when writing normal Go programs that use the http package. The one notable difference is that App Engine does not support streaming data in response to a single request."

App Engine doesn't even call your web app code until it has consumed the entire request body, and doesn't start flushing the response body until you close the output stream.

Ideas

The App Engine Channel API provides a way to have long-lived push connections to the client, subject to a restricted interface. (HTTP handlers are otherwise required to finish within 60 seconds.) The client could use HTTP request bodies to send data, and a channel to receive, and remove the need for polling. It would require us to reimplement the client JavaScript channel API in order to make use of the particular Comet-based protocol.

Paid apps can create outbound sockets. I don't think it helps us because then the web app would be responsible for managing the session id mapping.

GoAgent is similar in that it also uses App Engine as a middleman.

Attachments (20)

Download all attachments as: .zip