Write a UTP-based channel implementation

changed milestone to %Tor: unspecified

added component::core tor/tor milestone::Tor: unspecified parent::9165 priority::medium resolution::wontfix severity::normal status::closed tor-relay type::defect utp labels

QUIC would be most likely better choice. Its backed by Google and it will get well tuned.

http://blog.chromium.org/2013/06/experimenting-with-quic.html

Trac:
Username: hsn

As I understand it, the branch today works by adding a UTP connection "on the side" of every or_connection_t. Whenever an OR connection is launched, so is a corresponding UTP connection. The branch seems to want to use TLS for initial key setup, and does not appear to do encryption on the UTP connections yet.

I think that's going to be good enough for simulation, though: the setup characteristics are going to be significantly different than you'd see in a more polished implementation, but the throughput characteristics will be accurate (modulo the effects of no crypto).

For a polished, mergeable version here, I would want to see:

A proposal.
IPv6 support
An actual UTP-based channel, advertised in descriptors and microdescriptors, and implemented as its own channel_t implementation.
Integration with the bandwidth accounting and rate-limiting features in the rest of Tor.
Resolution for the issues noted below.
Encryption, obviously. This could be with TLS-over-UTP, or something different.

More fine-grained issues:

816abfac378f979d718782e17c5185c1dba43f24 Add a uTP connection in parallel to the channel:

Got to comment the new functions, if that doesn't happen.
It needs rate-limiting to work somehow.
need to avoid magic-numbers for inet_ntop buffer lengths.
remove magic-address 128.232.10.129
needs to bind something other than INADDR_ANY.
autoconf needs to check for utp, make it optional
Cast "userdata" to channel_tls_t in a local variable; don't say ((channel_tls_t*)userdata) more than once per function.

b1b18889cd4986a2d10e79137f99a9ad46006f08 -- Handle uTP reads using libevent

check all return values.

97aa42e4f235015ebd715ba673c475c978c48e67 -- Write payload data to uTP connection

utp_is_writable should get checked. Also, it looks like connection_or_write_cell_to_buf doesn't make sense when tlschan->utp_write_buf is actually successful.

67589c6b4f036ca73b49135cf5f7ec5a708dce73 -- Tie up incoming uTP connections with the TLS counterpart

Aw geez. it's using the TLS master key as some kind of key material for the UTP session. That's pretty darned kludgy. I'd really want a key derivation function in there at least.
It doesn't appear anything uses the TLS master key though.

Looking at the issues above, the issues likeliest to affect performance measurements (Assuming this actually works at all, which I haven't tested, but I believe Steven says it does) is the lack of rate-limiting, and the possibility of sending cells redundantly. (If I'm reading the code right, data is actually sent on both the TCP and the UTP connection. Was that fixed?)

Please let me know what I've gotten wrong there.

Replying to hsn:

QUIC would be most likely better choice. Its backed by Google and it will get well tuned.

http://blog.chromium.org/2013/06/experimenting-with-quic.html

Quite possibly! UTP is more mature and more widely deployed for now, though. Another experimental thing to look at is MinimaLT (http://cr.yp.to/tcpip/minimalt-20130522.pdf).

In any case, this ticket isn't about picking the best possible channel backend protocol -- it's about (first) getting Steven's initial branch that he wrote last year good enough to test, and keeping track of what more would need to be done to merge it.

Replying to nickm:

As I understand it, the branch today works by adding a UTP connection "on the side" of every or_connection_t. Whenever an OR connection is launched, so is a corresponding UTP connection. The branch seems to want to use TLS for initial key setup, and does not appear to do encryption on the UTP connections yet.

That's correct. Cells that are sent via channels go via uTP (and not TLS) but versions, netinfo, certs and auth* get sent directly on a OR connection rather than channel. I tried sending these via channels but this failed (my reverted attempt was in 4b674b6ba1a665703c2fbc4244eff40ae5e3e673)

Aw geez. it's using the TLS master key as some kind of key material for the UTP session. That's pretty darned kludgy. I'd really want a key derivation function in there at least.

It doesn't appear anything uses the TLS master key though.

Indeed this is horrid, but only necessary because uTP and TCP connections need to be linked to the same OR connection. Once all necessary cells are sent over uTP there will be no need for any such mechanism.

Replying to nickm:

Looking at the issues above, the issues likeliest to affect performance measurements (Assuming this actually works at all, which I haven't tested, but I believe Steven says it does) is the lack of rate-limiting, and the possibility of sending cells redundantly. (If I'm reading the code right, data is actually sent on both the TCP and the UTP connection. Was that fixed?)

I intended that cells except handshake cells should be sent over uTP and handshake cells sent over TCP. The (badly described) commit of fc63eca9ff26384b66707f3dfc15a4555c259a5d stops cells being sent on the OR connection if a uTP connection is available.

Trac:
Cc: sjmurdoch to sjmurdoch, iang

Trac:
sjm217-utp-combined.pdf

I finally got Shadow simulations of Steven's utp branch running (with help by Steven and Rob). See attached simulation results of a variant that uses uTP for all links and one that uses it for none of the links. This was in a "tiny" Shadow network with 20 relays, which took almost 2.5 hours per simulation.

Here's the code change I made to enable (if (1 ||) or disable uTP (if (0 &&):

diff --git a/src/or/channeltls.c b/src/or/channeltls.c
index 0551b73..b7b36e1 100644
--- a/src/or/channeltls.c
+++ b/src/or/channeltls.c
@@ -418,7 +418,7 @@ channel_tls_connect(const tor_addr_t *addr, uint16_t
port,
   /* Create a uTP connection */
   tor_addr_to_sockaddr(addr, port, (struct sockaddr*)&sin, sizeof(sin));
   tor_addr_to_str(addr_str, addr, sizeof(addr_str), 0);
-  if (!strncmp(addr_str, "128.232.10.129", sizeof(addr_str))) {
+  if (1 || !strncmp(addr_str, "128.232.10.129", sizeof(addr_str))) {
     log_info(LD_CHANNEL,
              "Trying uTP connection to %s", addr_str);
     tlschan->utp = UTP_Create(tor_UTPSendToProc, tlschan, (const struct
sockaddr*)&sin,

Am I doing the simulation right? Should I apply different code changes to turn uTP on/off? Should I run this in a larger Shadow network?

That looks fine to me Karsten (assuming you are using the LLVM hoist globals version of Shadow; prior ones probably won't work due to the issue described at https://github.com/bittorrent/libutp/issues/51).

Is 2.5 hours larger than it would be for a network of the same size without libutp? If so it would be worth investigating this as it probably would indicate a deeper problem.

Trac:
sjm217-utp-combined.2.pdf

Replying to sjmurdoch:

That looks fine to me Karsten (assuming you are using the LLVM hoist globals version of Shadow; prior ones probably won't work due to the issue described at https://github.com/bittorrent/libutp/issues/51).

I'm using Shadow master (91fc269). How do I know it worked correctly? Are there log messages indicating UDP vs. TCP traffic?

Is 2.5 hours larger than it would be for a network of the same size without libutp? If so it would be worth investigating this as it probably would indicate a deeper problem.

2.5 hours is normal. I ran another simulation using vanilla 0.2.4.4-alpha, and that took as long as the other two simulations. See results attached.

So, I wonder, shouldn't the black and blue line overlap more? Is there anything else I should change in your branch to turn off uTP?

Trac:
sjm217-utp-64GB-seeds-combined.pdf

Trac:
sjm217-utp-64GB-traffic-combined.pdf

I just finished new Shadow simulations using the large 64 GB network configuration. See attached two PDFs:

sjm217-utp-64GB-seeds-combined.pdf compares two runs with uTP for none of the links to two runs with uTP for all links. The two runs using the same configuration use different random seeds (using scallion vs. scallion -s 2). The motivation for this experiment was to see if the large 64 GB network configuration shows more significant performance differences than the small 16 GB network configuration, and to see if differences are real or random. Unfortunately, it looks like differences are mostly random.
sjm217-utp-64GB-traffic-combined.pdf compares the runs with seed 1 to similar runs that put additional non-Tor TCP traffic on the Shadow network. Additional TCP traffic is produced by adding a second instance of Shadow's filetransfer plugin called traffic, making all clients perform requests to fileservers directly, and removing all ".*[traffic-.*" lines from scallion.log before making graphs. The motivation for this experiment was to see if additional network load has any effect on Tor performance. Looks like this is not the case.

I'd like to do one more experiment: can Shadow somehow output what amount of traffic uses TCP vs. UDP? I'd like to re-run the simulation in a small or tiny network with uTP enabled and disabled, to confirm that enabling/disabling uTP actually worked. Rob, any idea how Shadow could provide this information?

Trac:
Cc: sjmurdoch, iang to sjmurdoch, iang, robgjansen

Replying to karsten:

I just finished new Shadow simulations using the large 64 GB network configuration. See attached two PDFs:

sjm217-utp-64GB-seeds-combined.pdf compares two runs with uTP for none of the links to two runs with uTP for all links. The two runs using the same configuration use different random seeds (using scallion vs. scallion -s 2). The motivation for this experiment was to see if the large 64 GB network configuration shows more significant performance differences than the small 16 GB network configuration, and to see if differences are real or random. Unfortunately, it looks like differences are mostly random.

Yeah, it definitely doesn't look like the experiments actually used a different transport.

sjm217-utp-64GB-traffic-combined.pdf compares the runs with seed 1 to similar runs that put additional non-Tor TCP traffic on the Shadow network. Additional TCP traffic is produced by adding a second instance of Shadow's filetransfer plugin called traffic, making all clients perform requests to fileservers directly, and removing all ".*[traffic-.*" lines from scallion.log before making graphs. The motivation for this experiment was to see if additional network load has any effect on Tor performance. Looks like this is not the case.

I wouldn't expect it to produce the results you want, unless you produce the extra TCP load on relays, clients, or fileservers that are bottlenecks -- you need to 'steal' bandwidth from the client's Tor application, or from a relay's Tor application. You could do this by adding the extra client applications on existing Tor client nodes, and then reduce their bandwidth to ensure the extra load means less bandwidth for Tor.

I'd like to do one more experiment: can Shadow somehow output what amount of traffic uses TCP vs. UDP? I'd like to re-run the simulation in a small or tiny network with uTP enabled and disabled, to confirm that enabling/disabling uTP actually worked. Rob, any idea how Shadow could provide this information?

Its your lucky day ;) It wasn't directly possible, but I just implemented and pushed it to shadow master. You need to include 'socket' in the attribute value of the 'heartbeatloginfo' attribute to the 'node' element. So, something like this:

<node heartbeatloginfo="node,ram,socket" ... >

Careful though, as this prints out information for every active socket for every node on which you enable this feature. grep the logs for '[shadow-heartbeat] [socket'.

If you are feeling particularly adventurous, you can enable for every node in the simulation at the same time with the shadow command line switch:

scallion --heartbeat-log-info=node,ram,socket ...

Trac:

Thanks for adding the heartbeatloginfo socket thing! I just ran two simulations in the tiny network with the new feature enabled for all nodes. Unfortunately, I didn't find a single UDP entry in the logs. I summed up read and written bytes for all nodes and made a graph out of them, and it looks like both simulations used TCP for everything.

So, does Shadow not detect UDP traffic correctly, or did I not enable/disable uTP correctly?

Replying to karsten:

So, does Shadow not detect UDP traffic correctly, or did I not enable/disable uTP correctly?

Sorry about that, its Shadow's fault. I fixed it in Shadow master.

Test it with:

shadow -i socket --echo | grep _tracker_logSocket

You should now see logs for UDP sockets, some with non-zero transfer statistics.

Write a UTP-based channel implementation

Child items 0

Activity