Investigate 10-second delay in TTFB
While looking into OnionPerf data I noticed a 10-second delay in time to first byte. I'll attach an ECDF shortly.
I started hunting down this issue and found that many of these cases (though not all of them) had their stream detached from a circuit and re-attached to another circuit following a 10-second timeout of some sort. The following example shows relevant controller events:
2019-05-05 09:55:00 1557046500.54 650 STREAM 45043 NEW 0 137.50.19.2:80 SOURCE_ADDR=127.0.0.1:36454 PURPOSE=USER
2019-05-05 09:55:00 1557046500.54 650 STREAM 45043 SENTCONNECT 29430 137.50.19.2:80
2019-05-05 09:55:00 1557046500.69 650 STREAM_BW 45043 13 2 2019-05-05T08:55:00.682587
^^ <- 10 second delay here
2019-05-05 09:55:10 1557046510.69 650 STREAM 45043 DETACHED 29430 137.50.19.2:80 REASON=TIMEOUT
2019-05-05 09:55:10 1557046510.69 650 STREAM 45043 SENTCONNECT 29411 137.50.19.2:80
2019-05-05 09:55:11 1557046511.12 650 STREAM 45043 REMAP 29411 137.50.19.2:80 SOURCE=EXIT
2019-05-05 09:55:11 1557046511.12 650 STREAM 45043 SUCCEEDED 29411 137.50.19.2:80
2019-05-05 09:55:11 1557046511.68 650 STREAM_BW 45043 55 10 2019-05-05T08:55:11.682353
2019-05-05 09:55:12 1557046512.68 650 STREAM_BW 45043 0 637971 2019-05-05T08:55:12.681636
2019-05-05 09:55:13 1557046513.21 650 STREAM_BW 45043 0 410673 2019-05-05T08:55:13.211188
2019-05-05 09:55:13 1557046513.21 650 STREAM 45043 CLOSED 29411 137.50.19.2:80 REASON=DONE
1% of measurements seems a lot to me, and I could imagine that these cases are particularly annoying to users. Maybe this timeout could be shorter or made more dynamic like other timeouts.
If the timeout cannot be changed, it would be nice to tell the user that their stream has just been attached to another circuit and that that's why they had to wait for the past 10 seconds.