Opened 7 years ago

Closed 20 months ago

#5334 closed task (wontfix)

Make simulator authors aware of "sparse exit traffic" and "varying directory load" wishlist items

Reported by: arma Owned by:
Priority: Medium Milestone:
Component: Metrics/Analysis Version:
Severity: Keywords:
Cc: robgjansen, kevin, tschorsch@…, karsten Actual Points:
Parent ID: #4682 Points:
Reviewer: Sponsor:

Description

Proposal 182 points out that we let the read bucket go negative (by reading the rest of the TLS record, which was already actually read by openssl so we might as well use what it says), but then we don't let the corresponding write occur, so we trap the cells inside Tor until the write bucket can catch up. The suggested fix is to allow writes based on how much we read, to avoid trapping cells.

But what situations occur where we end up with fewer write tokens in our token bucket than read tokens? Two big ones come to mind:

  • Sparse streams at exits, such that we read a few bytes and package them into a whole cell. Fetching hundreds of tiny web bug images is a good example here, if they're all on different streams. Another example would be an irc connection, though most users probably don't know what irc is. (An instant message conversation may or may not be a good example, since if the conversation is balanced, the exit would be reading cells from the Tor network and writing only a few bytes, thus offsetting the scarcity of write tokens.) The "sparse exit traffic" issue could be solved by having some fraction of the clients fetching that sort of traffic, to see what sort of an impact it would have if the behavior became popular.
  • Directory fetches produce wild imbalances between reads and writes. The population of users in the simulations seem to fetch a little bit of network information once at the beginning, and then never need it after that. Great, but it means our simulated environment misses out on this aspect of the real network. There are some hackish heuristics in global_write_bucket_low() in src/or/connection.c to decide whether a given relay should decline to answer a directory request, based on rate limits, size of request, current bucket levels, whether we ran out last second, etc. Whatever we do for solving proposal 182 should tune these heuristics to a) avoid answering if it would hurt our write bucket too much, yet b) make sure enough places still answer that everything still works smoothly. These contradictory goals seem like they need a realistic simulation framework, and I think neither of the current simulation frameworks handle the topic well?

Child Tickets

Change History (7)

comment:1 Changed 7 years ago by Flo

Cc: robgjansen kevin tschorsch@… added; robgjansen kevin removed

comment:2 Changed 7 years ago by kevin

This brings up a good question I don't yet have any answer to: What kind of traffic should we be generating in simulation/emulation? Both Shadow and ExperimenTor distinguish between "web" and "bulk", where "web" is a 320 KiB file download and "bulk" is a 5 MiB file download (of course, neither is close to realistic!). To help understand issues like "sparse exit streams", we first need a better traffic model. Ideas?

comment:3 Changed 7 years ago by robgjansen

I'd like to note that its easy to make clients trigger the "sparse exit streams" effect , e.g. by having them download 1 bytes files. The hard part is, as Kevin noted, coming up with an accurate client traffic model that has close to the right quantity of requests and distribution of requested file sizes.

While collecting client traffic characteristics in a privacy-preserving manner is an open research problem, we can probably do something in the short term that improves our "320KiB or 5MiB" model.

comment:4 Changed 7 years ago by arma

Cc: karsten added

Do the "IM users" in Rob's new client model produce the 'sparse exit streams' situation?

Does that mean we should do another run of #5336 using the client model that includes IM users?

comment:5 in reply to:  4 ; Changed 7 years ago by karsten

Replying to arma:

Do the "IM users" in Rob's new client model produce the 'sparse exit streams' situation?

Does that mean we should do another run of #5336 using the client model that includes IM users?

I ran the #5336 simulations 1 month ago with the network model in ~/shadow-git-clone/resource/scallion-hosts/large-m2.4xlarge.tar.xz. Shall I re-run them with a different model? I can do that this weekend if I know what to simulate.

Also, the November 1 deadline has passed. We should either complete this child ticket of #4682 very soon, or remove the parent ticket relationship to be able to close #4682.

comment:6 in reply to:  5 Changed 7 years ago by robgjansen

Replying to karsten:

Replying to arma:

Do the "IM users" in Rob's new client model produce the 'sparse exit streams' situation?

Does that mean we should do another run of #5336 using the client model that includes IM users?

I ran the #5336 simulations 1 month ago with the network model in ~/shadow-git-clone/resource/scallion-hosts/large-m2.4xlarge.tar.xz. Shall I re-run them with a different model? I can do that this weekend if I know what to simulate.

Also, the November 1 deadline has passed. We should either complete this child ticket of #4682 very soon, or remove the parent ticket relationship to be able to close #4682.

The "IM" client simply fetches a 1 KiB file every [1,5] seconds, and will not create your "sparse exit stream" situation as I understand it, since it will be reading and writing approximately the same amount.

We could configure a new client type that fetches 1 byte files at a time from a web server, though not from several streams at once. Will that give you the effect you are looking for, or will there not be enough data transferred there to make a difference? (I suspect the latter...)

comment:7 Changed 20 months ago by karsten

Resolution: wontfix
Status: newclosed

Closing tickets in Metrics/Analysis that have been created 5+ years ago and not seen progress recently, except for the ones that "nickm-cares" about.

Note: See TracTickets for help on using tickets.