Opened 17 months ago

Closed 6 months ago

#7359 closed enhancement (implemented)

Design/implement method for collecting/reporting statistics

Reported by: robgjansen Owned by:
Priority: normal Milestone: Tor: 0.2.5.x-final
Component: Tor Version:
Keywords: performance, simulation, statistics, tor-relay, tor-client Cc: arma, karsten, robgjansen, nickm
Actual Points: Parent ID: #7357
Points:

Description (last modified by robgjansen)

I envision a stats module that is responsible for processing all statistics gathered anywhere in Tor. This module would be responsible for making decisions on where they get published (ControlPort, DirAuth) and how much scrubbing/aggregation is required for those destinations.

If we like that approach, implementation likely involves porting currently collected stats to use the new "stats module knows best".

I'll be working in my github repo in branch "morestats":
https://github.com/robgjansen/torclone.git

Child Tickets

Attachments (1)

karsten-morestats4-mem-total.pdf (17.5 KB) - added by karsten 7 months ago.
Memory consumption when enabling morestats4 events

Download all attachments as: .zip

Change History (24)

comment:1 Changed 17 months ago by robgjansen

  • Type changed from defect to enhancement

comment:2 follow-up: Changed 17 months ago by robgjansen

  • Description modified (diff)

Should we be taking the approach of sending events to the ControlPort and letting an external process turn them into statistics for anything that we're not sending up to daddy? Or are we OK with doing the stats inside of Tor itself. I'm personally in favor of the latter as it minimizes effort required to get useful stats and requires less duplication of code in the long run.

comment:3 Changed 17 months ago by robgjansen

  • Cc nickm added

comment:4 Changed 17 months ago by robgjansen

  • Milestone changed from Tor: 0.2.4.x-final to Tor: unspecified

comment:5 in reply to: ↑ 2 Changed 17 months ago by karsten

Replying to robgjansen:

Should we be taking the approach of sending events to the ControlPort and letting an external process turn them into statistics for anything that we're not sending up to daddy? Or are we OK with doing the stats inside of Tor itself. I'm personally in favor of the latter as it minimizes effort required to get useful stats and requires less duplication of code in the long run.

Letting Tor do it sounds convenient, but only up to the point where we want to change something. Turning observations into statistics may depend on what we want to do with the statistics. If we change our mind there, we'll have to fix that in Tor. It's probably quite inconvenient for simulations if older Tor versions report different statistics than newer versions. I'd rather favor a variant where control port events contain the raw data and where it's up to the application, here Shadow, how to aggregate these data points to get something useful. I bet there's good support for processing control port events in stem that we can use here. (That's just my 2 cents.)

In general, having a common stats module that either prepares control port events, or aggregates data for being sent to the directory authorities, or both, makes sense. This is a major refactoring effort, so I'd be careful mixing that with adding new stats.

In #7358, I asked you for which statistics you were planning to emit asynchronous control port events whenever there's a change, or to emit events periodically. You asked me to move that question here. So, for example, the CIRC, STREAM, ORCONN, BUILDTIMEOUT_SET, and CIRC_MINOR events (Sections 4.1.1, .2, .3, .16, and .19) are emitted whenever there's a change, whereas BW and STREAM_BW (Sections 4.1.4 and .13) are emitted every second. In fact, the mentioned events already cover some of the statistics you were looking for. We'll probably want to design the remaining control port events similar to these.

comment:6 Changed 17 months ago by robgjansen

Replying to karsten:

Replying to robgjansen:

Should we be taking the approach of sending events to the ControlPort and letting an external process turn them into statistics for anything that we're not sending up to daddy? Or are we OK with doing the stats inside of Tor itself. I'm personally in favor of the latter as it minimizes effort required to get useful stats and requires less duplication of code in the long run.

Letting Tor do it sounds convenient, but only up to the point where we want to change something. Turning observations into statistics may depend on what we want to do with the statistics. If we change our mind there, we'll have to fix that in Tor. It's probably quite inconvenient for simulations if older Tor versions report different statistics than newer versions. I'd rather favor a variant where control port events contain the raw data and where it's up to the application, here Shadow, how to aggregate these data points to get something useful. I bet there's good support for processing control port events in stem that we can use here. (That's just my 2 cents.)

I've thought about this and reviewed the Tor controller spec. I agree. Thanks for your insight!

In general, having a common stats module that either prepares control port events, or aggregates data for being sent to the directory authorities, or both, makes sense. This is a major refactoring effort, so I'd be careful mixing that with adding new stats.

Yeah, good point. I'll focus on minimizing changes and only adding the desired event types to collect the new stats.

In #7358, I asked you for which statistics you were planning to emit asynchronous control port events whenever there's a change, or to emit events periodically. You asked me to move that question here. So, for example, the CIRC, STREAM, ORCONN, BUILDTIMEOUT_SET, and CIRC_MINOR events (Sections 4.1.1, .2, .3, .16, and .19) are emitted whenever there's a change, whereas BW and STREAM_BW (Sections 4.1.4 and .13) are emitted every second. In fact, the mentioned events already cover some of the statistics you were looking for. We'll probably want to design the remaining control port events similar to these.

I see. More soon on this...

comment:7 Changed 14 months ago by karsten

  • Milestone changed from Tor: unspecified to Tor: 0.2.5.x-final
  • Status changed from new to needs_review

We made some progress here. Proposal 218 defines four new controller events that shall help understand connection and circuit usage. Branch morestats3 in my public tor branch (which is based on Rob's morestats branch) has a tested implementation that could use some review before merging it into 0.2.5.x.

comment:8 Changed 11 months ago by karsten

Please review branch morestats4 in my public repo. Rebased to master and tested using Shadow.

comment:9 follow-up: Changed 11 months ago by nickm

  • Status changed from needs_review to needs_revision

quick comments:

General:

  • This needs to go along with a patch to control-spec.txt .

e54d664f7bb1205162c1df3495f8ebc30c23d867

  • The last time we added K=V arguments in the middle of controller event, it created trouble for controllers, even though it wasn't supposed to do so, since the spec says we can. Since nobody's actually hacking on Vidalia, would it be possible to just put the new K=V arguments at the end, to avoid possible trouble?

8d1f78c556abb570bb80ea84261c954d9746cf33

  • Instead of being controlled by TestingTorNetwork, should there be a separate option for EVENT_CONN_BW events?
  • The documentation for n_read and n_written should say that they're only used for EVENT_CONN_BW events, and only turned on conditionally. They should also probably have a name that implies that they are only for stats too.
  • The new code in control_update_global_event_mask should get extracted into a function.

c386d2d6ce4c4f58163acb385c7a5de1da8c5e30

  • "extern circuit_t *global_circuitlist" is really not a great way to mediate access to this thing; we should add an accessor function.
  • cell_command_to_string doesn't really belong in control.c
  • append_cell_stats_by_command doesn't seem to document how long the arrays are. I might be a bit more comfortable if it took a structure instead. This
  • Accessing num_used in smartlist_t is now how we do it. Call smartlist_len like every other place in the code.
  • include_if_positive is a weird name for an unsigned type. It can't be negative, after all.
  • control_event_circuit_cell_stats seems to be really huge and full of duplicated code. Can it be refactored into smaller functions?
  • Is it "exitward" or "exit_ward"? I prefer the former.
  • Prefer smartlist_add_asprintf() to tor_asprintf();smartlist_add().
  • Is this really slow in practice? It seems like it's likely to be really slow.
  • Ditto about having another controlling option. We don't want TestingTorNetwork to do everything at once.
  • CELL_COMMAND_MAX would be a better name than CELL_MAX
  • Prefer TOR_SIMPLEQ to manual list management for new queues.
  • This data structure ... how does it work out in practice? It looks like it's going to be hideously inefficient.
  • This code needs unit tests and functional abstraction badly.

dd5ce2157d8c47ffa3686b0579813e7b1aae8069

  • The documentation for the new global_*_emptied thing doesn't really say what they are. Do you mean "Last time at which the ... buckets were emptied in msec since midnight", or something else?
  • "connection_buckets_empty_ts" -- I can't read that name as a verb phrase. Our practice is to try to make function that do things have an active verb in their names.
  • The int cast in that function doesn't give me a great feeling.
  • Here too, TestingTorNetwork should not mean everything.
  • The code to convert time to "msec since midnight" is duplicated code. Let's try not to duplicate code.
  • Holy moly, it's a 13-argument function!!!!! Please, let's be better than that.

2925e2fe786dfd9a27c2dff503996712cd180de4

  • See above about the name and documentation for n_read and n_written.
  • The new code in control_update_global_event_mask should get extracted into a function.

comment:10 in reply to: ↑ 9 Changed 11 months ago by rransom

Replying to nickm:

e54d664f7bb1205162c1df3495f8ebc30c23d867

  • The last time we added K=V arguments in the middle of controller event, it created trouble for controllers, even though it wasn't supposed to do so, since the spec says we can. Since nobody's actually hacking on Vidalia, would it be possible to just put the new K=V arguments at the end, to avoid possible trouble?

It only created trouble for the PyTorCtl-based Python crap. Vidalia handled the new keyword arguments fine.

comment:11 follow-up: Changed 11 months ago by nickm

Started looking at some of the fixes when I meant to look at something else. Thought I'd review anyway, since why not.

a0c5080a48ed7612c72099c8c96dceef00be117e

  • Aren't we going to need a space before the ID= ? I don't see anything to put it there; am I missing it?

16e7f7889a7869220535b76af5c8ddb457068c20

  • Thanks! control_event_circuit_cell_stats() looks downright readable now.

comment:12 in reply to: ↑ 11 Changed 11 months ago by karsten

Replying to nickm:

Started looking at some of the fixes when I meant to look at something else. Thought I'd review anyway, since why not.

a0c5080a48ed7612c72099c8c96dceef00be117e

  • Aren't we going to need a space before the ID= ? I don't see anything to put it there; am I missing it?

Oops. Yes, I should put back the space where it belongs. (I didn't really test this branch yet, other than compiling it, because I still need to make more changes.)

16e7f7889a7869220535b76af5c8ddb457068c20

  • Thanks! control_event_circuit_cell_stats() looks downright readable now.

Cool!

I'll let you know when I'm done with your list of comments. It's a long list, but it's also a good list. Might keep me busy a few more hours.

comment:13 follow-up: Changed 11 months ago by karsten

  • Status changed from needs_revision to needs_review

Replying to nickm:

quick comments:

I finally made it through the list! Please see my updated morestats4 branch. It compiles, makes check-spaces happy, passes --enable-gcc-warnings, and runs in Shadow, but still I expect we'll need another iteration or two. Below I'm replying to all comments that require more than just: "done."

General:

  • This needs to go along with a patch to control-spec.txt .

Sure. I converted proposal 218 into a control-spec.txt patch here:

https://gitweb.torproject.org/user/karsten/torspec.git/shortlog/refs/heads/morestats4

e54d664f7bb1205162c1df3495f8ebc30c23d867

  • The last time we added K=V arguments in the middle of controller event, it created trouble for controllers, even though it wasn't supposed to do so, since the spec says we can. Since nobody's actually hacking on Vidalia, would it be possible to just put the new K=V arguments at the end, to avoid possible trouble?

rransom, I agree that sane controllers should handle this, but I also agree that we should change this, just to be safe.

8d1f78c556abb570bb80ea84261c954d9746cf33

  • Instead of being controlled by TestingTorNetwork, should there be a separate option for EVENT_CONN_BW events?

Added a new option TestingEnableConnBwEvent that depends on TestingTorNetwork being set, is 0 by default, and is changed to 1 if TestingTorNetwork is set. I also added similar options for the other new events that are specified for test networks only.

c386d2d6ce4c4f58163acb385c7a5de1da8c5e30

  • cell_command_to_string doesn't really belong in control.c

Moved to command.c, though I'm not entirely sure it belongs there. If not, where would you want it to be?

  • append_cell_stats_by_command doesn't seem to document how long the arrays are. I might be a bit more comfortable if it took a structure instead. This

Changed to a structure. How does the last sentence end?

  • Is this really slow in practice? It seems like it's likely to be really slow.

Do you mean slow or fast in your second sentence?

If you refer to the comment explaining why we're re-using arrays for all circuits, maybe I didn't just mean slow, but also bad for memory fragmentation.

  • Prefer TOR_SIMPLEQ to manual list management for new queues.
  • This data structure ... how does it work out in practice? It looks like it's going to be hideously inefficient.

I can't really think of a more efficient data structure. What do you have in mind? (I'll hold off changing the code to TOR_SIMPLEQ until I'm sure we want to use a queue there.)

  • This code needs unit tests and functional abstraction badly.

I think functional abstraction is improved and code is now easier to test. I didn't write tests yet, because I'm not quite sure where to start. Would you mind writing one example test that I could use as a start for these functions?

  • connection_buckets_note_empty_ts in connection.c
  • bucket_millis_empty in connection.c
  • sum_up_cell_stats_by_command in control.c
  • append_cell_stats_by_command in control.c
  • format_cell_stats in control.c

Are there any other new functions that need testing badly?

Thanks!

comment:14 in reply to: ↑ 13 ; follow-up: Changed 11 months ago by nickm

Replying to karsten:

Replying to nickm:

quick comments:

I finally made it through the list! Please see my updated morestats4 branch. It compiles, makes check-spaces happy, passes --enable-gcc-warnings, and runs in Shadow, but still I expect we'll need another iteration or two. Below I'm replying to all comments that require more than just: "done."

Okay. It's looking better. Thanks there!

  • append_cell_stats_by_command doesn't seem to document how long the arrays are. I might be a bit more comfortable if it took a structure instead. This

Changed to a structure. How does the last sentence end?

Probably explaining why a structure would be a good idea, I guess.

  • Is this really slow in practice? It seems like it's likely to be really slow.

Do you mean slow or fast in your second sentence?

If you refer to the comment explaining why we're re-using arrays for all circuits, maybe I didn't just mean slow, but also bad for memory fragmentation.

I'm asking here (and again below) what kind of a performance hit gets taken for turning this feature on.

  • Prefer TOR_SIMPLEQ to manual list management for new queues.
  • This data structure ... how does it work out in practice? It looks like it's going to be hideously inefficient.

I can't really think of a more efficient data structure. What do you have in mind? (I'll hold off changing the code to TOR_SIMPLEQ until I'm sure we want to use a queue there.)

Hm. Assuming a 64-bit architecture, this is going to eat an 8 byte pointer for the "next" element, plus 8 bytes for chunk element overhead, for every 5-byte entry we need to hold. We could do better by using some kind of dynamic array structure, which wouldn't have a per-element overhead...

but before we go and do that, we should know whether that actually matters. Hence my asking about how it actually works out in practice. If the memory consumption on a busy relay isn't much comparatively, then there's not much need to re-do this.

  • This code needs unit tests and functional abstraction badly.

I think functional abstraction is improved and code is now easier to test. I didn't write tests yet, because I'm not quite sure where to start. Would you mind writing one example test that I could use as a start for these functions?

  • connection_buckets_note_empty_ts in connection.c
  • bucket_millis_empty in connection.c
  • sum_up_cell_stats_by_command in control.c
  • append_cell_stats_by_command in control.c
  • format_cell_stats in control.c

Well, format_cell_stats is pretty easy: you would make one or more circuit_t* object with the cell statistics you want set on it, and then pass them to format_cell_stats, and then check whether the output is as you expect and whether the circuit is munged. You could stick this in an existing test_*.c module in src/tests/ , or add a new module as appropriate. For information on writing actual tinytest tests, see src/ext/tinytest*.

Is there more you need to know there? I'm not clear on what aspect of writing unit tests isn't making sense.

Are there any other new functions that need testing badly?

Our default should not be "let's test the functions that need it badly". Our default should be "test whatever new and changed functions we can."

comment:15 in reply to: ↑ 14 Changed 11 months ago by karsten

Replying to nickm:

Replying to karsten:

Replying to nickm:

  • Prefer TOR_SIMPLEQ to manual list management for new queues.
  • This data structure ... how does it work out in practice? It looks like it's going to be hideously inefficient.

I can't really think of a more efficient data structure. What do you have in mind? (I'll hold off changing the code to TOR_SIMPLEQ until I'm sure we want to use a queue there.)

Hm. Assuming a 64-bit architecture, this is going to eat an 8 byte pointer for the "next" element, plus 8 bytes for chunk element overhead, for every 5-byte entry we need to hold. We could do better by using some kind of dynamic array structure, which wouldn't have a per-element overhead...

I'm currently running somewhat bigger Shadow simulations to compare the overhead, but that's going to take a while.

but before we go and do that, we should know whether that actually matters. Hence my asking about how it actually works out in practice. If the memory consumption on a busy relay isn't much comparatively, then there's not much need to re-do this.

I wonder, if insertion_command_queue_t is an inefficient data structure, wouldn't that also apply to insertion_time_queue_t? Should we make both of them more efficient? If so, can you explain your dynamic array structure idea in some more detail? Maybe I can give that a try next week.

  • This code needs unit tests and functional abstraction badly.

I think functional abstraction is improved and code is now easier to test. I didn't write tests yet, because I'm not quite sure where to start. Would you mind writing one example test that I could use as a start for these functions?

  • connection_buckets_note_empty_ts in connection.c
  • bucket_millis_empty in connection.c
  • sum_up_cell_stats_by_command in control.c
  • append_cell_stats_by_command in control.c
  • format_cell_stats in control.c

Well, format_cell_stats is pretty easy: you would make one or more circuit_t* object with the cell statistics you want set on it, and then pass them to format_cell_stats, and then check whether the output is as you expect and whether the circuit is munged. You could stick this in an existing test_*.c module in src/tests/ , or add a new module as appropriate. For information on writing actual tinytest tests, see src/ext/tinytest*.

Is there more you need to know there? I'm not clear on what aspect of writing unit tests isn't making sense.

The part where these tests should go was a bit unclear to me, but now it's clearer, I think. Please see my updated morestats4 branch for a first attempt.

Are there any other new functions that need testing badly?

Our default should not be "let's test the functions that need it badly". Our default should be "test whatever new and changed functions we can."

Right, I meant the latter. :)

comment:16 follow-up: Changed 10 months ago by nickm

  • Status changed from needs_review to needs_revision

So, my questions about efficiency and performance are really questions, and not rhetorical. I'm asking: in practice, when you turn this feature on and run a reasonably busy simulation, is the performance/memory hit acceptable or not?

I'm asking that because I suspect that it's going to consume a whole lot of memory. And if it does, we can make it more efficient. But if it doesn't (and you've tested it out, so I'm hoping you will know), then there is not any point in putting the effort into making it more efficient.

Does that make sense?

More issues on review:

  • You're passing struct timeval on the stack; our convention is to pass around a "const struct timeval *" instead.
  • What keeps the various bandwidth events from getting enabled when not in Testing mode? I'm seeing that for some but not all of the other new event types.
  • The uint64_t * pointer arguments to append_cell_stats_by_command should be const. That function's documentation should document the lengths of those arrays.

Something to do AFTER merge:

  • Go through all the core functions that grew new if blocks here, and move the bodies of those if blocks into new functions. We have lots of places in Tor where ancillary functionality is crammed into core functions, but we shouldn't perpetuate that.


For the next branch like this:

  • Please learn the --fixup and --squah arguments to git commit.

comment:17 in reply to: ↑ 16 Changed 9 months ago by robgjansen

Replying to nickm:

So, my questions about efficiency and performance are really questions, and not rhetorical. I'm asking: in practice, when you turn this feature on and run a reasonably busy simulation, is the performance/memory hit acceptable or not?

I'm asking that because I suspect that it's going to consume a whole lot of memory. And if it does, we can make it more efficient. But if it doesn't (and you've tested it out, so I'm hoping you will know), then there is not any point in putting the effort into making it more efficient.

Of interest here is recent functionality I've added to Shadow. It now has the ability to track memory usage of simulated nodes over time (see this issue on GitHub). In the XML file, you include the 'heartbeatloginfo' attribute like

<node id="relay" heartbeatloginfo="node,ram,socket" ...>
...

For this ticket, you probably only need heartbeatloginfo="ram", and only on the nodes that you want to profile because Shadow will consume significantly more RAM to track the malloced pointers and sizes.

Once your simulation is complete, the contrib/analyze.py script will parse allocated, deallocated, and running total RAM usage over time from the scallion.log file, and can also be used to graph the results.

Changed 7 months ago by karsten

Memory consumption when enabling morestats4 events

comment:18 Changed 7 months ago by karsten

  • Status changed from needs_revision to needs_review

Replying to nickm:

So, my questions about efficiency and performance are really questions, and not rhetorical. I'm asking: in practice, when you turn this feature on and run a reasonably busy simulation, is the performance/memory hit acceptable or not?

I'm asking that because I suspect that it's going to consume a whole lot of memory. And if it does, we can make it more efficient. But if it doesn't (and you've tested it out, so I'm hoping you will know), then there is not any point in putting the effort into making it more efficient.

Does that make sense?

I finally have some Shadow results. This took much longer than expected, but the good news is that I found (and fixed) a segfault in my tor branch with Shadow.

Please see attached graph which shows memory usage over time. Black is the tor commit before my first morestats4 commit, red is my morestats4 branch where all the new events are disabled, and blue is my morestats4 branch with everything enabled. I don't see much of a difference in memory usage.

More issues on review:

  • You're passing struct timeval on the stack; our convention is to pass around a "const struct timeval *" instead.

Fixed in my updated morestats4 branch.

  • What keeps the various bandwidth events from getting enabled when not in Testing mode? I'm seeing that for some but not all of the other new event types.

The CIRC_BW event is not limited to testing mode, because it's only emitted for origin circuits. Should I make this clearer somewhere? Or do you think it's a problem that non-testing nodes can enable this event type?

  • The uint64_t * pointer arguments to append_cell_stats_by_command should be const. That function's documentation should document the lengths of those arrays.

Fixed in my updated morestats4 branch.

Something to do AFTER merge:

  • Go through all the core functions that grew new if blocks here, and move the bodies of those if blocks into new functions. We have lots of places in Tor where ancillary functionality is crammed into core functions, but we shouldn't perpetuate that.

Will do.

For the next branch like this:

  • Please learn the --fixup and --squah arguments to git commit.

So, I now know how to use those commands, and I agree I should use them. But I'm not sure how you'd want the morestats4 branch to look like in the end. Do you want just the five commits saying "Add XY" and the rest as fixup! commits? Or the "Tweak XY" commits as squash! commits, because they contain somewhat useful commit messages what fixes you suggested? Or something else? In any case, I'm happy to do a rebase and push a morestats5 branch before you merge, using a commit history that you like.

comment:19 Changed 6 months ago by nickm

The branch looks better now. One change that it needs: you *don't* need to cast uint64_t * to const uint64_t * implicitly. The cast is always safe, so C does it for you. Doing the cast explitictly can hide bugs if the input is secretly the wrong type.

So, I now know how to use those commands, and I agree I should use them. But I'm not sure how you'd want the morestats4 branch to look like in the end.

No need to use squash/apply with morestats4; but please use them the next time you're writing a branch

---

This is good to merge, except that it gives me a bunch of merge conflicts that I'm not sure how to resolve. Those surrounding cell_queue_append_packed_copy's interface are the most confusing. Do you want to make a merge branch for me? (To do this, start a new branch based on master, then merge morestats4 into it, resolve the conflicts, commit the merge, and push that branch.) This is better in this case than a rebase, since I'd need to review a rebased branch more or less from scratch and wouldn't have an easy time seeing what changed.

comment:20 Changed 6 months ago by karsten

How's my morestats5 branch? (Thanks!)

comment:21 Changed 6 months ago by nickm

  • Resolution set to implemented
  • Status changed from needs_review to closed

Added a clarification, and merged.

I couldn't squash the squash! safely, since it proceeded a merge. Next time, though, I'll be able to apply those. Thanks for starting here!

comment:22 Changed 6 months ago by karsten

  • Resolution implemented deleted
  • Status changed from closed to reopened

Thanks for reviewing, tweaking, and merging! Re-opening for two minor reasons:

  • I forgot to include a changes file. Please merge the latest commit in my tor morestats5 branch which contains one.

Thanks!

comment:23 Changed 6 months ago by nickm

  • Resolution set to implemented
  • Status changed from reopened to closed

Merged those; thanks!

Note: See TracTickets for help on using tickets.