Opened 7 years ago

Closed 7 years ago

Last modified 7 years ago

#6177 closed task (implemented)

Refactor rend_service_introduce()

Reported by: andrea Owned by:
Priority: Medium Milestone: Tor: 0.2.4.x-final
Component: Core Tor/Tor Version:
Severity: Keywords: tor-relay
Cc: Actual Points:
Parent ID: #6313 Points:
Reviewer: Sponsor:

Description

This function looks like something that crawled out of Cthulhu's navel, loathesomely redolent of spheres. Sanity is a scarce resource and we shouldn't squander it thus.

Child Tickets

Attachments (1)

top_offenders.txt (6.5 KB) - added by nickm 7 years ago.
cyclomatic complexity >= 20, acccording to cyclo-2.0

Download all attachments as: .zip

Change History (31)

comment:1 Changed 7 years ago by nickm

Ia fhtagn!

We should probably also be on the lookout for other such functions. rend_service_introduce() is one of the worst, but it's not the only offender.

Do you think any of the automated code complexity metrics are worth using here?

comment:2 Changed 7 years ago by andrea

Which automated complexity metrics do you have in mind?

comment:3 in reply to:  2 Changed 7 years ago by nickm

Replying to andrea:

Which automated complexity metrics do you have in mind?

Frankly, I know very little about the field; I was hoping you might know some. The only one I could name off the top of my head is "Cyclomatic Complexity", which is pretty old. On the plus side, there seem to be a lot of tools to calculate it. Following the links from that article on Wikipedia turned up some other measures I hadn't heard of, some of which seem to be proprietary. A simpler measurements to look at might be loc-per-function; a more historically focused approach might be to look at the total number of git commits affecting a function (on the theory that volatility is problematic).

Personally, I think if we pick a not-too-stupid measurement and have a look at all the functions where it's high, we'll do a good job of finding functions to simplify--just so long as we don't make a religion out of minimizing the metric for its own sake.

comment:4 Changed 7 years ago by andrea

I must confess equal ignorance, but agree that those are all interesting metrics (and would probably favor whichever we don't have to write any new tools to measure), and with the point about non-religiosity.

comment:5 Changed 7 years ago by andrea

Observation: 'bare' LOC without any preprocessing is actually a dreadful metric because it penalizes comments.

comment:6 Changed 7 years ago by nickm

There are some tools of interest at http://www.maultech.com/chrislott/resources/cmetrics/ . cylco-2.0, with the accompanying patch, seems to generate cyclomatic complexity or a reasonable approximation thereto. It runs into problems with some of our files, but among the ones it *can* check out, rend_service_introduce is in the top 10 most complex.

We might want to check out some other tools from that page as well.

comment:7 Changed 7 years ago by nickm

Component: Tor Hidden ServicesTor Relay
Priority: minornormal
Summary: Refactor rend_service_introduce()Refactor complicated functions

Renaming the bug so it's about complicated functions in general.

I'm attaching a list of all the functions with cyclomatic complexity >= 20. 15 is considered a fairly permissive upper-bound by people who like this metric.

comment:8 Changed 7 years ago by rransom

That list would be slightly more useful if it included each function's filename. (For example, there are two main functions in two different files.)

comment:9 Changed 7 years ago by nickm

Here's a better one.

The horrible ugly pipeline I used to make it was:

 ( for f in src/tools/*.c src/tools/*/*.c src/or/*.c src/common/*.c ; do 1 gcc -E -I. -Isrc/common -Isrc/or $f |  ~/build/cyclo-2.0/mcstrip  | ~/build/cyclo-2.0/cyclo -c | perl -pe "print \"$f \""; done ) | sort -n -k3 |grep -v OpenBSD | tail -150

plus a little postprocessing.

Changed 7 years ago by nickm

Attachment: top_offenders.txt added

cyclomatic complexity >= 20, acccording to cyclo-2.0

comment:10 Changed 7 years ago by nickm

Summary: Refactor complicated functionsRefactor rend_service_introduce()

Changing this back to rend_service_introduce(), since otherwise this ticket will be infinite. Will add a new ticket for refactoring complex stuff in general.

Andrea has started on a branch to simplify this function, in user/andrea/tor.git branch bug6177.

So far, it implements a replaycache_t type, to move the replay cache logic out of the function.

Comments:

  • The functions need documentation
  • By convention, when we have a function like the *_internal() functions here, where the function would be static if we didn't want to unit test it, we wrap its declarations inside "#ifdef RENDCACHE_PRIVATE", and then "#define RENDCACHE_PRIVATE" in rendcache.c and test_rendcache.c
  • Similarly with the replaycache_t structure definition.
  • There's a bunch of internal space inside parenthesis where we wouldn't ordinarily put it in the rest of the code.
  • I wonder if replaycache_check() has a better verb or predicate name than "check" that we could use. replaycache_item_seen() maybe? replaycache_add_and_test()?
  • If one of the items in the sanity-check fails, we probably want to log a message in domain LD_BUG, so that we know that our program sucks and we can fix it.


Otherwise, looks good so far!

comment:11 Changed 7 years ago by nickm

Milestone: Tor: unspecifiedTor: 0.2.4.x-final
Parent ID: #6313

comment:13 Changed 7 years ago by andrea

Further requested changes made

comment:14 Changed 7 years ago by andrea

I've modified rend_service_introduce() to use replaycache_t in my bug6177 branch.

comment:15 Changed 7 years ago by andrea

I added a replaycache_add_test_and_elapsed() function for the replay cache which provides the elapsed time since the last time the digest in question was seen when there's a hit, to facilitate the log messages in that case in rend_service_introduce().

I've also added a counter for INTRODUCE2 cells seen, since the old implementation of intro_point_accepted_intro_count() relied on the size of the accepted_intro_rsa_parts digest map for that, which will not work reliably now that accepted_intro_rsa_parts is a replaycache_t and ages out old values.

comment:16 Changed 7 years ago by nickm

In 5200b60102e82, I think that accepted_intro_rsa_parts might need to have an infinite timeout (ie, entries must never be removed from the cache). That map needs to keep growing for so long as the introduction point exists, since if somebody is playing malleability games, we can't necessarily trust the timestamp value.

otherwise, it's still looking good to me.

comment:17 Changed 7 years ago by andrea

Okay, changed to add support for non-expiring cache.

comment:18 Changed 7 years ago by andrea

First successful connections using new INTRODUCE2 parser. There's still some room to reorganize and clean up rend_service_introduce() later on in the circuit-building code, though.

Is there any not-too-painful way to send older (pre-V3) cells to test those parts of the parser?

comment:19 Changed 7 years ago by andrea

Also, I found a discrepancy between the format actually handled by the old rend_service_introduce() and emitted in rendclient.c on the one hand, and the specification on the other. See bug 6415.

comment:20 in reply to:  18 Changed 7 years ago by nickm

Replying to andrea:

First successful connections using new INTRODUCE2 parser. There's still some room to reorganize and clean up rend_service_introduce() later on in the circuit-building code, though.

Is there any not-too-painful way to send older (pre-V3) cells to test those parts of the parser?

From what I can tell, we have supported v3 since 0.2.1, whereas the oldest supported Tor version is 0.2.2. So maybe (after confirming that) we should drop support for all the older introduce cell formats. I've added #6418 on that one.

The "easiest" ways I can think of to test these parts of the parser would be:

  • Add a unit test where you manually generate the cells, or use a canned value.

or

  • Hack up rend_client_send_introduction locally so that it can never detect that the v3 protocol is supported; try it out with a test hidden service.

comment:21 Changed 7 years ago by nickm

Just read the code and I like it so far. I think we can (and should!) add unit tests to exercise nearly all of the new functions as a part of this fix. Let me know if you'd like any suggestions for generating hybrid-encrypted stuff, or what to test, etc.

Minor notes:

I think you need to add replaycache.h to noinst_HEADERS.

By convention, our foo_free() functions *always* check for NULL; there's no need to do an additional check first if all you're doing is foo_free().

By our convention, only one-liner "ifs" with no elses get to have the condition and the branch on the same line. Not a big deal; no need to do a patch for this one.

In the future, it's easier for me to read squash! and fixup! commits if the commit message describes what they do to the commit they're modifying.

I think many of the log_warn instances want to use LOG_PROTOCOL_WARN, perhaps.

I it would be a good idea to uniformly using the name err_msg_out for a char function parameter where we might stick an error. (In some places, you're calling that "err_msg"; in some places, err_msg is a char *).

comment:22 Changed 7 years ago by andrea

Re: err_msg_out...

No, I'm using err_msg as a char * for a local variable to hold an error description in or receive one from a called function, and err_msg_out as a char for a parameter I write that out to before returning if it's not NULL.

comment:23 in reply to:  22 Changed 7 years ago by nickm

Replying to andrea:

Re: err_msg_out...

No, I'm using err_msg as a char * for a local variable to hold an error description in or receive one from a called function, and err_msg_out as a char for a parameter I write that out to before returning if it's not NULL.

I'm confused then. What about the err_msg parameters for rend_service_parse_intro_for_v* ? They're named err_msg, and declared as char .

comment:24 Changed 7 years ago by andrea

Okay, then I should probably rename those.

comment:25 Changed 7 years ago by andrea

So far:

  • err_msg/err_msg_out renaming done
  • *_free() calls changed
  • noinst_HEADERS in src/or/Makefile.am now has replaycache.h

Suggestions on constructing the encrypted cells for unit tests would be appreciated.

comment:26 Changed 7 years ago by andrea

I've added a set of unit tests for the rend_intro_cell_t parser. While developing them I discovered a bug in rend_service_free_intro() that could cause crashes with v3 cells with auth data. This has been fixed.

comment:27 Changed 7 years ago by nickm

Looks good to me at first glance. I'll read it again before merging.

BTW, have you tried running this through gcov? Did you like what you saw?

comment:28 Changed 7 years ago by nickm

Resolution: implemented
Status: needs_reviewclosed

okay. Squashed, tweaked whitespace, condensed test functions, and merged. Thanks!

comment:29 Changed 7 years ago by nickm

Keywords: tor-relay added

comment:30 Changed 7 years ago by nickm

Component: Tor RelayTor
Note: See TracTickets for help on using tickets.