Opened 3 years ago

Closed 5 weeks ago

#20218 closed defect (fixed)

Fix and refactor and redocument routerstatus_has_changed

Reported by: nickm Owned by: nickm
Priority: Medium Milestone: Tor: 0.4.3.x-final
Component: Core Tor/Tor Version:
Severity: Normal Keywords: ipv6, 029-proposed, tor-control, easy, spec-conformance, review-group-31, 034-triage-20180328, 034-removed-20180328 043-can
Cc: maurice_pibouin Actual Points: 0.6
Parent ID: Points: .1
Reviewer: teor Sponsor:

Description

The routerstatus_has_changed() function is used by controllers to to tell which rs entries are new. But it only looks at a fraction of the fields which might change in a routerstatus. Also, it only checks for semantic changes that tor cares about (though this is not documented).

Child Tickets

Attachments (2)

Change History (56)

comment:1 Changed 3 years ago by nickm

(do not change this till #19958 is merged)

comment:2 Changed 3 years ago by teor

Milestone: Tor: 0.2.???Tor: 0.3.???

Milestone renamed

comment:3 Changed 3 years ago by nickm

Keywords: tor-03-unspecified-201612 added
Milestone: Tor: 0.3.???Tor: unspecified

Finally admitting that 0.3.??? was a euphemism for Tor: unspecified all along.

comment:4 Changed 3 years ago by nickm

Keywords: tor-03-unspecified-201612 removed

Remove an old triaging keyword.

comment:5 Changed 3 years ago by dgoulet

Unify controller keyword to "tor-control".

comment:6 Changed 3 years ago by dgoulet

Keywords: tor-control added; controller removed

Unify "controller" keyword to "tor-control".

comment:7 Changed 3 years ago by nickm

Keywords: easy spec-conformance added

comment:8 Changed 2 years ago by aruna1234

So the issue is with that router_has_changed() is not properly documented. The fact that it only checks for semantic changes that tor cares about, is not documented?

comment:9 Changed 2 years ago by teor

There are a few different issues:

  • the function comment needs to say that the function checks for semantic changes
  • the function is used by two different parts of the code, and the controller part needs it to check for more changes

The controller part of the code wants all new descriptors:
https://gitweb.torproject.org/torspec.git/tree/control-spec.txt#n2144

The other part of the code just wants descriptors that are different enough to matter.

This means adding an argument to the function that makes it check for different kinds of changes. Or making two functions. But we try not to copy and paste code, we get functions to call other functions instead.

comment:10 Changed 2 years ago by aruna1234

By different kinds of changes, you mean checking whether there are any new descriptors or not? And then changing the function calls as well.

comment:11 Changed 2 years ago by teor

routerstatus_has_changed() checks for some differences in descriptors. This is the right thing to do when the function is called from some code. But it hides some changed descriptors from controllers.

When called from control port code, the function should check if the timestamp has changed. That will give the controller *all* the changed descriptors.

comment:12 Changed 2 years ago by aruna1234

Hey!

I went through the code of networkstatuc.c.Can the timestamp be changed by checking time_t now parameter?

comment:13 Changed 2 years ago by teor

No, the timestamp is in the router structure that is passed to the function.

comment:14 Changed 2 years ago by aruna1234

The definition of the routerstatus_t structure, as in or.h has only time_t published_on. Is that the required timestamp to check.
A check like, (a->published_on != b->published_on) would do the task?

comment:15 in reply to:  9 Changed 2 years ago by teor

Yes, this field is the right field.

Here's how to do this patch:

  1. Create a new function that just checks this field
  2. Find all calls to router_has_changed() in control.c and from controller helper functions
  3. Replace them with a call to this new function

comment:16 Changed 2 years ago by teor

But that check isn't the right check: you only want newer descriptors.

comment:17 Changed 2 years ago by aruna1234

Then this check would be the right one, I suppose:

/* Function to check for new descriptors. Returns 0 if changed, else 1 */

static int check_timestamp(const routerstatus_t *a)
{
	if(a->published_on == a->now)
		return 1;
	else
		return 0;
}
Last edited 3 months ago by arma (previous) (diff)

comment:18 Changed 2 years ago by aruna1234

Have added a patch file. Do let me know, if the changes are correct, although I doubt as 'now' is not a structure member.
But if I need to check for change in timestamp, and published_on is my parameter, then I need to compare it with something.

comment:19 Changed 2 years ago by teor

Your new function should have the same arguments and return type as routerstatus_has_changed():

static int
routerstatus_has_changed(const routerstatus_t *a, const routerstatus_t *b)

https://gitweb.torproject.org/tor.git/tree/src/or/networkstatus.c#n1541

So now that you have two routerstatus objects, what can you do?

comment:20 Changed 2 years ago by teor

Keywords: ipv6 added
Milestone: Tor: unspecifiedTor: 0.3.4.x-final
Status: newneeds_revision

Oh, I'm sorry, I've been giving you bad advice. I must have searched for the wrong function name.

We don't need a new function. We need to change the existing function to look at more fields.

Here's what we need to do:

The routerstatus_t struct has a lot of fields. routerstatus_has_changed() checks a few of them. But it needs to check every field that is output by networkstatus_getinfo_helper_single().

These missing fields can be compared using !=:

  • published
  • ipv6_orport
  • is_v2_dir
  • bandwidth_kb

This missing field must be compared using tor_addr_compare(..., ..., CMP_EXACT):

  • ipv6_addr

We don't need to worry about the summarised IPv4 exit policy output by networkstatus_getinfo_helper_single(). The descriptor_digest check will find changes in this field.

Then we need to add comments to routerstatus_has_changed() saying that it checks for changes that are output by the control port. And we need to add a comment to routerstatus_format_entry() saying that any extra control port fields need to be added to routerstatus_has_changed() as well.

comment:21 Changed 2 years ago by aruna1234

Have added the extra checks, and uploaded a new patch file.
Do let me know if there are any further changes.

comment:22 Changed 2 years ago by teor

This looks good, but please submit one patch file based on the master branch.

comment:24 Changed 2 years ago by teor

Hi, this branch doesn't compile, it has extra "+" at the start of some lines.
Please run "make check" before submitting patches,

Also, please remove this comment, because this patch fixes it:

   // XXXX this function needs a huge refactoring; it has gotten out
   // XXXX of sync with routerstatus_t, and it will do so again.

comment:25 Changed 2 years ago by aruna1234

Oh !!My bad:(
I have done the final changes:https://github.com/ArunaMaurya221B/tor/commits/Bug-20218

comment:26 Changed 2 years ago by teor

Thanks! The check for tor_addr_compare() is around the wrong way.
Just like ==, tor_addr_compare() returns true when its arguments are equal.

You need to add a ! before the function call.

Edit: autocorrect does not understand "its"

Last edited 2 years ago by teor (previous) (diff)

comment:28 Changed 2 years ago by teor

Actual Points: 0.5
Status: needs_revisionneeds_review

This looks fine to me, but it needs a review, and a changes file. It might also help if it had unit tests.

A question for the team:
How can we make sure we update functions when we add new members to a struct?

comment:29 Changed 2 years ago by nickm

Keywords: review-group-31 added

comment:30 Changed 2 years ago by nickm

Reviewer: nickm

comment:31 Changed 2 years ago by nickm

Status: needs_reviewneeds_revision

teor, I think you have tor_addr_compare backwards. Its documentation says:

 * Given two addresses <b>addr1</b> and <b>addr2</b>, return 0 if the two
 * addresses are equivalent under the mask mbits, less than 0 if addr1
 * precedes addr2, and greater than 0 otherwise.

So the original use of tor_addr_compare (without "!") is correct.

(Unit tests would have caught this bug; that's one reason why unit tests are so great. ;) Any chance of writing those?)

As a smaller issue, it seems that this patch says the same line twice:

         a->is_hs_dir != b->is_hs_dir ||
         a->is_hs_dir != b->is_hs_dir ||

---
Another question: how did you decide on the list of fields to check? It seems that some but not all of the fields in routerstatus_t are now checked for equality, but I don't understand the rationale about why the remaining ones (pv, exitsummary) are not. Is that intentional?

---

Teor asks:

How can we make sure we update functions when we add new members to a struct?

In some cases, using a constructor for a struct instance will work, since we have turned on the options that let us know about any uninitialized members. But in cases like this, I don't see an easy way to use a constructor.

One option here would be to stop assuming that we list all relevant the members here. We could instead change the code that we only list the irrelevant members, by:

  1. Making sure that when we construct routerstatus_t, we initialize the whole object to 0.
  2. In a function like this, using memcpy() to make a temporary copy of the routerstatus_t.
  3. Instead of listing all the relevant members in a comparison, we could just use fast_memeq() to compare. If there are some members we don't want to compare, we would set them to 0 before calling fast_memeq().

I'm not sure if this is actually a good idea, though.

comment:32 in reply to:  31 Changed 2 years ago by teor

Replying to nickm:

teor, I think you have tor_addr_compare backwards. Its documentation says:

 * Given two addresses <b>addr1</b> and <b>addr2</b>, return 0 if the two
 * addresses are equivalent under the mask mbits, less than 0 if addr1
 * precedes addr2, and greater than 0 otherwise.

So the original use of tor_addr_compare (without "!") is correct.

You're right. Oops!

(Unit tests would have caught this bug; that's one reason why unit tests are so great. ;) Any chance of writing those?)

I think we should have unit tests.

As a smaller issue, it seems that this patch says the same line twice:

         a->is_hs_dir != b->is_hs_dir ||
         a->is_hs_dir != b->is_hs_dir ||

---
Another question: how did you decide on the list of fields to check? It seems that some but not all of the fields in routerstatus_t are now checked for equality, but I don't understand the rationale about why the remaining ones (pv, exitsummary) are not. Is that intentional?

We only check the fields that are output via the control port.
We should revise this like to say "only" not "also":

/* Given two router status entries for the same router identity, return 1 if
 * if the contents have changed between them. Otherwise, return 0.
 * It also checks for changes for that are output by control port. */

https://github.com/ArunaMaurya221B/tor/commit/e29292dda5ef68e441f267864357a7568b29588e#diff-fcb162b79906521706b54d280b939d57R1540

---

Teor asks:

How can we make sure we update functions when we add new members to a struct?

In some cases, using a constructor for a struct instance will work, since we have turned on the options that let us know about any uninitialized members. But in cases like this, I don't see an easy way to use a constructor.

One option here would be to stop assuming that we list all relevant the members here. We could instead change the code that we only list the irrelevant members, by:

  1. Making sure that when we construct routerstatus_t, we initialize the whole object to 0.
  2. In a function like this, using memcpy() to make a temporary copy of the routerstatus_t.
  3. Instead of listing all the relevant members in a comparison, we could just use fast_memeq() to compare. If there are some members we don't want to compare, we would set them to 0 before calling fast_memeq().

I'm not sure if this is actually a good idea, though.

I think it is enough to comment the control port printing function, and this function, and say that they must be kept in sync.

comment:33 Changed 2 years ago by nickm

Keywords: 034-triage-20180328 added

comment:34 Changed 2 years ago by nickm

Keywords: 034-removed-20180328 added

Per our triage process, these tickets are pending removal from 0.3.4.

comment:35 Changed 23 months ago by nickm

Milestone: Tor: 0.3.4.x-finalTor: unspecified

These needs_revision, tickets, tagged with 034-removed-*, are no longer in-scope for 0.3.4. We can reconsider any of them, if somebody does the necessary revision.

comment:36 Changed 4 months ago by maurice_pibouin

Cc: maurice_pibouin added

comment:37 Changed 3 months ago by maurice_pibouin

Status: needs_revisionneeds_review

First patch on tor, please tell me if anything is wrong.
See https://github.com/vnepveu/tor/tree/ticket20218.
edit : change branch name

Last edited 3 months ago by maurice_pibouin (previous) (diff)

comment:38 Changed 3 months ago by teor

Milestone: Tor: unspecifiedTor: 0.4.3.x-final

If you can make a GitHub pull request against torproject/tor/master, the reviewer will find it easier to review your code.

comment:39 Changed 3 months ago by maurice_pibouin

Created pull request as suggested : https://github.com/torproject/tor/pull/1549

comment:40 Changed 3 months ago by teor

Reviewer: nickmnickm, teor
Status: needs_reviewneeds_revision

Just a few changes needed to the fields, comments, and changes file.

I also think we need unit tests for:

  • routerstatus_has_changed(), and
  • routerstatus_format_entry(..., NS_CONTROL_PORT, ...).

comment:41 Changed 2 months ago by maurice_pibouin

Status: needs_revisionneeds_review

Made changes according to reviews, will add unit tests at a later time if that's ok.

comment:42 Changed 2 months ago by teor

Status: needs_reviewneeds_revision

Looks fine to me, just a little tweak to the changes file.

We usually require unit tests before we merge.

Let us know if you need help writing tests :-)

comment:43 Changed 2 months ago by maurice_pibouin

I would actually love some help !
I am a complete beginner in C, so I don't know how to go around with writing the makefile for my test (just to see if my tests work). I have follwed tutorials on how to write makefiles, but I don't see how to handle the multiple depth of dependencies.
What steps should I take to write my test makefile ? Since I don't want to go through "making" the whole Tor project everytime I want to see if my test file works.
Also, is this the right place to ask for help ? It feels weird posting questions on an issue.
Thanks !

Last edited 2 months ago by maurice_pibouin (previous) (diff)

comment:44 Changed 8 weeks ago by nickm

Status: needs_revisionneeds_review

comment:45 Changed 7 weeks ago by teor

Owner: set to nickm
Reviewer: nickm, teorteor
Status: needs_reviewassigned

Nick is going to write some tests, and then I'll review and merge the whole branch.

comment:46 Changed 7 weeks ago by teor

Status: assignedneeds_revision

comment:47 Changed 6 weeks ago by nickm

Keywords: 043-can added

comment:48 Changed 6 weeks ago by nickm

Status: needs_revisionneeds_review

I've rebased on master (to sort out the merge history) and added a unit test, as ticket20218_rebased with PR at https://github.com/torproject/tor/pull/1670 .

I have some questions in the unit test about particular fields; please let me know what you think there, and I'll revise the test.

comment:49 Changed 5 weeks ago by teor

Status: needs_reviewneeds_revision

The implementation of routerstatus_has_changed() is correct, it's only designed to detect changes in the control port routerstatus format. Let's rename routerstatus_has_changed() to a name that contains "control"?

If you'd like to remove the named and unnamed code, let's do it in another ticket, and try to remove as much as we can.

comment:50 Changed 5 weeks ago by nickm

Status: needs_revisionneeds_review

Okay -- I've changed the comments, updated the tests to verify that routerstatus_format_entry() actually changed (or hasn't), and renamed the identifier. I went with "visibly" instead of "control", but I'm open to suggestions for other identifiers that aren't even longer.

comment:51 Changed 5 weeks ago by teor

Status: needs_reviewmerge_ready

There's a conflict with #32695, which removed networkstatus_consensus_has_ipv6() above routerstatus_has_visibly_clanged(). The conflict is stopping CI from running,

But the code looks good.

I'll resolve it, run CI, then merge.

comment:52 Changed 5 weeks ago by teor

Here's the PR merged into the current master, with conflicts resolved:

Waiting for CI to pass before merging.

comment:53 Changed 5 weeks ago by teor

(I also had to remove ROUTERSTATUS_FORMAT_NO_CONSENSUS_METHOD from the tests, also due to #32695.)

comment:54 Changed 5 weeks ago by teor

Actual Points: 0.50.6
Resolution: fixed
Status: merge_readyclosed

Merged to master.

Merged #32962, #32984, and #20218 together.

Note: See TracTickets for help on using tickets.