Opened 3 years ago

Last modified 14 months ago

#20218 needs_revision defect

Fix and refactor and redocument routerstatus_has_changed

Reported by: nickm Owned by:
Priority: Medium Milestone: Tor: unspecified
Component: Core Tor/Tor Version:
Severity: Normal Keywords: ipv6, 029-proposed, tor-control, easy, spec-conformance, review-group-31, 034-triage-20180328, 034-removed-20180328
Cc: Actual Points: 0.5
Parent ID: Points: .1
Reviewer: nickm Sponsor:

Description

The routerstatus_has_changed() function is used by controllers to to tell which rs entries are new. But it only looks at a fraction of the fields which might change in a routerstatus. Also, it only checks for semantic changes that tor cares about (though this is not documented).

Child Tickets

Attachments (2)

Change History (37)

comment:1 Changed 3 years ago by nickm

(do not change this till #19958 is merged)

comment:2 Changed 3 years ago by teor

Milestone: Tor: 0.2.???Tor: 0.3.???

Milestone renamed

comment:3 Changed 2 years ago by nickm

Keywords: tor-03-unspecified-201612 added
Milestone: Tor: 0.3.???Tor: unspecified

Finally admitting that 0.3.??? was a euphemism for Tor: unspecified all along.

comment:4 Changed 2 years ago by nickm

Keywords: tor-03-unspecified-201612 removed

Remove an old triaging keyword.

comment:5 Changed 2 years ago by dgoulet

Unify controller keyword to "tor-control".

comment:6 Changed 2 years ago by dgoulet

Keywords: tor-control added; controller removed

Unify "controller" keyword to "tor-control".

comment:7 Changed 23 months ago by nickm

Keywords: easy spec-conformance added

comment:8 Changed 17 months ago by aruna1234

So the issue is with that router_has_changed() is not properly documented. The fact that it only checks for semantic changes that tor cares about, is not documented?

comment:9 Changed 17 months ago by teor

There are a few different issues:

  • the function comment needs to say that the function checks for semantic changes
  • the function is used by two different parts of the code, and the controller part needs it to check for more changes

The controller part of the code wants all new descriptors:
https://gitweb.torproject.org/torspec.git/tree/control-spec.txt#n2144

The other part of the code just wants descriptors that are different enough to matter.

This means adding an argument to the function that makes it check for different kinds of changes. Or making two functions. But we try not to copy and paste code, we get functions to call other functions instead.

comment:10 Changed 17 months ago by aruna1234

By different kinds of changes, you mean checking whether there are any new descriptors or not? And then changing the function calls as well.

comment:11 Changed 17 months ago by teor

routerstatus_has_changed() checks for some differences in descriptors. This is the right thing to do when the function is called from some code. But it hides some changed descriptors from controllers.

When called from control port code, the function should check if the timestamp has changed. That will give the controller *all* the changed descriptors.

comment:12 Changed 16 months ago by aruna1234

Hey!

I went through the code of networkstatuc.c.Can the timestamp be changed by checking time_t now parameter?

comment:13 Changed 16 months ago by teor

No, the timestamp is in the router structure that is passed to the function.

comment:14 Changed 16 months ago by aruna1234

The definition of the routerstatus_t structure, as in or.h has only time_t published_on. Is that the required timestamp to check.
A check like, (a->published_on != b->published_on) would do the task?

comment:15 in reply to:  9 Changed 16 months ago by teor

Yes, this field is the right field.

Here's how to do this patch:

  1. Create a new function that just checks this field
  2. Find all calls to router_has_changed() in control.c and from controller helper functions
  3. Replace them with a call to this new function

comment:16 Changed 16 months ago by teor

But that check isn't the right check: you only want newer descriptors.

comment:17 Changed 16 months ago by aruna1234

Then this check would be the right one, I suppose:

/* Function to check for new descriptors. Returns 0 if changed, else 1 */

static int check_tiemstamp(const routerstatus_t *a)
{

if(a->published_on == a->now)

return 1;

else

return 0;

}

comment:18 Changed 16 months ago by aruna1234

Have added a patch file. Do let me know, if the changes are correct, although I doubt as 'now' is not a structure member.
But if I need to check for change in timestamp, and published_on is my parameter, then I need to compare it with something.

comment:19 Changed 16 months ago by teor

Your new function should have the same arguments and return type as routerstatus_has_changed():

static int
routerstatus_has_changed(const routerstatus_t *a, const routerstatus_t *b)

https://gitweb.torproject.org/tor.git/tree/src/or/networkstatus.c#n1541

So now that you have two routerstatus objects, what can you do?

comment:20 Changed 16 months ago by teor

Keywords: ipv6 added
Milestone: Tor: unspecifiedTor: 0.3.4.x-final
Status: newneeds_revision

Oh, I'm sorry, I've been giving you bad advice. I must have searched for the wrong function name.

We don't need a new function. We need to change the existing function to look at more fields.

Here's what we need to do:

The routerstatus_t struct has a lot of fields. routerstatus_has_changed() checks a few of them. But it needs to check every field that is output by networkstatus_getinfo_helper_single().

These missing fields can be compared using !=:

  • published
  • ipv6_orport
  • is_v2_dir
  • bandwidth_kb

This missing field must be compared using tor_addr_compare(..., ..., CMP_EXACT):

  • ipv6_addr

We don't need to worry about the summarised IPv4 exit policy output by networkstatus_getinfo_helper_single(). The descriptor_digest check will find changes in this field.

Then we need to add comments to routerstatus_has_changed() saying that it checks for changes that are output by the control port. And we need to add a comment to routerstatus_format_entry() saying that any extra control port fields need to be added to routerstatus_has_changed() as well.

comment:21 Changed 16 months ago by aruna1234

Have added the extra checks, and uploaded a new patch file.
Do let me know if there are any further changes.

comment:22 Changed 16 months ago by teor

This looks good, but please submit one patch file based on the master branch.

comment:24 Changed 16 months ago by teor

Hi, this branch doesn't compile, it has extra "+" at the start of some lines.
Please run "make check" before submitting patches,

Also, please remove this comment, because this patch fixes it:

   // XXXX this function needs a huge refactoring; it has gotten out
   // XXXX of sync with routerstatus_t, and it will do so again.

comment:25 Changed 16 months ago by aruna1234

Oh !!My bad:(
I have done the final changes:https://github.com/ArunaMaurya221B/tor/commits/Bug-20218

comment:26 Changed 16 months ago by teor

Thanks! The check for tor_addr_compare() is around the wrong way.
Just like ==, tor_addr_compare() returns true when its arguments are equal.

You need to add a ! before the function call.

Edit: autocorrect does not understand "its"

Last edited 16 months ago by teor (previous) (diff)

comment:28 Changed 16 months ago by teor

Actual Points: 0.5
Status: needs_revisionneeds_review

This looks fine to me, but it needs a review, and a changes file. It might also help if it had unit tests.

A question for the team:
How can we make sure we update functions when we add new members to a struct?

comment:29 Changed 16 months ago by nickm

Keywords: review-group-31 added

comment:30 Changed 16 months ago by nickm

Reviewer: nickm

comment:31 Changed 16 months ago by nickm

Status: needs_reviewneeds_revision

teor, I think you have tor_addr_compare backwards. Its documentation says:

 * Given two addresses <b>addr1</b> and <b>addr2</b>, return 0 if the two
 * addresses are equivalent under the mask mbits, less than 0 if addr1
 * precedes addr2, and greater than 0 otherwise.

So the original use of tor_addr_compare (without "!") is correct.

(Unit tests would have caught this bug; that's one reason why unit tests are so great. ;) Any chance of writing those?)

As a smaller issue, it seems that this patch says the same line twice:

         a->is_hs_dir != b->is_hs_dir ||
         a->is_hs_dir != b->is_hs_dir ||

---
Another question: how did you decide on the list of fields to check? It seems that some but not all of the fields in routerstatus_t are now checked for equality, but I don't understand the rationale about why the remaining ones (pv, exitsummary) are not. Is that intentional?

---

Teor asks:

How can we make sure we update functions when we add new members to a struct?

In some cases, using a constructor for a struct instance will work, since we have turned on the options that let us know about any uninitialized members. But in cases like this, I don't see an easy way to use a constructor.

One option here would be to stop assuming that we list all relevant the members here. We could instead change the code that we only list the irrelevant members, by:

  1. Making sure that when we construct routerstatus_t, we initialize the whole object to 0.
  2. In a function like this, using memcpy() to make a temporary copy of the routerstatus_t.
  3. Instead of listing all the relevant members in a comparison, we could just use fast_memeq() to compare. If there are some members we don't want to compare, we would set them to 0 before calling fast_memeq().

I'm not sure if this is actually a good idea, though.

comment:32 in reply to:  31 Changed 16 months ago by teor

Replying to nickm:

teor, I think you have tor_addr_compare backwards. Its documentation says:

 * Given two addresses <b>addr1</b> and <b>addr2</b>, return 0 if the two
 * addresses are equivalent under the mask mbits, less than 0 if addr1
 * precedes addr2, and greater than 0 otherwise.

So the original use of tor_addr_compare (without "!") is correct.

You're right. Oops!

(Unit tests would have caught this bug; that's one reason why unit tests are so great. ;) Any chance of writing those?)

I think we should have unit tests.

As a smaller issue, it seems that this patch says the same line twice:

         a->is_hs_dir != b->is_hs_dir ||
         a->is_hs_dir != b->is_hs_dir ||

---
Another question: how did you decide on the list of fields to check? It seems that some but not all of the fields in routerstatus_t are now checked for equality, but I don't understand the rationale about why the remaining ones (pv, exitsummary) are not. Is that intentional?

We only check the fields that are output via the control port.
We should revise this like to say "only" not "also":

/* Given two router status entries for the same router identity, return 1 if
 * if the contents have changed between them. Otherwise, return 0.
 * It also checks for changes for that are output by control port. */

https://github.com/ArunaMaurya221B/tor/commit/e29292dda5ef68e441f267864357a7568b29588e#diff-fcb162b79906521706b54d280b939d57R1540

---

Teor asks:

How can we make sure we update functions when we add new members to a struct?

In some cases, using a constructor for a struct instance will work, since we have turned on the options that let us know about any uninitialized members. But in cases like this, I don't see an easy way to use a constructor.

One option here would be to stop assuming that we list all relevant the members here. We could instead change the code that we only list the irrelevant members, by:

  1. Making sure that when we construct routerstatus_t, we initialize the whole object to 0.
  2. In a function like this, using memcpy() to make a temporary copy of the routerstatus_t.
  3. Instead of listing all the relevant members in a comparison, we could just use fast_memeq() to compare. If there are some members we don't want to compare, we would set them to 0 before calling fast_memeq().

I'm not sure if this is actually a good idea, though.

I think it is enough to comment the control port printing function, and this function, and say that they must be kept in sync.

comment:33 Changed 14 months ago by nickm

Keywords: 034-triage-20180328 added

comment:34 Changed 14 months ago by nickm

Keywords: 034-removed-20180328 added

Per our triage process, these tickets are pending removal from 0.3.4.

comment:35 Changed 14 months ago by nickm

Milestone: Tor: 0.3.4.x-finalTor: unspecified

These needs_revision, tickets, tagged with 034-removed-*, are no longer in-scope for 0.3.4. We can reconsider any of them, if somebody does the necessary revision.

Note: See TracTickets for help on using tickets.