Opened 3 years ago

Last modified 7 months ago

#16849 needs_revision defect

clear_status_flags_on_sybil might want to clear more flags

Reported by: teor Owned by: ffmancera
Priority: High Milestone: Tor: unspecified
Component: Core Tor/Tor Version:
Severity: Normal Keywords: SponsorS-deferred, technical-debt, tor-dirauth, pending-disaster, review-group-32, review-group-34, 034-triage-20180328
Cc: Actual Points:
Parent ID: Points: small
Reviewer: nickm Sponsor:

Description

clear_status_flags_on_sybil contains a comment saying "it's easy to add a new flag but forget to add it to this clause."

It looks like we may have forgot the following flags:

  • is_hs_dir
  • version_known?
  • version_supports_extend2_cells?
  • has_bandwidth
  • has_exitsummary?
  • bw_is_unmeasured? (set to 1?)
  • bandwidth_kb
  • has_guardfraction
  • guardfraction_percentage

To deal with the root cause, should we instead zero out the entire routerstatus_t, then copy the fields we need back in?
(This would zero new fields on sybils by default.)

We could also implement a unit test for clear_status_flags_on_sybil that checks that certain (important?) flags are cleared, or that all flags are cleared (?).

Child Tickets

Change History (36)

comment:1 Changed 3 years ago by nickm

Seems like a good idea to me.

comment:2 Changed 3 years ago by nickm

Keywords: 028-triage added

comment:3 Changed 3 years ago by nickm

Keywords: SponsorS removed
Sponsor: SponsorS

Bulk-replace SponsorS keyword with SponsorS sponsor field in Tor component.

comment:4 Changed 3 years ago by nickm

Points: small

comment:5 Changed 3 years ago by nickm

Milestone: Tor: 0.2.8.x-finalTor: 0.2.???
Priority: normalminor

Actually, none of these flags is problematic not to be clearing.

comment:6 Changed 3 years ago by nickm

Keywords: SponsorS-deferred added
Sponsor: SponsorS

Remove the SponsorS status from these items, which we already decided to defer from 0.2.9. add the SponsorS-deferred tag instead in case we ever want to remember which ones these were.

comment:7 Changed 2 years ago by teor

Milestone: Tor: 0.2.???Tor: 0.3.???

Milestone renamed

comment:8 Changed 2 years ago by nickm

Keywords: tor-03-unspecified-201612 added
Milestone: Tor: 0.3.???Tor: unspecified

Finally admitting that 0.3.??? was a euphemism for Tor: unspecified all along.

comment:9 Changed 19 months ago by nickm

Keywords: tor-03-unspecified-201612 removed

Remove an old triaging keyword.

comment:10 Changed 19 months ago by nickm

Keywords: 028-triage removed

comment:11 Changed 18 months ago by nickm

Keywords: technical-debut tor-dirauth pending-disaster added
Priority: LowHigh
Severity: Normal

comment:12 Changed 18 months ago by nickm

Keywords: technical-debt added; technical-debut removed

comment:13 Changed 10 months ago by ffmancera

Owner: set to ffmancera
Status: newassigned

comment:14 Changed 10 months ago by ffmancera

Status: assignedneeds_review

Here is a patch! But I didn't add version_known and version_supports_extend2_cells because they are not member of routerstatus_t. All tests pass succesfully and Travis CI seems happy.

I hope everything is fine, check my github branch bug16849.

comment:15 Changed 10 months ago by nickm

Milestone: Tor: unspecifiedTor: 0.3.4.x-final

comment:16 in reply to:  description Changed 10 months ago by teor

Replying to teor:

clear_status_flags_on_sybil contains a comment saying "it's easy to add a new flag but forget to add it to this clause."

It looks like we may have forgot the following flags:

  • is_hs_dir
  • version_known?
  • version_supports_extend2_cells?
  • has_bandwidth
  • has_exitsummary?
  • bw_is_unmeasured? (set to 1?)
  • bandwidth_kb
  • has_guardfraction
  • guardfraction_percentage

It's been 2 years since this ticket was opened, are there's any new flags we should also clear?

To deal with the root cause, should we instead zero out the entire routerstatus_t, then copy the fields we need back in?
(This would zero new fields on sybils by default.)

Do you think we should zero the struct, then copy across the fields we want to keep?
Or do you think that touching the entire struct is a bad idea?

We could make it easier to clear the flags by putting them in their own struct. But that's a pretty extreme refactor.

We could also implement a unit test for clear_status_flags_on_sybil that checks that certain (important?) flags are cleared, or that all flags are cleared (?).

Would you like to write a unit test to make sure we clear important status flags?
(It's important that we clear consensus flags like Guard, Exit, V2Dir, …)

comment:17 Changed 10 months ago by teor

Here's a clever way to do a unit test:

Write all 1s to the struct. (That is, memset the struct to 0xff.)
Clear sybils.
Turn the consensus flags in the struct into a string using the standard function.
Make sure the string is empty.

Last edited 10 months ago by teor (previous) (diff)

comment:18 Changed 10 months ago by nickm

It's been 2 years since this ticket was opened, are there's any new flags we should also clear?

Probably!

Do you think we should zero the struct, then copy across the fields we want to keep?
Or do you think that touching the entire struct is a bad idea?

I think it's probably fine to do the zero-and-copy approach.

comment:19 Changed 10 months ago by nickm

Keywords: review-group-32 added

comment:20 Changed 10 months ago by teor

Reviewer: teor, nickm
Status: needs_reviewneeds_revision

Putting in needs_revision for the zero-and-copy approach, and the unit test

comment:21 Changed 10 months ago by ffmancera

As teor and I discussed at #tor-dev, doing the zero-and-copy approach will avoid new similar issues in the future with new flags, but we need to find out a way to check out in a unit test that we are not clearing other important struct fields.

Also, there is no new flags added to the routerstatus_t struct.

comment:22 Changed 10 months ago by ffmancera

Status: needs_revisionneeds_review

I have a patch, I think it is a good approach :)

I hope everything is fine, check my github branch bug16849.

comment:23 Changed 10 months ago by nickm

Keywords: review-group-34 added

comment:24 Changed 9 months ago by dgoulet

Reviewer: teor, nickmnickm

comment:25 Changed 9 months ago by dgoulet

Assigning reviewer for week 03/19.

comment:26 Changed 9 months ago by nickm

Hm. I think this is probably correct, but I wonder if it could be even more simple.

My main concern with this patch now is that we have a new failure mode: instead of maybe forgetting to clear a flag, we can now maybe forget to preserve a field.

Either approach would be okay if we could come up with some kind of unit test that ensures that we haven't missed any fields. I think we could build one -- just a second...

comment:27 Changed 9 months ago by nickm

oh, I did find bugs!

  • We should use memcpy() to copy the identity and descriptor digests: strncpy will stop copying if those digests contain any 0-valued bytes.
  • We should use strlcpy() and sizeof() to copy the nickname.

I think that we can have a somewhat safer approach. Hang on...

comment:28 Changed 9 months ago by nickm

Status: needs_reviewneeds_revision

So, the somewhat safer approach I had in mind is only really possible with C11, which we haven't standardized on. And my idea of how to test was pretty wrong: it relied on there being no padding holes within the routerstatus_t structure.

comment:29 Changed 9 months ago by ffmancera

Okay, let me fix these bugs and thanks for this review :-)

comment:30 Changed 9 months ago by nickm

Keywords: 034-triage-20180328 added

comment:31 Changed 9 months ago by nickm

Keywords: 034-removed-20180328 added

Per our triage process, these tickets are pending removal from 0.3.4.

comment:32 Changed 9 months ago by nickm

Keywords: 034-removed-20180328 removed

Oh! I think ffmancera is still working on this one.

comment:33 Changed 9 months ago by ffmancera

Status: needs_revisionneeds_review

We should use memcpy() to copy the identity and descriptor digests: strncpy will stop copying if those digests contain any 0-valued bytes.
We should use strlcpy() and sizeof() to copy the nickname.

Done!

comment:34 Changed 8 months ago by nickm

Status: needs_reviewneeds_revision

Okay; sorry for all of the back-and-forth on this one.

I've had another look at the code here, and talked it over with dgoulet, and we think that as it stands, this patch does not actually make the pattern here less error-prone than it was before. Previously, we had to be careful every time we added a new flag that would be cleared. With this patch, we must be careful every time we add a new field that would _not_ be cleared. The unit tests don't help us with the problem here, because they have no way to check for correct behavior of new fields in routerstatus_t.

Most of the patterns that we thought of to try to improve the tests in this regard seem like they would cause other problems in the code (like forcing us to rewrite everything that uses the routerstatus_t flags).

I'm putting this ticket into needs_revision, but it might need insight or a different approach entirely. Even just some comments and might be an improvement.

comment:35 Changed 8 months ago by teor

Maybe we should just rely on comments here?
Or, as an alternative, when we create our C-Rust coupled tool, let's make it do C-C coupled warnings as well?

comment:36 Changed 7 months ago by nickm

Keywords: easy removed
Milestone: Tor: 0.3.4.x-finalTor: unspecified

Moving to Unspecified: I don't think we have a good handle on what we actually want to do with this ticket.

Note: See TracTickets for help on using tickets.