Opened 5 months ago

Closed 7 weeks ago

#30649 closed defect (fixed)

Every few hours, relays [warn] Received circuit padding stop command for unknown machine.

Reported by: teor Owned by: mikeperry
Priority: Medium Milestone: Tor: 0.4.0.x-final
Component: Core Tor/Tor Version: Tor: 0.4.0.1-alpha
Severity: Normal Keywords: teor-merge, consider-backport-after-0415, 040-backport, tor-relay, circuitpadding, wtf-pad
Cc: asn, mikeperry, nickm Actual Points: 1
Parent ID: Points: 1
Reviewer: asn Sponsor: Sponsor2

Description

toralf says on #tor-relays:

I got 5x within last 36 ours with 0.4.1.1-alpha:  [warn] Received circuit padding stop command for unknown machine.        -shall I ignore that?

I'm marking this as 041-must, because we will get lots of bug reports from relay operators if we release a stable with this warning. At the very least, we must downgrade it to a protocol error.

Child Tickets

Change History (36)

comment:1 Changed 5 months ago by mikeperry

This is interesting. The actual log is due refactoring eliminating a return value propogation/check of free_circ_machine_infos_with_machinenum() in circpad_handle_padding_negotiate(). As a result of the check being removed, the logline is *always* printing even if the machine is valid and did exist. So adding the check back in will eliminate most of those.

It's not happening that often because normally client-side does not send STOP commands. The client will only send a STOP of the machine conditions cease applying. This can happen in the event of a circuit build timeout (and associated purpose change to MEASUREMENT), though.

So I think all we have to do is make free_circ_machine_infos_with_machinenum() return a success/fail result, and check it properly in circpad_handle_padding_negotiate(), before warning.

comment:2 Changed 5 months ago by mikeperry

Owner: set to mikeperry
Status: newaccepted

comment:3 Changed 5 months ago by teor

If the relay operator can't do anything to fix these warnings, we should downgrade the remaining warnings to protocol warnings.

comment:4 Changed 5 months ago by mikeperry

Status: acceptedneeds_review

comment:5 Changed 5 months ago by teor

Status: needs_reviewneeds_revision

I have some questions about the latest commit: some of the changes look like typos?

comment:6 Changed 5 months ago by mikeperry

Can you be more specific? Do you mean the origin-side messages? I changed those from LD_PROTOCOL because I think they should be LD_CIRC. But they are not relay side, they are origin side. It's my understanding that relay side messages are what LOG_PROTOCOL_WARN is for.

comment:7 Changed 5 months ago by teor

LOG_PROTOCOL_WARN Is for warnings about other nodes on the network disobeying the tor protocol.

So it should also be used for log messages on the origin side, when the message is caused by the relay behaving badly.

comment:8 Changed 5 months ago by mikeperry

Hrmm.. but there is something that origins can do.. They can use ExcludeNodes to blacklist nodes that violoate protocol. Based on the reasoning for having LOG_PROTOCOL_WARN, it sounds like origin-facing procotol warns that mention the idhex and recommend ExcludeNodes should be fine as regular warns.

The reasoning is that I think that some users would prefer having padding defenses work over using broken nodes that violate protocol. Right now, circpad does not kill the circuit if padding fails in a bizarre way. I am thinking it should, or at least should tell the user and advise them.

This ticket got a lot more complicated than fixing this bug.. Should we make a different ticket about the warns?

Last edited 5 months ago by mikeperry (previous) (diff)

comment:9 in reply to:  8 Changed 5 months ago by teor

Replying to mikeperry:

Hrmm.. but there is something that origins can do.. They can use ExcludeNodes to blacklist nodes that violoate protocol. Based on the reasoning for having LOG_PROTOCOL_WARN, it sounds like origin-facing procotol warns that mention the idhex and recommend ExcludeNodes should be fine as regular warns.

I think you should talk to the Tor Browser team and UX team about this kind of warning. It could confuse a lot of Tor Browser users.

The reasoning is that I think that some users would prefer having padding defenses work over using broken nodes that violate protocol. Right now, circpad does not kill the circuit if padding fails in a bizarre way. I am thinking it should, or at least should tell the user and advise them.

We need to design tor for the majority of our users, who do not edit their torrcs. But we also need to have advanced options available for users who need strict privacy. I don't know if there's a way to do both: we should talk to the UX team.

This ticket got a lot more complicated than fixing this bug.. Should we make a different ticket about the warns?

Yes, I think so.

comment:10 Changed 5 months ago by mikeperry

Status: needs_revisionneeds_review

I pushed more log protocol warn updates to the PR: now when padding negotiation fails due to either corrupt cell or "valid" refusal, we LOG_PROTOCOL_WARN. Those should not happen, but one of them was at info, so it was already quiet.

I will file a new ticket for evaluating if the origin-side LOG_PROTOCOL_WARNs here should be LOG_WARNs or not.

comment:11 Changed 4 months ago by dgoulet

Reviewer: asn

comment:12 Changed 4 months ago by asn

Seems reasonable. Please see my commit in https://github.com/torproject/tor/pull/1101.

I added a log for receiving a legit STOP command, and also promoted a protocol warn for receiving a padding from the wrong hop to a normal warn. I think if this happens we want to learn about it.

comment:13 Changed 4 months ago by asn

Status: needs_reviewneeds_revision

Setting to needs_rev as its waiting for action.

comment:14 Changed 4 months ago by mikeperry

Status: needs_revisionneeds_review

Asn's changes are fine with me. I'm not the reviewer though, so just setting back to needs_review.

comment:15 Changed 4 months ago by asn

Status: needs_reviewmerge_ready

OK, since mike ACKed it I'm moving it to merge_ready.

comment:16 Changed 4 months ago by nickm

Merged to 0.4.1 and forward.

comment:17 Changed 4 months ago by nickm

Resolution: fixed
Status: merge_readyclosed

comment:18 Changed 4 months ago by arma

Resolution: fixed
Status: closedreopened

A relay operator (strelnikov) reports getting this message on their 0.4.0.5 relay.

It looks like commit 9aaf72ea went into 0.4.0.1-alpha?

So (a) the bug *is* in our stable, and here is a user reporting it like teor predicted, and (b) we only pushed the fix into 0.4.1 above, right?

Reopening so it can get more eyes. Thanks!

comment:19 Changed 4 months ago by teor

Keywords: 040-backport added
Milestone: Tor: 0.4.1.x-finalTor: 0.4.0.x-final
Status: reopenedneeds_revision
Version: Tor: 0.4.1.1-alphaTor: 0.4.0.1-alpha

Please remember to add actual points :-)

comment:20 Changed 4 months ago by nickm

Keywords: 041-must removed

comment:21 Changed 3 months ago by asn

Made a branch based on 040 here: https://github.com/torproject/tor/pull/1174

comment:22 Changed 3 months ago by asn

Status: needs_revisionmerge_ready

comment:23 Changed 3 months ago by asn

Status: merge_readyneeds_revision

Gr! Compilation errors on travis. Aborting and putting back in needs_revision.

comment:24 Changed 3 months ago by asn

Status: needs_revisionmerge_ready

OK, pushed a better backport 040 PR. If CI passes, this can be merged.

comment:25 Changed 3 months ago by teor

Keywords: consider-backport-after-0415 added

This PR was merged into 0.4.1.4, so we will test it until 0.4.1.5 is released, then merge.

comment:26 Changed 8 weeks ago by teor

Status: merge_readyneeds_revision

I can't backport this branch, because:

  • the changes file has the wrong bugfix version:
    • please update the maint-0.4.0 PR with the changes file with the correct bugfix version 0.4.0.1-alpha
    • if you'd like, you can create a release-0.4.1 PR that fixes the wrong bug versions in the ChangeLog and ReleaseNotes
  • the PR doesn't merge cleanly into maint-0.4.1
    • please merge the maint-0.4.0 PR to maint-0.4.1, and make that merged branch into a separate PR on maint-0.4.1

comment:27 Changed 8 weeks ago by nickm

Mike is pretty hosed; I can pick this up unless somebody else wants to.

comment:28 Changed 8 weeks ago by teor

I think it might just need an "ours" merge, because the corresponding change was already merged to 0.4.1, before being backported.

But I'm really not sure.

comment:29 Changed 7 weeks ago by asn

I'm on it!

comment:30 in reply to:  26 ; Changed 7 weeks ago by asn

Status: needs_revisionmerge_ready

OK, here we go!

Replying to teor:

I can't backport this branch, because:

  • the changes file has the wrong bugfix version:
    • please update the maint-0.4.0 PR with the changes file with the correct bugfix version 0.4.0.1-alpha

The PR has been updated with a fixup: https://github.com/torproject/tor/pull/1174

  • if you'd like, you can create a release-0.4.1 PR that fixes the wrong bug versions in the ChangeLog and ReleaseNotes

A changelog fix has been pushed https://github.com/torproject/tor/pull/1282 , but I couldn't find the #30649 entry in the ReleaseNotes.

  • the PR doesn't merge cleanly into maint-0.4.1
    • please merge the maint-0.4.0 PR to maint-0.4.1, and make that merged branch into a separate PR on maint-0.4.1

An 041 PR can be found here: https://github.com/torproject/tor/pull/1283

If CI passes, we can go ahead!

comment:31 in reply to:  30 Changed 7 weeks ago by teor

Cc: nickm added

Replying to asn:

Replying to teor:

I can't backport this branch, because:

  • the changes file has the wrong bugfix version:
    • please update the maint-0.4.0 PR with the changes file with the correct bugfix version 0.4.0.1-alpha

The PR has been updated with a fixup: https://github.com/torproject/tor/pull/1174

  • if you'd like, you can create a release-0.4.1 PR that fixes the wrong bug versions in the ChangeLog and ReleaseNotes

A changelog fix has been pushed https://github.com/torproject/tor/pull/1282 , but I couldn't find the #30649 entry in the ReleaseNotes.

30649 is missing from the ReleaseNotes, because the changes file said it was a bug on 0.4.1.
Bugs don't go in the release notes unless they have been released in a stable version of Tor.

Please copy the ChangeLog entry for 30649 into the correct section of the release-0.4.1 ReleaseNotes.

nickm, do we also need to patch the master ReleaseNotes?

comment:32 Changed 7 weeks ago by teor

Apparently we need a branch for release-0.4,1 and master, or something special needs to happen with the merge.
(I tried some release notes fixes in #31461, but they never made it to master.)

Let's wait for nickm's advice here.

comment:33 Changed 7 weeks ago by nickm

In general, if we need to edit a changelog or releasenotes, we make the fix in one branch, and cherry-pick it to the others as needed. We also ask "is this fix really necessary"; having a fixed changelog and having an accurate changelog are both desirable.

comment:34 Changed 7 weeks ago by asn

I added the entry to ReleaseNotes in my release-0.4.1 branch (PR 1282).

Is there anything else that needs to be done?

comment:35 Changed 7 weeks ago by teor

Keywords: teor-merge added

Seems good to me.

Here's what I'll do to merge this backport:

comment:36 Changed 7 weeks ago by teor

Actual Points: 1
Resolution: fixed
Status: merge_readyclosed

This was a complex merge.
There were some trivial conflicts, which I fixed, see the merge commit messages for details.

Thanks for making it happen!

Note: See TracTickets for help on using tickets.