Opened 7 weeks ago

Last modified 8 days ago

#30649 needs_revision defect

Every few hours, relays [warn] Received circuit padding stop command for unknown machine.

Reported by: teor Owned by: mikeperry
Priority: Medium Milestone: Tor: 0.4.0.x-final
Component: Core Tor/Tor Version: Tor: 0.4.0.1-alpha
Severity: Normal Keywords: 040-backport, tor-relay, circuitpadding, wtf-pad
Cc: asn, mikeperry Actual Points:
Parent ID: Points: 1
Reviewer: asn Sponsor: Sponsor2

Description

toralf says on #tor-relays:

I got 5x within last 36 ours with 0.4.1.1-alpha:  [warn] Received circuit padding stop command for unknown machine.        -shall I ignore that?

I'm marking this as 041-must, because we will get lots of bug reports from relay operators if we release a stable with this warning. At the very least, we must downgrade it to a protocol error.

Child Tickets

Change History (23)

comment:1 Changed 7 weeks ago by mikeperry

This is interesting. The actual log is due refactoring eliminating a return value propogation/check of free_circ_machine_infos_with_machinenum() in circpad_handle_padding_negotiate(). As a result of the check being removed, the logline is *always* printing even if the machine is valid and did exist. So adding the check back in will eliminate most of those.

It's not happening that often because normally client-side does not send STOP commands. The client will only send a STOP of the machine conditions cease applying. This can happen in the event of a circuit build timeout (and associated purpose change to MEASUREMENT), though.

So I think all we have to do is make free_circ_machine_infos_with_machinenum() return a success/fail result, and check it properly in circpad_handle_padding_negotiate(), before warning.

comment:2 Changed 7 weeks ago by mikeperry

Owner: set to mikeperry
Status: newaccepted

comment:3 Changed 7 weeks ago by teor

If the relay operator can't do anything to fix these warnings, we should downgrade the remaining warnings to protocol warnings.

comment:4 Changed 7 weeks ago by mikeperry

Status: acceptedneeds_review

comment:5 Changed 7 weeks ago by teor

Status: needs_reviewneeds_revision

I have some questions about the latest commit: some of the changes look like typos?

comment:6 Changed 7 weeks ago by mikeperry

Can you be more specific? Do you mean the origin-side messages? I changed those from LD_PROTOCOL because I think they should be LD_CIRC. But they are not relay side, they are origin side. It's my understanding that relay side messages are what LOG_PROTOCOL_WARN is for.

comment:7 Changed 7 weeks ago by teor

LOG_PROTOCOL_WARN Is for warnings about other nodes on the network disobeying the tor protocol.

So it should also be used for log messages on the origin side, when the message is caused by the relay behaving badly.

comment:8 Changed 6 weeks ago by mikeperry

Hrmm.. but there is something that origins can do.. They can use ExcludeNodes to blacklist nodes that violoate protocol. Based on the reasoning for having LOG_PROTOCOL_WARN, it sounds like origin-facing procotol warns that mention the idhex and recommend ExcludeNodes should be fine as regular warns.

The reasoning is that I think that some users would prefer having padding defenses work over using broken nodes that violate protocol. Right now, circpad does not kill the circuit if padding fails in a bizarre way. I am thinking it should, or at least should tell the user and advise them.

This ticket got a lot more complicated than fixing this bug.. Should we make a different ticket about the warns?

Last edited 6 weeks ago by mikeperry (previous) (diff)

comment:9 in reply to:  8 Changed 6 weeks ago by teor

Replying to mikeperry:

Hrmm.. but there is something that origins can do.. They can use ExcludeNodes to blacklist nodes that violoate protocol. Based on the reasoning for having LOG_PROTOCOL_WARN, it sounds like origin-facing procotol warns that mention the idhex and recommend ExcludeNodes should be fine as regular warns.

I think you should talk to the Tor Browser team and UX team about this kind of warning. It could confuse a lot of Tor Browser users.

The reasoning is that I think that some users would prefer having padding defenses work over using broken nodes that violate protocol. Right now, circpad does not kill the circuit if padding fails in a bizarre way. I am thinking it should, or at least should tell the user and advise them.

We need to design tor for the majority of our users, who do not edit their torrcs. But we also need to have advanced options available for users who need strict privacy. I don't know if there's a way to do both: we should talk to the UX team.

This ticket got a lot more complicated than fixing this bug.. Should we make a different ticket about the warns?

Yes, I think so.

comment:10 Changed 6 weeks ago by mikeperry

Status: needs_revisionneeds_review

I pushed more log protocol warn updates to the PR: now when padding negotiation fails due to either corrupt cell or "valid" refusal, we LOG_PROTOCOL_WARN. Those should not happen, but one of them was at info, so it was already quiet.

I will file a new ticket for evaluating if the origin-side LOG_PROTOCOL_WARNs here should be LOG_WARNs or not.

comment:11 Changed 5 weeks ago by dgoulet

Reviewer: asn

comment:12 Changed 5 weeks ago by asn

Seems reasonable. Please see my commit in https://github.com/torproject/tor/pull/1101.

I added a log for receiving a legit STOP command, and also promoted a protocol warn for receiving a padding from the wrong hop to a normal warn. I think if this happens we want to learn about it.

comment:13 Changed 5 weeks ago by asn

Status: needs_reviewneeds_revision

Setting to needs_rev as its waiting for action.

comment:14 Changed 3 weeks ago by mikeperry

Status: needs_revisionneeds_review

Asn's changes are fine with me. I'm not the reviewer though, so just setting back to needs_review.

comment:15 Changed 3 weeks ago by asn

Status: needs_reviewmerge_ready

OK, since mike ACKed it I'm moving it to merge_ready.

comment:16 Changed 3 weeks ago by nickm

Merged to 0.4.1 and forward.

comment:17 Changed 3 weeks ago by nickm

Resolution: fixed
Status: merge_readyclosed

comment:18 Changed 2 weeks ago by arma

Resolution: fixed
Status: closedreopened

A relay operator (strelnikov) reports getting this message on their 0.4.0.5 relay.

It looks like commit 9aaf72ea went into 0.4.0.1-alpha?

So (a) the bug *is* in our stable, and here is a user reporting it like teor predicted, and (b) we only pushed the fix into 0.4.1 above, right?

Reopening so it can get more eyes. Thanks!

comment:19 Changed 2 weeks ago by teor

Keywords: 040-backport added
Milestone: Tor: 0.4.1.x-finalTor: 0.4.0.x-final
Status: reopenedneeds_revision
Version: Tor: 0.4.1.1-alphaTor: 0.4.0.1-alpha

Please remember to add actual points :-)

comment:20 Changed 10 days ago by nickm

Keywords: 041-must removed

comment:21 Changed 8 days ago by asn

Made a branch based on 040 here: https://github.com/torproject/tor/pull/1174

comment:22 Changed 8 days ago by asn

Status: needs_revisionmerge_ready

comment:23 Changed 8 days ago by asn

Status: merge_readyneeds_revision

Gr! Compilation errors on travis. Aborting and putting back in needs_revision.

Note: See TracTickets for help on using tickets.