Opened 5 months ago

Last modified 9 days ago

#29034 assigned defect

circuit: Cleanup an HS circuit when it is being re-purposed

Reported by: dgoulet Owned by: dgoulet
Priority: High Milestone: Tor: 0.4.1.x-final
Component: Core Tor/Tor Version: Tor: 0.3.2.1-alpha
Severity: Normal Keywords: tor-hs-reachability, 029-backport-maybe, 034-backport-maybe, 035-backport, 040-backport, postfreeze-ok, network-team-roadmap-2019-Q1Q2, 041-should
Cc: asn, mikeperry, dgoulet Actual Points:
Parent ID: #28634 Points: 3
Reviewer: asn Sponsor: Sponsor27-must

Description

Mike found out that when an IP/RP circuit fails to build in the right amount of time (for instance through circuit_expire_building()), it is re-purposed to become a measurement circuit.

The issue is that those HS circuits are set in the HS circuitmap and have an hs_ident or rend_data set to them that should really not linger in the circuit object if the circuit is not an HS one anymore.

Offenders: circuit_build_times_mark_circ_as_measurement_only() and pathbias_send_usable_probe().

Solution:

circuit_change_purpose() is probably the right place to make a callback within the HS subsystem specific to cleaning up a circuit for a purpose change. I think we need a new function that specifically does that and not use hs_circ_cleanup() since it won't remove the ident.

Lingering circuits in the HS circuitmap is bad and this bug could probably explain some of the issues we had with clients unable to establish connections because the IP auth key wouldn't match the one in the circuit ident.

I strongly believe this should be backported up to 0.3.5 at the very least.

Child Tickets

Change History (28)

comment:1 Changed 4 months ago by nickm

Keywords: postfreeze-ok added

comment:2 Changed 4 months ago by mikeperry

Keywords: tor-hs-reachability added; tor-hs removed

comment:3 Changed 3 months ago by nickm

Keywords: 040-must added

Marking tickets as 040-must based on triage with dgoulet.

comment:4 Changed 3 months ago by dgoulet

Owner: set to dgoulet
Status: newassigned

comment:5 Changed 2 months ago by teor

Owner: changed from dgoulet to mikeperry

dgoulet is on leave, assigning to Mike, because asn already has 3 tickets in 040-must.

comment:6 Changed 8 weeks ago by teor

Owner: mikeperry deleted

Un-assigning 040-must bugs from Mike, because he's overloaded.
We'll work out what to do with these tickets at the meeting.

comment:7 Changed 8 weeks ago by asn

Points: 3
Sponsor: Sponsor27-must

comment:8 Changed 8 weeks ago by nickm

Keywords: 040-must removed

comment:9 Changed 8 weeks ago by teor

Keywords: 040-backport added

This is not a release blocker, we can do it later and backport

comment:10 Changed 8 weeks ago by teor

Milestone: Tor: 0.4.0.x-finalTor: 0.4.1.x-final

comment:11 Changed 8 weeks ago by teor

Keywords: 029-backport-maybe 034-backport-maybe added

comment:12 Changed 8 weeks ago by pili

Parent ID: #29995

comment:13 Changed 7 weeks ago by mikeperry

Status: assignedneeds_review

Ok I did the simplest narrow fix, based on what dgoulet said in the description:
https://github.com/torproject/tor/pull/931

Someone who is deeply familiar with both v2 and v3 state should review this. It's simple, and I think it's ok, but I don't know both v2 and v3 code bases well enough to really be sure.

comment:14 Changed 7 weeks ago by teor

Owner: set to mikeperry
Status: needs_reviewassigned
Version: Tor: 0.3.2.1-alpha

comment:15 Changed 7 weeks ago by teor

Status: assignedneeds_review

comment:16 Changed 6 weeks ago by asn

Reviewer: asn

comment:17 Changed 5 weeks ago by asn

Status: needs_reviewneeds_revision

LGTM!

I have a slight revision in https://github.com/torproject/tor/pull/969, because I opted for not calling the new functions hs_circ_free() since *_free() functions have a very specific meaning in Tor (they free and NULL their input). Instead I hijacked hs_circ_cleanup() which seamed a better target.

If you like, let's merge_ready.

comment:18 Changed 4 weeks ago by gaba

Keywords: network-team-roadmap-2019-Q1Q2 added

Add keyword for tickets in the network team roadmap.

comment:19 Changed 4 weeks ago by mikeperry

Status: needs_revisionmerge_ready

Ok, dgoulet said it should be a different function, but I figure you guys can hash that out. I'm fine with asn's https://github.com/torproject/tor/pull/969.

comment:20 Changed 3 weeks ago by asn

Cc: dgoulet added

Hey David, what needs to happen here?

comment:21 Changed 3 weeks ago by dgoulet

Status: merge_readyneeds_revision

Commented on PR. The approach is not correct I believe if we are serious about the v2 and v3 code separation.

comment:22 Changed 3 weeks ago by mikeperry

<mikeperry> dgoulet: what do you think about what I did earlier in that case? (https://github.com/torproject/tor/pull/931/commits/bee7979e87d51735d2d5f746d69761d89d7dd928)
<dgoulet> mikeperry: yeah something like that but it needs to call hs_circ_cleanup() for v3 only and the v2 stuff needs to ideally not be done there... so I think the place it needs to go is hs_common.c and then call the v2 or v3 specialized calls

Hrmm, I am not sure how to differentiate the v2 vs v3 calls+data. What is the convention for that?

comment:23 Changed 3 weeks ago by mikeperry

Parent ID: #29995#28634

This is needed for #28634+#28780 for 0.4.1.

comment:24 in reply to:  22 Changed 2 weeks ago by dgoulet

Replying to mikeperry:

Hrmm, I am not sure how to differentiate the v2 vs v3 calls+data. What is the convention for that?

Within the circuit, v2 has a rend_data and v3 has hs_ident. One circuit can _not_ have both. If none exist also, it is not an HS circuit.

comment:25 Changed 2 weeks ago by asn

further clarification by dgoulet:

11:27 <+dgoulet> asn: anyway, my main point about #29043 is that if the cleanup function we need needs to be v2 and v3 , we should do one high level in hs_common.h and             then specialized in hs/ and rend/ ... but simply not use hs_circuit.c for v2

comment:26 Changed 2 weeks ago by dgoulet

Owner: changed from mikeperry to dgoulet
Status: needs_revisionassigned

Taking over as part of s27 since we believe #28970 and #27471 will get fixed with this.

comment:27 Changed 10 days ago by nickm

Keywords: 041-should added

comment:28 Changed 9 days ago by mikeperry

dgoulet - So far we have not experienced any obvious reliability issues due to this bug while testing #28634 (which uses #28780, which changes client-side INTRO and REND circ purposes to PADDING circs).

I think this means that the hs_circuitmap reliability issues do not apply to client-side intros?

My guess is it is because clients only seem to use this map for the 128bit rend cookie, and not any permanent mappings that could be invalid later. In which case, this is "just" a client-side memory leak of this state with #28780, and not a reliability problem?

Last edited 9 days ago by mikeperry (previous) (diff)
Note: See TracTickets for help on using tickets.