Opened 11 months ago

Last modified 6 weeks ago

#22781 needs_revision enhancement

hs: Unify link specifier API/ABI

Reported by: dgoulet Owned by: dgoulet
Priority: Very High Milestone: Tor: unspecified
Component: Core Tor/Tor Version:
Severity: Normal Keywords: tor-cell, tor-relay, ipv6, 034-triage-20180328, 034-removed-20180328
Cc: Actual Points:
Parent ID: #24403 Points: 1
Reviewer: teor Sponsor: SponsorV-can

Description

In hs_descriptor.{c|h}, link specifiers are used to encode the introduction points in the descriptor. The decoded version is represented by the hs_desc_link_specifier_t object which is used to build a trunnel object that is then base64 encoded and put in the descriptor.

With the service and client implementation of prop224, link specifiers are also used in the INTRODUCE1 cell. Client will encode them in the cell and service decodes them (INTRODUCE2 cell).

This ticket is to unify the link specifier API/ABI used between cell and descriptor for both encoding and decoding in order to avoid code duplication.

Furthermore, future work could expend this to the EXTEND2 cell that also requires link specifiers (extend_cell_from_extend2_cell_body() and extend_cell_format().

Child Tickets

Change History (18)

comment:1 Changed 11 months ago by dgoulet

Parent ID: #21888
Status: newneeds_review

This is important groundwork for #17242 and #20657 which both uses link specifiers so they are using this unified API.

Branch is in ticket22781_032_01.
https://gitlab.com/dgoulet/tor/merge_requests/32

Last edited 11 months ago by dgoulet (previous) (diff)

comment:2 Changed 11 months ago by dgoulet

Status: needs_reviewneeds_revision

nvm the above, this needs some changes.

comment:3 Changed 10 months ago by dgoulet

Keywords: prop224-extra added
Parent ID: #21888
Sponsor: SponsorR-mustSponsorR-can

This is not prop224 groundwork afterall.

comment:4 Changed 9 months ago by dgoulet

Keywords: tor-cell tor-relay added; prop224 tor-hs prop224-extra removed
Milestone: Tor: 0.3.2.x-finalTor: 0.3.3.x-final
Sponsor: SponsorR-can

comment:5 Changed 6 months ago by teor

Keywords: ipv6 added
Parent ID: #24403
Sponsor: SponsorV-can

This will help with IPv6 ORPort reachability checks in #24403.

comment:6 Changed 5 months ago by dgoulet

Status: needs_revisionneeds_information

I want to make sure we nail down an interface that we like/want before going further.

  1. EXTEND cell: The extend_cell_t object contains basically all the possible ways to extend. Through extend_cell_format(), we take what's in it and create link_specifier_t (trunnel).

The relationship is:

  • extend_cell_format(): extend_cell_t -> link_specifier_t
  • extend_cell_parse(): link_specifier_t -> extend_cell_t (Note that this is for EXTEND2).
  1. HSv3: We have hs_desc_link_specifier_t which represent only one possible way to extend (IPv4, IPv6, Legacy ID or ed25519).

Relationship is:

  • encode_link_specifiers(): hs_desc_link_specifier_t -> link_specifier_t
  • decode_link_specifiers(): link_specifier_t -> hs_desc_link_specifier_t

That is what we have right now using link specifiers interface (trunnel) and some intermediate object. I think a good move forward would be so abstract the high level link specifier object to something generic that every subsystem could use. In other words, replace hs_desc_link_specifier_t and extend_cell_t to be the same object.

They differ right now because the HS one contains *one* specific link specifier and the extend cell object contains *all* possible link specifiers. In the spirit of reducing memory allocation, I think it is reasonable to have one generic object containing all possible link specifiers. Something like:

tor_addr_port_t addrport_v4;
tor_addr_port_t addrport_v6;
uint8_t legacy_id[DIGEST_LEN];
ed25519_public_key_t ed_id;

We mandate that v4 and legacy_id be present which means we could either check for "mem zero" on IPv6 and ed_id to decide to include them or we add a flag saying "v6 defined" to maybe be more efficient. Parsing cells is certainly part of our fast path in my opinion so any optimization is a win. Downside here is if we ever want to include more than one type of specifiers in a cell/descriptor that is for example 2 different IPv4?

My original branch (ticket22781_032_01) has already some work started there with lspec_t being a generic higher level object that encodes to a link_specifier_t and vice versa. However, it would need to contain all the possible links.

Am I on a path that we like?

comment:7 in reply to:  6 ; Changed 5 months ago by teor

Replying to dgoulet:

I want to make sure we nail down an interface that we like/want before going further.

  1. EXTEND cell: The extend_cell_t object contains basically all the possible ways to extend. Through extend_cell_format(), we take what's in it and create link_specifier_t (trunnel).

The relationship is:

  • extend_cell_format(): extend_cell_t -> link_specifier_t
  • extend_cell_parse(): link_specifier_t -> extend_cell_t (Note that this is for EXTEND2).

What about CREATE[2] cells?
extend_info_t is used for them, too.

  1. HSv3: We have hs_desc_link_specifier_t which represent only one possible way to extend (IPv4, IPv6, Legacy ID or ed25519).

Relationship is:

  • encode_link_specifiers(): hs_desc_link_specifier_t -> link_specifier_t
  • decode_link_specifiers(): link_specifier_t -> hs_desc_link_specifier_t

That is what we have right now using link specifiers interface (trunnel) and some intermediate object. I think a good move forward would be so abstract the high level link specifier object to something generic that every subsystem could use. In other words, replace hs_desc_link_specifier_t and extend_cell_t to be the same object.

They differ right now because the HS one contains *one* specific link specifier and the extend cell object contains *all* possible link specifiers. In the spirit of reducing memory allocation, I think it is reasonable to have one generic object containing all possible link specifiers. Something like:

tor_addr_port_t addrport_v4;
tor_addr_port_t addrport_v6;
uint8_t legacy_id[DIGEST_LEN];
ed25519_public_key_t ed_id;

We mandate that v4 and legacy_id be present which means we could either check for "mem zero" on IPv6 and ed_id to decide to include them or we add a flag saying "v6 defined" to maybe be more efficient. Parsing cells is certainly part of our fast path in my opinion so any optimization is a win.

I'm all for avoiding premature optimisations.
Let's define accessor functions that do the zero-checks and then profile CPU and RAM.
Then we can add flags, and profile again.

Downside here is if we ever want to include more than one type of specifiers in a cell/descriptor that is for example 2 different IPv4?

The Prop224 spec says that we should pass on all link specifiers, even unknown link specifier types.
So I think we need a final member extra_spec_list that's a smartlist pointer.
On the fast path, it doesn't get used, because it's NULL.

This could handle extra addresses, too, if we ever did that.

My original branch (ticket22781_032_01) has already some work started there with lspec_t being a generic higher level object that encodes to a link_specifier_t and vice versa. However, it would need to contain all the possible links.

Am I on a path that we like?

I like this idea.

I think this makes it easier to push all the details of choosing addresses deep down the stack. Then we can choose the remote address just before we go to connect to it. It makes a whole lot of code much easier, and much more generic. And we can finally implement tickets like #6772, and try another ORPort if the first one fails.

comment:8 Changed 5 months ago by teor

Reviewer: teor
Status: needs_informationneeds_revision

comment:9 in reply to:  7 ; Changed 5 months ago by dgoulet

Replying to teor:

What about CREATE[2] cells?
extend_info_t is used for them, too.

Right the goal would be to put the "link specifier" generic object in extend_cell_t leaving also the create_cell_t in there.

I'm all for avoiding premature optimisations.
Let's define accessor functions that do the zero-checks and then profile CPU and RAM.
Then we can add flags, and profile again.

Sounds good. Simple first!

The Prop224 spec says that we should pass on all link specifiers, even unknown link specifier types.
So I think we need a final member extra_spec_list that's a smartlist pointer.
On the fast path, it doesn't get used, because it's NULL.

This could handle extra addresses, too, if we ever did that.

Hmmm well the generic object I'm talking about is the one we decode to and encode from link_specifier_t which means that if you don't recognize one, you ignore. Thus the object doesn't really need to keep "extra unknown specifier(s)" because you can really decode and encode what you don't know.

This is more a matter of ignoring unknown link spec type.

My original branch (ticket22781_032_01) has already some work started there with lspec_t being a generic higher level object that encodes to a link_specifier_t and vice versa. However, it would need to contain all the possible links.

Am I on a path that we like?

I like this idea.

I think this makes it easier to push all the details of choosing addresses deep down the stack. Then we can choose the remote address just before we go to connect to it. It makes a whole lot of code much easier, and much more generic. And we can finally implement tickets like #6772, and try another ORPort if the first one fails.

Great! I'll get to it asap! Thanks for the feedback teor!

comment:10 Changed 5 months ago by dgoulet

I've picked up the branch and modified it with what we discussed. See branch ticket22781_033_01.

It still needs a bit more unit test but at least we got the extend cell format unit test testing a lot of it.

One of the commit make the extend_cell_t use the new link specifier object. It could be really improved in terms of refactoring but for now I went for an "inplace" change and then with future commit we can refactor this a bit nicer.

Also, you'll notice that I've explicitly made that we can't put an IPv6 link spec in an EXTEND cell for now even though with this patch, we get it for free.

Last edited 5 months ago by dgoulet (previous) (diff)

comment:11 in reply to:  9 ; Changed 5 months ago by teor

Replying to dgoulet:

Replying to teor:

The Prop224 spec says that we should pass on all link specifiers, even unknown link specifier types.
So I think we need a final member extra_spec_list that's a smartlist pointer.
On the fast path, it doesn't get used, because it's NULL.

This could handle extra addresses, too, if we ever did that.

Hmmm well the generic object I'm talking about is the one we decode to and encode from link_specifier_t which means that if you don't recognize one, you ignore. Thus the object doesn't really need to keep "extra unknown specifier(s)" because you can really decode and encode what you don't know.

As long as you know the length, you can transmit unknown link specifier types without parsing them.

This is more a matter of ignoring unknown link spec type.

Hang on, that's not what the spec says:

The hidden service SHOULD NOT reject any LSTYPE fields which it
doesn't recognize; instead, it should use them verbatim in its EXTEND
request to the rendezvous point.

https://gitweb.torproject.org/torspec.git/tree/proposals/224-rend-spec-ng.txt#n1689

But the spec is inconsistent: it specifies this for service to rend, but not client to intro.
I think we should do it for both, because it means that we automatically upgrade when a new link specifier comes along.

Also, we should add something about how to deal with link specifiers that are too long:

Any link specifier that would make the total length exceed the length of an EXTEND cell MUST be ignored. (This is important for HS descriptor to intro EXTEND, because the descriptor link specifiers aren't limited to the size of a cell.)

Or, we can just ignore unknown link specifiers, but that introduces a version distinguisher, and makes upgrades slower.

comment:12 in reply to:  11 Changed 5 months ago by dgoulet

Replying to teor:
[snip]

But the spec is inconsistent: it specifies this for service to rend, but not client to intro.

Intro do not parse link specifiers from the INTRO cell, that part is encrypted for the service else they would learn the RP and that is no good :).

So HS service is really the only place I see that tor will take a "blob" of link specifiers even thought it doesn't know about it and will extend to the RP.

Am I correct?

comment:13 Changed 5 months ago by teor

The client reads link specifiers from the descriptor and uses them to extend to the intro:

https://gitweb.torproject.org/torspec.git/tree/proposals/224-rend-spec-ng.txt#n1322

comment:14 Changed 4 months ago by dgoulet

Priority: MediumVery High

comment:15 Changed 4 months ago by teor

Milestone: Tor: 0.3.3.x-finalTor: 0.3.4.x-final

The 0.3.3 freeze deadline has passed, all these children of #24403 belong in 0.3.4

comment:16 Changed 8 weeks ago by nickm

Keywords: 034-triage-20180328 added

comment:17 Changed 8 weeks ago by nickm

Keywords: 034-removed-20180328 added

Per our triage process, these tickets are pending removal from 0.3.4.

comment:18 Changed 6 weeks ago by nickm

Milestone: Tor: 0.3.4.x-finalTor: unspecified

These needs_revision, tickets, tagged with 034-removed-*, are no longer in-scope for 0.3.4. We can reconsider any of them, if somebody does the necessary revision.

Note: See TracTickets for help on using tickets.