Opened 2 years ago

Closed 2 years ago

Last modified 17 months ago

#24742 closed enhancement (implemented)

Add fallback list spec to torspec

Reported by: teor Owned by: teor
Priority: Medium Milestone: Tor: 0.3.3.x-final
Component: Core Tor/Fallback Scripts Version:
Severity: Normal Keywords: fallback, torspec, review-group-28
Cc: Actual Points: 1.0
Parent ID: #22271 Points: 0.5
Reviewer: atagar Sponsor:

Description

Please see my torspec branch fallback-format-2 on github, which adds the fallback-spec.txt specification.

This was requested by irl on tor-dev@, so that metrics-lib could have some assurance of the file format. I'm sure it will help stem, too.

Child Tickets

Change History (19)

comment:1 Changed 2 years ago by teor

Status: assignedneeds_review

comment:2 Changed 2 years ago by atagar

Hi Tim, please provide a link for your github branch. Here's an example of one of my pull requests...

https://trac.torproject.org/projects/tor/ticket/24147

Nick and a few others don't provide links either but in their case they're personal branches on gitweb.torproject.org so easy to look up. If you'd like to use github that's fine, but please always provide a link.

comment:3 Changed 2 years ago by atagar

Status: needs_reviewneeds_information

comment:4 Changed 2 years ago by teor

Status: needs_informationneeds_review

comment:5 Changed 2 years ago by atagar

Thanks Tim. For other's reference here's the link.

Very well written, nice work! Only a few minor thoughts...

  • In the 'Fallback Summary' lets explicitly say 'The separator MUST not appear within the user readable comments.' Since parsers will be using it to determine where it ends that would of course be a bad thing. :)
  • Concerning the headers I just had an idea: how about a revision number? Stem uses tor's git commit to determine "Have I already read this copy of the fallbacks file?". If the file instead had an incrementing number I could use that instead ("Oh hey, I've already parsed revision 64. Nothing to do.").

Not necessary in the least. Just an idea.

  • Not certain I know what all "A fallback entry consists of a C string constant" entails. If you simply mean that the fields are quoted then lets say that instead.
  • For the ipv6_address lets specify that the address is fully expanded (ie, nothing like '::').
  • Concerning nickname and extrainfo it's not jumping out to me to expect those to be C-style comments rather than quoted.
  • Total aesthetic thing but I'd include more equal signs in the divider, but do whatever you think makes the file the nicest to read. :)
  • On my first read I missed that you were listing fields rather than lines. Every other spec we have this would mean to expect the orport line to be separate from id line, etc.

Mind if we instead do a per-line spec instead? That is to say tell users they can expect...

DQUOTE AddressIPv4 ":" DirPort SP "orport=" OrPort SP "id=" Fingerprint DQUOTE NL
[ Exactly once ]

DQUOTE SP "ipv6=[" AddressIPv6 "]:" Port DQUOTE NL
[ Any number of times ]

... etc...

comment:6 in reply to:  5 Changed 2 years ago by teor

Actual Points: 1.0
Reviewer: atagar

The revised spec is:

Branch fallback-format-2 on https://github.com/teor2345/torspec.git

The revised code is:

Branch fallback-format-2-v2 on https://github.com/teor2345/tor.git

The revised example file is:

https://trac.torproject.org/projects/tor/attachment/ticket/22759/fallback_dirs_new_format_version.4.inc

In particular, I rewrote the spec to be line-based, and to reference definitions in the Onionoo spec or dir-spec.txt as appropriate.

Replying to atagar:

Thanks Tim. For other's reference here's the link.

Very well written, nice work! Only a few minor thoughts...

  • In the 'Fallback Summary' lets explicitly say 'The separator MUST not appear within the user readable comments.' Since parsers will be using it to determine where it ends that would of course be a bad thing. :)

I made this much clearer:
https://github.com/teor2345/torspec/blob/fallback-format-2/fallback-spec.txt#L189

In particular:

   The fallback summary section MUST NOT be a valid fallback entry.

   The fallback summary MUST end with a section separator:

     separator WS* NL

   There MUST NOT be any section separations in the fallback summary
   section, other than the terminating section separator.
  • Concerning the headers I just had an idea: how about a revision number? Stem uses tor's git commit to determine "Have I already read this copy of the fallbacks file?". If the file instead had an incrementing number I could use that instead ("Oh hey, I've already parsed revision 64. Nothing to do.").

Not necessary in the least. Just an idea.

The file generation does not parse the previous file, so a revision number would be difficult.

But a timestamp should allow you to work out if the file has changed:

https://github.com/teor2345/torspec/blob/fallback-format-2/fallback-spec.txt#L152

  • Not certain I know what all "A fallback entry consists of a C string constant" entails. If you simply mean that the fields are quoted then lets say that instead.

https://github.com/teor2345/torspec/blob/fallback-format-2/fallback-spec.txt#L217

   DQUOTE dir_address WS "orport=" ipv4_or_port WS
     "id=" fingerprint DQUOTE WS* NL
  • For the ipv6_address lets specify that the address is fully expanded (ie, nothing like '::').

The script just outputs whatever Onionoo provides, so we're stuck with that,
unless we change the code. I've made this clearer:

https://github.com/teor2345/torspec/blob/fallback-format-2/fallback-spec.txt#L93

       The ipv6_or_address is an IPv6 address and port from the Onionoo
       or_addresses list. The address MAY be in the canonical RFC 5952
       IPv6 address format.
  • Concerning nickname and extrainfo it's not jumping out to me to expect those to be C-style comments rather than quoted.

Yes, the previous description was very lax. And the likely outcome of that was N different parser implementations. (It's already hard enough to define a format that's C code, that parses as torrc lines, *and* includes some other structure that the compiler will ignore.)

Here's what they look like now:

https://github.com/teor2345/torspec/blob/fallback-format-2/fallback-spec.txt#L279

     "/*" WS "nickname=" nickname WS "*/" WS* NL

       [Zero or more times.]

       The nickname for this FallbackDir, as defined by Onionoo. The
       transitional 2.0.0 format has zero-length nicknames.
  • Total aesthetic thing but I'd include more equal signs in the divider, but do whatever you think makes the file the nicest to read. :)

I spent several minutes doing internal bikeshedding on this.
And then I resolved not to spend any more time on it.

I think this depends on the user's editor, screen, and visual aesthetic.

  • On my first read I missed that you were listing fields rather than lines. Every other spec we have this would mean to expect the orport line to be separate from id line, etc.

Mind if we instead do a per-line spec instead? That is to say tell users they can expect...

DQUOTE AddressIPv4 ":" DirPort SP "orport=" OrPort SP "id=" Fingerprint DQUOTE NL
[ Exactly once ]

DQUOTE SP "ipv6=[" AddressIPv6 "]:" Port DQUOTE NL
[ Any number of times ]

... etc...

It's now a per-line spec, see above for examples.

Once you're happy, please flip this to merge_ready, and nickm will look at it when he's back.

(I'll send an update to tor-dev@, and give irl a chance to weigh in. But we can always do revisions later.)

comment:7 Changed 2 years ago by atagar

Status: needs_reviewmerge_ready

Thanks Tim, looks great to me! I can merge spec additions like this if you'd like, but since you indicated wanting Nick's review too leaving it be.


The script just outputs whatever Onionoo provides, so we're stuck with that

Oh, I didn't realize fallback generation used Onionoo. Why is that? I assume you use the additional data Onionoo provides (geoip and AS)? Otherwise seems simpler to just run through descriptors from the dirauths.


WS (whitespace)

WS is defined in the dir-spec as...

WS = (SP | TAB)+

Generally the dir-spec and control-spec only use SP. Are you sure you want parsers to look for tabs too?

comment:8 in reply to:  7 Changed 2 years ago by teor

Replying to atagar:

Thanks Tim, looks great to me! I can merge spec additions like this if you'd like, but since you indicated wanting Nick's review too leaving it be.


The script just outputs whatever Onionoo provides, so we're stuck with that

Oh, I didn't realize fallback generation used Onionoo. Why is that? I assume you use the additional data Onionoo provides (geoip and AS)?

uptime:

  • flags

details:

  • last_changed_address_or_port
  • recommended_version
  • effective_family
  • some information that is also in the consensus
  • some information that is also in descriptors

WS (whitespace)

WS is defined in the dir-spec as...

WS = (SP | TAB)+

Generally the dir-spec and control-spec only use SP. Are you sure you want parsers to look for tabs too?

I changed the spec so it uses SP+ or SP* as appropriate.

comment:9 Changed 2 years ago by atagar

Ahh! Gotcha, thanks. Flags and recommended version are in the consensus, but effective_family and last_changed_address_or_port definitely aren't.

comment:10 Changed 2 years ago by teor

Flag *uptime* (history) is not in the consensus.

comment:11 Changed 2 years ago by teor

I squashed the 4 commits in the original branch into fallback-format-2-v2 on https://github.com/teor2345/torspec.git

comment:12 Changed 2 years ago by nickm

Keywords: review-group-28 added

comment:13 Changed 2 years ago by pastly

     "/*" SP+ "nickname=" nickname SP+ "*/" SP* NL

       [Zero or more times.]

       The nickname for this FallbackDir, as defined by Onionoo. The
       transitional 2.0.0 format has zero-length nicknames.

Is that accurate? I see nicknames in the output of fallback-format-2-v4.

A.1. Sample Data

Make sure these are up to date.

comment:14 in reply to:  13 Changed 2 years ago by teor

Replying to pastly:

     "/*" SP+ "nickname=" nickname SP+ "*/" SP* NL

       [Zero or more times.]

       The nickname for this FallbackDir, as defined by Onionoo. The
       transitional 2.0.0 format has zero-length nicknames.

Is that accurate? I see nicknames in the output of fallback-format-2-v4.

You're right, this is confusing.

It's in the output of the script in fallback-format-2-v4.
But it's not in the src/or/fallback_dirs.inc in fallback-format-2-v4.

I pushed a commit that explains what we do when the nickname or extrainfo is missing. I also explained which data is missing from the src/or/fallback_dirs.inc in fallback-format-2-v4.

A.1. Sample Data

Make sure these are up to date.

Yes, I've updated the links.
I think one was different in a significant way, and the rest were not.

comment:15 Changed 2 years ago by teor

Parent ID: #24725#22271

Re-parenting this, so we don't lose it.

comment:16 Changed 2 years ago by teor

Status: merge_readyneeds_information

The occurrences lines didn't match the generated file, so I added a commit to fallback-format-2-v2 on ​https://github.com/teor2345/torspec.git

I squashed this branch into fallback-format-2-v3.

I want to wait on the outcome of #24818 before we merge this, because if we decide to make the authorities into a list, we might want to change the name of the file.

comment:17 Changed 2 years ago by teor

Status: needs_informationneeds_review

The file is now dir-list-spec.txt, and the branch is now fallback-format-2-v4.

I made the descriptions in the spec more generic in preparation for #24818, but the content of the spec is still the same.

comment:18 Changed 2 years ago by nickm

Resolution: implemented
Status: needs_reviewclosed

merged to torspec! Un-parented #24818 to close.

comment:19 Changed 17 months ago by teor

Cc: teor@… removed

Remove useless Cc

Note: See TracTickets for help on using tickets.