#28615 closed enhancement (implemented)

Additional @type annotation

Reported by: atagar Owned by: metrics-team
Priority: Medium Milestone:
Component: Metrics/Library Version:
Severity: Normal Keywords:
Cc: teor, juga Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description

Hi Karsten, would you mind if we add a few new @type annotations? Stem is getting the ability to parse detached signatures (#28495), and sometimes is called to parse individual router status entries.

The way I've designed Stem expects that all descriptor types have a type annotation that users can provide to say 'please parse this as X'. As such I need to either make up my own annotations or (better!) ask for us to expand our official set.

In particular I'm hoping for...

@type detached-signature 1.0

  Detached signature as per section 3.10 of the dir-spec,
  and downloadable for five minutes each hour via the
  '/tor/status-vote/next/consensus-signatures' resource.

@type router-status-entry-2 1.0


  Individual router status entry from a v2 network status
  document.

@type router-status-entry-3 1.0


  Individual router status entry from a v3 network status
  document.

@type router-status-entry-micro-3 1.0


  Individual router status entry from a v3 microdescriptor
  network status document.

Thanks!

Child Tickets

Change History (20)

comment:1 Changed 13 months ago by irl

downloadable for five minutes each hour

It is actually for DistSeconds every freshness period. This just happens to be 5 minutes at the moment but could change.

Other than that I think these are all important and we should add them.

comment:2 Changed 13 months ago by karsten

Adding a @type annotation for detached signatures sounds reasonable.

But I'm less clear about the other three. A status entry is not a descriptor. It is part of another descriptor. In Metrics Library we have NetworkStatusEntry for status entry, but it's explicitly marked as not being a descriptor type of its own.

Wouldn't it be sufficient to pass the @type annotation of the network status containing the status entry to Stem?

Asked differently, if we make the status entry a descriptor type of its own, why would we not do the same with dir-source entries (DirSourceEntry in Metrics Library), directory-signature entries (DirectorySignature in Metrics Library)? (I'm not suggesting we do this, but what would we tell somebody suggesting this after defining descriptor types for the various status entries?)

comment:3 Changed 13 months ago by atagar

Adding a @type annotation for detached signatures sounds reasonable.

Great! Thanks Karsten. Once the @type docs are updated please let me know and I'll make the related tweaks on stem's end.

But I'm less clear about the other three

You're completely right that router status entries are simply entities in a network status document (@type network-status-*). The trouble is that unlike other descriptor types router status entries are often provided on their own. For example, via the control port...

>>> GETINFO ns/name/caersidi
250+ns/name/caersidi=
r caersidi O7NMYwctnRDoNu5ClocT97kyX2Y 1cL1nVzDsMuGliTcNUnjIc6MAEE 2018-11-26 17:23:54 208.113.135.162 1443 1444
s Fast Guard HSDir Running Stable V2Dir Valid
w Bandwidth=4820
.
250 OK

As such stem users sometimes have a desire to parse individual router status entries despite the fact that they're not technically valid descriptors on their own.

why would we not do the same with dir-source entries

That's an interesting point. Is there anything that vends dir-source entries on their own?

comment:4 Changed 11 months ago by atagar

Hi Karsten, I also need a @type for bandwidth files. Stem is learning to parse these since they'll soon be available via DirPorts...

https://trac.torproject.org/projects/tor/ticket/29056
https://gitweb.torproject.org/torspec.git/tree/bandwidth-file-spec.txt

Mind if we move this forward?

comment:5 in reply to:  4 Changed 11 months ago by karsten

Replying to atagar:

Hi Karsten, I also need a @type for bandwidth files. Stem is learning to parse these since they'll soon be available via DirPorts...

https://trac.torproject.org/projects/tor/ticket/29056
https://gitweb.torproject.org/torspec.git/tree/bandwidth-file-spec.txt

What do you suggest as @type annotation for bandwidth files?

Mind if we move this forward?

So, regarding the four suggested @type annotations from above (in slightly different order):

  • @type detached-signature 1.0 seems plausible to me. Let's add it.
  • @type router-status-entry-3 1.0 might be okay, too, even though it's not a descriptor of its own, but because it's defined in control-spec as response to a command there. We might want to document whether descriptor (parts) can be obtained via dir-spec, control-spec, or both. Oh, and I wonder if @type network-status-entry-3 1.0 would be more consistent with the consensus and vote equivalents, what do you think?
  • @type router-status-entry-micro-3 1.0 might be okay for the same reasons as above, though possibly renamed to @type network-status-microdesc-entry-3 1.0. One question though: which one do you get when running that control port command above? The unflavored or the microdesc-flavored entry?
  • @type router-status-entry-2 1.0, see above, except I'm unclear where you get this from in control-spec. And again, the slightly more consistent name could be @type network-status-entry-2 1.0.

All these are questions and suggestions for discussion here.

comment:6 Changed 11 months ago by irl

One thing that comes to mind is NASA's ontology of units. They have a number of units defined that wouldn't make any sense on their own, but they are found in intermediate steps of solving equations. If we begin moving to big data stream/batch processing with databases and object stores then we are likely to find ourselves with individual entries that are stored on disk as objects or in databases independently of the network status document they came from.

I believe for bandwidth files they should be "bandwidth-list" to use the same terminology as the spec. The spec filename is "bandwidth file" but the actual content of it calls them "lists".

comment:7 Changed 11 months ago by atagar

Cc: teor juga added

What do you suggest as @type annotation for bandwidth files?

Perfect question, and something I intentionally neglected to mention because I don't know for certain. The spec uses the terms 'bandwidth file' or 'bandwidth list', and in Stem I opted to call the class a BandwidthMetric. Adding juga and teor to this ticket to see if they have an opinion on which they want...

@type bandwidth 1.2
@type bandwidth-file 1.2
@type bandwidth-list 1.2
@type bandwidth-metric 1.2

juga, teor: any preference on which of the above you like best?

For what it's worth a couple other things worth noting...

  • juga mentioned adding these to CollecTor, so the addition of a @type annotation of it will probably be of particular interest to you soon as well.
  • Bandwidth files already have multiple versions. According to #29056 these documents have versions 1.0, 1.1, and 1.2 (not to be confused with software versions, which are moot for this ticket and include torflow, sbws 0.1.0, sbws 1.0.0, and sbws 1.0.3).

@type detached-signature 1.0 seems plausible to me. Let's add it.

Great!

Oh, and I wonder if @type network-status-entry-3 1.0 would be more consistent with the consensus and vote equivalents, what do you think?

Sure, fine with me. I don't mind what the name is and '@type network-status-entry-3' sounds fine.

The unflavored or the microdesc-flavored entry?

Good question, and one that previously caused quite a bit of confusion. My understanding is that the answer is unflavored...

https://gitweb.torproject.org/stem.git/commit/?id=136537c
https://trac.torproject.org/projects/tor/ticket/24110

comment:8 in reply to:  7 ; Changed 11 months ago by juga

Replying to atagar:

What do you suggest as @type annotation for bandwidth files?

Perfect question, and something I intentionally neglected to mention because I don't know for certain. The spec uses the terms 'bandwidth file' or 'bandwidth list', and in Stem I opted to call the class a BandwidthMetric. Adding juga and teor to this ticket to see if they have an opinion on which they want...

@type bandwidth 1.2
@type bandwidth-file 1.2
@type bandwidth-list 1.2
@type bandwidth-metric 1.2

juga, teor: any preference on which of the above you like best?

I think it should be bandwidth-file.
Actually re-reading the spec, i think it's a bug that we use Bandwidth List to refer to the whole content of the file, Bandwidth List should be used only for the body, ie the list of relays' measurements.

comment:9 in reply to:  8 Changed 11 months ago by teor

Replying to juga:

Replying to atagar:

What do you suggest as @type annotation for bandwidth files?

Perfect question, and something I intentionally neglected to mention because I don't know for certain. The spec uses the terms 'bandwidth file' or 'bandwidth list', and in Stem I opted to call the class a BandwidthMetric. Adding juga and teor to this ticket to see if they have an opinion on which they want...

@type bandwidth 1.2
@type bandwidth-file 1.2
@type bandwidth-list 1.2
@type bandwidth-metric 1.2

juga, teor: any preference on which of the above you like best?

I think it should be bandwidth-file.

+1, to match https://gitweb.torproject.org/torspec.git/tree/bandwidth-file-spec.txt

Actually re-reading the spec, i think it's a bug that we use Bandwidth List to refer to the whole content of the file, Bandwidth List should be used only for the body, ie the list of relays' measurements.

+1, it would be nice to fix, but I think the spec still makes sense

comment:10 in reply to:  7 Changed 11 months ago by teor

Replying to atagar:

The unflavored or the microdesc-flavored entry?

Good question, and one that previously caused quite a bit of confusion. My understanding is that the answer is unflavored...

https://gitweb.torproject.org/stem.git/commit/?id=136537c
https://trac.torproject.org/projects/tor/ticket/24110

"it always returns NS-flavored r lines, except with truncated descriptor digests if tor is using microdescriptors"
https://trac.torproject.org/projects/tor/ticket/24110#comment:10

Also, some parts of the standard ns (unflavoured) entry are missing, like the "w" line.

So I think the correct answer is actually "control-port" flavoured.
If you try to pretend that they are ns-flavoured, you will eventually be sad.

comment:11 Changed 11 months ago by atagar

Thanks juga, thanks teor. Spec standardization of the name out for review: #29137

comment:12 in reply to:  7 Changed 11 months ago by juga

Replying to atagar:

and in Stem I opted to call the class a BandwidthMetric.

Hmm, i would call it BandwidthFile or BandwidthDocument in stem (though technically all the other documents are also documents and files).
See https://trac.torproject.org/projects/tor/ticket/29056#comment:13 for a wip refactoring of current sbws bandwidth file parser to make it compatible with stem Descriptor.

Edit: fix typo, add word "call"

Last edited 11 months ago by juga (previous) (diff)

comment:13 Changed 11 months ago by karsten

Trying to summarize where we are:

  • We're going to add a new @type detached-signature 1.0 for: Detached signature as per section 3.10 of the dir-spec, and downloadable for DistSeconds every consensus freshness period (usually five minutes each hour) via the /tor/status-vote/next/consensus-signatures resource.
  • We define @type network-status-entry-3 1.0 for: Individual router status entry obtained from the control port based on a v3 network status document. Based on unflavored network status consensus entries, except with truncated descriptor digests if tor is using microdescriptors and without w lines.
  • We drop the plan to add two more types for network/router status entries discussed earlier, because the controller wouldn't produce them anyway.
  • Finally, we need @type bandwidth-file 1.2 for: Bandwidth file produced by sbws for use by directory authorities acting as bandwidth authorities, and downloadable via the /tor/... (add URL) resource.

Does this make sense? atagar, would you mind starting a patch?

comment:14 Changed 11 months ago by atagar

except with truncated descriptor digests if tor is using microdescriptors and without w lines.

Is this somthing we want to codify in our spec? Ticket #24110 is still open, and in re-reading that it sounds to me like...

  • Router status entries vended though the control port have 'r' lines with a microdescriptor rather than a server descriptor digest.
  • It sounds like we're unaware of anything that referrences these erronious digests, so fixing is probably fine.

Am I wrong? Is referrencing the microdescriptor digest either desirable or necessary?

Sorry if I'm missing something!

atagar, would you mind starting a patch?

@type detached-signature 1.0

  Detached signature as per section 3.10 of the dir-spec,
  and downloadable for DistSeconds every consensus freshness
  period (usually five minutes each hour) via the
  '/tor/status-vote/next/consensus-signatures' resource.

@type network-status-entry-3 1.0

  Individual router status entry from an unflavored v3
  network status document. These are available from
  Tor's control port 'GETINFO ns/*' commands and NS
  events.

@type bandwidth-file 1.2

  Bandwidth authority metrics as defined in the
  bandwidth-file-spec [1]. These are available from
  a DirPort's '/tor/status-vote/next/bandwidth' url
  and CollecTor. [3]


[1] https://gitweb.torproject.org/torspec.git/tree/bandwidth-file-spec.txt
[2] Both are pending https://trac.torproject.org/projects/tor/ticket/21377

comment:15 in reply to:  14 Changed 11 months ago by teor

Replying to atagar:

except with truncated descriptor digests if tor is using microdescriptors and without w lines.

And maybe missing some other lines, and maybe with some other format quirks.
The missing "w" line was an example.

Is this somthing we want to codify in our spec? Ticket #24110 is still open, and in re-reading that it sounds to me like...

  • Router status entries vended though the control port have 'r' lines with a microdescriptor rather than a server descriptor digest.
  • It sounds like we're unaware of anything that referrences these erronious digests, so fixing is probably fine.

Am I wrong? Is referrencing the microdescriptor digest either desirable or necessary?

Sorry if I'm missing something!

We don't know if any code depends on the digest here, but if it does, it's terribly fragile, because it will break when UseMicrodescriptors changes.

We could look at this function at our hackfest next week. It will need a refactor before we make any changes. I'll put it on our list.

atagar, would you mind starting a patch?

@type detached-signature 1.0

  Detached signature as per section 3.10 of the dir-spec,
  and downloadable for DistSeconds every consensus freshness
  period (usually five minutes each hour) via the
  '/tor/status-vote/next/consensus-signatures' resource.

@type network-status-entry-3 1.0

  Individual router status entry from an unflavored v3
  network status document. These are available from
  Tor's control port 'GETINFO ns/*' commands and NS
  events.

@type bandwidth-file 1.2

  Bandwidth authority metrics as defined in the
  bandwidth-file-spec [1]. These are available from
  a DirPort's '/tor/status-vote/next/bandwidth' url
  and CollecTor. [3]


[1] https://gitweb.torproject.org/torspec.git/tree/bandwidth-file-spec.txt
[2] Both are pending https://trac.torproject.org/projects/tor/ticket/21377

5/6 bandwidth authorities are still using version 1.0 of the bandwidth file format, because they run torflow.

comment:16 Changed 11 months ago by atagar

The missing "w" line was an example.

'w' lines are not mandatory, so omitting them does not malform router status entries (unlike the digest, which arguably does).

Lacking those lines sounds like a weird control protocol quirk which should be noted in its spec, but is unimportant to the annotation.

We could look at this function at our hackfest next week. It will need a refactor before we make any changes. I'll put it on our list.

Thanks!

5/6 bandwidth authorities are still using version 1.0 of the bandwidth file format, because they run torflow.

Gotcha. I'll leave it up to Karsten how we want to present it on the metrics site. Adding the '@type bandwidth-file' annotation is the important bit to me. I'll leave it up to others if we want CollecTor to describe the differences between versions 1.0, 1.1, and 1.2.

comment:17 Changed 11 months ago by karsten

Status: newneeds_review

Please somebody review commit c778a5e in my task-28615 branch. Note the change to @type detached-signature-3 for extra consistency.

comment:18 Changed 11 months ago by atagar

Thanks Karsten! Looks good to me.

comment:19 Changed 11 months ago by karsten

Status: needs_reviewneeds_information

Great, thanks for checking. Merged and deployed. Anything else left to do here?

comment:20 Changed 11 months ago by atagar

Resolution: implemented
Status: needs_informationclosed

Don't think so. Thanks Karsten!

Note: See TracTickets for help on using tickets.