Hi Karsten, would you mind if we add a few new @type annotations? Stem is getting the ability to parse detached signatures (#28495 (closed)), and sometimes is called to parse individual router status entries.
The way I've designed Stem expects that all descriptor types have a type annotation that users can provide to say 'please parse this as X'. As such I need to either make up my own annotations or (better!) ask for us to expand our official set.
In particular I'm hoping for...
@type detached-signature 1.0 Detached signature as per section 3.10 of the dir-spec, and downloadable for five minutes each hour via the '/tor/status-vote/next/consensus-signatures' resource.@type router-status-entry-2 1.0 Individual router status entry from a v2 network status document.@type router-status-entry-3 1.0 Individual router status entry from a v3 network status document.@type router-status-entry-micro-3 1.0 Individual router status entry from a v3 microdescriptor network status document.
Thanks!
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information
Child items 0
Show closed items
No child items are currently assigned. Use child items to break down this issue into smaller parts.
Linked items 0
Link issues together to show that they're related.
Learn more.
Adding a @type annotation for detached signatures sounds reasonable.
But I'm less clear about the other three. A status entry is not a descriptor. It is part of another descriptor. In Metrics Library we have NetworkStatusEntry for status entry, but it's explicitly marked as not being a descriptor type of its own.
Wouldn't it be sufficient to pass the @type annotation of the network status containing the status entry to Stem?
Asked differently, if we make the status entry a descriptor type of its own, why would we not do the same with dir-source entries (DirSourceEntry in Metrics Library), directory-signature entries (DirectorySignature in Metrics Library)? (I'm not suggesting we do this, but what would we tell somebody suggesting this after defining descriptor types for the various status entries?)
Adding a @type annotation for detached signatures sounds reasonable.
Great! Thanks Karsten. Once the @type docs are updated please let me know and I'll make the related tweaks on stem's end.
But I'm less clear about the other three
You're completely right that router status entries are simply entities in a network status document (@type network-status-*). The trouble is that unlike other descriptor types router status entries are often provided on their own. For example, via the control port...
>>> GETINFO ns/name/caersidi250+ns/name/caersidi=r caersidi O7NMYwctnRDoNu5ClocT97kyX2Y 1cL1nVzDsMuGliTcNUnjIc6MAEE 2018-11-26 17:23:54 208.113.135.162 1443 1444s Fast Guard HSDir Running Stable V2Dir Validw Bandwidth=4820.250 OK
As such stem users sometimes have a desire to parse individual router status entries despite the fact that they're not technically valid descriptors on their own.
why would we not do the same with dir-source entries
That's an interesting point. Is there anything that vends dir-source entries on their own?
What do you suggest as @type annotation for bandwidth files?
Mind if we move this forward?
So, regarding the four suggested @type annotations from above (in slightly different order):
@type detached-signature 1.0 seems plausible to me. Let's add it.
@type router-status-entry-3 1.0 might be okay, too, even though it's not a descriptor of its own, but because it's defined in control-spec as response to a command there. We might want to document whether descriptor (parts) can be obtained via dir-spec, control-spec, or both. Oh, and I wonder if @type network-status-entry-3 1.0 would be more consistent with the consensus and vote equivalents, what do you think?
@type router-status-entry-micro-3 1.0 might be okay for the same reasons as above, though possibly renamed to @type network-status-microdesc-entry-3 1.0. One question though: which one do you get when running that control port command above? The unflavored or the microdesc-flavored entry?
@type router-status-entry-2 1.0, see above, except I'm unclear where you get this from in control-spec. And again, the slightly more consistent name could be @type network-status-entry-2 1.0.
All these are questions and suggestions for discussion here.
One thing that comes to mind is NASA's ontology of units. They have a number of units defined that wouldn't make any sense on their own, but they are found in intermediate steps of solving equations. If we begin moving to big data stream/batch processing with databases and object stores then we are likely to find ourselves with individual entries that are stored on disk as objects or in databases independently of the network status document they came from.
I believe for bandwidth files they should be "bandwidth-list" to use the same terminology as the spec. The spec filename is "bandwidth file" but the actual content of it calls them "lists".
What do you suggest as @type annotation for bandwidth files?
Perfect question, and something I intentionally neglected to mention because I don't know for certain. The spec uses the terms 'bandwidth file' or 'bandwidth list', and in Stem I opted to call the class a BandwidthMetric. Adding juga and teor to this ticket to see if they have an opinion on which they want...
juga, teor: any preference on which of the above you like best?
For what it's worth a couple other things worth noting...
juga mentioned adding these to CollecTor, so the addition of a @type annotation of it will probably be of particular interest to you soon as well.
Bandwidth files already have multiple versions. According to #29056 (closed) these documents have versions 1.0, 1.1, and 1.2 (not to be confused with software versions, which are moot for this ticket and include torflow, sbws 0.1.0, sbws 1.0.0, and sbws 1.0.3).
@type detached-signature 1.0 seems plausible to me. Let's add it.
Great!
Oh, and I wonder if @type network-status-entry-3 1.0 would be more consistent with the consensus and vote equivalents, what do you think?
Sure, fine with me. I don't mind what the name is and '@type network-status-entry-3' sounds fine.
The unflavored or the microdesc-flavored entry?
Good question, and one that previously caused quite a bit of confusion. My understanding is that the answer is unflavored...
What do you suggest as @type annotation for bandwidth files?
Perfect question, and something I intentionally neglected to mention because I don't know for certain. The spec uses the terms 'bandwidth file' or 'bandwidth list', and in Stem I opted to call the class a BandwidthMetric. Adding juga and teor to this ticket to see if they have an opinion on which they want...
juga, teor: any preference on which of the above you like best?
I think it should be bandwidth-file.
Actually re-reading the spec, i think it's a bug that we use Bandwidth List to refer to the whole content of the file, Bandwidth List should be used only for the body, ie the list of relays' measurements.
What do you suggest as @type annotation for bandwidth files?
Perfect question, and something I intentionally neglected to mention because I don't know for certain. The spec uses the terms 'bandwidth file' or 'bandwidth list', and in Stem I opted to call the class a BandwidthMetric. Adding juga and teor to this ticket to see if they have an opinion on which they want...
Actually re-reading the spec, i think it's a bug that we use Bandwidth List to refer to the whole content of the file, Bandwidth List should be used only for the body, ie the list of relays' measurements.
+1, it would be nice to fix, but I think the spec still makes sense
and in Stem I opted to call the class a BandwidthMetric.
Hmm, i would call it BandwidthFile or BandwidthDocument in stem (though technically all the other documents are also documents and files).
See https://trac.torproject.org/projects/tor/ticket/29056#comment:13 for a wip refactoring of current sbws bandwidth file parser to make it compatible with stem Descriptor.
We're going to add a new @type detached-signature 1.0 for: Detached signature as per section 3.10 of the dir-spec, and downloadable for DistSeconds every consensus freshness period (usually five minutes each hour) via the /tor/status-vote/next/consensus-signatures resource.
We define @type network-status-entry-3 1.0 for: Individual router status entry obtained from the control port based on a v3 network status document. Based on unflavored network status consensus entries, except with truncated descriptor digests if tor is using microdescriptors and without w lines.
We drop the plan to add two more types for network/router status entries discussed earlier, because the controller wouldn't produce them anyway.
Finally, we need @type bandwidth-file 1.2 for: Bandwidth file produced by sbws for use by directory authorities acting as bandwidth authorities, and downloadable via the /tor/... (add URL) resource.
Does this make sense? atagar, would you mind starting a patch?
except with truncated descriptor digests if tor is using microdescriptors and without w lines.
Is this somthing we want to codify in our spec? Ticket #24110 (moved) is still open, and in re-reading that it sounds to me like...
Router status entries vended though the control port have 'r' lines with a microdescriptor rather than a server descriptor digest.
It sounds like we're unaware of anything that referrences these erronious digests, so fixing is probably fine.
Am I wrong? Is referrencing the microdescriptor digest either desirable or necessary?
Sorry if I'm missing something!
atagar, would you mind starting a patch?
@type detached-signature 1.0 Detached signature as per section 3.10 of the dir-spec, and downloadable for DistSeconds every consensus freshness period (usually five minutes each hour) via the '/tor/status-vote/next/consensus-signatures' resource.@type network-status-entry-3 1.0 Individual router status entry from an unflavored v3 network status document. These are available from Tor's control port 'GETINFO ns/*' commands and NS events.@type bandwidth-file 1.2 Bandwidth authority metrics as defined in the bandwidth-file-spec [1]. These are available from a DirPort's '/tor/status-vote/next/bandwidth' url and CollecTor. [3][1] https://gitweb.torproject.org/torspec.git/tree/bandwidth-file-spec.txt[2] Both are pending https://trac.torproject.org/projects/tor/ticket/21377
except with truncated descriptor digests if tor is using microdescriptors and without w lines.
And maybe missing some other lines, and maybe with some other format quirks.
The missing "w" line was an example.
Is this somthing we want to codify in our spec? Ticket #24110 (moved) is still open, and in re-reading that it sounds to me like...
Router status entries vended though the control port have 'r' lines with a microdescriptor rather than a server descriptor digest.
It sounds like we're unaware of anything that referrences these erronious digests, so fixing is probably fine.
Am I wrong? Is referrencing the microdescriptor digest either desirable or necessary?
Sorry if I'm missing something!
We don't know if any code depends on the digest here, but if it does, it's terribly fragile, because it will break when UseMicrodescriptors changes.
We could look at this function at our hackfest next week. It will need a refactor before we make any changes. I'll put it on our list.
atagar, would you mind starting a patch?
{{{
@type detached-signature 1.0
Detached signature as per section 3.10 of the dir-spec,
and downloadable for DistSeconds every consensus freshness
period (usually five minutes each hour) via the
'/tor/status-vote/next/consensus-signatures' resource.
@type network-status-entry-3 1.0
Individual router status entry from an unflavored v3
network status document. These are available from
Tor's control port 'GETINFO ns/*' commands and NS
events.
@type bandwidth-file 1.2
Bandwidth authority metrics as defined in the
bandwidth-file-spec [1]. These are available from
a DirPort's '/tor/status-vote/next/bandwidth' url
and CollecTor. [3]
'w' lines are not mandatory, so omitting them does not malform router status entries (unlike the digest, which arguably does).
Lacking those lines sounds like a weird control protocol quirk which should be noted in its spec, but is unimportant to the annotation.
We could look at this function at our hackfest next week. It will need a refactor before we make any changes. I'll put it on our list.
Thanks!
5/6 bandwidth authorities are still using version 1.0 of the bandwidth file format, because they run torflow.
Gotcha. I'll leave it up to Karsten how we want to present it on the metrics site. Adding the '@type bandwidth-file' annotation is the important bit to me. I'll leave it up to others if we want CollecTor to describe the differences between versions 1.0, 1.1, and 1.2.