Opened 5 years ago

Closed 4 years ago

#7987 closed defect (implemented)

Descriptor types missing network-status-microdesc-consensus-3

Reported by: atagar Owned by:
Priority: Medium Milestone:
Component: Metrics/Metrics website Version:
Severity: Keywords:
Cc: Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description

Hi Karsten. While looking over stem's code I just realized that the @type annotation for "network-status-microdesc-consensus-3" isn't on...

https://metrics.torproject.org/formats.html#descriptortypes

This comes from one of Ravi's commits...

https://gitweb.torproject.org/stem.git/commit/8ad310114b1ea7b743a868a8b70832eea5b8f3e2

... so I'm not positive where it came from, but should that @type annotation be on the page?

Thanks! -Damian

Child Tickets

Change History (9)

comment:1 follow-up: Changed 5 years ago by atagar

A side question: it looks like the @type annotations that stem presently doesn't support are...

directory 1.0
dir-key-certificate-3 1.0
torperf 1.0
bridge-pool-assignment 1.0
tordnsel 1.0

Stem presently parses KeyCertificates (in order to support network status documents)...
https://gitweb.torproject.org/stem.git/blob/HEAD:/stem/descriptor/networkstatus.py#l1132

Why do dir-key-certificate-3 have a separate @type annotation? What tarballs could I find instances in?

Would it be a high-ish priority for any of these other types to get support or shall we wait until there's a need?

comment:2 in reply to: ↑ description Changed 5 years ago by karsten

Replying to atagar:

Hi Karsten. While looking over stem's code I just realized that the @type annotation for "network-status-microdesc-consensus-3" isn't on...

https://metrics.torproject.org/formats.html#descriptortypes

We don't archive microdesc consensuses, so you won't find any files containing them on the metrics website. That's why they aren't listed. The same applies to microdescs, too.

... so I'm not positive where it came from, but should that @type annotation be on the page?

A fine question. Does stem add that line to microdesc consensuses that it receives from Tor? What about microdescs?

comment:3 in reply to: ↑ 1 Changed 5 years ago by karsten

Replying to atagar:

A side question: it looks like the @type annotations that stem presently doesn't support are...

directory 1.0
dir-key-certificate-3 1.0
torperf 1.0
bridge-pool-assignment 1.0
tordnsel 1.0

Stem presently parses KeyCertificates (in order to support network status documents)...
https://gitweb.torproject.org/stem.git/blob/HEAD:/stem/descriptor/networkstatus.py#l1132

Why do dir-key-certificate-3 have a separate @type annotation? What tarballs could I find instances in?

https://metrics.torproject.org/data/certs.tar.bz2

So, it seems they're already supported in Stem, just not using that specific @type annotation.

Would it be a high-ish priority for any of these other types to get support or shall we wait until there's a need?

If I had to guess in what order need will arise, that would be: bridge-pool-assignments 1.0, tordnsel 1.0, torperf 1.0, directory 1.0. But it's hard to guess when that will be. bridge-pool-assignments might be relevant soon, the others maybe not.

comment:4 follow-up: Changed 5 years ago by atagar

https://metrics.torproject.org/data/certs.tar.bz2

So, it seems they're already supported in Stem, just not using that specific @type annotation.

Thanks! Added support for the annotation...

https://gitweb.torproject.org/stem.git/commitdiff/fa6ef6bca6e703e9b69a140ec5abe4c972122872

We don't archive microdesc consensuses, so you won't find any files containing them on the metrics website. That's why they aren't listed. The same applies to microdescs, too.

Hmmm. Would you mind expanding the @type annotations to include things not on the metrics site?

I've been getting feedback from Aaron, and last night I overhauled stem's parse_file() function to make it more user friendly...

https://stem.torproject.org/api/descriptor/descriptor.html#stem.descriptor.__init__.parse_file

One of the changes that I made was to let users specify the descriptor_type and added a table of descriptor_type to class mappings.

I decided to use our @type annotations for the argument rather than making up something of my own because they provide a nice, canonical way of specifying descriptor formats. I'd rather not need to make up additional descriptor_types of my own to cover microdescriptors. :)

A fine question. Does stem add that line to microdesc consensuses that it receives from Tor? What about microdescs?

Nope. Pinged Ravi on irc to ask if he remembers where this came from.

If I had to guess in what order need will arise, that would be: bridge-pool-assignments 1.0, tordnsel 1.0, torperf 1.0, directory 1.0. But it's hard to guess when that will be. bridge-pool-assignments might be relevant soon, the others maybe not.

Ok. Let me know if/when a need arises.

You earlier made a list of tasks we needed to make stem the primary descriptor parsing library (ideally so we don't need to continue maintaining metrics-lib as well). Thoughts on the next step?

https://trac.torproject.org/projects/tor/wiki/doc/stem#Projects

comment:5 Changed 5 years ago by neena

I'm a little hazy about the descriptor parsing work I did. I'm not sure where it came from, but Googling "network-status-microdesc-consensus-3" shows up nothing except references to this bit of code in Stem. It feels like I made it up. I can't imagine why I would do that. Maybe I attempted guessing? I don't know.

In any case, it should be removed, I think.

comment:6 in reply to: ↑ 4 Changed 4 years ago by karsten

Replying to atagar:

Hmmm. Would you mind expanding the @type annotations to include things not on the metrics site?

Not at all. Which annotations should we add?

You earlier made a list of tasks we needed to make stem the primary descriptor parsing library (ideally so we don't need to continue maintaining metrics-lib as well). Thoughts on the next step?

https://trac.torproject.org/projects/tor/wiki/doc/stem#Projects

Quoting the not yet done steps here:

Parse microdescriptors - Smaller replacement for server descriptors.

Microdescriptors are not relevant for metrics.

Parse bridge pool assignments - Published by BridgeDB and sanitized by metrics.

This may become relevant once we make progress on the metrics-web replacement. This might be in the next few weeks.

Parse exit list entry - Published by DNSEL or TorBEL to indicate what ip address exit relay X had at timestamp Y.

Same as bridge pool assignments.

Parse Torperf output - Performance data measured by making periodic requests over the Tor network. We'll want to implement #3036 first.

This one should be postponed until after the Torperf rewrite. Hopefully, we'll have a new Torperf data format in a month from now.

Port Onionoo - See #6452 - One of the chief users of metrics-lib, Onionoo is the data provider for Atlas. There's a design document which might be a good starting point.

Sathya and I recently agreed to postpone this and resume Onionoo development in Java. Porting Onionoo isn't as simple as it seems. We'll need somebody with two months of free time for this. We should revisit this in 6 months from now.

Remote descriptor fetching - Ability to fetch descriptors via an authority or directory mirror's DirPort. This involves some tricky, but important performance optimizations like making requests to directories in parallel, requesting up to 96 descriptors in a single HTTP GET, using .z compression, etc.
Port DocTor - Fetches current consensuses and votes and outputs consensus problems.

These two would be nice. This is not related to the other stuff we're currently working on. If I have to do this myself, that's going to take months. But this could be done by an interested volunteer. Happy to guide that volunteer and review code.

Relay class - Convenience class for commonly requested relay attributes, which lazily loads a relay's server descriptor, microdescriptor, and network status entity as needed. This is more a todo item for the Controller class since it requires a control socket.

Okay, unrelated to metrics, it seems.

Port the statistics-aggregating portion of metrics-web - Provides data for graphs on the metrics website. Once we have that we only need a new metrics website before we can kill metrics-web.

I'm currently working on this together with a volunteer. Soon there will be code.

Port ExoneraTor - A website that tells you whether some IP address was a Tor relay. We could probably re-use the existing database schema with minor tweaks.
Searchable Tor descriptor and metrics data archive - Once we have that, we can turn off relay search and maybe even ExoneraTor.

I think I'd prefer a new searchable data archive that makes ExoneraTor obsolete over a simple rewrite of ExoneraTor.

Replace metrics-lib - Part of the goal of this project is to deprecate metrics-lib in favor of stem so we have one fewer services to maintain. When the above is done we should be very, very close - last step is to double check that we haven't missed anything. Realistically, we're talking about years here, not months.

Ack.

comment:7 follow-up: Changed 4 years ago by atagar

Not at all. Which annotations should we add?

Presently the only thing that I'm spotting are microdescriptors. "@type network-status-microdesc-consensus-3 1.0" or something similar for a microdescriptor network status documents would be nice. :)

This may become relevant once we make progress on the metrics-web replacement. This might be in the next few weeks.

Ok. Ping me when it does.

Sathya and I recently agreed to postpone this and resume Onionoo development in Java. Porting Onionoo isn't as simple as it seems. We'll need somebody with two months of free time for this. We should revisit this in 6 months from now.

If only I had more time. This looks like a really fun one.

Speaking of interesting projects, we should think of some good ones for GSoC later...

comment:8 in reply to: ↑ 7 Changed 4 years ago by karsten

Replying to atagar:

Not at all. Which annotations should we add?

Presently the only thing that I'm spotting are microdescriptors. "@type network-status-microdesc-consensus-3 1.0" or something similar for a microdescriptor network status documents would be nice. :)

Done.

This may become relevant once we make progress on the metrics-web replacement. This might be in the next few weeks.

Ok. Ping me when it does.

Will do, thanks!

Sathya and I recently agreed to postpone this and resume Onionoo development in Java. Porting Onionoo isn't as simple as it seems. We'll need somebody with two months of free time for this. We should revisit this in 6 months from now.

If only I had more time. This looks like a really fun one.

Speaking of interesting projects, we should think of some good ones for GSoC later...

Onionoo is fun, but it's quite easy to underestimate the effort of rewriting it in Python. The problem is that we need a full replacement that is as reliable as the Java Onionoo, or it won't be of any use. That's different from other GSoC projects where an 80% version of what was originally planned can be quite useful. I think I wouldn't recommend Onionoo as a GSoC project.

This is getting off-topic. Should we close the ticket and move the GSoC discussion somewhere else? :)

comment:9 Changed 4 years ago by atagar

  • Resolution set to implemented
  • Status changed from new to closed

This is getting off-topic. Should we close the ticket and move the GSoC discussion somewhere else? :)

Agreed, this would be a fine thing to discuss in the March meeting. Resolving.

Note: See TracTickets for help on using tickets.