Opened 5 years ago

Closed 4 years ago

#13135 closed enhancement (implemented)

Support base64-encoded fingerprints in `search` parameter

Reported by: Sebastian Owned by:
Priority: Medium Milestone:
Component: Metrics/Onionoo Version:
Severity: Keywords:
Cc: iwakeh Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description

When searching for relays, it'd be nice if we could also search (and list) the base64-encoded fingerprint, extrainfo-hash etc, so that it is easy to find the relays inside Tor's data dir.

Child Tickets

Attachments (1)

ResourceServletTest.java (1.7 KB) - added by iwakeh 4 years ago.

Download all attachments as: .zip

Change History (19)

comment:1 in reply to:  description Changed 5 years ago by karsten

Type: defectenhancement

Thanks for the suggestion!

Searching by and listing base64-encoded relay fingerprints is doable. I'm not entirely sure how + and / should be escaped in the search parameter, but I'll ask Onionoo client developers to comment on this. The listing part is simply a new field.

What do you mean by "extrainfo-hash"? Would you really want to learn about the extra-info descriptor digest here? We don't even include the server descriptor digest. What would you do with the extra-info descriptor digest?

And what other things does the "etc" stand for?

comment:2 Changed 5 years ago by Sebastian

Basically, any field that is referenced in the consensus I would like to search for, be it descriptor or ei-descriptor. Ideally, this would work for microdescriptors too.

comment:3 Changed 5 years ago by karsten

Suggested changes:

  1. The search parameter supports searching by (beginning of) base64-encoded relay fingerprint. This needs input from Onionoo client developers for escaping + and /. Technically, this constitutes a major protocol change that requires at least one month notice before deployment to give client developers the chance to implement filters if their searches are not supposed to return matching base64-encoded fingerprints.
  1. (This is actually a suggestion to not change anything in Onionoo:) Regarding listing base64-encoded fingerprint, I'm considering not to add a field for that but asking client developers to re-encode the hex fingerprint if they want to display the base64-encoded one. Let's save bandwidth by avoiding to transmit redundant data. Unless binary encoding is something that clients cannot or shouldn't be required to do?
  1. details documents contain new fields server_digest and extra_digest with hex representations of server descriptor and extra-info descriptor digests. This is a minor protocol change that I can implement in the next day or two and deploy by the end of the week if we agree it's a good idea. Atlas and friends can then display the new fields if they wish to. They could even include links to the descriptors as served by gabelmoo, if you want to.
  1. details documents contain a new field micro_digest with the microdescriptor digest encoded in base64 with trailing =s omitted. This requires parsing microdesc consensuses which we're currently not doing, so we should run some performance tests before deploying this on the server. Don't expect this feature this month, I'd say.

comment:4 in reply to:  3 ; Changed 4 years ago by phw

Replying to karsten:

  1. The search parameter supports searching by (beginning of) base64-encoded relay fingerprint. This needs input from Onionoo client developers for escaping + and /. Technically, this constitutes a major protocol change that requires at least one month notice before deployment to give client developers the chance to implement filters if their searches are not supposed to return matching base64-encoded fingerprints.

If a relay would have a nickname identical to another relay's base64-encoded fingerprint, Onionoo would return both relays, right?

  1. (This is actually a suggestion to not change anything in Onionoo:) Regarding listing base64-encoded fingerprint, I'm considering not to add a field for that but asking client developers to re-encode the hex fingerprint if they want to display the base64-encoded one. Let's save bandwidth by avoiding to transmit redundant data. Unless binary encoding is something that clients cannot or shouldn't be required to do?

That sounds like a good option to me but would it work with the upcoming JavaScript-free version of Globe?

comment:5 in reply to:  4 Changed 4 years ago by karsten

Replying to phw:

Replying to karsten:

  1. The search parameter supports searching by (beginning of) base64-encoded relay fingerprint. This needs input from Onionoo client developers for escaping + and /. Technically, this constitutes a major protocol change that requires at least one month notice before deployment to give client developers the chance to implement filters if their searches are not supposed to return matching base64-encoded fingerprints.

If a relay would have a nickname identical to another relay's base64-encoded fingerprint, Onionoo would return both relays, right?

It would, yes.

  1. (This is actually a suggestion to not change anything in Onionoo:) Regarding listing base64-encoded fingerprint, I'm considering not to add a field for that but asking client developers to re-encode the hex fingerprint if they want to display the base64-encoded one. Let's save bandwidth by avoiding to transmit redundant data. Unless binary encoding is something that clients cannot or shouldn't be required to do?

That sounds like a good option to me but would it work with the upcoming JavaScript-free version of Globe?

Why wouldn't it work? The JavaScript-free version of Globe would simply put together details pages on the Globe server, rather than letting the user's browser do that.

Thanks for the feedback!

comment:6 Changed 4 years ago by karsten

Cc: iwakeh added
Status: newneeds_review

I implemented searching for base64-encoded fingerprints in my task-13135 branch. That branch passes unit tests, but I didn't test it in the Vagrant testing instance yet. We'll also have to announce this protocol change one month prior to deploying it. Please review.

comment:7 Changed 4 years ago by iwakeh

The branch looks fine. (No problems when running it locally, but I had to start with fresh data.)

In SummaryDocument (catch-statements around lines 52 and 65)
I'd rather log the DecoderException instead of totally ignoring it.
The DigestUtils or Base64 class could malfunction, or some other problem could happen. In such a case, there would be a hint in the log.

And the version in ResponseBuilder needs to be increased before the release.

The first time for using 'next_major_version_scheduled' :-)

comment:8 in reply to:  7 ; Changed 4 years ago by karsten

Replying to iwakeh:

The branch looks fine. (No problems when running it locally, but I had to start with fresh data.)

In theory, starting with fresh data shouldn't have been necessary. Redeploying the .war or maybe even restarting Tomcat should be sufficient. Hmm.

In SummaryDocument (catch-statements around lines 52 and 65)
I'd rather log the DecoderException instead of totally ignoring it.
The DigestUtils or Base64 class could malfunction, or some other problem could happen. In such a case, there would be a hint in the log.

True, we should log those things.

And the version in ResponseBuilder needs to be increased before the release.

True.

The first time for using 'next_major_version_scheduled' :-)

Indeed.

However, there's still a problem with my branch: Tomcat interprets + in parameter values as space. When I fetch the following URL either with curl or Firefox, the servlet sees the following search terms.

http://localhost:8080/onionoo/summary?search=S+1drLUSi/Sq+vGxcQlIjauTLDg

Search terms: [S, 1drLUSi/Sq, vGxcQlIjauTLDg]

Maybe embedded Jetty doesn't do that? Or maybe there's a fix to tell Tomcat not to interpret + as space? Or maybe we should tell Onionoo clients to escape/substitute the + character in base64 searches (ugh)?

comment:9 in reply to:  8 ; Changed 4 years ago by iwakeh

Replying to karsten:

Replying to iwakeh:

The branch looks fine. (No problems when running it locally, but I had to start with fresh data.)

In theory, starting with fresh data shouldn't have been necessary. Redeploying the .war or maybe even > restarting Tomcat should be sufficient. Hmm.

I had to start with fresh data, b/c it was erased during an extended 'clean-up' ;-)

However, there's still a problem with my branch: Tomcat interprets + in parameter values as space. When I fetch the following URL either with curl or Firefox, the servlet sees the following search terms.

http://localhost:8080/onionoo/summary?search=S+1drLUSi/Sq+vGxcQlIjauTLDg

Search terms: [S, 1drLUSi/Sq, vGxcQlIjauTLDg]

Maybe embedded Jetty doesn't do that? Or maybe there's a fix to tell Tomcat not to interpret + as
space? Or maybe we should tell Onionoo clients to escape/substitute the + character in base64
searches (ugh)?

Oh, I missed that.
Well, getQueryString() from HttpServlet will return search=S+1drLUSi/Sq+vGxcQlIjauTLDg
So, in case of a search rather parse the search parameters manually?
This would then also work for all other containers, as HttpServlet is part of the api.

comment:10 in reply to:  9 Changed 4 years ago by iwakeh

Last edited 4 years ago by iwakeh (previous) (diff)

comment:11 Changed 4 years ago by karsten

I took your advice and rewrote the servlet to use getQueryString for parsing the search parameter. Mind reviewing my branch once more, in particular the regular expression?

comment:12 in reply to:  11 Changed 4 years ago by iwakeh

I attached another 'ResourceServletTest' class, which should go into
the org.torproject.onionoo.server test package.
(Feel free to rename and rearrange anything. And, add more tests.)

To make the tests work parseSearchParameters() should be public static.

getQueryString() returns the entire string after the path, i.e.
I think, http://localhost:8080/summary?search=something&limit=4
will lead to search=something&limit=4.

(btw: This test is an example for the tests that we can add more easily when the api change is introduced, e.g. test classes for all the Validators there.)

Last edited 4 years ago by iwakeh (previous) (diff)

Changed 4 years ago by iwakeh

Attachment: ResourceServletTest.java added

comment:13 Changed 4 years ago by karsten

I updated the behavior of getQueryString() in the unit tests and added tests based on yours. (I'm trying to limit unit tests to a single test case only, which is why I rewrote your test methods.) Feel free to suggest more tweaks.

I'm also going to set next_major_version_scheduled to November 15, 2014 and announce the next protocol version on onionoo-announce@ today.

Thanks!

comment:14 Changed 4 years ago by karsten

Sebastian, while re-reading the ticket to see what's left to do, I figured that you didn't only ask for the various descriptor digests to be listed, but you also want to search for them, right? That's also doable, but slightly harder to implement.

I also realized that you don't just want the digests of the most recent descriptors that are known to Onionoo (which might be whatever metrics-db fetches from gabelmoo and moria1 at :05 of the hour, not what's contained in the latest consensus or microdesc consensus), but a set of digests of descriptors published in the last X (6? 12? 24? 48?) hours.

So, before I go build something, I'd like to hear what exact use cases you have in mind. Can you say which of the following use cases would be useful to have, and for whom and in which situation? (I guess I'm asking for user stories here...)

  • Search for (partial) base64-encoded fingerprint (already implemented, will be deployed on November 15)
  • Display base64-encoded fingerprint (should be implemented in Atlas/Globe by converting the hex fingerprint to base64)
  • Search for hex-encoded server descriptor digest
  • Search for base64-encoded server descriptor digest
  • Display hex-encoded or base64-encoded (whichever is more useful) digests of all server descriptors published in last X hours
  • Search for hex-encoded extra-info descriptor digest
  • Search for base64-encoded extra-info descriptor digest
  • Display hex-encoded or base64-encoded (whichever is more useful) digests of all extra-info descriptors published in last X hours
  • Search for hex-encoded microdescriptor digest
  • Search for base64-encoded microdescriptor digest
  • Display hex-encoded or base64-encoded (whichever is more useful) digests of all microdescriptors generated from descriptors published in last X hours

comment:15 Changed 4 years ago by Sebastian

One big usecase for the partial searches is spotting some hidden service ring cheating sybils.

Why I requested the base64 format is because I want to plug whatever I see in a consensus in Tor into the tool to learn more about it. The same goes for the reverse direction - by having them displayed on the website, I can easily search for them on disk. I'm not sure if I ever asked for hex-encoded digests, only fingerprints I believe.

comment:16 in reply to:  15 Changed 4 years ago by karsten

Replying to Sebastian:

One big usecase for the partial searches is spotting some hidden service ring cheating sybils.

Partial fingerprint searches are already supported for hex-encoded fingerprints and will be supported for base64-encoded fingerprints, too. Sounds like we're doing good here.

Why I requested the base64 format is because I want to plug whatever I see in a consensus in Tor into the tool to learn more about it. The same goes for the reverse direction - by having them displayed on the website, I can easily search for them on disk. I'm not sure if I ever asked for hex-encoded digests, only fingerprints I believe.

The search part will be supported by Onionoo soon, and the display part is something you'll have to request from Atlas/Globe people in separate tickets. That's for relay fingerprints.

Does that mean descriptor digests are not relevant for your use cases? That includes server descriptors, extra-info descriptors, and microdescriptors. There was some discussion of these above, but if fingerprints are all you care about, I'm happy not to include a feature that wouldn't be used anyway.

comment:17 Changed 4 years ago by karsten

Summary: Support various fingerprint formatsSupport base64-encoded fingerprints in `search` parameter

Sebastian and I discussed this more on IRC. The conclusion is in new tickets #13422, #13424, and #13425. I'd say let's turn this ticket into the ticket that adds base64-encoded fingerprints to the search parameter and discuss the remaining ideas in the other tickets. Leaving this ticket open until the branch is merged and deployed, which will be in a month from now.

comment:18 Changed 4 years ago by karsten

Resolution: implemented
Status: needs_reviewclosed

Merged to master and deployed. Closing. Thanks everyone!

Note: See TracTickets for help on using tickets.