Opened 8 months ago

Last modified 5 months ago

#21366 new enhancement

support whitespace in search term (as does onionoo)

Reported by: cypherpunks Owned by: irl
Priority: Medium Milestone:
Component: Metrics/Atlas Version:
Severity: Normal Keywords:
Cc: Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Child Tickets

Change History (16)

comment:1 in reply to:  description Changed 8 months ago by cypherpunks

Replying to cypherpunks:

does not work:
https://atlas.torproject.org/#search/contact:Neel%20Chauhan

Atlas uses https://onionoo.torproject.org/summary?search=contact:Neel%20Chauhan with that query which does result in two empty arrays.

comment:2 Changed 8 months ago by cypherpunks

So this is an onionoo bug?

Should semantics for
https://onionoo.torproject.org/summary?search=contact:Neel%20Chauhan
and
https://onionoo.torproject.org/summary?contact=Neel%20Chauhan
match? (result in the same search result)

comment:3 Changed 8 months ago by cypherpunks

Component: Metrics/AtlasMetrics/Onionoo
Owner: changed from irl to metrics-team

comment:4 Changed 8 months ago by karsten

Well, I wouldn't say that this is an Onionoo bug. Onionoo interprets spaces in the search parameter as separators between search terms, and search terms can optionally be prefixed with a qualifier.

In the first example above, search=contact:Neel%20Chauhan is interpreted as "contains Neel in the contact line and contains Chauhan in the usual fields we search for".

You could approximate the second example by using search=contact:Neel%20contact:Chauhan, but that will also return all relays that have those two strings somewhere in the contact line, rather than just Neel Chauhan.

What we could do is support quotes in qualified search terms to include spaces. That would be search=contact:"Neel%20Chauhan". But how intuitive is that? And would it produce new problems that I don't think of right now? Or are there any alternatives that are more intuitive? Hmmmmm.

comment:5 Changed 8 months ago by cypherpunks

Component: Metrics/OnionooMetrics/Atlas
Owner: changed from metrics-team to irl

So then the easy fix is to tell atlas to use

​​https://onionoo.torproject.org/summary?contact=...
instead of
https://onionoo.torproject.org/summary?search=contact:...

when the search term starts with "contact:"?

comment:6 Changed 8 months ago by karsten

Well, no. What if a user wants to search by contact and by another search part, say, nickname? Or by contact and another qualified search term like AS number? We cannot merge everything following the first qualifier into a single qualified search term. And we can also not define the "contact:" qualifier to come last, because what if we add another qualified search term in the future that permits spaces? No, this won't work.

comment:7 Changed 8 months ago by cypherpunks

Oh this is bad. So atlas will never support whitespace?

what if we introduce a special new atlas-level qualifier in which the users says "I want to search for contact, yes contact only"

lets say "contactonly:foo bar" and that gets mapped to
https://onionoo.torproject.org/summary?contact=...

What do you think?
(btw: thanks for the quick feedback so far!)

comment:8 Changed 8 months ago by karsten

Well, the idea of using (double) quotes for qualified search terms containing spaces would work. I'm just not sure how intuitive that would be. And somebody would have to build it. :)

Another option would be that Atlas provides an extended search of some kind where it has inputs for other parameters than Onionoo's search parameter. For example, there could be a contact input field, and Atlas would simply pass anything in that field to Onionoo's contact parameter, including contained spaces. That would still leave qualified search term as shortcut for pro users with almost the same functionality (except for this case and maybe a few others). But the more advanced users would go to that extended search and see what they can search for, rather than having to remember what qualified search terms exist. (Somebody would have to build this as well.)

comment:9 in reply to:  8 Changed 8 months ago by cypherpunks

Replying to karsten:

Well, the idea of using (double) quotes for qualified search terms containing spaces would work. I'm just not sure how intuitive that would be. And somebody would have to build it. :)

I assume this would require an onionoo change (I guess this is less likely to be implemented).

Another option would be that Atlas provides an extended search of some kind where it has inputs for other parameters than Onionoo's search parameter. For example, there could be a contact input field, and Atlas would simply pass anything in that field to Onionoo's contact parameter

ok, so the option with the least amount of effort would be the contactonly:...
option to tell atlas to ask
https://onionoo.torproject.org/summary?contact=...
instead of
​​https://onionoo.torproject.org/summary?search=contact:...

since it does not require any atlas UI changes.

I have no opinion on how this is actually done as long as the use case is possible.

Either way the implementer decides - as usual, I hope there will be one.

It is somehow sad that you can use atlas to make a powerful search over many fields but you can not make a simple search using a single field if your searchterm contains a whitespace.

comment:10 Changed 8 months ago by karsten

The contactonly: suggestion is a hack. We shouldn't go that route.

The stop-gap solution would be that you prefix each contact part with contact:, as in: https://atlas.torproject.org/#search/contact:Neel%20contact:Chauhan.

One possible real solution would be to extend Onionoo to accept quoted strings. Needs more discussion and somebody to write it.

comment:11 in reply to:  10 Changed 8 months ago by cypherpunks

Great to see that others (teor) also have a use case here.

Replying to karsten:

The contactonly: suggestion is a hack. We shouldn't go that route.

I agree that would be a hack, but a hack is still better than no solution at all - for me.
(I'm not in favor of that hack if there is someone implementing a proper solution.)

If anyone implements something proper, let me add something that I didn't mention until know because it reduces the likelihood of any solution at all.

I'd actually like to search for perfect matches only.

Search term:
"Neel Chauhan"

should not match on

Chauhan Neel
or
Neel Chauhan 123

comment:12 Changed 8 months ago by teor

In #21373, I had no idea that atlas had qualifiers, or I had forgotten.

So my use case would be resolved by making "contact" part of "the usual fields we search for".
Or I can work around it by searching once for the name I remember, then again using "contact:".

(I really don't care much about names with spaces, or getting the query exactly right: I am quite capable of refining my search using an appropriately unique string. Is there a specific use case that requires a quoted string? Perhaps a programmatic search by contact in an application?)

comment:13 in reply to:  12 Changed 8 months ago by cypherpunks

Replying to teor:

Is there a specific use case that requires a quoted string? Perhaps a programmatic search by contact in an application?

The use case is: creating atlas URLs that use contact as identifier,
to find list all relays with a given contact.(described #21368).

I would use that functionality to make the contact column in this table a URL:
https://raw.githubusercontent.com/ornetstats/stats/master/o/potentially_dangerous_relaygroups.txt

comment:14 Changed 5 months ago by nusenu

https://lists.torproject.org/pipermail/metrics-team/2017-April/000323.html:

(trying to find a stop-gap solution for
https://trac.torproject.org/projects/tor/ticket/21366)

from onionoo.tpo:

search

Return only (1) relays with the parameter value matching (part of a)
nickname, (possibly $-prefixed) [...]
If multiple search terms are given, separated by spaces, the
intersection of all relays and bridges matching all search terms
will be returned.

Karsten wrote
(https://trac.torproject.org/projects/tor/ticket/21366#comment:4)

You could approximate the second example by using
search=contact:Neel%20contact:Chauhan, but that will also return all
relays that have those two strings somewhere in the contact line,
rather than just Neel Chauhan.

So I would assume that
https://atlas.torproject.org/#search/contact:Neel%20contact:Chauhan
(backend:
https://onionoo.torproject.org/summary?search=contact:Neel%20contact:Chauhan)

should only return relays where the contact field contains "Neel"
and "Chauhan" but it also returns relays that have only "Neel" (and
no "Chauhan"), so I would deduce that search terms are OR connected.

example contact result:
"<neel AT rdkr DOT uk> 0xBBC1514B34CFB0F10231280F2FC36F0EF7887127"

If search terms are OR connected (not the "intersection")
then I would simply list all the fingerprints to get a list of all
relevant relays, but that does not work either (no results)
example (2 fingerprints separated by a single space):
https://atlas.torproject.org/#search/D5B8C38539C509380767D4DE20DE84CF84EE8299%201602E42D1DE3C7B3EF042F357F906DE55FA6C7C6
Also tried:
"lookup:D5B8C38539C509380767D4DE20DE84CF84EE8299%20lookup:1602E42D1DE3C7B3EF042F357F906DE55FA6C7C6"

In this search, the search terms are AND connected:

https://onionoo.torproject.org/summary?search=contact:Neel%20D46175487C3

So I'm not sure

  • if the current behavior works as documented and intended
  • How to get a stop-gap solution without false-positives/false-negative

search results

Is this a bug or am I misunderstanding something?
(or does the AND/OR mode depend on whether search qualifiers are used?)


https://lists.torproject.org/pipermail/metrics-team/2017-April/000324.html:
Hi nusenu,

(trying to find a stop-gap solution for
https://trac.torproject.org/projects/tor/ticket/21366)

We should probably discuss this on the ticket, not here. Quick response
below though.

from onionoo.tpo:

search

Return only (1) relays with the parameter value matching (part of a)
nickname, (possibly $-prefixed) [...]
If multiple search terms are given, separated by spaces, the
intersection of all relays and bridges matching all search terms
will be returned.

Karsten wrote
(https://trac.torproject.org/projects/tor/ticket/21366#comment:4)

You could approximate the second example by using
search=contact:Neel%20contact:Chauhan, but that will also return all
relays that have those two strings somewhere in the contact line,
rather than just Neel Chauhan.

So I would assume that
https://atlas.torproject.org/#search/contact:Neel%20contact:Chauhan
(backend:
https://onionoo.torproject.org/summary?search=contact:Neel%20contact:Chauhan)

should only return relays where the contact field contains "Neel"
and "Chauhan" but it also returns relays that have only "Neel" (and
no "Chauhan"), so I would deduce that search terms are OR connected.

example contact result:
"<neel AT rdkr DOT uk> 0xBBC1514B34CFB0F10231280F2FC36F0EF7887127"

Hmm, looks like my suggestion was misleading. The two qualified search
terms are not OR-connected, but the second search term is simply
discarded. Try swapping the two and see how that changes the result.

The spec says: "If the same parameter is specified more than once, only
the first parameter value is considered."

Now, the search term is only given once, but the qualified search terms
are treated as if the user passed values to keys matching search
qualifiers. And the second contact parameter is simply dropped.

This is expected behavior that we might be able to document better.

All the best,
Karsten

comment:15 Changed 5 months ago by nusenu

Since the stop-gap solution does not work, is there any other way how I can build a single atlas URL to find a list of relays with a given contact string (I also know their fingerprint)?

Last edited 5 months ago by nusenu (previous) (diff)

comment:16 in reply to:  15 Changed 5 months ago by cypherpunks

Type: defectenhancement

I've created #22063 so the documentation task in comment:14 isn't forgetten (although it becomes obsolete by #22064).

Replying to nusenu:

Since the stop-gap solution does not work, is there any other way how I can build a single atlas URL to find a list of relays with a given contact string (I also know their fingerprint)?

FWICT there doesn't seem to be a way to do this currently. The search parameter intersects the results from the given search terms and the qualified search terms do not permit multiple search terms separated by spaces.

I like the suggestion in comment:8 about adding an advanced search to Atlas. It would better communicate the parameters that Onionoo (and therefore Atlas) supports. I also opened #22064 to deprecate the qualified search terms in the search parameter because i believe it weakly duplicates existing functionality.

Note: See TracTickets for help on using tickets.