Opened 9 months ago

Last modified 6 months ago

#23829 assigned enhancement

Add support for search term negation

Reported by: cypherpunks Owned by: metrics-team
Priority: Medium Milestone:
Component: Metrics/Onionoo Version:
Severity: Normal Keywords:
Cc: Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description

Currently you can search for things like
give me all exits in FI running 0.3.1

https://atlas.torproject.org/#search/country:fi%20flag:exit%20version:0.3.1

you can not:
give me all relays by operator with contact foo that do not have the guard flag

Child Tickets

Change History (9)

comment:1 Changed 9 months ago by karsten

Status: newneeds_review

Agreed, that sounds potentially useful.

We'll have to support that for all parameters that we think this would be useful for. I just wrote a patch for the "flag" parameter in my task-23829 branch that needs careful review. The specification for that parameter would need to change to something like the following (with additions written in italics): "Return only relays and bridges which have the given relay flag assigned by the directory authorities. If the flag parameter is prefixed with "!", only those relays and bridges are returned that do not have the given relay flag. Note that if the flag parameter is specified more than once, only the first parameter value will be considered. Filtering by flag is case-insensitive."

If this looks potentially useful, please also go through the other available parameters and say which of them would also be good to extend like the "flag" parameter. Please include possible use cases.

Note: This change requires bumping the protocol version to the next minor version.

comment:2 Changed 9 months ago by cypherpunks

Thank you for working on this so fast!

The above description does not clarify if I would write the search term for all relays NOT having the guard flag AND NOT having the exit flag as

flag:!guard,exit

or

flag:!guard,!exit

Regarding this feature for other parameter: I would find it useful for the following parameter:

  • as
  • country
  • contact (least useful because we can not search for contacts with whitespace #21366)
  • family
  • version

Examples:

  • Show me all exits that are in France but NOT in OVH AS.
  • Show me all relays located in the OVH AS that are NOT in France.
  • Are there any relays in AS number X that do not have contact Y.
  • Are there relays outside my family using my contact string? (!family)
  • Are there any relays with nickname X not running version Y (!version)
  • Are there any relays in AS X not having nickname Y? (no parameter)
  • Are there any relays with nickname X outside of IP block a.b.c.? (no parameter)
Last edited 9 months ago by cypherpunks (previous) (diff)

comment:3 Changed 9 months ago by karsten

So, regarding your question about relays having neither Guard nor Exit flag, there's currently no way to write such a query. The flag parameter currently does not accept a comma-separated list of flags. We could extend it towards doing so. That would be a new ticket, though. Mind opening one if you think that would be a useful feature?

Regarding the other parameters that you suggest that should support negation, that list sounds reasonable. What it does not mention is the "search" parameter itself, which means unqualified search terms for which you give use cases further down below.

Before I go write more code, can you answer the following usability questions (numbered for easier reference, not to indicate priority)?

  1. Is ! the best character we can find to indicate negation? Or should we instead pick -? Or something else?
  2. We'll have to extend the various parameters to support ! as part of the parameter value as in search=flag:!exit, and we'll have to allow unqualified search terms starting with ! as in search=!default. But should we also allow qualified search terms starting with ! as in search=!flag:exit which would be equivalent to search=flag:!exit? Note that if we do, search=!flag:!exit would be a valid parameter, as would search=!flag:exit,guard or search=!flag:!exit,guard if we extend the "flag" parameter as mentioned in my first paragraph. It would be up to the user to interpret what that might possibly mean. But maybe they're to blame if they write such a complex query rather than us for accepting it. ;)

As you may guess there might be more questions coming up as we discuss this extension.

Thanks for helping make Onionoo better!

comment:4 in reply to:  3 ; Changed 9 months ago by nusenu

Replying to karsten:

Mind opening one if you think that would be a useful feature?

#23914

Regarding the other parameters that you suggest that should support negation, that list sounds reasonable. What it does not mention is the "search" parameter itself, which means unqualified search terms for which you give use cases further down below.

Since I realized that nickname and IP are mutually exclusive (nickname can not contain dots, IPs can not contain chars) it makes sense to add them as well even without specific IP: nickname: parameters.

Before I go write more code, can you answer the following usability questions (numbered for easier reference, not to indicate priority)?

  1. Is ! the best character we can find to indicate negation? Or should we instead pick -? Or something else?

! is IMHO the most intuitive and most common character for this use-case. This would be my first try before reading any documentation.

  1. We'll have to extend the various parameters to support ! as part of the parameter value as in search=flag:!exit, and we'll have to allow unqualified search terms starting with ! as in search=!default. But should we also allow qualified search terms starting with ! as in search=!flag:exit which would be equivalent to search=flag:!exit? Note that if we do, search=!flag:!exit would be a valid parameter, as would search=!flag:exit,guard or search=!flag:!exit,guard if we extend the "flag" parameter as mentioned in my first paragraph. It would be up to the user to interpret what that might possibly mean. But maybe they're to blame if they write such a complex query rather than us for accepting it. ;)

All use-cases can be formed with the ! sign being used in the value part only, right?

!flag:!guard,exit == flag:guard,!exit ?

comment:5 in reply to:  4 ; Changed 9 months ago by karsten

Replying to nusenu:

Replying to karsten:

Mind opening one if you think that would be a useful feature?

#23914

Thanks!

Regarding the other parameters that you suggest that should support negation, that list sounds reasonable. What it does not mention is the "search" parameter itself, which means unqualified search terms for which you give use cases further down below.

Since I realized that nickname and IP are mutually exclusive (nickname can not contain dots, IPs can not contain chars) it makes sense to add them as well even without specific IP: nickname: parameters.

Ah! Well, you're right that nicknames and IPs are mutually exclusive. But other unqualified search terms are not. For example, aaaa could be the beginning of a hex-encoded fingerprint, any 4 hex character block in the middle of a space-separated fingerprint, the beginning of a base64-encoded fingerprint, the beginning of an IPv6 address, the beginning of a nickname, and maybe even something else I didn't think of right now. It would probably be very confusing to look at the results of search=!aaaa and guess which relays were excluded and why. Hmm.

Before I go write more code, can you answer the following usability questions (numbered for easier reference, not to indicate priority)?

  1. Is ! the best character we can find to indicate negation? Or should we instead pick -? Or something else?

! is IMHO the most intuitive and most common character for this use-case. This would be my first try before reading any documentation.

Sounds good. I'll leave this question open here for a few more days to hear if somebody strongly disagrees. Otherwise it's going to be !.

  1. We'll have to extend the various parameters to support ! as part of the parameter value as in search=flag:!exit, and we'll have to allow unqualified search terms starting with ! as in search=!default. But should we also allow qualified search terms starting with ! as in search=!flag:exit which would be equivalent to search=flag:!exit? Note that if we do, search=!flag:!exit would be a valid parameter, as would search=!flag:exit,guard or search=!flag:!exit,guard if we extend the "flag" parameter as mentioned in my first paragraph. It would be up to the user to interpret what that might possibly mean. But maybe they're to blame if they write such a complex query rather than us for accepting it. ;)

All use-cases can be formed with the ! sign being used in the value part only, right?

Yes.

!flag:!guard,exit == flag:guard,!exit ?

Unfortunately, no:

NOT (NOT guard AND exit) == guard OR NOT exit != guard AND NOT exit

I'm inclined to take back the suggestion to negate search terms. If we limit negation to parameters like "flag", "as", and so on, that might be more intuitive. And we could always extend it to the "search" parameter later on in a backward-compatible if we change our minds. But I think we should start without the "search" parameter, except for values of qualified search terms, of course.

Alright, I think I can work with this and write some more code. That won't happen today, but I'll try to do this next week.

comment:6 Changed 9 months ago by karsten

Owner: changed from metrics-team to karsten
Status: needs_reviewaccepted

comment:7 in reply to:  5 ; Changed 9 months ago by cypherpunks

Ah! Well, you're right that nicknames and IPs are mutually exclusive. But other unqualified search terms are not. For example, aaaa could be the beginning of a hex-encoded fingerprint, any 4 hex character block in the middle of a space-separated fingerprint, the beginning of a base64-encoded fingerprint, the beginning of an IPv6 address, the beginning of a nickname, and maybe even something else I didn't think of right now. It would probably be very confusing to look at the results of search=!aaaa and guess which relays were excluded and why. Hmm.

the unqualified search term can be:

  • nickname
  • fingerprint
  • base64 fingerprint
  • IP address
  • hashed fingerprint

I is probably not valuable to implement negation for unqualified search terms, but I
still would see a use case in being able to say "not in IP block a.b.(c.)" - which implies that we might want a new ip url parameter?

All use-cases can be formed with the ! sign being used in the value part only, right?

Yes.

Ok than it is less confusing to accept ! in the value part only.

I'm inclined to take back the suggestion to negate search terms. If we limit negation to parameters like "flag", "as", and so on, that might be more intuitive. And we could always extend it to the "search" parameter later on in a backward-compatible if we change our minds. But I think we should start without the "search" parameter, except for values of qualified search terms, of course.

Agreed.

Alright, I think I can work with this and write some more code. That won't happen today, but I'll try to do this next week.

thank you!

Last edited 9 months ago by cypherpunks (previous) (diff)

comment:8 in reply to:  7 Changed 9 months ago by karsten

Replying to cypherpunks:

I is probably not valuable to implement negation for unqualified search terms, but I
still would see a use case in being able to say "not in IP block a.b.(c.)" - which implies that we might want a new ip url parameter?

Yes, or maybe "address" to be more consistent with the field names "or_addresses", "exit_addresses", and "dir_address".

We should also consider adding a "nickname" parameter.

And we should extend the "fingerprint" parameter to provide the same functionality as passing a fingerprint or fingerprint part to the "search" parameter. At the same time, or shortly after, we can deprecate the "lookup" parameter. That latter part requires a major protocol version bump, the other parts do not.

Do you mind opening tickets for these if you agree that these would be useful to have? Happy to discuss these new/changed parameters more, but preferably on new tickets.

Agreeing with the rest, this ticket is ready for implementation.

comment:9 Changed 6 months ago by karsten

Owner: changed from karsten to metrics-team
Status: acceptedassigned

This still seems like a worthwhile enhancement, but I'm not going to find the time to do it anytime soon. Re-assigning to metrics-team for now.

Note: See TracTickets for help on using tickets.