Opened 3 years ago

Last modified 3 years ago

#22512 new enhancement

Add enums for keywords used in exit lists, Torperf measurement results, bridge pool assignments, and soon sanitized web logs

Reported by: karsten Owned by: metrics-team
Priority: Medium Milestone:
Component: Metrics/Library Version:
Severity: Normal Keywords: metrics-2018
Cc: Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description (last modified by iwakeh)

Todo derived from the discussion in comments 1 to 4:

Add keywords for all descriptors to Key enum.
Use a one letter prefix for keywords from non-Tor data sources.
First step: determine the best letter for each non-Tor source.

Original summary: We recently introduced the Key enum with keywords contained in relay descriptors, sanitized bridge descriptors, and sanitized bridge pool assignments. We did not include keywords in exit lists and Torperf measurement results, and in retrospect we should have excluded sanitized bridge pool assignments there.

The reason why each data source should have its own enum is that naming conventions might vary in terms of upper/lower case and word separators. For example, Tor descriptors use lower-case-keywords, exit lists contain CamelCase, and Torperf/OnionPerf use ALL_UPPER_CASE_WITH_UNDERSCORES. There could be conflicts for keywords like source vs. SOURCE.

Child Tickets

Change History (8)

comment:1 Changed 3 years ago by iwakeh

Right, all descriptor parsing should make use of the Key keywords.

Some things to consider:
There is no inheritance between enums. So, the ease of use in all descriptor parsing code would be lost with different Key-enums. When using only the general Enum<T> the .keyword and other features of Key are lost.
This yields toward the "adding more Key enums approach" instead of separate Enums.

There are no naming conventions yet (only implicitly). The possible naming problems suppose certain rules which were not introduced directly.
A working set of naming convention rules could be started with the aim of only using one Key enum. If there are similar Key names as in the example above using 'source' a prefix could be added. It should be ok to not have a strict translation rule from the actual keyword to the enum name; only keep it heuristically close.

comment:2 Changed 3 years ago by karsten

Ah, good point. How about we add a single-letter source prefix as in T_NETWORK_STATUS_VERSION("network-status-version") where T_ would stand for "Tor descriptor"?

comment:3 Changed 3 years ago by iwakeh

Single letter is fine.
As most Tor keywords are defined already and I assume there are more Tor related ones than non-Tor, I'd prefix non-Tor descriptors. This would also accommodate non-Tor keywords from different data sources that overlap. This is a different prefix for each non-Tor source.

Last edited 3 years ago by iwakeh (previous) (diff)

comment:4 Changed 3 years ago by karsten

Works for me. In any case, prefixing Tor keywords can be done really easily, and it only affects the implementation, not the interface. If we ever decide to also prefix Tor keywords, we can just do it.

comment:5 Changed 3 years ago by iwakeh

Description: modified (diff)

True! I added the new todos to the summary for easier reference.

comment:6 Changed 3 years ago by karsten

Keywords: metrics-2018 added

comment:7 Changed 3 years ago by karsten

Keywords: metrics-2017 added; metrics-2018 removed

comment:8 Changed 3 years ago by iwakeh

Keywords: metrics-2018 added; metrics-2017 removed

Will be completed in 2018.

Note: See TracTickets for help on using tickets.