I'd like us to flesh out how we'd like to improve the interface hierarchy, and I'll start by going through the ideas above.
Comment 10 of #19398 (moved) says: "There shouldn't be two interfaces declaring the very same method; there should be a more structured hierarchy for the interfaces, too." I think I agree in theory, though I'd want to discuss how strictly we want to implement this principle. Here's a contrasting principle: let's avoid introducing interfaces that have the sole purpose of deduplicating methods. I think we need to find a middle ground here.
The linked comment in NetworkStatusImpl seems related to possible improvements to the interface hierarchy, though I'd prefer if we address that issue after improving the interface hierarchy.
Comment 4 above, splitting ParseHelper into interface and implementation seems unrelated, unless we're planning to make the interface part of the metrics-lib API. Right now it's solely an implementation helper to avoid duplicating code.
#17861 (moved) is indeed something where we could improve the interface hierarchy, by having a separate interface for each descriptor type. But I'm not sure how to best implement such a new interface. See the heavy overlap between RelayNetworkStatusVote and RelayNetworkStatusConsensus. A new RelayNetworkStatusMicrodescConsensus interface following the current code practice would basically copy over the entire consensus interface and change a method here and there. We should avoid that.
In addition to those items, I came up with a few more ideas:
When we added RelayServerDescriptor, RelayExtraInfoDescriptor, BridgeServerDescriptor, and BridgeExtraInfoDescriptor, we left all methods in the superinterfaces ServerDescriptor and ExtraInfoDescriptor. We should consider moving methods that are specific to relays or bridges to the subinterfaces.
We're still using a single NetworkStatusEntry for entries in all the different network statuses. We should consider using an interface hierarchy there, too, to only provide relevant methods depending on the network status at hand. I started working on a patch, but I'd like to keep this discussion on the conceptual level for now.
And here are some more concrete suggestions for improving the interface hierarchy, somewhat overlapping with the ideas above:
Introduce a NetworkStatus interface to capture all common parts of network statuses in directory protocol versions 2 and 3, including all RelayNetworkStatus* and BridgeNetworkStatus.
Introduce another interface called NetworkStatusVote for common parts of network statuses in directory protocol version 3, following the common part in URLs like http://<hostname>/tor/status-vote/next/authority.z. Though there's the risk that this interface will be confused with RelayNetworkStatusVote. Another possible interface name could be RelayNetworkStatusVersion3, though I'm not a big fan of including numbers in type names.
Introduce a new interface NetworkStatus.Entry just like ExitList.Entry and a set of subinterfaces like RelayNetworkStatusConsensus.Entry to address issue 6 above. Requires some heavy generics lifting.
Use RelayNetworkStatusConsensus for the common parts in consensuses of any flavor and introduce RelayNetworkStatusNsConsensus for unflavored/ns-flavored and RelayNetworkStatusMicrodescConsensus for microdesc-flavored consensuses. That's basically what we did with adding new subinterfaces to ServerDescriptor.
What else can/should we do? Or which parts should we rather not do?
Trac: Owner: karsten to metrics-team Status: new to assigned
Also from #21932 (moved) comments 14/15, listed for discussion:
avoid constant duplication, think about adding or not adding certain constants to the api (refers to DescriptorImpl.NL and ExitList.EOL)
add overloaded methods for newScanner, b/c the usual delimiter is "\n"
DescriptorImpl.setDigestXXX allow empty or null argument. This should have a check even though currently the calling methods make sure the argument is not null or empty, but when working on other tasks in the future that might not be apparent anymore and would get lost w/o a check and accompanying test.
Think about improving TorperfResultImpl and ExitListImpl (also in regard to delimiter use, see question 4, comment 14, #21932 (moved)).
Metrics-lib parses Tor protocol related descriptors and descriptors from other sources. Shouldn't this distinction be reflected in the interface hierarchy?
Have Descriptor on top and introduce an ExternalDescriptor interface for non-Tor protocol related data sources? Maybe, even have TorDescriptor too?
Descriptor / \ TorDescriptor ExternalDescriptor
This would provide more freedom when for example a switch to an external parsing framework (like ANTLR) is made for the Tor related descriptors. In addition, non-Tor descriptors could be modeled more easily to the soon to come Metrics-data standards and even have their own parsing/generator.
Hmm. The use cases you describe make sense for adding such marker interfaces sound plausible. But it also seems easy to introduce these interfaces and potentially difficult to deprecate and remove them if we don't want them anymore. Should we wait with adding them until we need them?
Thanks for the input!
I'm trying to consider as many topics as possible while working on this with the aim to only introduce minimal changes that provide the most benefits.
We decided today not to let this ticket (and #17861 (moved)) delay the 1.9.0 release, and 2.0.0 won't be a good target milestone either. Taking this out of planned milestones for now.
Let's not close this yet. It's true that #24028 (moved) is what we should focus on now, but that doesn't mean that all the ideas have become invalid/irrelevant. Re-opening.
Trac: Status: closed to reopened Resolution: invalid toN/A
Let's not close this yet. It's true that #24028 (moved) is what we should focus on now, but that doesn't mean that all the ideas have become invalid/irrelevant. Re-opening.
Oh, the closing action just meant to state that the discussion moved on. This ticket is explicitely referenced in #24028 (moved) as a resource. But I really want to avoid to open two discussions of one topic. Moving to the new ticket seems better because there is a better structure how to go about things than the summary of (surely valid) ideas here.
I'd rather close this again for this reason, which doesn't mean the ideas here were obsolete.