Write a specification for BridgeDB's metrics

added actualpoints::0.7 component::circumvention/bridgedb owner::phw points::0.5 priority::medium resolution::implemented reviewer::cohosh severity::normal sponsor::30-can status::closed type::task labels

I wrote a patch that adds BridgeDB's metrics format to the torspec repository.

Note that the patch differs from our current implementation in its version number. Our current implementation uses 1.0 while the spec uses 1. I would like to update the implementation because a single number number seems simpler.

Trac:
Reviewer: N/A to cohosh
Status: assigned to needs_review
Cc: phw to phw, karsten

I left a few comments on the commit.

Also note that the snowflake metrics spec exists in the snowflake repository and not in torspec (as suggested in https://trac.torproject.org/projects/tor/ticket/28942#comment:65) I know there's already a spec for bridgedb in torspec so appending this to that makes sense to me. Just noting the difference here.

Trac:
Status: needs_review to needs_revision

Replying to cohosh:

I left a few comments on the commit.

Thanks! I addressed your feedback here.

Also note that the snowflake metrics spec exists in the snowflake repository and not in torspec (as suggested in https://trac.torproject.org/projects/tor/ticket/28942#comment:65) I know there's already a spec for bridgedb in torspec so appending this to that makes sense to me. Just noting the difference here.

Good point. BridgeDB's specification is outdated to a point where it may do more harm than good and if somebody's interested in the metrics spec, they are more likely to search for it in BridgeDB's repo. I think we should move it. Here's a patch that adds the spec to BridgeDB's repo and updates the code to be consistent with the spec. (The spec includes the revisions, so you don't have to re-review it.)

Trac:
Status: needs_revision to needs_review

These revisions look good to me!

Trac:
Status: needs_review to merge_ready

Replying to cohosh:

These revisions look good to me!

Thanks. I'll wait a bit before merging because karsten mentioned he'll take a look too.

Sorry for the delay. I'll take a look today!

Just one question: can we somehow make these changes to existing bridgedb-metrics.log* files, either on polyanthum or on colchicifolium? It would be sad to lose a month of files just because of cosmetic changes to the spec. If so, do you prefer editing files yourself, or shall I do that?

Replying to karsten:

Just one question: can we somehow make these changes to existing bridgedb-metrics.log* files, either on polyanthum or on colchicifolium? It would be sad to lose a month of files just because of cosmetic changes to the spec. If so, do you prefer editing files yourself, or shall I do that?

I will modify all the files that are currently on polyanthum but you'll have to modify the ones that were already synced to colchicifolium.

Other than that, does the spec look good to you?

Replying to phw:

Replying to karsten:

Just one question: can we somehow make these changes to existing bridgedb-metrics.log* files, either on polyanthum or on colchicifolium? It would be sad to lose a month of files just because of cosmetic changes to the spec. If so, do you prefer editing files yourself, or shall I do that?

I will modify all the files that are currently on polyanthum but you'll have to modify the ones that were already synced to colchicifolium.

Ok, this is done. The updated files should soon be synced to colchicifolium.

Here's an implementation quirk that I just realised: When I restart BridgeDB (e.g., to update to the latest version), it does not write its unfinished metrics to disk, which means that we are losing up to 24 hours worth of metrics after each restart. I filed #31936 (moved) for this issue.

Replying to phw:

Replying to karsten:

Just one question: can we somehow make these changes to existing bridgedb-metrics.log* files, either on polyanthum or on colchicifolium? It would be sad to lose a month of files just because of cosmetic changes to the spec. If so, do you prefer editing files yourself, or shall I do that?

I will modify all the files that are currently on polyanthum but you'll have to modify the ones that were already synced to colchicifolium.

Looks like all files on colchicifolium are already updated. Nothing left to do for me, unless I'm overlooking something.

Other than that, does the spec look good to you?

Sure, it looks fine to me. My main concern was that we'd have to update existing files and code, but it's much easier to do this now than later, so I'm okay with that.

Can I update the code to this new spec now, or do you expect to make any further changes to version 1?

Replying to phw:

Here's an implementation quirk that I just realised: When I restart BridgeDB (e.g., to update to the latest version), it does not write its unfinished metrics to disk, which means that we are losing up to 24 hours worth of metrics after each restart. I filed #31936 (moved) for this issue.

Relays have the same issue. The downside of writing unfinished metrics to disk is that having non-sanitized metrics on disk can be a security problem. This is why relays keep non-sanitized statistics in memory, then sanitize them, and then write them to disk. Of course it's unfortunate to lose up to 24 hours worth of metrics because of a restart, but for relays this hasn't caused major trouble in the past. Maybe this is different with BridgeDB, though.

Anyway, this is unrelated to making BridgeDB stats/metrics available, right? If so, I'd continue with adding these stats/metrics to metrics-lib and CollecTor.

Replying to karsten:

Can I update the code to this new spec now, or do you expect to make any further changes to version 1?

There will be no further changes to version 1, so please update the code.

Replying to karsten:

Replying to phw:

Here's an implementation quirk that I just realised: When I restart BridgeDB (e.g., to update to the latest version), it does not write its unfinished metrics to disk, which means that we are losing up to 24 hours worth of metrics after each restart. I filed #31936 (moved) for this issue.

Relays have the same issue. The downside of writing unfinished metrics to disk is that having non-sanitized metrics on disk can be a security problem. This is why relays keep non-sanitized statistics in memory, then sanitize them, and then write them to disk. Of course it's unfortunate to lose up to 24 hours worth of metrics because of a restart, but for relays this hasn't caused major trouble in the past. Maybe this is different with BridgeDB, though.

Thanks for the context, that's helpful to know.

Anyway, this is unrelated to making BridgeDB stats/metrics available, right? If so, I'd continue with adding these stats/metrics to metrics-lib and CollecTor.

Yes, it is.

I merged the patch in 0751ad7.

Trac:
Status: merge_ready to closed
Resolution: N/A to implemented

Trac:
Actualpoints: N/A to 0.7

closed

changed time estimate to 4h

added 5h 36m of time spent

mentioned in issue #31936 (moved)

moved to tpo/anti-censorship/bridgedb#31780 (closed)

Write a specification for BridgeDB's metrics

Child items 0

Activity