While talking about making a PT infographic and interesting stats to point out, asn pointed out that there are no publicly available BridgeDB stats.
This can be useful in many different fronts, such as seeing the balance between usage and available resources.
This ticket is about adding a CollecTor module that will archive the stats exposed in #9316 (moved). The actual stats and format should be discussed in #9316 (moved) and may benefit from the discussion in #29315 (moved).
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information
Child items 0
Show closed items
No child items are currently assigned. Use child items to break down this issue into smaller parts.
Linked items 0
Link issues together to show that they're related.
Learn more.
This looks like a duplicate of #9316 (moved). Should we close it?
Or is this ticket supposed to cover the publish part of the statistics exported in #9316 (moved)? If so, we should move it to Metrics/CollecTor and set it to needs_information until there's something to publish.
Trac: Summary: Publish BridgeDB stats to Add a BridgeDB module Keywords: metrics deleted, metrics-roadmap-2019-q2 added Type: defect to enhancement Parent: #9316 (moved)toN/A Cc: asn, metrics-team, gaba, cohosh to asn, metrics-team, gaba, cohosh, ahf, dgoulet, phw Description: While talking about making a PT infographic and interesting stats to point out, asn pointed out that there are no publicly available BridgeDB stats.
This can be useful in many different fronts, such as seeing the balance between usage and available resources.
to
While talking about making a PT infographic and interesting stats to point out, asn pointed out that there are no publicly available BridgeDB stats.
This can be useful in many different fronts, such as seeing the balance between usage and available resources.
This ticket is about adding a CollecTor module that will archive the stats exposed in #9316 (moved). The actual stats and format should be discussed in #9316 (moved) and may benefit from the discussion in #29315 (moved). Points: 1 to 8 Status: needs_information to new
#9316 (moved) has been resolved recently which unblocks this ticket, AIUI.
Yes. How would you like me to expose the metrics file on BridgeDB's host? Should it be available over HTTPS? Or do you want me to rsync it to another host?
#9316 (moved) has been resolved recently which unblocks this ticket, AIUI.
Yes. How would you like me to expose the metrics file on BridgeDB's host? Should it be available over HTTPS? Or do you want me to rsync it to another host?
Is there anything sensitive in the file that would have to be sanitized on the CollecTor host? If so, we should rsync it over ssh to colchicifolium. But if not, the preferred way would be to expose it on the BridgeDB host, so that others can fetch it, too.
Here's another question, similar to the one about Snowflake stats: Would it be possible to expose more than just the latest BridgeDB statistics? Something like 7 or 14 days, or if it's not much data, everything until it gets too big?
#9316 (moved) has been resolved recently which unblocks this ticket, AIUI.
Yes. How would you like me to expose the metrics file on BridgeDB's host? Should it be available over HTTPS? Or do you want me to rsync it to another host?
Is there anything sensitive in the file that would have to be sanitized on the CollecTor host? If so, we should rsync it over ssh to colchicifolium. But if not, the preferred way would be to expose it on the BridgeDB host, so that others can fetch it, too.
There's nothing sensitive. We're doing the sanitisation ourselves and have published the data before. I'll look into exposing the files over our apache.
Here's another question, similar to the one about Snowflake stats: Would it be possible to expose more than just the latest BridgeDB statistics? Something like 7 or 14 days, or if it's not much data, everything until it gets too big?
Yes, that's definitely feasible. One week worth of data shouldn't be more than ~100 KB. I'll look into logrotate, so we can expose multiple weeks worth of data at any given time.
#9316 (moved) has been resolved recently which unblocks this ticket, AIUI.
Yes. How would you like me to expose the metrics file on BridgeDB's host? Should it be available over HTTPS? Or do you want me to rsync it to another host?
Is there anything sensitive in the file that would have to be sanitized on the CollecTor host? If so, we should rsync it over ssh to colchicifolium. But if not, the preferred way would be to expose it on the BridgeDB host, so that others can fetch it, too.
Change of plan: Can we instead rsync BridgeDB's metrics to colchicifolium? Weasel isn't a fan of the idea of exposing BridgeDB's metrics on polyanthum. If CollecTor is archiving the metrics anyway, we might as well just sync them to colchicifolium.
If you are ok with this, I just need a directory to sync the metrics to.
Sure, that works, too. How about /srv/collector.torproject.org/collector/in/bridgedb-stats/? I'll have to edit a script on the receiving side, but feel free to set up the rsync on the sending side whenever you're ready.
Sure, that works, too. How about /srv/collector.torproject.org/collector/in/bridgedb-stats/? I'll have to edit a script on the receiving side, but feel free to set up the rsync on the sending side whenever you're ready.
Thanks. I updated the script on polyanthum's side. It will sync all available bridgedb-metrics.log files, including the rotated ones. The format of rotated files is the same as for assignments.log: bridgedb-metrics.log-YYYYMMDD.gz, e.g., bridgedb-metrics.log-20190905.gz. The file bridgedb-metrics.log is written once per day and also rotated once a day. For now, I configured logrotate to retain 30 rotated files, mostly as a precaution, so we don't lose data in case we run into trouble.
All the changes I made are in the task/19332 branch of my bridgedb-admin repository. Here's a patch. Cecylia, can you please review these changes when you get a chance?
Trac: Reviewer: N/Ato cohosh Status: needs_information to needs_review
Since all logs (including previously rotated ones) are synced each time with rsync, is there a way to detect if old logs have been corrupted and are overwriting the previously synced logs? Not sure how we want to handle a case where logs that have previously been synced have changed for some reason, or what the easiest way to deal with this is.
Okay, please let me know when I need to do something on colchicifolium's side.
Since all logs (including previously rotated ones) are synced each time with rsync, is there a way to detect if old logs have been corrupted and are overwriting the previously synced logs? Not sure how we want to handle a case where logs that have previously been synced have changed for some reason, or what the easiest way to deal with this is.
Thanks for thinking about such problems beforehand. In this case I think it's fine to just rsync what's on the BridgeDB host to colchicifolium. We can still decide on colchicifolium to not overwrite previously imported statistics, which I think is what we do with all other files. I'd say let's give it a try, and we can change this later if this turns out to be an issue.
Since all logs (including previously rotated ones) are synced each time with rsync, is there a way to detect if old logs have been corrupted and are overwriting the previously synced logs? Not sure how we want to handle a case where logs that have previously been synced have changed for some reason, or what the easiest way to deal with this is.
I left the patch as it is according to Karsten's suggestion. I also reassigned the ticket to Karsten because BridgeDB's side is looking good now.
Trac: Owner: phw to karsten Status: merge_ready to assigned
1568738247:metrics:<36>Sep 17 16:37:27 collector-ssh-wrap[29644]: The SSH_ORIGINAL_COMMAND ('rsync --server -logDtpre.iLsfxC . /srv/collector.torproject.org/collector/in/bridgedb-stats/') is not on the whitelist1568738247:metrics:rsync: connection unexpectedly closed (0 bytes received so far) [sender]1568738247:metrics:rsync error: error in rsync protocol data stream (code 12) at io.c(235) [sender=3.1.2]
It looks like the same issue as the one we recently solved in #31515 (moved).
Setting to merge_ready as per irl's statement during yesterday's meeting: "15:41:20 i think it's ok to merge it and i can review the test cases retroactively"
This has been merged and deployed, but I noticed a minor issue with paths and file names of written files. Please also review commit 14cff65 in my task-19332-2 branch with a fix. I'd like to release and deploy that together with #31204 (moved).