Opened 8 months ago

Closed 4 weeks ago

#29461 closed enhancement (fixed)

Add a Snowflake module

Reported by: irl Owned by: metrics-team
Priority: Medium Milestone:
Component: Metrics/CollecTor Version:
Severity: Normal Keywords: metrics-roadmap-august, anti-censorship-roadmap-september
Cc: ahf, cohosh Actual Points:
Parent ID: Points: 8
Reviewer: irl Sponsor: Sponsor28

Description

Add a module to CollecTor to archive the statistics produced in #21315. The actual statistics and format should be discussed in that ticket. The discussion in #29315 may help to inform that.

Child Tickets

TicketTypeStatusOwnerSummary
#21315enhancementclosedcohoshpublish some realtime stats from the broker?

Attachments (1)

example_metrics.log (1.0 KB) - added by cohosh 2 months ago.
metrics log output from the last 2 measurement periods

Download all attachments as: .zip

Change History (43)

comment:1 Changed 4 months ago by gaba

Sponsor: Sponsor28

comment:2 Changed 3 months ago by gaba

Keywords: metrics-roadmap-august added; metrics-roadmap-2019-q2 removed

comment:3 Changed 3 months ago by gaba

Keywords: anti-censorship-roadmap-september added

comment:4 Changed 2 months ago by cohosh

I started on implementing a handler to return metrics from snowflake-broker@bamsoftware.com/metrics in #31376. I have more questions about what exactly you need here. Is it only the metrics from the last full 24 hour measurement period?

Changed 2 months ago by cohosh

Attachment: example_metrics.log added

metrics log output from the last 2 measurement periods

comment:5 Changed 2 months ago by karsten

Where would I find the spec? I found this comment which is fine for me to start write some metrics-lib code. But we'll have to link to something more permanent once we start collecting these statistics; ideally tor-spec.git. Thanks!

comment:6 in reply to:  5 Changed 2 months ago by cohosh

Replying to karsten:

Where would I find the spec? I found this comment which is fine for me to start write some metrics-lib code. But we'll have to link to something more permanent once we start collecting these statistics; ideally tor-spec.git. Thanks!

I'll work on adding it to tor-spec.git. Right now the source code is the ground truth on the spec.

comment:7 Changed 2 months ago by karsten

One quick comment on the spec after writing the metrics-lib part: The "snowflake-stats-end" line should have multiplicity "[At start, exactly once.]" rather than "[At most once.]". Other than that, everything seemed reasonable from a parsing perspective. (I didn't start with the CollecTor part, yet.)

To answer your earlier question above: having just the last 24 hours of data might be problematic if the metrics host goes down for, say, a weekend. Ideally, there would be at least a week of statistics available. Or maybe just accumulate new statistics forever, given the tiny amount of statistics per day.

comment:8 in reply to:  7 Changed 2 months ago by cohosh

Replying to karsten:

One quick comment on the spec after writing the metrics-lib part: The "snowflake-stats-end" line should have multiplicity "[At start, exactly once.]" rather than "[At most once.]". Other than that, everything seemed reasonable from a parsing perspective. (I didn't start with the CollecTor part, yet.)

Thanks! Filed a PR for torspec in #31407 with this change.

To answer your earlier question above: having just the last 24 hours of data might be problematic if the metrics host goes down for, say, a weekend. Ideally, there would be at least a week of statistics available. Or maybe just accumulate new statistics forever, given the tiny amount of statistics per day.

Okay great, I have a implementation of this in #31376 that will respond with all logged metrics data.

comment:9 Changed 2 months ago by karsten

Status: newneeds_review

Couple ideas after implementing the CollecTor and metrics-web parts:

  • As indicated before, I'm using snowflake-stats-end as descriptor type identifier, which means that future data format versions will have to keep that line as their first line. A better choice would have been to use something like snowflake-stats $version or similar. (If it's any relief, we're forced to use [0-9]{10} as descriptor type identifier for bandwidth files, so there would have been plenty of room to do worse.)
  • The current format only supports a single snowflake broker. Maybe this is acceptable for the snowflake design. But just in case that you'll one day want to add a second broker, you'll have to include some sort of broker identifier in the format.
  • The current format is not signed, which is somewhat related to not having a broker identifier in the format.
  • As a consequence of the above, CollecTor needs to make a decision whether it wants to archive a newly downloaded snowflake-stats snippet, if it already has another snippet with the same timestamp and different contents. Possible strategies for this specific case are to a) never overwrite, b) always overwrite, c) keep all versions by including a digest in the file name, d) maybe something else. I implemented a) for now.

I think we can start with what we have, without changing anything of the above. Of course, if you want to change something with regard to future maintenance effort, now's the time!

irl, please review metrics-lib commit c8f8321, CollecTor commit cd5a27d, and metrics-web commit 5958454. If there are more changes, I'll try to add those as fixup/squash commits.

comment:10 Changed 2 months ago by karsten

Note to self: pending on #31376 for including the correct, working URL as default configuration option.

comment:11 Changed 2 months ago by irl

Reviewer: irl

I have seen this, and plan to do the review tomorrow morning.

comment:12 in reply to:  9 ; Changed 2 months ago by cohosh

Replying to karsten:
Thanks karsten!

  • As indicated before, I'm using snowflake-stats-end as descriptor type identifier, which means that future data format versions will have to keep that line as their first line. A better choice would have been to use something like snowflake-stats $version or similar. (If it's any relief, we're forced to use [0-9]{10} as descriptor type identifier for bandwidth files, so there would have been plenty of room to do worse.)

Oh this is a good point. I can make this change pretty easily now.

  • The current format only supports a single snowflake broker. Maybe this is acceptable for the snowflake design. But just in case that you'll one day want to add a second broker, you'll have to include some sort of broker identifier in the format.

This will definitely be the case for quite a while, I think a spec change (and version bump of the spec) will be a good way to handle this. How difficult will it be to have data with different spec versions on your end?

  • The current format is not signed, which is somewhat related to not having a broker identifier in the format.
  • As a consequence of the above, CollecTor needs to make a decision whether it wants to archive a newly downloaded snowflake-stats snippet, if it already has another snippet with the same timestamp and different contents. Possible strategies for this specific case are to a) never overwrite, b) always overwrite, c) keep all versions by including a digest in the file name, d) maybe something else. I implemented a) for now.

I didn't think about signing, that would take while to implement I think. I agree that the best decision here is (a) for now.

I think we can start with what we have, without changing anything of the above. Of course, if you want to change something with regard to future maintenance effort, now's the time!

comment:13 Changed 2 months ago by irl

Status: needs_reviewneeds_revision

The SnowflakeStatsUrl doesn't look right at all. There's an ampersat in it?

'snowflakes' contains a single compressed tarball with snowflake statistics:

It is nice that we can have archives that after a while become read-only. Why are we not splitting this by month?

comment:14 in reply to:  13 ; Changed 2 months ago by karsten

Replying to irl:

The SnowflakeStatsUrl doesn't look right at all. There's an ampersat in it?

You mean the default URL in collector.properties? That's known, I need to put in the actual URL once that's available (#31376).

'snowflakes' contains a single compressed tarball with snowflake statistics:

It is nice that we can have archives that after a while become read-only. Why are we not splitting this by month?

The reason is that these statistics are tiny and that processing these files would be simpler with a single tarball. But changing this is trivial, if you think that doing so has more advantages.

comment:15 Changed 2 months ago by irl

I would like to have things becoming immutable after we think they will not receive more data, as it makes it easier to manage archives in the longer term and do things like monitor for bitrot.

comment:16 in reply to:  12 ; Changed 2 months ago by karsten

Replying to cohosh:

Replying to karsten:
Thanks karsten!

  • As indicated before, I'm using snowflake-stats-end as descriptor type identifier, which means that future data format versions will have to keep that line as their first line. A better choice would have been to use something like snowflake-stats $version or similar. (If it's any relief, we're forced to use [0-9]{10} as descriptor type identifier for bandwidth files, so there would have been plenty of room to do worse.)

Oh this is a good point. I can make this change pretty easily now.

Okay, in that case please make this change and let me know where to find the new spec and a new sample.

  • The current format only supports a single snowflake broker. Maybe this is acceptable for the snowflake design. But just in case that you'll one day want to add a second broker, you'll have to include some sort of broker identifier in the format.

This will definitely be the case for quite a while, I think a spec change (and version bump of the spec) will be a good way to handle this. How difficult will it be to have data with different spec versions on your end?

Not difficult at all. Okay, let's keep this unchanged then.

  • The current format is not signed, which is somewhat related to not having a broker identifier in the format.
  • As a consequence of the above, CollecTor needs to make a decision whether it wants to archive a newly downloaded snowflake-stats snippet, if it already has another snippet with the same timestamp and different contents. Possible strategies for this specific case are to a) never overwrite, b) always overwrite, c) keep all versions by including a digest in the file name, d) maybe something else. I implemented a) for now.

I didn't think about signing, that would take while to implement I think. I agree that the best decision here is (a) for now.

I think we can start with what we have, without changing anything of the above. Of course, if you want to change something with regard to future maintenance effort, now's the time!

comment:17 in reply to:  15 ; Changed 2 months ago by karsten

Replying to irl:

I would like to have things becoming immutable after we think they will not receive more data, as it makes it easier to manage archives in the longer term and do things like monitor for bitrot.

Works for me. Will revise this.

comment:18 in reply to:  17 Changed 8 weeks ago by karsten

Replying to karsten:

Replying to irl:

I would like to have things becoming immutable after we think they will not receive more data, as it makes it easier to manage archives in the longer term and do things like monitor for bitrot.

Works for me. Will revise this.

Revised in ac00e57.

comment:19 in reply to:  16 ; Changed 8 weeks ago by karsten

Status: needs_revisionneeds_information

Replying to karsten:

Replying to cohosh:

Replying to karsten:
Thanks karsten!

  • As indicated before, I'm using snowflake-stats-end as descriptor type identifier, which means that future data format versions will have to keep that line as their first line. A better choice would have been to use something like snowflake-stats $version or similar. (If it's any relief, we're forced to use [0-9]{10} as descriptor type identifier for bandwidth files, so there would have been plenty of room to do worse.)

Oh this is a good point. I can make this change pretty easily now.

Okay, in that case please make this change and let me know where to find the new spec and a new sample.

cohosh, are you still planning to revise the spec and code to include such a snowflake-stats $version line?

comment:20 in reply to:  14 ; Changed 8 weeks ago by karsten

Replying to karsten:

Replying to irl:

The SnowflakeStatsUrl doesn't look right at all. There's an ampersat in it?

You mean the default URL in collector.properties? That's known, I need to put in the actual URL once that's available (#31376).

cohosh, what URL did you come up with in that ticket?

comment:21 in reply to:  20 ; Changed 8 weeks ago by cohosh

Replying to karsten:

Replying to karsten:

Replying to irl:

The SnowflakeStatsUrl doesn't look right at all. There's an ampersat in it?

You mean the default URL in collector.properties? That's known, I need to put in the actual URL once that's available (#31376).

cohosh, what URL did you come up with in that ticket?

I just deployed this and you can now fetch it from https://snowflake-broker.bamsoftware.com/metrics

comment:22 in reply to:  19 ; Changed 8 weeks ago by cohosh

Replying to karsten:

Replying to karsten:

Replying to cohosh:

Replying to karsten:
Thanks karsten!

  • As indicated before, I'm using snowflake-stats-end as descriptor type identifier, which means that future data format versions will have to keep that line as their first line. A better choice would have been to use something like snowflake-stats $version or similar. (If it's any relief, we're forced to use [0-9]{10} as descriptor type identifier for bandwidth files, so there would have been plenty of room to do worse.)

Oh this is a good point. I can make this change pretty easily now.

Okay, in that case please make this change and let me know where to find the new spec and a new sample.

cohosh, are you still planning to revise the spec and code to include such a snowflake-stats $version line?

Yes, I just got back from vacation and I'm going to implement it this afternoon.

comment:23 in reply to:  21 Changed 8 weeks ago by cohosh

Replying to cohosh:

Replying to karsten:

Replying to karsten:

Replying to irl:

The SnowflakeStatsUrl doesn't look right at all. There's an ampersat in it?

You mean the default URL in collector.properties? That's known, I need to put in the actual URL once that's available (#31376).

cohosh, what URL did you come up with in that ticket?

I just deployed this and you can now fetch it from https://snowflake-broker.bamsoftware.com/metrics

Ooops, actually we should use https://snowflake-broker.torproject.net/metrics

comment:24 in reply to:  22 ; Changed 8 weeks ago by cohosh

Replying to cohosh:

Replying to karsten:

cohosh, are you still planning to revise the spec and code to include such a snowflake-stats $version line?

Yes, I just got back from vacation and I'm going to implement it this afternoon.

Opened #31493 and implemented here. How does that look to you?

comment:25 in reply to:  24 ; Changed 7 weeks ago by karsten

Replying to cohosh:

Replying to cohosh:

Replying to karsten:

cohosh, are you still planning to revise the spec and code to include such a snowflake-stats $version line?

Yes, I just got back from vacation and I'm going to implement it this afternoon.

Opened #31493 and implemented here. How does that look to you?

Well, that would work, though I think that introducing a separate line for descriptor type and version might be clearer:

snowflake-stats 1.0
snowflake-stats-end 2019-08-07 19:52:11 (86400 s)
snowflake-ips VN=5,NL=26,AU=30,GT=2,NO=5,EG=3,NI=1,AT=22,[...]
snowflake-ips-total 937
snowflake-idle-count 660976
client-denied-count 0
client-snowflake-match-count 864

If this looks okay to you and if you make this change, do you think you could somehow edit the existing stats file on the snowflake broker, maybe using some sed/awk magic?

comment:26 in reply to:  25 ; Changed 7 weeks ago by cohosh

Replying to karsten:

Replying to cohosh:

Replying to cohosh:

Replying to karsten:

cohosh, are you still planning to revise the spec and code to include such a snowflake-stats $version line?

Yes, I just got back from vacation and I'm going to implement it this afternoon.

Opened #31493 and implemented here. How does that look to you?

Well, that would work, though I think that introducing a separate line for descriptor type and version might be clearer:

snowflake-stats 1.0
snowflake-stats-end 2019-08-07 19:52:11 (86400 s)
snowflake-ips VN=5,NL=26,AU=30,GT=2,NO=5,EG=3,NI=1,AT=22,[...]
snowflake-ips-total 937
snowflake-idle-count 660976
client-denied-count 0
client-snowflake-match-count 864

If this looks okay to you and if you make this change, do you think you could somehow edit the existing stats file on the snowflake broker, maybe using some sed/awk magic?

I can do that, yes. dcf just commented that it might be easier to do the version on a separate line (see https://trac.torproject.org/projects/tor/ticket/31493#comment:3). Do you have an opinion on that? We're looking for ease of use now and in the future if/when the version bumps.

comment:27 in reply to:  26 Changed 7 weeks ago by dcf

Replying to cohosh:

I can do that, yes. dcf just commented that it might be easier to do the version on a separate line (see https://trac.torproject.org/projects/tor/ticket/31493#comment:3). Do you have an opinion on that? We're looking for ease of use now and in the future if/when the version bumps.

I was picturing a @type snowflake-stats 1.0 so that the format is the same as other CollecTor formats. Data Formats: "Each descriptor provided here contains an @type annotation using the format @type $descriptortype $major.$minor." See for example in zoossh.

comment:28 Changed 7 weeks ago by cohosh

Okay, so just the version should have the @type format? I based the spec off of how I gathered relays were exporting stats by reading the source code: https://gitweb.torproject.org/tor.git/tree/src/feature/stats/geoip_stats.c#n994

Last edited 7 weeks ago by cohosh (previous) (diff)

comment:29 Changed 7 weeks ago by cohosh

Okay I've updated #31493 to use the following spec:

We export metrics in the following (version 1.0) format:

    "snowflake-stats-end" YYYY-MM-DD HH:MM:SS (NSEC s) NL
        [At start, exactly once.]

        YYYY-MM-DD HH:MM:SS defines the end of the included measurement
        interval of length NSEC seconds (86400 seconds by default).

    "@type snowflake-stats 1.0"
        [Exactly once.]

        Defines the version of snowflake stats output being used
        (currently 1.0)

    "snowflake-ips" CC=NUM,CC=NUM,... NL
        [At most once.]

        List of mappings from two-letter country codes to the number of
        unique IP addresses of snowflake proxies that have polled.

[rest omitted for brevity]

comment:30 Changed 7 weeks ago by karsten

Commented on #31493.

comment:31 Changed 7 weeks ago by karsten

So, it seems that we're not doing #31493. That means that the remaining tasks on my list are to update the default URL, do another test round, and hand the branch over to irl for review.

comment:32 Changed 7 weeks ago by karsten

Status: needs_informationneeds_review

comment:33 Changed 6 weeks ago by notirl

Status: -> needs_revision

Again the startProcessing() method in the Downloader is huge. Testing and review would be easier if this were broken up. This one is easier to follow though.

We should file a ticket to refactor HTTP GET request code into a util function.

I think we already had a ticket to refactor cleaning up directory after X days? If we don't then we should file that ticket.

The create-tarballs script was updated, but the PROTOCOL was not updated, for the monthly tarballs.

comment:34 Changed 6 weeks ago by irl

Status: needs_reviewneeds_revision

comment:35 in reply to:  33 Changed 6 weeks ago by karsten

Replying to notirl:

Again the startProcessing() method in the Downloader is huge. Testing and review would be easier if this were broken up. This one is easier to follow though.

Thanks for the review! I broke up the method to take out the HTTP GET and writing to disk.

We should file a ticket to refactor HTTP GET request code into a util function.

Agreed. Filed #31599 for this.

I think we already had a ticket to refactor cleaning up directory after X days? If we don't then we should file that ticket.

Yes, that's #20546.

The create-tarballs script was updated, but the PROTOCOL was not updated, for the monthly tarballs.

Oops, fixed.

Please take another look: commit eeb9f20 in my task-29461 branch.

comment:36 Changed 6 weeks ago by karsten

Status: needs_revisionneeds_review

comment:37 Changed 6 weeks ago by irl

Status: needs_reviewmerge_ready

This looks good to me now.

comment:38 in reply to:  37 Changed 6 weeks ago by karsten

Replying to irl:

This looks good to me now.

Did you also review metrics-lib commit c8f8321?

comment:39 Changed 6 weeks ago by irl

Yes, I had thought that was already merged. I guess not.

comment:40 Changed 5 weeks ago by karsten

All changes are merged, released, and deployed. Next steps are that we

  • wait for descriptors to show up on CollecTor and Tor Metrics,
  • try out syncing from CollecTor host, and
  • update the CollecTor mirror to sync from the main CollecTor instance.

comment:41 in reply to:  40 Changed 4 weeks ago by karsten

Replying to karsten:

All changes are merged, released, and deployed. Next steps are that we

  • wait for descriptors to show up on CollecTor and Tor Metrics,

Done.

  • try out syncing from CollecTor host, and

Done.

  • update the CollecTor mirror to sync from the main CollecTor instance.

irl, do you want to do that, or shall I?

comment:42 Changed 4 weeks ago by karsten

Resolution: fixed
Status: merge_readyclosed

I just upgraded our CollecTor mirror and will take a look how that goes. Closing, because deployment shouldn't be part of this ticket anymore.

Note: See TracTickets for help on using tickets.