publish some realtime stats from the broker?

added anti-censorship-roadmap component::circumvention/snowflake owner::cohosh parent::29461 priority::medium resolution::implemented reviewer::phw severity::normal sponsor::28-can status::closed type::enhancement labels

I would also be interested in stats about users and usage (including e.g. number of users being handled divided by number of snowflakes handling them), but I recognize anything involving the users is a more complicated topic, and we shouldn't do things that could put users at risk without sorting through what we ought to protect and how we can make sure it's being protected.

So, step one, tell me more about the snowflakes please. :)

One other concrete thing that I want: how many times are you giving snowflakes out? How many times did you stop giving a snowflake out because you've given it out so many times already? These questions tie into the address distribution algorithm question: it's not clear how to pick the right parameters in a vacuum, but we're not in a vacuum, so maybe we can gain some intuition by seeing how things play out in practice.

Shirking these questions for now, just linking this issue w/ https://github.com/keroserene/snowflake/issues/29

And pointing to, https://metrics.torproject.org/userstats-bridge-transport.html?transport=snowflake

There's an undocumented /debug URL path that shows the currently connected snowflakes. https://snowflake-reg.appspot.com/debug (App Engine broker that we plan to move away from) https://snowflake-broker.bamsoftware.com/debug (standalone broker from #22874 (moved)) I'm not sure it's a good idea to publish this information in this form, but for what it's worth, that's how it works now.

There should be at least 3 snowflakes on each broker at all times, because we're specifically running some fallback proxy-go instances. Obviously these are no good from a circumvention point of view, because they're on a static IP address--they're mainly there so that curious people who try the snowflake option in the alpha browser aren't immediately discouraged.

Metrics Team expects to produce the corresponding CollecTor module for this within the next 6-month roadmap.

Trac:
Parent: N/A to #29461 (moved)
Cc: N/A to metrics-team

Here's a summary of the current state of things:

=== Eventual Goals

It sounds like we have a few things we want to achieve/learn from collected metrics:

Detect censorship events
Allow current or potential proxies to see if they are needed
Allow clients to see whether their connection issues are due to censorship or proxy availability
Help us figure out whether we should be doing something different in distributing proxies to clients

=== What We Have

We current collect and "publish" information on:

how many snowflake are currently available along with their SIDs (available at broker /debug handler). This is good for more detailed monitoring of censorship events. Even though we collect bridge usage metrics, collecting broker usage metrics will narrow down where the censorship is happening.
country stats of domain-fronted client connections (logged, most recent snapshot at broker /debug)
the roundtrip time it takes for a client to connect to get a snowflake proxy answer (available at broker /debug)
the usual snowflake bridge statistics (at metrics.torproject.org)

=== What We Want

Some of the metrics mentioned above will be easier to implement than others. The best place to collect statistics is at the broker, but some of the data mentioned would require proxies to report metrics to the broker for collection. We have to be a bit careful with this since anyone can run a proxy. It will also impact the decisions we make for #29207 (moved).

I would also be interested in stats about users and usage (including e.g. number of users being handled divided by number of snowflakes handling them)

This is a bit tricky. The broker knows which proxies it hands out the users but doesn't know the state of the clients' connections to those proxies (e.g., when they have been closed). It's also worth noting that different "types" of proxies (standalone vs. browser-based) can handle a different amount of users at once. Perhaps a more useful metric would be for snowflake proxies to advertize to the broker how many available "slots/tokens" they have when they poll for clients. This could be added to the broker--proxy WebSocket protocol. It would also avoid collecting more data on clients which is generally safer

how many times are you giving snowflakes out? How many times did you stop giving a snowflake out because you've given it out so many times already? These questions tie into the address distribution algorithm question

The above comment addresses this as well. The broker doesn't really decide whether or not they've given a snowflake out too many times. I think more important to deciding whether we are giving out proxies in a good way is to try to measure how "reliable" individual proxies have been in the past. This is related to setting up persistent identifiers (#29260 (moved)).

It might also be interesting to have some kind of proxy diversity metric (e.g., whether 90% of all connections are handled by the same proxy). We can get some idea with persistent identifiers (#29260 (moved)), but of course using a persistent identifier will always be optional. We can also do collection of geoip country stats of proxies.

=== Next steps

Narrow down what we want
Address prerequisite tickets (#29207 (moved), #29260 (moved))
Log all of the statistics in a reasonable format
coordinate with the metrics team to get these metrics collected and visualized somewhere

Trac:
Cc: metrics-team to metrics-team, cohosh

I noticed that metrics-team is cc'ed, but from reading the comments it seems like you're still at design discussions like the slots/tokens idea. When is a good time for the metrics team to get involved? Should we wait until you have a better idea what you want? Or should we help with the bikeshedding right now? :)

(FWIW, I didn't understand arma's "big difference between 0 and 1" comment in the summary, and I'm not 100% certain whether SID stands for Snowflake IDentifier or Somethingelse I Don'tknow.)

Trac:
Sponsor: N/A to Sponsor19

Replying to karsten:

I noticed that metrics-team is cc'ed, but from reading the comments it seems like you're still at design discussions like the slots/tokens idea. When is a good time for the metrics team to get involved? Should we wait until you have a better idea what you want? Or should we help with the bikeshedding right now? :)

I think we've still got a bit of work to do before we will know enough about where we want to go to include the metrics team. We have a few other tickets we need to cover first before we will even have the data we need at the broker for some of these metrics.

Replying to cohosh:

It sounds like we have a few things we want to achieve/learn from collected metrics:

Detect censorship events

Allow current or potential proxies to see if they are needed

Allow clients to see whether their connection issues are due to censorship or proxy availability

Help us figure out whether we should be doing something different in distributing proxies to clients

These all seem like good goals.

We current collect and "publish" information on:

how many snowflake are currently available along with their SIDs (available at broker /debug handler). This is good for more detailed monitoring of censorship events. Even though we collect bridge usage metrics, collecting broker usage metrics will narrow down where the censorship is happening.

country stats of domain-fronted client connections (logged, most recent snapshot at broker /debug)

the roundtrip time it takes for a client to connect to get a snowflake proxy answer (available at broker /debug)

Should we be already archiving this data?

Some of the metrics mentioned above will be easier to implement than others. The best place to collect statistics is at the broker, but some of the data mentioned would require proxies to report metrics to the broker for collection. We have to be a bit careful with this since anyone can run a proxy. It will also impact the decisions we make for #29207 (moved).

We collect a lot of statistics at relays and bridges, which anyone can run. We are working on methods of improving robustness against these statistics being manipulated, but so far have not detected anyone reporting values that are not normal. It is good to have criteria for determining, based on stats others report, what you would be expecting so that anomalies can be detected. For example, we would expect relay bandwidth usage among relays to be proportional to consensus weight.

I would also be interested in stats about users and usage (including e.g. number of users being handled divided by number of snowflakes handling them)

This is a bit tricky. The broker knows which proxies it hands out the users but doesn't know the state of the clients' connections to those proxies (e.g., when they have been closed). It's also worth noting that different "types" of proxies (standalone vs. browser-based) can handle a different amount of users at once. Perhaps a more useful metric would be for snowflake proxies to advertize to the broker how many available "slots/tokens" they have when they poll for clients. This could be added to the broker--proxy WebSocket protocol. It would also avoid collecting more data on clients which is generally safer

This sounds like a reasonable approach. You might want to take a look at:

This will give you an idea of how we do this for other parts of Tor.

how many times are you giving snowflakes out? How many times did you stop giving a snowflake out because you've given it out so many times already? These questions tie into the address distribution algorithm question

Can this also be an indirect measurement of number of users?

The above comment addresses this as well. The broker doesn't really decide whether or not they've given a snowflake out too many times. I think more important to deciding whether we are giving out proxies in a good way is to try to measure how "reliable" individual proxies have been in the past. This is related to setting up persistent identifiers (#29260 (moved)).

For relays, directory authorities track the mean time between failures, and we track this in Tor Metrics too.

It might also be interesting to have some kind of proxy diversity metric (e.g., whether 90% of all connections are handled by the same proxy). We can get some idea with persistent identifiers (#29260 (moved)), but of course using a persistent identifier will always be optional. We can also do collection of geoip country stats of proxies.

We don't really have this metric for relays yet, so if you have ideas that would be applicable to relays too then that would be great. We know about country/AS distribution, but we haven't quantified the diversity using any particular formula.

Log all of the statistics in a reasonable format

This would ideally be a format that Tor Metrics is already handling. If it could be based on the Tor directory protocol meta-format (§1.2 dir-spec) then that would be great. We don't want to bring in dependencies for parsing yaml/toml/etc. if we can help it.

coordinate with the metrics team to get these metrics collected and visualized somewhere

Please also coordinate on what you want to collect, so we can consider if that information already comes from somewhere, if we already had a plan for it, and if it is safe or not.

Trac:
Cc: metrics-team, cohosh to metrics-team, cohosh, phw

In the interest of having a minimum viable product I think it makes sense to start with making sure that what we have now is safe and can be archived by Tor metrics. Some useful statistics (like how long proxies remain available and how many unique proxies we give out) will have to wait for #29260 (moved) to be implemented which in turn depends on a few different things.

So, the metrics we currently (or soon) are able to collect are:

Number of currently available snowflake proxies
GeoIP stats (and, with that, the total number) of snowflake proxies we actually hand out to clients
round trip time of snowflake rendezvous you can see these metrics on the snowflake broker debug page: https://snowflake-broker.bamsoftware.com/debug (proxy geoip info will be added in ticket ##29734 (moved))

The next steps are:

Figure out whether these metrics are safe and good
Get these metrics in a format that Tor Metrics can handle and display

So my questions for the metrics team are:

Does it make sense to use to privacy preserving ways of counting available proxies (we are definitely not going to collect or export any client data at this point in time)
I have a vague memory of a trac ticket, wiki page, or email that summarized something to the effect of "this is how to get Tor metrics to archive your data" but I can't find it. Other than following the Tor directory protocol meta-format, do you have any other advice on how to format our data?

I guess this ticket needs answers from the metrics team to the questions in comment:11.

Trac:
Status: new to needs_information

Trac:
Reviewer: N/A to irl

Number of currently available snowflake proxies is not sensitive. We do not make any efforts to hide the numbers of relays or bridges, and so this can be an exact count. The question here is not the count resolution but the time resolution. (Sorry to answer your question with a question.)

If I'm an attacker, can I learn anything about a client if I can observe the client's traffic and the exact count of snowflakes. For example, what do I learn if a snowflake that a client is using disappears? I'm not sure what the snowflake protocol does in this case.

I'm not sure what you mean with the GeoIP stats. If these are stats regarding the locations of proxies, again exact counts would be fine and would be in line with what we do for relays and bridges at the moment. If this is for clients, we should aim to provide differential privacy. I fear that at the moment, we are not seeing enough users that we can safely report GeoIP stats (usefully) for clients at all. With relays and bridges, we round the counts up to the nearest multiple of 8.

Round trip time of snowflake rendezvous sounds like a really useful metric for engineering work, but a dangerous one for safety. This would be a good candidate for PrivCount but without such a technique I wouldn't do this one. We currently measure performance of relays using active measurement, such that we are only analyzing our own traffic. We have extended that tool, OnionPerf, to also work for pluggable transports but it will do the end-to-end performance not just client->snowflake.

Can you lay out in detail exactly what metrics you'd want, what resolution data you want (both in counts and in time) and what you might consider an attacker could learn, assuming they are in a position to monitor, or are running, a point in the network?

Section 2.1.2 of dir-spec contains some examples of descriptions of metrics.

Trac:
Status: needs_information to needs_revision

Replying to irl: Thanks for this feedback. This was very helpful. What makes snowflake statistics a little more complex than bridge or relay stats is that, while stats about how many times a bridge was used doesn't closely reflect client usage, proxies handle either a single client or a small, fixed number of clients as determined by the individual proxy and so there's a greater possibility for data leakage there.

Number of currently available snowflake proxies is not sensitive. We do not make any efforts to hide the numbers of relays or bridges, and so this can be an exact count. The question here is not the count resolution but the time resolution. (Sorry to answer your question with a question.)

If I'm an attacker, can I learn anything about a client if I can observe the client's traffic and the exact count of snowflakes. For example, what do I learn if a snowflake that a client is using disappears? I'm not sure what the snowflake protocol does in this case. Possibly, as stated above, it depends on what type of proxy you are and how it's set up. I think we're better off doing binning in this case. However, as stated below, if we collect at a granularity of every 24 hours this shouldn't leak client usage.

I'm not sure what you mean with the GeoIP stats. If these are stats regarding the locations of proxies, again exact counts would be fine and would be in line with what we do for relays and bridges at the moment. If this is for clients, we should aim to provide differential privacy. I fear that at the moment, we are not seeing enough users that we can safely report GeoIP stats (usefully) for clients at all. With relays and bridges, we round the counts up to the nearest multiple of 8. We're absolutely not collecting geoip stats of clients. These are only of snowflake proxies. I originally thought to include geoip stats of proxies that are actually handed out but it's safer to do stats for available proxies since this shouldn't leak client usage if collected over a period of 24 hours.

Round trip time of snowflake rendezvous sounds like a really useful metric for engineering work, but a dangerous one for safety. This would be a good candidate for PrivCount but without such a technique I wouldn't do this one. We currently measure performance of relays using active measurement, such that we are only analyzing our own traffic. We have extended that tool, OnionPerf, to also work for pluggable transports but it will do the end-to-end performance not just client->snowflake.

That's fair, this is really only available for debug purposes. We don't need to export it as a metric and I'd argue that this should only be logged locally.

Can you lay out in detail exactly what metrics you'd want, what resolution data you want (both in counts and in time) and what you might consider an attacker could learn, assuming they are in a position to monitor, or are running, a point in the network?

It looks like bridge stats default to 24 hours, that seems reasonable for snowflake as well.

Section 2.1.2 of dir-spec contains some examples of descriptions of metrics.

To summarize, and be more precise about what we want to collect, I've put our proposed exported metrics in the Tor Directory Protocol Format:

    "snowflake-stats-end" YYYY-MM-DD HH:MM:SS (NSEC s) NL
        [At most once.]

        YYYY-MM-DD HH:MM:SS defines the end of the included measurement
        interval of length NSEC seconds (86400 seconds by default).

    "snowflake-ips" CC=NUM,CC=NUM,... NL
        [At most once.]

        List of mappings from two-letter country codes to the number of
        unique IP addresses of available snowflake proxies, rounded up
        to the nearest multiple of 8.

    "snowflake-available-count" NUM
        [At most once.]

        A count of the number of unique IP addresses corresponding
        to currently available snowflake proxies, rounded up to
        the nearest multiple of 8.

    "snowflake-usage-count" NUM
        [At most once.]

        A count of the number of snowflake proxies that have been
        handed out by the broker to clients, rounded up to the
        nearest multiple of 8.

So in short, we'd collect over a 24 hour period:

geoip stats of unique available snowflake proxies
approximated count of unique, available snowflake proxies
approximated count of the number of proxies handed to snowflake clients (which would also be the same as the total number of client requests).

I've cc'd dcf and arlolra to see if they have thoughts on this.

I just thought of another thing that makes snowflake quite different from relays or bridges is the expected amount of time that each proxy will be online. It might make more sense to shorten the collection period from 24 hours in this case. Another thing we can do is have the broker export another metric that includes something like the average amount of time a unique proxy was available.

Also noting that mapping unique proxies to IP addresses isn't quite how snowflake is supposed to work: we'll probably want some kind of persistent identifier later as mentioned in the comments above, but I think it will work fine for our purposes now.

Trac:
Cc: metrics-team, cohosh, phw to metrics-team, cohosh, phw, dcf, arlolra

Replying to cohosh:

Also noting that mapping unique proxies to IP addresses isn't quite how snowflake is supposed to work: we'll probably want some kind of persistent identifier later as mentioned in the comments above, but I think it will work fine for our purposes now.

Identifying by IP address is interesting by itself, though. If IP address is the expected basis of blocking, then it's interesting to consider a proxy's "identity" as IP address because that's the information available to a censor. A long-term identity independent of IP address is also interesting, but less informative for measuring blocking resistance IMO.

Moving from Sponsor 19 to Sponsor 28.

Trac:
Sponsor: Sponsor19 to Sponsor28-can

After looking at #30731 (moved), I want to change the proposed collected metrics to match the data shown in these graphs. I modified the Rscript in those tickets slightly to bin to the nearest multiple of 8 and the results are almost identical (due to how often proxies are polling). We can get away with even coarser binning but I think the multiple of 8 method does enough to disguise individual client traffic.

In addition to this I think it would be interesting to collect geoip data on the available proxies. I don't want to bin snowflake proxies at this time because we have too few for this count to be useful at the moment, maybe we'd want to add it later but I don't think we need to from a client safety perspective. The overall proxy count won't leak client information because it only measures whether a proxy has polled at all in a 24 hour period and not whether it was given out (which would leak client usage data).

So I'll propose the following metrics (gathered at a granularity of every 24 hours):

    "snowflake-stats-end" YYYY-MM-DD HH:MM:SS (NSEC s) NL
        [At most once.]

        YYYY-MM-DD HH:MM:SS defines the end of the included measurement
        interval of length NSEC seconds (86400 seconds by default).

    "snowflake-ips" CC=NUM,CC=NUM,... NL
        [At most once.]

        List of mappings from two-letter country codes to the number of
        unique IP addresses of snowflake proxies that have polled.

    "snowflake-idle-count" NUM NL
        [At most once.]

        A count of the number of times a proxy has polled but received 
        no client offer, rounded up to the nearest multiple of 8.

    "client-denied-count" NUM NL
        [At most once.]

        A count of the number of times a client has requested a proxy 
        from the broker but no proxies were available, rounded up to
        the nearest multiple of 8.

    "client-snowflake-match-count" NUM NL
        [At most once.]

        A count of the number of times a client successfully received a
        proxy from the broker, rounded up to the nearest multiple of 8.

I'm going to start implementing these metrics and meanwhile put this in needs_review for the metrics team to look at.

Trac:
Status: needs_revision to needs_review

Here's a PR that implements the metrics specified above: https://github.com/cohosh/snowflake/pull/3

Trac:
Status: needs_review to assigned
Owner: N/A to cohosh

Trac:
Status: assigned to needs_review

Trac:
Reviewer: irl to phw

Trac:
Keywords: N/A deleted, anti-censorship-roadmap added

I left a number of minor comments in the GitHub pull request.

Trac:
Status: needs_review to needs_revision

Replying to phw:

I left a number of minor comments in the GitHub pull request. Thanks! Pushed some commits to address these comments: https://github.com/cohosh/snowflake/pull/3

Trac:
Status: needs_revision to needs_review

Replying to cohosh:

Replying to phw:

I left a number of minor comments in the GitHub pull request. Thanks! Pushed some commits to address these comments: https://github.com/cohosh/snowflake/pull/3

Looks good to me! We're still waiting for the metrics team to review the statistics, right? I'll just remove myself as reviewer and keep the ticket in needs_review.

Trac:
Reviewer: phw to N/A

Thanks! I put irl back as a reviewer for the metrics feedback (but feel free to change this anyone)

Trac:
Reviewer: N/A to irl

While we're waiting for review, I propose we deploy these changes to the broker and start collecting the metrics data locally so we can take a look at it.

Was there a reason for removing "snowflake-available-count"? This number is going to be the same as the sum of all country codes in "snowflake-ips", but it would probably be nice to have this in addition to be able to see at a glance.

I can follow your thought processes and I think that these metrics described in comment:19, and also snowflake-available-count from comment:14 would be OK to make public. Nothing is jumping out as particularly sensitive.

Is it possible to run two snowflake proxies from the same IP address? There does seem to be an implied limit of 1 proxy per IP address in your metrics descriptions. Maybe from a perspective of whether a bridge is burned or not, the fact that two processes may be running on the same IP doesn't matter because they would both be burned together?

Replying to irl: Thanks irl!

Was there a reason for removing "snowflake-available-count"? This number is going to be the same as the sum of all country codes in "snowflake-ips", but it would probably be nice to have this in addition to be able to see at a glance.

I opted for snowflake-idle-count and snowflake-client-match-count instead, since I think this gives us the information we'd want to use snowflake-available-count for anyway. I'm not opposed to exporting another stat on the available snowflakes, I'll add the code for that back in shortly.

I can follow your thought processes and I think that these metrics described in comment:19, and also snowflake-available-count from comment:14 would be OK to make public. Nothing is jumping out as particularly sensitive.

Is it possible to run two snowflake proxies from the same IP address? There does seem to be an implied limit of 1 proxy per IP address in your metrics descriptions. Maybe from a perspective of whether a bridge is burned or not, the fact that two processes may be running on the same IP doesn't matter because they would both be burned together? It is possible to run multiple snowflakes on a single IP. Only the country codes stats (and the total available snowflakes which I'll add back in) are unique by IP. The snowflake-idle-count and snowflake-client-match-count are not unique by IP and would reflect one IP address running multiple snowflakes. I think splitting the metrics in this way makes sense. The unique-by-IP ones will tell us information that's useful for censorship or blocking by IP and the ones that aren't unique by IP will tell us useful information about load on the system.

I'm putting this back into needs_revision to add the total available snowflake stats. I'll get a code review on that once I complete it, and then I'm tempted to close out this ticket and open a new one for the next steps in hooking these metrics outputs to whatever the metrics team needs to publish these.

Trac:
Status: needs_review to needs_revision

Replying to cohosh:

Replying to irl: Thanks irl!

Was there a reason for removing "snowflake-available-count"? This number is going to be the same as the sum of all country codes in "snowflake-ips", but it would probably be nice to have this in addition to be able to see at a glance.

I opted for snowflake-idle-count and snowflake-client-match-count instead, since I think this gives us the information we'd want to use snowflake-available-count for anyway. I'm not opposed to exporting another stat on the available snowflakes, I'll add the code for that back in shortly.

Here's a commit that adds this metric: https://github.com/cohosh/snowflake/commit/8f2dc3563b1922b285f406a48da85a5a94ee86f9

Trac:
Status: needs_revision to needs_review
Reviewer: irl to phw

Replying to cohosh:

I'm putting this back into needs_revision to add the total available snowflake stats. I'll get a code review on that once I complete it, and then I'm tempted to close out this ticket and open a new one for the next steps in hooking these metrics outputs to whatever the metrics team needs to publish these.

I created #30978 (moved) for this

Having the total count is also a good way to make sure the GeoIP code isn't doing something strange. Looking at UpdateCountryStats I don't think there will be any issues here, because you've got good error checking and a fallback option. Separating things that depend on GeoIP databases from things that don't is sometimes a good idea though.

Replying to cohosh:

Replying to cohosh:

Replying to irl: Thanks irl!

Was there a reason for removing "snowflake-available-count"? This number is going to be the same as the sum of all country codes in "snowflake-ips", but it would probably be nice to have this in addition to be able to see at a glance.

I opted for snowflake-idle-count and snowflake-client-match-count instead, since I think this gives us the information we'd want to use snowflake-available-count for anyway. I'm not opposed to exporting another stat on the available snowflakes, I'll add the code for that back in shortly.

Here's a commit that adds this metric: https://github.com/cohosh/snowflake/commit/8f2dc3563b1922b285f406a48da85a5a94ee86f9

Whoops, forgot the tests: https://github.com/cohosh/snowflake/commit/908cf3fc6413930dacdf8a29b5834a5dcf5eab92

The patch looks good to me!

Trac:
Status: needs_review to merge_ready

Merged to master and deployed.

Trac:
Resolution: N/A to implemented
Status: merge_ready to closed

closed

mentioned in issue #29461 (moved)

mentioned in issue #29734 (moved)

mentioned in issue #30731 (moved)

mentioned in issue #30830 (moved)

mentioned in issue #30978 (moved)

mentioned in issue #31376 (moved)

mentioned in issue #31407 (moved)

moved to tpo/anti-censorship/pluggable-transports/snowflake#21315 (closed)

publish some realtime stats from the broker?

Child items ...

Activity