How many snowflakes are there registered right now and happy to serve censored users?
Right now there's a big difference between 0 and 1, and it's not easy to figure out which it is.
Knowing this number would help me as a snowflake volunteer decide whether I am needed, and whether to do advocacy at this moment to get other people to be snowflakes.
Knowing this number would help the censored users too, because it would give them a sense of the health of the snowflake population, and also it can help them debug their "it's not working, I wonder if I can narrow down some possible problems" situations.
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information
Child items ...
Show closed items
Linked items 0
Link issues together to show that they're related.
Learn more.
I would also be interested in stats about users and usage (including e.g. number of users being handled divided by number of snowflakes handling them), but I recognize anything involving the users is a more complicated topic, and we shouldn't do things that could put users at risk without sorting through what we ought to protect and how we can make sure it's being protected.
So, step one, tell me more about the snowflakes please. :)
One other concrete thing that I want: how many times are you giving snowflakes out? How many times did you stop giving a snowflake out because you've given it out so many times already? These questions tie into the address distribution algorithm question: it's not clear how to pick the right parameters in a vacuum, but we're not in a vacuum, so maybe we can gain some intuition by seeing how things play out in practice.
There should be at least 3 snowflakes on each broker at all times, because we're specifically running some fallback proxy-go instances. Obviously these are no good from a circumvention point of view, because they're on a static IP address--they're mainly there so that curious people who try the snowflake option in the alpha browser aren't immediately discouraged.
It sounds like we have a few things we want to achieve/learn from collected metrics:
Detect censorship events
Allow current or potential proxies to see if they are needed
Allow clients to see whether their connection issues are due to censorship or proxy availability
Help us figure out whether we should be doing something different in distributing proxies to clients
=== What We Have
We current collect and "publish" information on:
how many snowflake are currently available along with their SIDs (available at broker /debug handler). This is good for more detailed monitoring of censorship events. Even though we collect bridge usage metrics, collecting broker usage metrics will narrow down where the censorship is happening.
country stats of domain-fronted client connections (logged, most recent snapshot at broker /debug)
the roundtrip time it takes for a client to connect to get a snowflake proxy answer (available at broker /debug)
the usual snowflake bridge statistics (at metrics.torproject.org)
=== What We Want
Some of the metrics mentioned above will be easier to implement than others. The best place to collect statistics is at the broker, but some of the data mentioned would require proxies to report metrics to the broker for collection. We have to be a bit careful with this since anyone can run a proxy. It will also impact the decisions we make for #29207 (moved).
I would also be interested in stats about users and usage (including e.g. number of users being handled divided by number of snowflakes handling them)
This is a bit tricky. The broker knows which proxies it hands out the users but doesn't know the state of the clients' connections to those proxies (e.g., when they have been closed). It's also worth noting that different "types" of proxies (standalone vs. browser-based) can handle a different amount of users at once. Perhaps a more useful metric would be for snowflake proxies to advertize to the broker how many available "slots/tokens" they have when they poll for clients. This could be added to the broker--proxy WebSocket protocol. It would also avoid collecting more data on clients which is generally safer
how many times are you giving snowflakes out? How many times did you stop giving a snowflake out because you've given it out so many times already? These questions tie into the address distribution algorithm question
The above comment addresses this as well. The broker doesn't really decide whether or not they've given a snowflake out too many times. I think more important to deciding whether we are giving out proxies in a good way is to try to measure how "reliable" individual proxies have been in the past. This is related to setting up persistent identifiers (#29260 (moved)).
It might also be interesting to have some kind of proxy diversity metric (e.g., whether 90% of all connections are handled by the same proxy). We can get some idea with persistent identifiers (#29260 (moved)), but of course using a persistent identifier will always be optional. We can also do collection of geoip country stats of proxies.
I noticed that metrics-team is cc'ed, but from reading the comments it seems like you're still at design discussions like the slots/tokens idea. When is a good time for the metrics team to get involved? Should we wait until you have a better idea what you want? Or should we help with the bikeshedding right now? :)
(FWIW, I didn't understand arma's "big difference between 0 and 1" comment in the summary, and I'm not 100% certain whether SID stands for Snowflake IDentifier or Somethingelse I Don'tknow.)
I noticed that metrics-team is cc'ed, but from reading the comments it seems like you're still at design discussions like the slots/tokens idea. When is a good time for the metrics team to get involved? Should we wait until you have a better idea what you want? Or should we help with the bikeshedding right now? :)
I think we've still got a bit of work to do before we will know enough about where we want to go to include the metrics team. We have a few other tickets we need to cover first before we will even have the data we need at the broker for some of these metrics.
It sounds like we have a few things we want to achieve/learn from collected metrics:
Detect censorship events
Allow current or potential proxies to see if they are needed
Allow clients to see whether their connection issues are due to censorship or proxy availability
Help us figure out whether we should be doing something different in distributing proxies to clients
These all seem like good goals.
We current collect and "publish" information on:
how many snowflake are currently available along with their SIDs (available at broker /debug handler). This is good for more detailed monitoring of censorship events. Even though we collect bridge usage metrics, collecting broker usage metrics will narrow down where the censorship is happening.
country stats of domain-fronted client connections (logged, most recent snapshot at broker /debug)
the roundtrip time it takes for a client to connect to get a snowflake proxy answer (available at broker /debug)
Should we be already archiving this data?
Some of the metrics mentioned above will be easier to implement than others. The best place to collect statistics is at the broker, but some of the data mentioned would require proxies to report metrics to the broker for collection. We have to be a bit careful with this since anyone can run a proxy. It will also impact the decisions we make for #29207 (moved).
We collect a lot of statistics at relays and bridges, which anyone can run. We are working on methods of improving robustness against these statistics being manipulated, but so far have not detected anyone reporting values that are not normal. It is good to have criteria for determining, based on stats others report, what you would be expecting so that anomalies can be detected. For example, we would expect relay bandwidth usage among relays to be proportional to consensus weight.
I would also be interested in stats about users and usage (including e.g. number of users being handled divided by number of snowflakes handling them)
This is a bit tricky. The broker knows which proxies it hands out the users but doesn't know the state of the clients' connections to those proxies (e.g., when they have been closed). It's also worth noting that different "types" of proxies (standalone vs. browser-based) can handle a different amount of users at once. Perhaps a more useful metric would be for snowflake proxies to advertize to the broker how many available "slots/tokens" they have when they poll for clients. This could be added to the broker--proxy WebSocket protocol. It would also avoid collecting more data on clients which is generally safer
This sounds like a reasonable approach. You might want to take a look at:
This will give you an idea of how we do this for other parts of Tor.
how many times are you giving snowflakes out? How many times did you stop giving a snowflake out because you've given it out so many times already? These questions tie into the address distribution algorithm question
Can this also be an indirect measurement of number of users?
The above comment addresses this as well. The broker doesn't really decide whether or not they've given a snowflake out too many times. I think more important to deciding whether we are giving out proxies in a good way is to try to measure how "reliable" individual proxies have been in the past. This is related to setting up persistent identifiers (#29260 (moved)).
For relays, directory authorities track the mean time between failures, and we track this in Tor Metrics too.
It might also be interesting to have some kind of proxy diversity metric (e.g., whether 90% of all connections are handled by the same proxy). We can get some idea with persistent identifiers (#29260 (moved)), but of course using a persistent identifier will always be optional. We can also do collection of geoip country stats of proxies.
We don't really have this metric for relays yet, so if you have ideas that would be applicable to relays too then that would be great. We know about country/AS distribution, but we haven't quantified the diversity using any particular formula.
Log all of the statistics in a reasonable format
This would ideally be a format that Tor Metrics is already handling. If it could be based on the Tor directory protocol meta-format (§1.2 dir-spec) then that would be great. We don't want to bring in dependencies for parsing yaml/toml/etc. if we can help it.
coordinate with the metrics team to get these metrics collected and visualized somewhere
Please also coordinate on what you want to collect, so we can consider if that information already comes from somewhere, if we already had a plan for it, and if it is safe or not.
In the interest of having a minimum viable product I think it makes sense to start with making sure that what we have now is safe and can be archived by Tor metrics. Some useful statistics (like how long proxies remain available and how many unique proxies we give out) will have to wait for #29260 (moved) to be implemented which in turn depends on a few different things.
So, the metrics we currently (or soon) are able to collect are:
Number of currently available snowflake proxies
GeoIP stats (and, with that, the total number) of snowflake proxies we actually hand out to clients
Figure out whether these metrics are safe and good
Get these metrics in a format that Tor Metrics can handle and display
So my questions for the metrics team are:
Does it make sense to use to privacy preserving ways of counting available proxies (we are definitely not going to collect or export any client data at this point in time)
I have a vague memory of a trac ticket, wiki page, or email that summarized something to the effect of "this is how to get Tor metrics to archive your data" but I can't find it. Other than following the Tor directory protocol meta-format, do you have any other advice on how to format our data?
Number of currently available snowflake proxies is not sensitive. We do not make any efforts to hide the numbers of relays or bridges, and so this can be an exact count. The question here is not the count resolution but the time resolution. (Sorry to answer your question with a question.)
If I'm an attacker, can I learn anything about a client if I can observe the client's traffic and the exact count of snowflakes. For example, what do I learn if a snowflake that a client is using disappears? I'm not sure what the snowflake protocol does in this case.
I'm not sure what you mean with the GeoIP stats. If these are stats regarding the locations of proxies, again exact counts would be fine and would be in line with what we do for relays and bridges at the moment. If this is for clients, we should aim to provide differential privacy. I fear that at the moment, we are not seeing enough users that we can safely report GeoIP stats (usefully) for clients at all. With relays and bridges, we round the counts up to the nearest multiple of 8.
Round trip time of snowflake rendezvous sounds like a really useful metric for engineering work, but a dangerous one for safety. This would be a good candidate for PrivCount but without such a technique I wouldn't do this one. We currently measure performance of relays using active measurement, such that we are only analyzing our own traffic. We have extended that tool, OnionPerf, to also work for pluggable transports but it will do the end-to-end performance not just client->snowflake.
Can you lay out in detail exactly what metrics you'd want, what resolution data you want (both in counts and in time) and what you might consider an attacker could learn, assuming they are in a position to monitor, or are running, a point in the network?
Section 2.1.2 of dir-spec contains some examples of descriptions of metrics.
Replying to irl:
Thanks for this feedback. This was very helpful. What makes snowflake statistics a little more complex than bridge or relay stats is that, while stats about how many times a bridge was used doesn't closely reflect client usage, proxies handle either a single client or a small, fixed number of clients as determined by the individual proxy and so there's a greater possibility for data leakage there.
Number of currently available snowflake proxies is not sensitive. We do not make any efforts to hide the numbers of relays or bridges, and so this can be an exact count. The question here is not the count resolution but the time resolution. (Sorry to answer your question with a question.)
If I'm an attacker, can I learn anything about a client if I can observe the client's traffic and the exact count of snowflakes. For example, what do I learn if a snowflake that a client is using disappears? I'm not sure what the snowflake protocol does in this case.
Possibly, as stated above, it depends on what type of proxy you are and how it's set up. I think we're better off doing binning in this case. However, as stated below, if we collect at a granularity of every 24 hours this shouldn't leak client usage.
I'm not sure what you mean with the GeoIP stats. If these are stats regarding the locations of proxies, again exact counts would be fine and would be in line with what we do for relays and bridges at the moment. If this is for clients, we should aim to provide differential privacy. I fear that at the moment, we are not seeing enough users that we can safely report GeoIP stats (usefully) for clients at all. With relays and bridges, we round the counts up to the nearest multiple of 8.
We're absolutely not collecting geoip stats of clients. These are only of snowflake proxies. I originally thought to include geoip stats of proxies that are actually handed out but it's safer to do stats for available proxies since this shouldn't leak client usage if collected over a period of 24 hours.
Round trip time of snowflake rendezvous sounds like a really useful metric for engineering work, but a dangerous one for safety. This would be a good candidate for PrivCount but without such a technique I wouldn't do this one. We currently measure performance of relays using active measurement, such that we are only analyzing our own traffic. We have extended that tool, OnionPerf, to also work for pluggable transports but it will do the end-to-end performance not just client->snowflake.
That's fair, this is really only available for debug purposes. We don't need to export it as a metric and I'd argue that this should only be logged locally.
Can you lay out in detail exactly what metrics you'd want, what resolution data you want (both in counts and in time) and what you might consider an attacker could learn, assuming they are in a position to monitor, or are running, a point in the network?
It looks like bridge stats default to 24 hours, that seems reasonable for snowflake as well.
Section 2.1.2 of dir-spec contains some examples of descriptions of metrics.
To summarize, and be more precise about what we want to collect, I've put our proposed exported metrics in the Tor Directory Protocol Format:
"snowflake-stats-end" YYYY-MM-DD HH:MM:SS (NSEC s) NL [At most once.] YYYY-MM-DD HH:MM:SS defines the end of the included measurement interval of length NSEC seconds (86400 seconds by default). "snowflake-ips" CC=NUM,CC=NUM,... NL [At most once.] List of mappings from two-letter country codes to the number of unique IP addresses of available snowflake proxies, rounded up to the nearest multiple of 8. "snowflake-available-count" NUM [At most once.] A count of the number of unique IP addresses corresponding to currently available snowflake proxies, rounded up to the nearest multiple of 8. "snowflake-usage-count" NUM [At most once.] A count of the number of snowflake proxies that have been handed out by the broker to clients, rounded up to the nearest multiple of 8.
So in short, we'd collect over a 24 hour period:
geoip stats of unique available snowflake proxies
approximated count of unique, available snowflake proxies
approximated count of the number of proxies handed to snowflake clients (which would also be the same as the total number of client requests).
I've cc'd dcf and arlolra to see if they have thoughts on this.
I just thought of another thing that makes snowflake quite different from relays or bridges is the expected amount of time that each proxy will be online. It might make more sense to shorten the collection period from 24 hours in this case. Another thing we can do is have the broker export another metric that includes something like the average amount of time a unique proxy was available.
Also noting that mapping unique proxies to IP addresses isn't quite how snowflake is supposed to work: we'll probably want some kind of persistent identifier later as mentioned in the comments above, but I think it will work fine for our purposes now.
Also noting that mapping unique proxies to IP addresses isn't quite how snowflake is supposed to work: we'll probably want some kind of persistent identifier later as mentioned in the comments above, but I think it will work fine for our purposes now.
Identifying by IP address is interesting by itself, though. If IP address is the expected basis of blocking, then it's interesting to consider a proxy's "identity" as IP address because that's the information available to a censor. A long-term identity independent of IP address is also interesting, but less informative for measuring blocking resistance IMO.
After looking at #30731 (moved), I want to change the proposed collected metrics to match the data shown in these graphs. I modified the Rscript in those tickets slightly to bin to the nearest multiple of 8 and the results are almost identical (due to how often proxies are polling). We can get away with even coarser binning but I think the multiple of 8 method does enough to disguise individual client traffic.
In addition to this I think it would be interesting to collect geoip data on the available proxies. I don't want to bin snowflake proxies at this time because we have too few for this count to be useful at the moment, maybe we'd want to add it later but I don't think we need to from a client safety perspective. The overall proxy count won't leak client information because it only measures whether a proxy has polled at all in a 24 hour period and not whether it was given out (which would leak client usage data).
So I'll propose the following metrics (gathered at a granularity of every 24 hours):
"snowflake-stats-end" YYYY-MM-DD HH:MM:SS (NSEC s) NL [At most once.] YYYY-MM-DD HH:MM:SS defines the end of the included measurement interval of length NSEC seconds (86400 seconds by default). "snowflake-ips" CC=NUM,CC=NUM,... NL [At most once.] List of mappings from two-letter country codes to the number of unique IP addresses of snowflake proxies that have polled. "snowflake-idle-count" NUM NL [At most once.] A count of the number of times a proxy has polled but received no client offer, rounded up to the nearest multiple of 8. "client-denied-count" NUM NL [At most once.] A count of the number of times a client has requested a proxy from the broker but no proxies were available, rounded up to the nearest multiple of 8. "client-snowflake-match-count" NUM NL [At most once.] A count of the number of times a client successfully received a proxy from the broker, rounded up to the nearest multiple of 8.
I'm going to start implementing these metrics and meanwhile put this in needs_review for the metrics team to look at.
Looks good to me! We're still waiting for the metrics team to review the statistics, right? I'll just remove myself as reviewer and keep the ticket in needs_review.
While we're waiting for review, I propose we deploy these changes to the broker and start collecting the metrics data locally so we can take a look at it.
Was there a reason for removing "snowflake-available-count"? This number is going to be the same as the sum of all country codes in "snowflake-ips", but it would probably be nice to have this in addition to be able to see at a glance.
I can follow your thought processes and I think that these metrics described in comment:19, and also snowflake-available-count from comment:14 would be OK to make public. Nothing is jumping out as particularly sensitive.
Is it possible to run two snowflake proxies from the same IP address? There does seem to be an implied limit of 1 proxy per IP address in your metrics descriptions. Maybe from a perspective of whether a bridge is burned or not, the fact that two processes may be running on the same IP doesn't matter because they would both be burned together?
Was there a reason for removing "snowflake-available-count"? This number is going to be the same as the sum of all country codes in "snowflake-ips", but it would probably be nice to have this in addition to be able to see at a glance.
I opted for snowflake-idle-count and snowflake-client-match-count instead, since I think this gives us the information we'd want to use snowflake-available-count for anyway. I'm not opposed to exporting another stat on the available snowflakes, I'll add the code for that back in shortly.
I can follow your thought processes and I think that these metrics described in comment:19, and also snowflake-available-count from comment:14 would be OK to make public. Nothing is jumping out as particularly sensitive.
Is it possible to run two snowflake proxies from the same IP address? There does seem to be an implied limit of 1 proxy per IP address in your metrics descriptions. Maybe from a perspective of whether a bridge is burned or not, the fact that two processes may be running on the same IP doesn't matter because they would both be burned together?
It is possible to run multiple snowflakes on a single IP. Only the country codes stats (and the total available snowflakes which I'll add back in) are unique by IP. The snowflake-idle-count and snowflake-client-match-count are not unique by IP and would reflect one IP address running multiple snowflakes. I think splitting the metrics in this way makes sense. The unique-by-IP ones will tell us information that's useful for censorship or blocking by IP and the ones that aren't unique by IP will tell us useful information about load on the system.
I'm putting this back into needs_revision to add the total available snowflake stats. I'll get a code review on that once I complete it, and then I'm tempted to close out this ticket and open a new one for the next steps in hooking these metrics outputs to whatever the metrics team needs to publish these.
Was there a reason for removing "snowflake-available-count"? This number is going to be the same as the sum of all country codes in "snowflake-ips", but it would probably be nice to have this in addition to be able to see at a glance.
I opted for snowflake-idle-count and snowflake-client-match-count instead, since I think this gives us the information we'd want to use snowflake-available-count for anyway. I'm not opposed to exporting another stat on the available snowflakes, I'll add the code for that back in shortly.
I'm putting this back into needs_revision to add the total available snowflake stats. I'll get a code review on that once I complete it, and then I'm tempted to close out this ticket and open a new one for the next steps in hooking these metrics outputs to whatever the metrics team needs to publish these.
Having the total count is also a good way to make sure the GeoIP code isn't doing something strange. Looking at UpdateCountryStats I don't think there will be any issues here, because you've got good error checking and a fallback option. Separating things that depend on GeoIP databases from things that don't is sometimes a good idea though.
Was there a reason for removing "snowflake-available-count"? This number is going to be the same as the sum of all country codes in "snowflake-ips", but it would probably be nice to have this in addition to be able to see at a glance.
I opted for snowflake-idle-count and snowflake-client-match-count instead, since I think this gives us the information we'd want to use snowflake-available-count for anyway. I'm not opposed to exporting another stat on the available snowflakes, I'll add the code for that back in shortly.