Opened 6 years ago

Last modified 5 days ago

#9316 assigned task

BridgeDB should export statistics

Reported by: asn Owned by: phw
Priority: Medium Milestone:
Component: Circumvention/BridgeDB Version:
Severity: Normal Keywords: metrics, bridgedb, prometheus, ex-sponsor-19, anti-censorship-roadmap
Cc: metrics-team, phw Actual Points:
Parent ID: #19332 Points: 3
Reviewer: Sponsor: Sponsor30-must

Description

BridgeDB should export statistics on its usage. Stuff like distributor usage, number of clients served, etc.

Child Tickets

Change History (25)

comment:1 Changed 6 years ago by sysrqb

Keywords: important added

comment:2 Changed 6 years ago by isis

Cc: isis@… added
Keywords: metrics bridges added; important removed
Owner: set to isis
Parent ID: #9199
Status: newassigned

comment:3 Changed 6 years ago by isis

Related: #7525 Design a system for tracking bridge assignment metrics.

comment:4 Changed 6 years ago by isis

Status: assignedneeds_revision

Closing #9317 as a duplicate of this one, putting the information from that ticket on this one, and setting as 'needs revision' because I haven't testing or looked at this branch in a while.

Quoting #9317:

While writing bridgedb's logger, I made a context manager for storing a state dictionary which is, so far rather loosely defined, but it would allow us to gather free statistics on bridgedb. Essentially, you would use it like so:

from bridgedb import log as logging
logging.callWithContext(myfoocontext, {'addBridgeAssignment': foobridge})

It is also safely threadable, so it would be possible to use this to retrieve debugging information from threads, for instance for #5232.

The nice thing about this is that it is easily called from the logger (and will still handles log levels and all the other added features from #9199). The bad thing is that if it is not written very clearly, it could be difficult for other/new people reading the code to understand, especially if they are not familiar with Twisted.

Part of this was also discussed between myself and Karsten on tor-assistants@…, earlier this month, in the "BridgeDB data for metrics" thread.

comment:5 Changed 5 years ago by isis

Keywords: bridgedb added; bridges removed
Parent ID: #9199

comment:6 Changed 4 years ago by isis

Arma commented on !#4771 that we should be also tracking the "successfulness" of each distributor:

I would define success of a distribution strategy as a function of how many people are using the bridges that are given out by that strategy.

That means if a strategy never gives bridges to anybody, it would score low. And if it gives out a lot of bridges but they never get used because they got blocked, it would also score low.

It we wanted to get fancier, we would then have a per-country success value. And then we could compare distribution strategies for a given country.

The intuition comes from Damon's Proximax paper from long ago.

comment:7 Changed 19 months ago by teor

Severity: Normal

Set all open tickets without a severity to "Normal"

comment:8 Changed 5 months ago by gaba

Cc: isis@… removed
Owner: isis deleted
Points: 3
Sponsor: Sponsor19
Status: needs_revisionassigned

comment:9 Changed 5 months ago by karsten

Cc: metrics-team added

sysrqb and I discussed this topic in Mexico City. IIRC, we said that sysrqb would send me 24 hours of logs, which can easily be non-recent and heavily obfuscated and use encrypted email, and I use those logs to suggest a possible statistics format on tor-dev@. sysrqb, want to send me those logs, and I move things forward as time permits?

comment:10 Changed 4 months ago by dgoulet

Owner: set to dgoulet

comment:11 Changed 4 months ago by irl

Parent ID: #19332

This is required to exist before metrics team can archive them in CollecTor.

comment:12 Changed 4 months ago by gaba

Milestone: Network Team 2019 Q1Q2

comment:13 Changed 4 months ago by gaba

Keywords: network-team-roadmap-2019-Q1Q2 added

comment:14 Changed 4 months ago by gaba

Milestone: Network Team 2019 Q1Q2

comment:15 Changed 2 months ago by phw

Cc: phw added

comment:16 Changed 2 months ago by phw

Here's a preliminary list of statistics that we may want, and why we want them. Needless to say, we need to figure out how to collect these statistics safely.

  • Approximate number of successful requests per distribution mechanism, per country, per bridge type.
    • This shows us the demand for bridges over time, and how much use BridgeDB sees.
    • It also teaches us what distribution mechanisms are the most useful (or at least popular).
  • Approximate number of denied requests per distribution mechanism, per country, per bridge type.
    • This may show us if people are interacting with BridgeDB unsuccessfully, despite good intentions.
    • It may also show us if somebody is trying to game the system.
    • Unfortunately, it's difficult to tell apart well-intentioned misuse from ill-intentioned misuse.
  • Approximate number of email requests per provider, per bridge type.
    • This would help us decide what email providers we should pay attention to.
    • This would also teach us what providers we could safely retire. For example, over at #28496, we are thinking about removing Yahoo. What fraction of requests would be affected by this?
  • Approximate number of HTTPS requests coming from proxies.
    • This may be an indicator of people trying to game the system.
  • Maybe the number of bridges per transport in BridgeDB (see #14453).

What am I forgetting?

comment:17 Changed 2 months ago by phw

Keywords: prometheus added

I briefly discussed this with dgoulet and sysrqb. dgoulet suggested that we may want to export these statistics to our prometheus instance. The idea is to run an exporter on the BridgeDB host. This exporter would only expose the latest BridgeDB stats.

comment:18 Changed 8 weeks ago by gaba

Keywords: network-team-roadmap-2019-Q1Q2 removed

comment:19 in reply to:  16 Changed 8 weeks ago by dcf

Replying to phw:

Here's a preliminary list of statistics that we may want, and why we want them. Needless to say, we need to figure out how to collect these statistics safely.

If it's possible, I would like to have a guess at what fraction of bridge requesters are bots. Proxy-distribution papers usually assume that an adversary controls some fraction of the users--it would be great to know what the fraction is in this case. For example Mahdian2010a "n users, k of whom [are] adversaries," Wang2013a "Let f denote the fraction of malicious users among all potential bridge users.... We expect a typical value of f between 1% and 5%...."

Here are some possible ways to identify bots:

  • IP address clustering--for example if BridgeDB considers all addresses in a /24 the same, find the most commonly occurring /20
  • auto-generated email addresses following a pattern
    • to start, you could make a histogram of the lengths of email addresses, and see if it's concentrated at a single point. or count the frequency of short prefixes and suffixes of email address local-parts, and see if there are any that appear overwhelmingly more often than others.
  • an anachronistic HTTP User-Agent (for example, Chrome from 2 years ago, when most real Chrome users auto-update)
  • inconsistent HTTP headers, for example Chrome or Firefox without Accept-Encoding: gzip

With some sort of bot-classification heuristic, then it would be good to analyze the statistics you mentioned already (e.g. fraction allowed/denied) for bot and non-bot requests.

I would like to see a graph that shows how long it takes for a single bridge to be given to n different requesters. When BridgeDB starts distributing a bridge, how long does it take before 5 people know about it? Before 50 people know about it?

  • Approximate number of HTTPS requests coming from proxies.
    • This may be an indicator of people trying to game the system.

On this point, specifically I would want to know what fraction of of requests have an X-Forwarded-For or Via header, and how many entries it contains. I mention this because not only can these headers indicate the use of a proxy, a client may spoof them. And I seem to remember that BridgeDB may process X-Forwarded-For incorrectly, like it reads the entries in the wrong order when there are multiple of them.

For this analysis, you will have to be aware that requests via Moat always have at least one X-Forwarded-For (I believe), because Moat is implemented using an Apache ProxyPass reverse proxy and Apache adds that header.

comment:20 Changed 7 weeks ago by phw

Owner: changed from dgoulet to phw

comment:21 Changed 7 weeks ago by phw

I posted a draft proposal for Tor's research safety board on our mailing list.

comment:22 Changed 12 days ago by gaba

Keywords: ex-sponsor-19 added

Adding the keyword to mark everything that didn't fit into the time for sponsor 19.

comment:23 Changed 11 days ago by phw

Sponsor: Sponsor19Sponsor30-must

Moving from Sponsor 19 to Sponsor 30.

comment:24 Changed 11 days ago by gaba

Keywords: anti-censorship-roadmap added

comment:25 Changed 5 days ago by phw

We just heard back from Tor's Research Safety Board. You can find the response below. The reviewer writes that our proposal wouldn't be an issue in a one-off setting but could be problematic in the long run. I think a reasonable way forward would be to implement the proposal, run it in a one-off setting for, say, a week, and then evaluate if we should change data collection. In the long run, we should also transition to PrivCount as the reviewer mentions.

Tor Research Safety Board Paper #20 Reviews and Comments
===========================================================================
Paper #20 Collecting BridgeDB usage statistics


Review #20A
===========================================================================
* Updated: 11 Jun 2019 6:02:53pm EDT

Overall merit
-------------
4. Accept

Reviewer expertise
------------------
3. Knowledgeable

Paper summary
-------------
The document proposing collecting a new set of usage statistics through data
available from the operation of BridgeDB. The statistics would be useful for
better prioritizing development tasks, to improve reaction time to bridge
enumeration attacks and blockages, to reduce failure rates, and to help promote
censorship circumvention research.

Comments for author
-------------------
If this was a short term study, I would say go for it, no questions asked. The
benefits are clear and I agree that they outweigh the risks.

However, I think it was implied (although not explicitly stated) that the new
statistics would be regularly collected and published on an ongoing basis. I
think there are more risks associated with such an ongoing collection as opposed
to a one-off or short term study, so we should carefully consider the trade-offs
between cost/effort of safer collection methods with the privacy benefits of
such methods.

The most concerning statistics to me are the per-country statistics and the
per-service (gmail, yahoo, etc.) statistics. I think it is clear from Sections 3
and 4 that you understand the risks associated with collecting these statistics:
a single user from an unpopular country could be identified because the 1-10
bucket suddenly changed from a 0 count to a 1 count. This issue might also exist
if unpopular email service providers are selected. This issue is already present
in Tor's per-country user statistics, and I believe there is a plan to
transition away from these statistics because of the safety concerns. The
bucketing proposal (round to the nearest 10) does provide some uncertainty, but
it's hard to reason about what protection it is providing.

In an ideal world, we would collect these statistics with a privacy-preserving
statistics collection tool. In fact, I think most if not all of these could be
collected with PrivCount (assuming it was extended to support the new event
types).

One useful thing about PrivCount is secure aggregation, meaning that if you have
multiple data collectors, you can securely count a total across all of them
without leaking individual inputs. In this case, it seems like there is only one
BridgeDB data source, so we woud not benefit from PrivCount's secure
aggregation.

The other useful thing that PrivCount provides is differential privacy. This is
where you could get most of the benefit. Rather than rounding to 10 and not
knowing how much privacy that provides, you instead start by defining how much
privacy each statistic should achieve based on your operational environment
(these are called action bounds), and then PrivCount will add noise to the
statistics in a way that will guarantee differential privacy under those
constraints. If these constraints add too much noise for the resulting
statistics to be useful, then you have to consider if the measurement is too
privacy-invasive for the given actions you are trying to protect and therefore
you possibly shouldn't collect them.

Tor has PrivCount on the roadmap (I believe), so one option could be to
implement the non-PrivCount version now and eventually transition the statistics
to PrivCount. Another option would be to set up a PrivCount instance using the
open source tool rather than waiting for the PrivCount-in-Tor version to be
ready. In fact, if the data is collected at BridgeDB, then I'm not sure that
having PrivCount in Tor would help anyway (unless the BridgeDB runs Tor).

There has been some work to use PrivCount for measurement and also to explain
the process of defining action bounds. I think the most relevant is the IMC
paper:
    - https://torusage-imc2018.github.io
Note: See TracTickets for help on using tickets.