Opened 6 years ago

Last modified 23 hours ago

#9316 assigned task

BridgeDB should export statistics

Reported by: asn Owned by: dgoulet
Priority: Medium Milestone:
Component: Obfuscation/BridgeDB Version:
Severity: Normal Keywords: metrics, bridgedb, network-team-roadmap-2019-Q1Q2, prometheus
Cc: metrics-team, phw Actual Points:
Parent ID: #19332 Points: 3
Reviewer: Sponsor: Sponsor19

Description

BridgeDB should export statistics on its usage. Stuff like distributor usage, number of clients served, etc.

Child Tickets

Change History (17)

comment:1 Changed 6 years ago by sysrqb

Keywords: important added

comment:2 Changed 6 years ago by isis

Cc: isis@… added
Keywords: metrics bridges added; important removed
Owner: set to isis
Parent ID: #9199
Status: newassigned

comment:3 Changed 6 years ago by isis

Related: #7525 Design a system for tracking bridge assignment metrics.

comment:4 Changed 6 years ago by isis

Status: assignedneeds_revision

Closing #9317 as a duplicate of this one, putting the information from that ticket on this one, and setting as 'needs revision' because I haven't testing or looked at this branch in a while.

Quoting #9317:

While writing bridgedb's logger, I made a context manager for storing a state dictionary which is, so far rather loosely defined, but it would allow us to gather free statistics on bridgedb. Essentially, you would use it like so:

from bridgedb import log as logging
logging.callWithContext(myfoocontext, {'addBridgeAssignment': foobridge})

It is also safely threadable, so it would be possible to use this to retrieve debugging information from threads, for instance for #5232.

The nice thing about this is that it is easily called from the logger (and will still handles log levels and all the other added features from #9199). The bad thing is that if it is not written very clearly, it could be difficult for other/new people reading the code to understand, especially if they are not familiar with Twisted.

Part of this was also discussed between myself and Karsten on tor-assistants@…, earlier this month, in the "BridgeDB data for metrics" thread.

comment:5 Changed 5 years ago by isis

Keywords: bridgedb added; bridges removed
Parent ID: #9199

comment:6 Changed 4 years ago by isis

Arma commented on !#4771 that we should be also tracking the "successfulness" of each distributor:

I would define success of a distribution strategy as a function of how many people are using the bridges that are given out by that strategy.

That means if a strategy never gives bridges to anybody, it would score low. And if it gives out a lot of bridges but they never get used because they got blocked, it would also score low.

It we wanted to get fancier, we would then have a per-country success value. And then we could compare distribution strategies for a given country.

The intuition comes from Damon's Proximax paper from long ago.

comment:7 Changed 17 months ago by teor

Severity: Normal

Set all open tickets without a severity to "Normal"

comment:8 Changed 3 months ago by gaba

Cc: isis@… removed
Owner: isis deleted
Points: 3
Sponsor: Sponsor19
Status: needs_revisionassigned

comment:9 Changed 3 months ago by karsten

Cc: metrics-team added

sysrqb and I discussed this topic in Mexico City. IIRC, we said that sysrqb would send me 24 hours of logs, which can easily be non-recent and heavily obfuscated and use encrypted email, and I use those logs to suggest a possible statistics format on tor-dev@. sysrqb, want to send me those logs, and I move things forward as time permits?

comment:10 Changed 2 months ago by dgoulet

Owner: set to dgoulet

comment:11 Changed 2 months ago by irl

Parent ID: #19332

This is required to exist before metrics team can archive them in CollecTor.

comment:12 Changed 2 months ago by gaba

Milestone: Network Team 2019 Q1Q2

comment:13 Changed 2 months ago by gaba

Keywords: network-team-roadmap-2019-Q1Q2 added

comment:14 Changed 2 months ago by gaba

Milestone: Network Team 2019 Q1Q2

comment:15 Changed 2 weeks ago by phw

Cc: phw added

comment:16 Changed 2 weeks ago by phw

Here's a preliminary list of statistics that we may want, and why we want them. Needless to say, we need to figure out how to collect these statistics safely.

  • Approximate number of successful requests per distribution mechanism, per country, per bridge type.
    • This shows us the demand for bridges over time, and how much use BridgeDB sees.
    • It also teaches us what distribution mechanisms are the most useful (or at least popular).
  • Approximate number of denied requests per distribution mechanism, per country, per bridge type.
    • This may show us if people are interacting with BridgeDB unsuccessfully, despite good intentions.
    • It may also show us if somebody is trying to game the system.
    • Unfortunately, it's difficult to tell apart well-intentioned misuse from ill-intentioned misuse.
  • Approximate number of email requests per provider, per bridge type.
    • This would help us decide what email providers we should pay attention to.
    • This would also teach us what providers we could safely retire. For example, over at #28496, we are thinking about removing Yahoo. What fraction of requests would be affected by this?
  • Approximate number of HTTPS requests coming from proxies.
    • This may be an indicator of people trying to game the system.
  • Maybe the number of bridges per transport in BridgeDB (see #14453).

What am I forgetting?

comment:17 Changed 23 hours ago by phw

Keywords: prometheus added

I briefly discussed this with dgoulet and sysrqb. dgoulet suggested that we may want to export these statistics to our prometheus instance. The idea is to run an exporter on the BridgeDB host. This exporter would only expose the latest BridgeDB stats.

Note: See TracTickets for help on using tickets.