Opened 8 years ago
Closed 6 years ago
#3261 closed project (implemented)
Analyze how wrong our bridge usage statistics are
Reported by: | karsten | Owned by: | karsten |
---|---|---|---|
Priority: | High | Milestone: | |
Component: | Metrics/Analysis | Version: | |
Severity: | Keywords: | SponsorF20121101 | |
Cc: | Actual Points: | ||
Parent ID: | Points: | ||
Reviewer: | Sponsor: |
Description
Our bridge usage statistics are based on the reports from bridges with at least 24 hours uptime. If bridges are shut down before that time, they won't tell us anything in order to protect their users' privacy.
We should find out what fraction of bridges have less than 24 hours continuous uptime to say how wrong our bridge usage statistics are.
See also Section 7 of our technical report for a more detailed description of the bridge usage statistics and their problems.
Child Tickets
Ticket | Status | Owner | Summary | Component |
---|---|---|---|---|
#5807 | closed | karsten | Propose better bridge usage statistics | Metrics/Analysis |
Attachments (3)
Change History (21)
comment:1 Changed 8 years ago by
Priority: | normal → major |
---|
comment:2 follow-up: 3 Changed 8 years ago by
I ran an analysis to see what fraction of bridges reports statistics. More precisely, I looked at the fraction of relayed bytes per day that is covered by bridge statistics. In theory, this fraction should roughly correspond to the fraction of bridge users being included in our bridge statistics.
See the attached graph. The fraction starts at around 10-20% in October 2010 and slowly increases to 20-30% in July 2010. The gaps in February and July 2011 come from Tonga downtimes. The incline in late September 2011 to 50% probably comes from a few fast and stable bridges pushing larger parts of bridge traffic.
Note that this analysis doesn't specifically look at the reasons why bridges don't report statistics. Possible reasons are less than 24 hours uptime, delay in descriptor publication (#4142), too old versions, etc. It's just meant as a general overview. More analysis needed.
comment:3 Changed 8 years ago by
Replying to karsten:
More precisely, I looked at the fraction of relayed bytes per day that is covered by bridge statistics. In theory, this fraction should roughly correspond to the fraction of bridge users being included in our bridge statistics.
Assuming most bridge users find out about a bridge via one of the bridgedb mechanisms, I think we should look at 'fraction of bridges' as the primary question rather than 'fraction of bytes'. Bridgedb doesn't look at capacity after all when deciding what addresses to give out.
So I would ask "Given this hour's networkstatus (written by Tonga), what fraction of the Running bridges never send us stats covering this hour?"
(Treating load as uniform across bridges is the wrong thing to do for users who learn their bridge through a non-bridgedb mechanism, like hearing from a friend what bridge they use. I wonder how we can estimate what fraction of bridge users learn about their bridge in what way. We could say that there probably aren't many such users because it involves manual interaction; or we could say that there aren't many users of the bridgedb approach because it gives out bridges that don't work in China so they're moot. I'm inclined toward the former.)
Changed 7 years ago by
Attachment: | stats-coverage-bridges.png added |
---|
Fraction of bridges reporting statistics
comment:4 follow-ups: 6 7 8 Changed 7 years ago by
Replying to arma:
Assuming most bridge users find out about a bridge via one of the bridgedb mechanisms, I think we should look at 'fraction of bridges' as the primary question rather than 'fraction of bytes'. Bridgedb doesn't look at capacity after all when deciding what addresses to give out.
So I would ask "Given this hour's networkstatus (written by Tonga), what fraction of the Running bridges never send us stats covering this hour?"
You're right. Unfortunately, I cannot change the analysis to include network statuses, at least not easily. I'm only parsing bridge extra-info descriptor, and even that keeps my machine busy for a few hours for a year of data, let alone the time I'd have to spend on rewriting the analysis code.
But I changed the analysis to look at bridge uptime seconds per day that are covered by stats instead of written bytes. I'm adding up the seconds for which bridges report usage statistics and the seconds for which they report written or read bytes. The quotient of the two sums is the percentage we're looking for. This analysis should be quite close to what you describe. At least it gives us the idea whether we're talking about 10, 30, 50, 70, or 90% here.
See the attached graph that I just updated. The upper part contains the old approach where we weight by written bytes, and the lower part is the new analysis that weights by uptime seconds. So, the fraction of bridges reporting statistics has been at 20% until August 2011 and has then magically increased to 40%.
(Treating load as uniform across bridges is the wrong thing to do for users who learn their bridge through a non-bridgedb mechanism, like hearing from a friend what bridge they use. I wonder how we can estimate what fraction of bridge users learn about their bridge in what way. We could say that there probably aren't many such users because it involves manual interaction; or we could say that there aren't many users of the bridgedb approach because it gives out bridges that don't work in China so they're moot. I'm inclined toward the former.)
Do we have any data about users who learn about their bridges through a non-BridgeDB mechanism? You mean public bridges, right? Because we don't have statistics from private bridges, which is an unrelated problem. I don't know what data to use here, so I'm going to ignore the fact that non-BridgeDB bridge discovery mechanisms exist for now.
comment:5 Changed 7 years ago by
Milestone: | → Sponsor F: July 15, 2012 |
---|---|
Owner: | set to karsten |
Status: | new → assigned |
Type: | task → project |
Grabbing this ticket and turning it into a sponsor F project for July.
comment:6 Changed 7 years ago by
Replying to karsten:
Do we have any data about users who learn about their bridges through a non-BridgeDB mechanism? You mean public bridges, right? Because we don't have statistics from private bridges, which is an unrelated problem. I don't know what data to use here, so I'm going to ignore the fact that non-BridgeDB bridge discovery mechanisms exist for now.
Sounds like a fine plan.
comment:7 follow-up: 9 Changed 7 years ago by
Replying to karsten:
See the attached graph that I just updated. The upper part contains the old approach where we weight by written bytes, and the lower part is the new analysis that weights by uptime seconds. So, the fraction of bridges reporting statistics has been at 20% until August 2011 and has then magically increased to 40%.
Is it easy to make a new graph? Or even to automate the making of these graphs?
comment:8 follow-up: 10 Changed 7 years ago by
Replying to karsten:
So I would ask "Given this hour's networkstatus (written by Tonga), what fraction of the Running bridges never send us stats covering this hour?"
You're right. Unfortunately, I cannot change the analysis to include network statuses, at least not easily. I'm only parsing bridge extra-info descriptor, and even that keeps my machine busy for a few hours for a year of data, let alone the time I'd have to spend on rewriting the analysis code.
Weighting by uptime seconds seems like a weird approach. I have no idea what that ought to tell us.
How much work is it to get to an answer to the question I ask above?
comment:9 Changed 7 years ago by
Replying to arma:
Replying to karsten:
See the attached graph that I just updated. The upper part contains the old approach where we weight by written bytes, and the lower part is the new analysis that weights by uptime seconds. So, the fraction of bridges reporting statistics has been at 20% until August 2011 and has then magically increased to 40%.
Is it easy to make a new graph? Or even to automate the making of these graphs?
Making a new graph is easy, but it'll keep my machine busy for half a day. I'll have to do it tonight.
That also answers the question if automating the making of these graphs is easy: no. At least not without prior optimization, and I don't think it's worth the effort. I can make a new graph whenever we want one.
comment:10 follow-up: 11 Changed 7 years ago by
Replying to arma:
Weighting by uptime seconds seems like a weird approach. I have no idea what that ought to tell us.
It's not a weird approach. It's quite related to the approach that you suggested.
Here's how the current approach works: assuming we have 10 bridges with 1, 2, 3, ..., 10 hours uptime on a given day, and the 2-hours and the 4-hours bridge report statistics, the graph would show a fraction of bridges reporting statistics by uptime of (2+4)/(1+2+3+...+10). Uptime is the time for which bridges report bandwidth histories here.
If we switch to Tonga's reachability information, we have a similar statistic. The only thing that changes is that we rely on Tonga telling us that the 1-hour bridge had the Running flag for 1 hours, the 2-hour bridge had it for 2 hours, and so on. That will fix situations when bridges think they're available but Tonga disagrees. But in theory, results should be quite similar or at least not totally off. (Yay, theory.)
How much work is it to get to an answer to the question I ask above?
I don't know. Half a day or a day? I want to make the analysis more precise by identifying reasons why bridges don't report statistics: less than 24 hours uptime, delay in descriptor publication (#4142), too old versions, no geoip file, etc. I think that requires a rewrite of the analysis tool anyway, so I should be able to include Tonga's Running flag, too.
comment:11 Changed 7 years ago by
Replying to karsten:
Replying to arma:
Weighting by uptime seconds seems like a weird approach. I have no idea what that ought to tell us.
Heh, after sending the last comment I realized that you might have understood "uptime seconds" as "uptime of the bridge until that day" which of course would be a weird approach. I hope my explanation clears up this misunderstanding. How would I phrase this in Good English?
comment:12 Changed 7 years ago by
Here's a schedule for finishing this deliverable by July 1:
- April 30: Answer the first half of the question ("what fraction of our bridges are not reporting usage statistics").
- May 31: (TBD by April 30)
- June 30: Finish a report that concludes the analysis.
- July to October: Start working on improved bridge usage statistics (optional).
Changed 7 years ago by
Attachment: | stats-coverage-bridges.2.png added |
---|
Fraction of bridges reporting statistics
comment:13 Changed 7 years ago by
I just updated the graph. It now contains a third line for the fraction of bridges with the Running flag that report statistics. Looks like that fraction went up from 25 to 80 % in the past year. It's unclear to me why that happened. I hope I can answer that question by end of April.
Changed 7 years ago by
Attachment: | bridge-report-usage-stats.pdf added |
---|
Tech report: What fraction of our bridges are not reporting usage statistics? (DRAFT)
comment:14 Changed 7 years ago by
Please find a tech report draft attached. Comments welcome! I'd like to publish this report by April 30 as indicated in the schedule above.
comment:15 Changed 7 years ago by
The tech report is now available here.
I'm now more convinced that the way how bridges report their statistics to the bridge authority is not our problem. It's the way how we derive user numbers from unique IP addresses which is totally broken. We should try an approach that's similar to how we count directory requests on directory mirrors.
Revised schedule:
- June 30: Write a proposal for new bridge statistics based on counting directory requests per day and country. Implement the proposal, get it in mergeable state, and test it on three own bridges that don't publish the new statistics to the bridge authority yet. Evaluate whether the new statistics can improve our user number estimates, and if so, enable reporting to the bridge authority and prepare deployment on all new bridges.
- October 31: If the new statistics have been deployed as expected, re-evaluate them based on a larger fraction of bridges reporting them to the bridge authority.
comment:16 Changed 7 years ago by
Milestone: | Sponsor F: July 1, 2012 → Sponsor F: November 1, 2012 |
---|
Time to update the schedule. I'm quite optimistic that we can use existing data to come up with improved bridge statistics (#5807) instead of writing a proposal, writing and testing code, and evaluating new data coming out of that. That concludes the June 30 substep. The October 31 substep is still valid, though the focus will be on evaluating existing data. Moving to the November milestone for the final substep.
comment:17 Changed 7 years ago by
Keywords: | SponsorF20121101 added |
---|---|
Milestone: | Sponsor F: November 1, 2012 |
Switching from using milestones to keywords for sponsor deliverables. See #6365 for details.
comment:18 Changed 6 years ago by
Resolution: | → implemented |
---|---|
Status: | assigned → closed |
Finished the planned report in #5807 which concludes this deliverable.
Summary: Wrote a report on what fraction of our bridges are not reporting usage statistics and a second report proposing an alternative approach for counting daily bridge users that is similar to how we estimate daily directly connecting users.
I'm excited to see this happen, especially the "how many bridges are we not hearing from" part