Breakout session: measuring users while balancing data utility and privacy 
Members: Lorenzo (Lo), Ame (A), Feri (F), Linda (Li), Karsten(K), Sebastian (S), Martha (M), Isa (I) 
Facilitator: Isa

Topics talked about (in chronological order): 
1. the user path 
2. concerns with data collection
3. overarching themes 
4. visions 

1: the user path

The user path involves a traditional browser, the download process, and then go through the tor launcher interface, then use the tor browser. 

We can't really measure anything that the users do in the traditional browser or the download process, because it is out of tor. The rest of the conversation is brainstorming metrics to measure in the tor launcher and tor browser.

tor launcher (append-only list): 
+ if a connection is successful (but how do we measure failed connections?) 
+ what is clicked and how many times? 
+ what bridge was used
+ time to connect to the tor network
+ number of attempts until success
+ what order the bridges were chosen in 

2: concerns with data collection 

S: we shouldn't build anything that can uniquely identify people. 
K: key is transparency with what we collect, along with discussion on picking metrics carefully. 
S: but we need to say what needs to be collected and need large adoption for data collection to get representable results of the users. 
I,M: we don't want users to need to opt into data collection.
S: we need to make data collection default. we need metrics to make UX work succeed. 
L: we can collect data privately, using methods (adding statistical noise, etc.) 
S: we can't track something specific, like removing tor browser.. while preserving privacy. 
K: adding statistical noise correct is a project in itself. we need an efficient way to automate graphing the data and getting feedback. 
S: let's do self-reporting inside the browser. 

M: don't publish how many people haven't updated their software. Then attackers know how many percent of the users are vulnerable. 
K: we have always lost that battle, because that data is out there. Tor also downloads updates automatically, so they are only vulnerable when they are offline. 
M: I am talking more about mobile phones, where android versions are behind and vulnerable, and that's really hard to address. 
K: If this is an issue, we need to change how we are displaying this data.
S: I don't see this as an issue, this is just how things suck. 
M: This would only be a problem if people don't update automatically. 

I: we need to start drafting what we want to collect, whether that is counting cells or anything else. We need the community to accept that we want to collect data, and this is something we can do. 
I: remember when we prioritized what information we want? we also analyzed how risky each piece of data is, and then we came up with the final list of what we will collect. We need do that again! 
K: we need internal discussion about privacy preserving statistics and get out priorities straight before we go public. 
A: what did you do previously? 
I: we talked about gathering statistics for onion services (reference to process above). 
K: we need to prioritize the data we want, yes. 
I: The key is to show the difference when our changes are made--so we can show our funders, or just so we know that a change was actually helpful.  
I: we want to influence the splash page and have it be a responsive design, influence users to go through the launcher differently, etc. 
I: we want to recommend apps on mobile on the splash page, and do other things in the desktop. 
k: we are already collecting the data, and make changes today. We are just not processing the data. 

K: our metrics are too coarse to measure things like subtle UI changes. 
F: we can measure by countries too, which gives a hint, but it's not obvious. 
L: we need metrics we can't collect. I am running a user study and we need metrics like how much time they took to configure, what bridges were tried in order, if they understood the text, how many buttons were clicked, etc. 
K: should we have people opt into this data collection? 
I: anything would be useful, even if it's like 10% opt-in. 
Lo: why not do it by default? 
I: we don't want to ask users to do more or decide more. 
S: and not making it obvious would get us only people who are observant. 

<linda got behind on keeping up with talking> 

S: most people should not click configure unless they really need to click it. 
F: maybe people don't know that they even need to configure. 
L: from user studies, most people actually get confused into clicking configure rather than connect. 
S: we need to change the fact that most people click configure when they don't need to. 
S: we need good explanations without overloading text. 

I: we need to draw attention to the little globe icon. 
I: does a user understand the globe? 
I: the security depends on the information there. 
I: we want to empower users to be able to change the connection configuration. 
I: making it easy to use makes for user retention. 
K: we can try to collect data about tor users vs non-tor users. that's touch. 
S: I don't think that knowing more about tor makes them stay. I think it's similarity to a normal browser, and not increasing the cognitive load for the users. Emphasizing before you use Tor, understand this--that seems like a bad thing. 
L: I agree with S. 
I: the fundraising campaign had a counter to only display it three times per users. Maybe we can inform people once every n times to get people educated. 
I: we want teaching moments for the users. we shouldn't do it on first launch. 
S: that's a way the user can opt into this. 
L: it needs to be really low frequency: definitely less than once a day and maybe once in about 1/1000 times they get a tip. 

L: it is valuable to have a research team to collect fine-grained data from people who explicitly gave permission.
L: you can have interviews of participants about their thought process, decision making algorithm, and ask things like, "did you understand what you just did?" 
I: we need to talk about what we need to collect automatically, and what we need to research. 
M: can we have a list of what research needs to be done? 
L: definitely! 

A: how do we be delicate about data collected? is there a way to tell people we NEVER collect x,y, and z? 
I: we drafted an ethics for collection. 

3: overarching themes
- unable to identify users uniquely (old users, new users, returning users). 
- need to keep people anonymous.  
- ux team need to be more synced up with the metrics team. 

4: visions 
- such safe data collected that requires no-opt in data collection, and collect for all users. 
- ability for the ux team to look at tor metrics to do research. 
- automated data cleaning, graphing, collecting, etc. 

Last modified 4 years ago Last modified on Mar 18, 2016, 10:25:48 PM