Opened 4 months ago

Last modified 2 months ago

#33157 new defect

Client generates SDP with "IN IP4 0.0.0.0", causing proxy to send "client_ip=0.0.0.0" and bridge to send "USERADDR 0.0.0.0:1"

Reported by: dcf Owned by:
Priority: Medium Milestone:
Component: Circumvention/Snowflake Version:
Severity: Normal Keywords:
Cc: cohosh, phw, arlolra, dcf Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description

There is a pipeline of relaying the client IP address:

  • The proxy infers the client's IP address by grepping it out of the SDP during ICE negotiation (proxy, proxy-go) and attaches it to the WebSocket connection as a URL query parameter ?client_ip=A.B.C.D.
  • The bridge parses the client_ip query parameter and passes it on to tor with a USERADDR A.B.C.D:1 command on the ExtORPort.
  • tor does geoip lookups and aggregates statistics and ultimately sends them to Tor Metrics for country-specific graphs.

It looks like the pion SDP code puts 0.0.0.0 in the place where proxy and proxy-go look for the remote IP address. This causes the proxy to send ?client_ip=0.0.0.0 to the bridge, and the bridge to send USERADDR 0.0.0.0:1 to tor. I'm not sure that this happens every time; see below for bridge-extra-infos output.

I found it while testing proxy-go with a localhost client and a patch like:

  • proxy-go/snowflake.go

    a b import ( 
    2222       "git.torproject.org/pluggable-transports/snowflake.git/common/messages"
    23        "git.torproject.org/pluggable-transports/snowflake.git/common/safelog"
     23       _ "git.torproject.org/pluggable-transports/snowflake.git/common/safelog"
    2424       "git.torproject.org/pluggable-transports/snowflake.git/common/websocketconn"
    func (c *webRTCConn) Write(b []byte) (int, error) { 
    9393       // log.Printf("webrtc Write %d %+q", len(b), string(b))
    94        log.Printf("Write %d bytes --> WebRTC", len(b))
     94       // log.Printf("Write %d bytes --> WebRTC", len(b))
    9595       if c.dc != nil {
    func (c *webRTCConn) RemoteAddr() net.Addr { 
    114114       clientIP := remoteIPFromSDP(c.pc.RemoteDescription().SDP)
     115       log.Printf("RemoteAddr %+q", c.pc.RemoteDescription().SDP)
    115116       if clientIP == nil {
    func makePeerConnectionFromOffer(sdp *webrtc.SessionDescription, config webrtc.C 
    322323               dc.OnMessage(func(msg webrtc.DataChannelMessage) {
    323                        log.Printf("OnMessage <--- %d bytes", len(msg.Data))
     324                       // log.Printf("OnMessage <--- %d bytes", len(msg.Data))
    324325                       var n int
    func main() { 
    432433       //We want to send the log output through our scrubber first
    433        log.SetOutput(&safelog.LogScrubber{Output: logOutput})
     434       // log.SetOutput(&safelog.LogScrubber{Output: logOutput})
     435       log.SetOutput(logOutput)

For example, the beginning of an SDP string for me is

v=0
o=- 34318359 1580881353 IN IP4 0.0.0.0
s=-
t=0 0
a=fingerprint:sha-256 80:EE:E6:8D:55:07:CB:52:58:7A:CC:61:70:F9:F3:65:DB:4B:D3:69:CB:F9:68:C8:5F:E3:06:3D:D3:90:C1:E6
a=group:BUNDLE 0
m=application 9 DTLS/SCTP 5000
c=IN IP4 0.0.0.0

The client IP address inference, implemented in #18628, was always a bit of a hack, but it was effective enough, as evidenced by country counts in comment:4:ticket:29734. I just now looked at bridge-extra-infos-2020-02.tar.xz and it seems that we are still sometimes getting identified countries, but the largest count belongs to ??.

$ tar -O -xf bridge-extra-infos-2020-02.tar.xz | grep -A 24 '^extra-info flakey 5481936581E23D2D178105D44DB6915AB06BFB7F$' | grep -E '^dirreq-v3-reqs '
dirreq-v3-reqs ??=16,tr=16,cn=8,de=8,eg=8,in=8,ir=8,ph=8,us=8
dirreq-v3-reqs ??=16,tr=16,cn=8,de=8,eg=8,in=8,ir=8,ph=8,us=8
dirreq-v3-reqs ??=24,cn=8,es=8,fr=8,ir=8,tr=8
dirreq-v3-reqs ??=48,tr=16,ar=8,ca=8,eg=8,gb=8,ir=8,us=8
dirreq-v3-reqs ??=16,tr=16,cn=8,de=8,eg=8,in=8,ir=8,ph=8,us=8
dirreq-v3-reqs ??=24,cn=8,es=8,fr=8,ir=8,tr=8
dirreq-v3-reqs ??=48,tr=16,ar=8,ca=8,eg=8,gb=8,ir=8,us=8
dirreq-v3-reqs ??=24,cn=8,es=8,fr=8,ir=8,tr=8
dirreq-v3-reqs ??=24,cn=8,es=8,fr=8,ir=8,tr=8
dirreq-v3-reqs ??=24,cn=8,es=8,fr=8,ir=8,tr=8
dirreq-v3-reqs ??=48,tr=16,ar=8,ca=8,eg=8,gb=8,ir=8,us=8
dirreq-v3-reqs ??=16,tr=16,cn=8,de=8,eg=8,in=8,ir=8,ph=8,us=8
dirreq-v3-reqs ??=24,cn=8,es=8,fr=8,ir=8,tr=8
dirreq-v3-reqs ??=48,tr=16,ar=8,ca=8,eg=8,gb=8,ir=8,us=8
dirreq-v3-reqs ??=16,tr=16,cn=8,de=8,eg=8,in=8,ir=8,ph=8,us=8
dirreq-v3-reqs ??=16,tr=16,cn=8,de=8,eg=8,in=8,ir=8,ph=8,us=8
dirreq-v3-reqs ??=24,cn=8,es=8,fr=8,ir=8,tr=8
dirreq-v3-reqs ??=16,tr=16,cn=8,de=8,eg=8,in=8,ir=8,ph=8,us=8
dirreq-v3-reqs ??=24,cn=8,es=8,fr=8,ir=8,tr=8
dirreq-v3-reqs ??=24,tr=16,fr=8,om=8,ru=8,us=8
dirreq-v3-reqs ??=16,tr=16,cn=8,de=8,eg=8,in=8,ir=8,ph=8,us=8
dirreq-v3-reqs ??=48,tr=16,ar=8,ca=8,eg=8,gb=8,ir=8,us=8

Maybe it happens only intermittently. pion/sdp sets UnicastAddress: "0.0.0.0" unconditionally and I don't see where it is ever modified. Maybe the others are older non-pion clients?

A little searching indicates that IN IP4 0.0.0.0 has something to do with trickle ICE:

It seems that ultimately, we need a more reliable way for the proxy to infer the client's external IP address.

Child Tickets

Change History (12)

comment:1 Changed 4 months ago by cohosh

Possible related: #19026

comment:2 Changed 4 months ago by cohosh

You know, we probably shouldn't even be using trickle ICE. The intended use case as I understand it is to allow peers to send candidates to each other continually rather than all at once (which we definitely don't do). Our original reason for using it was because it allowed us to keep using the OnICEGatheringStateChange callback here (see #28942#comment:28).

I think it's worth revisiting that decision.

comment:3 Changed 4 months ago by dcf

I'll say that the bug here is not necessarily that the SDP contains 0.0.0.0; it's that the proxy uses 0.0.0.0 as the client address despite more meaningful information being available in the SDP. remoteIPFromSDP could choose one of the candidates from an a=candidate line rather than only looking at the c= line, for example. (Which one? It probably doesn't matter, as they are all likely to have the same geolocation, as long as you pick an external address candidate and not a LAN address candidate.) One way to fix the problem would be to ensure that clients generate SDP with a meaningful address in the c= line; another way would be to use a different method of extracting the address in the proxy.

comment:4 Changed 4 months ago by arlolra

I found it while testing proxy-go with a localhost client and a patch like:

Something like that patch was useful when working on #19026 so would you consider merging,
https://github.com/keroserene/snowflake/commit/dbd733e4b1430c046ec11e8052efdbac6010e58a

comment:5 Changed 4 months ago by dcf

#33385 is to have the server ignore client IP address of 0.0.0.0, separate from the issue of proxies failing to extract a meaningful address.

comment:6 Changed 3 months ago by arlolra

Status: newneeds_review

Changing the status for review on the tangential patch in #comment:4

comment:7 in reply to:  4 Changed 3 months ago by dcf

Replying to arlolra:

I found it while testing proxy-go with a localhost client and a patch like:

Something like that patch was useful when working on #19026 so would you consider merging,
https://github.com/keroserene/snowflake/commit/dbd733e4b1430c046ec11e8052efdbac6010e58a

It's okay with me but I would call the option --unsafe-logging instead of --unsafeLogging to match the style of the other options.

comment:8 Changed 3 months ago by arlolra

It's okay with me but I would call the option --unsafe-logging instead of --unsafeLogging to match the style of the other options.

In broker/broker.go, yes, dashes are used (ex. disable-tls, acme-email, etc), however, in client/snowflake.go we have camel casing (ex. logToStateDir, keepLocalAddresses, etc)

We should probably consolidate this now. I imagine your preference is for the former?

comment:9 Changed 3 months ago by dcf

Status: needs_reviewnew

Moved the unsafe logging discussion to #33665.

comment:10 Changed 2 months ago by dcf

Since the fix in #33385 on 2020-02-20, only about 20–30% of connections at the bridge have a meaningful client_ip. I believe this is causing us to undercount clients and/or bandwidth, even though we no longer set USERADDR for those 0.0.0.0 connections. In the absence of USERADDR, tor will use a 127.0.0.1 address, which like 0.0.0.0 is ignored for the purpose of bandwidth accounting. For instance, record_num_bytes_transferred_impl calls connection_is_rate_limited which calls tor_addr_is_internal which returns false for both 0.0.0.0/8 and 127.0.0.0/8. (This is similar to #33693, but on the server, not on the client.)

Related metrics-team thread: https://lists.torproject.org/pipermail/metrics-team/2020-March/001139.html

# zgrep -h client_ip /var/log/tor/snowflake-server.log* | sort
2020/02/20 18:03:47 in the past 86400 s, 476/476 connections had client_ip
2020/02/21 18:03:47 in the past 86400 s, 523/523 connections had client_ip
2020/02/23 23:49:59 in the past 86400 s, 87/330 connections had client_ip
2020/02/24 23:49:59 in the past 86400 s, 128/577 connections had client_ip
2020/02/25 23:49:59 in the past 86400 s, 153/663 connections had client_ip
2020/02/26 23:49:59 in the past 86400 s, 42/239 connections had client_ip
2020/02/27 23:49:59 in the past 86400 s, 158/510 connections had client_ip
2020/02/28 23:49:59 in the past 86400 s, 123/467 connections had client_ip
2020/02/29 23:49:59 in the past 86400 s, 97/451 connections had client_ip
2020/03/01 23:49:59 in the past 86400 s, 117/565 connections had client_ip
2020/03/02 23:49:59 in the past 86400 s, 161/584 connections had client_ip
2020/03/03 23:49:59 in the past 86400 s, 82/325 connections had client_ip
2020/03/04 23:49:59 in the past 86400 s, 111/467 connections had client_ip
2020/03/05 23:49:59 in the past 86400 s, 150/518 connections had client_ip
2020/03/06 23:49:59 in the past 86400 s, 204/768 connections had client_ip
2020/03/07 23:49:59 in the past 86400 s, 175/668 connections had client_ip
2020/03/08 23:49:59 in the past 86400 s, 135/643 connections had client_ip
2020/03/09 23:49:59 in the past 86400 s, 159/634 connections had client_ip
2020/03/10 23:49:59 in the past 86400 s, 198/863 connections had client_ip
2020/03/11 23:49:59 in the past 86400 s, 284/1327 connections had client_ip
2020/03/12 23:49:59 in the past 86400 s, 117/511 connections had client_ip
2020/03/13 23:49:59 in the past 86400 s, 145/638 connections had client_ip
2020/03/14 23:49:59 in the past 86400 s, 177/808 connections had client_ip
2020/03/15 23:49:59 in the past 86400 s, 349/1470 connections had client_ip
2020/03/16 23:49:59 in the past 86400 s, 404/1965 connections had client_ip
2020/03/17 23:49:59 in the past 86400 s, 264/1286 connections had client_ip
2020/03/18 23:49:59 in the past 86400 s, 279/1155 connections had client_ip
2020/03/19 23:49:59 in the past 86400 s, 296/1265 connections had client_ip
2020/03/20 23:49:59 in the past 86400 s, 314/1392 connections had client_ip
2020/03/21 23:49:59 in the past 86400 s, 290/1025 connections had client_ip
2020/03/22 23:49:59 in the past 86400 s, 236/637 connections had client_ip
2020/03/23 23:49:59 in the past 86400 s, 199/671 connections had client_ip
2020/03/25 01:37:46 in the past 86400 s, 164/543 connections had client_ip
                  date   n    d percent
1  2020-02-20 18:03:47 476  476   100.0
2  2020-02-21 18:03:47 523  523   100.0
3  2020-02-23 23:49:59  87  330    26.4
4  2020-02-24 23:49:59 128  577    22.2
5  2020-02-25 23:49:59 153  663    23.1
6  2020-02-26 23:49:59  42  239    17.6
7  2020-02-27 23:49:59 158  510    31.0
8  2020-02-28 23:49:59 123  467    26.3
9  2020-02-29 23:49:59  97  451    21.5
10 2020-03-01 23:49:59 117  565    20.7
11 2020-03-02 23:49:59 161  584    27.6
12 2020-03-03 23:49:59  82  325    25.2
13 2020-03-04 23:49:59 111  467    23.8
14 2020-03-05 23:49:59 150  518    29.0
15 2020-03-06 23:49:59 204  768    26.6
16 2020-03-07 23:49:59 175  668    26.2
17 2020-03-08 23:49:59 135  643    21.0
18 2020-03-09 23:49:59 159  634    25.1
19 2020-03-10 23:49:59 198  863    22.9
20 2020-03-11 23:49:59 284 1327    21.4
21 2020-03-12 23:49:59 117  511    22.9
22 2020-03-13 23:49:59 145  638    22.7
23 2020-03-14 23:49:59 177  808    21.9
24 2020-03-15 23:49:59 349 1470    23.7
25 2020-03-16 23:49:59 404 1965    20.6
26 2020-03-17 23:49:59 264 1286    20.5
27 2020-03-18 23:49:59 279 1155    24.2
28 2020-03-19 23:49:59 296 1265    23.4
29 2020-03-20 23:49:59 314 1392    22.6
30 2020-03-21 23:49:59 290 1025    28.3
31 2020-03-22 23:49:59 236  637    37.0
32 2020-03-23 23:49:59 199  671    29.7
33 2020-03-25 01:37:46 164  543    30.2

comment:11 in reply to:  10 ; Changed 2 months ago by arma

Replying to dcf:

Since the fix in #33385 on 2020-02-20, only about 20–30% of connections at the bridge have a meaningful client_ip. I believe this is causing us to undercount clients and/or bandwidth, even though we no longer set USERADDR for those 0.0.0.0 connections. In the absence of USERADDR, tor will use a 127.0.0.1 address, which like 0.0.0.0 is ignored for the purpose of bandwidth accounting.

I agree with your analysis.

I've opened #33747 to handle this situation better on the Tor side. But I encourage you to continue regarding "we don't send a proper ip address along" as a bug on the Snowflake side.

comment:12 in reply to:  11 Changed 2 months ago by arma

Replying to arma:

Replying to dcf:

Since the fix in #33385 on 2020-02-20, only about 20–30% of connections at the bridge have a meaningful client_ip. I believe this is causing us to undercount clients and/or bandwidth, even though we no longer set USERADDR for those 0.0.0.0 connections. In the absence of USERADDR, tor will use a 127.0.0.1 address, which like 0.0.0.0 is ignored for the purpose of bandwidth accounting.

I agree with your analysis.

For a short-term workaround, you can set "CountPrivateBandwidth 1" in the torrc of the snowflake bridge. I'd give that 80% odds of doing what you want. :)

Note: See TracTickets for help on using tickets.