Opened 2 years ago

Closed 23 months ago

#23244 closed enhancement (implemented)

Onionoo documents should be the same accross all tp.o instances

Reported by: iwakeh Owned by: metrics-team
Priority: Medium Milestone:
Component: Metrics/Onionoo Version:
Severity: Normal Keywords:
Cc: Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description

Investigate current differences and define the prerequisites for achieving this goal.

I'll add examples for differences in comments, as this allows for different discussion threads.

Child Tickets

Change History (11)

comment:1 Changed 2 years ago by iwakeh

The details documents have geoip related differences (data retrieved on 2017-08-15), i.e., coordinates changes and spelling

h> "region_name":"Pennsylvania","city_name":"Lansdale","latitude":40.2262,"longitude":-75.2931,
m> "region_name":"Pennsylvania","city_name":"Lansdale","latitude":40.2415,"longitude":-75.2838,
                                                                      ^^^                  ^^^
h> "region_name":"Stockholm","city_name":"Norrtaelje",
m> "region_name":"Stockholm","city_name":"Norrtalje",
                                                ^^^

comment:2 Changed 2 years ago by iwakeh

Other differences might be due to different update times, from details:

h> {"nickname":"mrkoolltor","fingerprint":"92808CA58D8F32CA34A34C547610869BF4E2A6EC","or_addresses":["77.120.94.233:9001"],"last_seen":"2017-08-15 08:00:00",...,"consensus_weight_fraction":3.3634333E-6,"guard_probability":0.0,"middle_probability":8.677337E-6,"exit_probability":0.0,"recommended_version":false,"measured":true}
m> {"nickname":"mrkoolltor","fingerprint":"92808CA58D8F32CA34A34C547610869BF4E2A6EC","or_addresses":["77.120.94.233:9001"],"last_seen":"2017-08-15 07:00:00",...,"consensus_weight_fraction":3.3669173E-6,"guard_probability":0.0,"middle_probability":8.666976E-6,"exit_probability":0.0,"recommended_version":false,"measured":true},
                                                                                                                                                   ^^                                            ^^^^^                                                    ^^^^^

comment:3 Changed 2 years ago by iwakeh

clients

h> "fingerprint":"FCC1A739D1FCF2FCC15DEF3AE86D346548B0165D",..,"countries":{"by":0.042976454,"ca":0.020426456,"ch":0.09072858,"cn":0.02642937,"de":0.01072343,"es":0.021451455,"fr":0.01072343,"gb":0.034091588,"ir":0.122246176,"jp":0.07436535,"kz":0.06670313,"ma":0.010213228,"pa":0.016547084,"ro":0.013131949,"ru":0.122246176,"sa":0.01072343,"tr":0.122246176,"us":0.10079472,"vn":0.08319506},"transports":{"\u003cOR\u003e":0.31281802,"obfs4":0.687182},"versions":{"v4":1.0}},"1_month":{"first":"2017-07-13 12:00:00","last":"2017-08-13 12:00:00","interval":86400,"factor":0.0028379379379379383,"count":32,"values":[570,557,613,null,704,704,557,569,557,422,422,275,288,422,422,422,422,422,422,422,422,569,557,569,851,986,839,999,974,851,986,795],"countries":{"by":0.029526738,"ca":0.041479144,"ch":0.04099562,"cn":0.05778515,"de":0.038172565,"es":0.019129056,"fr":0.02451852,"gb":0.025931904,"ir":0.14300486,"it":0.015917324,"jp":0.0392698,"kz":0.08019847,"nl":0.01588199,"ru":0.12153813,"tr":0.115521945,"us":0.09202813,"vn":0.04192548},"transports":
m> "fingerprint":"FCC1A739D1FCF2FCC15DEF3AE86D346548B0165D",..,"countries":{"by":0.032763224,"ca":0.010213228,"ch":0.09200638,"cn":0.02642937,"de":0.01072343,"es":0.021451455,"fr":0.01072343,"gb":0.034091588,"ir":0.12352398,"jp":0.07564315,"kz":0.06798094,"ma":0.010213228,"pa":0.016547084,"ro":0.013131949,"ru":0.12352398,"sa":0.01072343,"tr":0.12352398,"us":0.10207252,"vn":0.09468609},"transports":{"\u003cOR\u003e":0.31281802,"obfs4":0.687182},"versions":{"v4":1.0}},"1_month":{"first":"2017-07-13 12:00:00","last":"2017-08-13 12:00:00","interval":86400,"factor":0.0028379379379379383,"count":32,"values":[570,557,613,null,704,704,557,569,557,422,422,275,288,422,422,422,422,422,422,422,422,569,557,569,851,986,839,999,974,851,986,795],"countries":{"by":0.025394445,"ca":0.03734685,"ch":0.04151262,"cn":0.05778515,"de":0.038172565,"es":0.019129056,"fr":0.02451852,"gb":0.025931904,"ir":0.14352186,"it":0.015917324,"jp":0.0397868,"kz":0.08071547,"nl":0.01588199,"ru":0.122055136,"tr":0.11603895,"us":0.09254514,"vn":0.04657477},"transports":
                                                                                    ^^^^^^^^         ^^^^^^^^          ^^^^^^^^                                                                                          ^^^^^^^

m> is from the main Onionoo instance.

Some entries appear in different order. If there is need I can also attach the entire file here.

Last edited 2 years ago by iwakeh (previous) (diff)

comment:4 Changed 2 years ago by iwakeh

bandwidth

Differences in first seen minutes and slight differences in counts, which seem to be due to different starting times of the updater.

h> "fingerprint":"D7EF14045DEDAE9E9E61D8CEF0F84F87703AD55D","write_history":{"3_days":{"first":"2017-08-12 04:37:30","last":"2017-08-15 03:07:30","interval":900,"factor":8.02902902902903,"count":283,
m> "fingerprint":"D7EF14045DEDAE9E9E61D8CEF0F84F87703AD55D","write_history":{"3_days":{"first":"2017-08-12 04:22:30","last":"2017-08-15 03:07:30","interval":900,"factor":8.02902902902903,"count":284,
                                                                                                              ^^                                                                                     ^

The following should have come from the copied data, but still differences?

h> {"fingerprint":"96CF829ABA63490D052396E717250FA84F60667E","write_history":{"1_month":{"first":"2017-07-15 02:00:00","last":"2017-08-14 18:00:00",
m> {"fingerprint":"96CF829ABA63490D052396E717250FA84F60667E","write_history":{"1_month":{"first":"2017-07-14 22:00:00","last":"2017-08-14 18:00:00",
                                                                                                          ^^ ^^

All samples of relays only showing up in the 20170815 diff for the main instance are disappeared in the current documents for both the main and the mirror instance.

Last edited 2 years ago by iwakeh (previous) (diff)

comment:5 in reply to:  1 Changed 2 years ago by iwakeh

Replying to iwakeh:

The details documents have geoip related differences (data retrieved on 2017-08-15), i.e., coordinates changes and spelling

h> "region_name":"Pennsylvania","city_name":"Lansdale","latitude":40.2262,"longitude":-75.2931,
m> "region_name":"Pennsylvania","city_name":"Lansdale","latitude":40.2415,"longitude":-75.2838,
                                                                      ^^^                  ^^^
h> "region_name":"Stockholm","city_name":"Norrtaelje",
m> "region_name":"Stockholm","city_name":"Norrtalje",
                                                ^^^

These differences shouldn't cause problems for clients. But it would be better to have consistent data, which is immediately achievable by using the same geoip edition.

In the long run ticket #21515 ought to provide a single source for geoip related data and thus solve the issue thoroughly.

comment:6 Changed 2 years ago by iwakeh

Status: newneeds_information

Since both instances use the same geoip database the details doc differences reduced quite a bit.

In total, there is a small percentage when only looking at running nodes:

doc-type	count   different lines	percentage 
summary	9262	25	0.003
details	9262	649	0.070
bandwd. 9262	753	0.081
weights	6785	361	0.053
clients	2484	16	0.006

These should be fine with clients when rotating.

Still some should be investigated like the diffs for details, weights, and bandwith. -> new ticket

Version 0, edited 2 years ago by iwakeh (next)

comment:7 Changed 2 years ago by karsten

Without looking at the differences yet: We might resolve some of them by simply doing #22033. And #16513 might take a little more effort but move us forward in this direction, too.

comment:8 in reply to:  7 Changed 2 years ago by iwakeh

Replying to karsten:

Without looking at the differences yet: We might resolve some of them by simply doing #22033. And #16513 might take a little more effort but move us forward in this direction, too.

Yes, these two tasks will make lots of diffs go away. #22033 removes most differences left in the clients docs.
#16513 is important, because we will otherwise see an increase of differences in a few weeks for the monthly values, too.
So, we already have the appropriate tickets, no need to open a new one as mentioned above.

I think this ticket could be closed as the work will be done in the two mentioned tickets.

comment:9 Changed 2 years ago by karsten

Resolution: duplicate
Status: needs_informationclosed

Sounds good to me. Thanks for checking! Closing.

comment:10 Changed 23 months ago by iwakeh

Resolution: duplicate
Status: closedreopened

Only opening to close it with an appropriate solution.

comment:11 Changed 23 months ago by iwakeh

Resolution: implemented
Status: reopenedclosed

Resolved as 'implemented' b/c the ticket was about investigation of differences and defining measures to resolve these. One measure was to use identical geoip dbs, which is now deployed on both hosts. The other steps are the two tickets named. So, this is no duplicate, but done ;-)

Note: See TracTickets for help on using tickets.