Opened 11 months ago

Last modified 10 months ago

#25002 new enhancement

Make data and results from Onionoo deterministic

Reported by: iwakeh Owned by: metrics-team
Priority: Medium Milestone:
Component: Metrics/Onionoo Version:
Severity: Normal Keywords:
Cc: Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description

As started with #16513 all documents and data generated by Onionoo should be deterministic with reasonable effort.

One other step is discussed here.

More topics will be added as comment, when they appear.

Child Tickets

TicketTypeStatusOwnerSummary
#25085enhancementclosediwakehMake order of sorted results deterministic
#25091enhancementnewmetrics-teamMake 'out/update' deterministic across instances

Change History (12)

comment:1 Changed 11 months ago by iwakeh

RdnsLookupRequest leads to many differences between the summary and the details documents of two instances even when run from the same machine within a short period of time. These are also the cause of most differences between the two main tp.o instances.

Last edited 11 months ago by iwakeh (previous) (diff)

comment:2 Changed 11 months ago by iwakeh

Another non-deterministic element in Onionoo docs is the order of entries. When there is a tie in the used order parameter order is arbitrary and will differ. A simple solution could be to always sort by fingerprint in case of a tie.

comment:3 in reply to:  1 Changed 11 months ago by karsten

Replying to iwakeh:

RdnsLookupRequest leads to many differences between the summary and the details documents of two instances even when run from the same machine within a short period of time. These are also the cause of most differences between the two main tp.o instances.

True. We might consider running these lookups in CollecTor and providing results using a new data format. Onionoo would then fetch and process those files.

comment:4 in reply to:  2 ; Changed 11 months ago by karsten

Replying to iwakeh:

Another non-deterministic element in Onionoo docs is the order of entries. When there is a tie in the used order parameter order is arbitrary and will differ. A simple solution could be to always sort by fingerprint in case of a tie.

True, that sounds easy enough to do.

comment:5 in reply to:  4 ; Changed 11 months ago by iwakeh

Replying to karsten:

Replying to iwakeh:

Another non-deterministic element in Onionoo docs is the order of entries. When there is a tie in the used order parameter order is arbitrary and will differ. A simple solution could be to always sort by fingerprint in case of a tie.

True, that sounds easy enough to do.

Please find a suggestion for a patch here.

comment:6 Changed 10 months ago by iwakeh

Maybe, also use the valid after timestamp for the value in file out/update or remove the file?

comment:7 in reply to:  5 ; Changed 10 months ago by karsten

Replying to iwakeh:

Replying to karsten:

Replying to iwakeh:

Another non-deterministic element in Onionoo docs is the order of entries. When there is a tie in the used order parameter order is arbitrary and will differ. A simple solution could be to always sort by fingerprint in case of a tie.

True, that sounds easy enough to do.

Please find a suggestion for a patch here.

Commit 0b2e5a2 looks good! Can you rebase to master, add a change log entry, and provide a metrics-web patch for updating the specification?

comment:8 in reply to:  6 ; Changed 10 months ago by karsten

Replying to iwakeh:

Maybe, also use the valid after timestamp for the value in file out/update or remove the file?

That might even be part of #16513. We cannot remove that file, because we need it for updating the index in the server part. But we could use something else as timestamp than the current system time. Not sure if valid-after timestamp is the best choice, because we have two such timestamps for relays and for bridges. Maybe we could use the last-modified time of status/summary here. Untested, just an idea.

comment:9 in reply to:  8 ; Changed 10 months ago by iwakeh

Replying to karsten:

Replying to iwakeh:

Maybe, also use the valid after timestamp for the value in file out/update or remove the file?

That might even be part of #16513. We cannot remove that file, because we need it for updating the index in the server part. But we could use something else as timestamp than the current system time. Not sure if valid-after timestamp is the best choice, because we have two such timestamps for relays and for bridges. Maybe we could use the last-modified time of status/summary here. Untested, just an idea.

Last modified time of status/summary is not deterministic across instances. What about using the latest time of the two valid after timestamps? This would sort of synchronize Onionoo with 'Tor time'?

comment:10 in reply to:  7 Changed 10 months ago by iwakeh

Replying to karsten:

...

Please find a suggestion for a patch here.

Commit 0b2e5a2 looks good! Can you rebase to master, add a change log entry, and provide a metrics-web patch for updating the specification?

Sure. Isolated this part as child ticket.

comment:11 in reply to:  9 ; Changed 10 months ago by karsten

Replying to iwakeh:

Replying to karsten:

Replying to iwakeh:

Maybe, also use the valid after timestamp for the value in file out/update or remove the file?

That might even be part of #16513. We cannot remove that file, because we need it for updating the index in the server part. But we could use something else as timestamp than the current system time. Not sure if valid-after timestamp is the best choice, because we have two such timestamps for relays and for bridges. Maybe we could use the last-modified time of status/summary here. Untested, just an idea.

Last modified time of status/summary is not deterministic across instances. What about using the latest time of the two valid after timestamps? This would sort of synchronize Onionoo with 'Tor time'?

Yes, sounds good!

comment:12 in reply to:  11 Changed 10 months ago by iwakeh

Replying to karsten:

...

Last modified time of status/summary is not deterministic across instances. What about using the latest time of the two valid after timestamps? This would sort of synchronize Onionoo with 'Tor time'?

Yes, sounds good!

Child ticket #25091.

Note: See TracTickets for help on using tickets.