Opened 6 months ago

Last modified 6 months ago

#25002 new enhancement

Make data and results from Onionoo deterministic

Reported by: iwakeh Owned by: metrics-team
Priority: Medium Milestone:
Component: Metrics/Onionoo Version:
Severity: Normal Keywords:
Cc: Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description

As started with #16513 all documents and data generated by Onionoo should be deterministic with reasonable effort.

One other step is discussed here.

More topics will be added as comment, when they appear.

Child Tickets

TicketTypeStatusOwnerSummary
#25085enhancementclosediwakehMake order of sorted results deterministic
#25091enhancementnewmetrics-teamMake 'out/update' deterministic across instances

Change History (12)

comment:1 Changed 6 months ago by iwakeh

RdnsLookupRequest leads to many differences between the summary and the details documents of two instances even when run from the same machine within a short period of time. These are also the cause of most differences between the two main tp.o instances.

Last edited 6 months ago by iwakeh (previous) (diff)

comment:2 Changed 6 months ago by iwakeh

Another non-deterministic element in Onionoo docs is the order of entries. When there is a tie in the used order parameter order is arbitrary and will differ. A simple solution could be to always sort by fingerprint in case of a tie.

comment:3 in reply to:  1 Changed 6 months ago by karsten

Replying to iwakeh:

RdnsLookupRequest leads to many differences between the summary and the details documents of two instances even when run from the same machine within a short period of time. These are also the cause of most differences between the two main tp.o instances.

True. We might consider running these lookups in CollecTor and providing results using a new data format. Onionoo would then fetch and process those files.

comment:4 in reply to:  2 ; Changed 6 months ago by karsten

Replying to iwakeh:

Another non-deterministic element in Onionoo docs is the order of entries. When there is a tie in the used order parameter order is arbitrary and will differ. A simple solution could be to always sort by fingerprint in case of a tie.

True, that sounds easy enough to do.

comment:5 in reply to:  4 ; Changed 6 months ago by iwakeh

Replying to karsten:

Replying to iwakeh:

Another non-deterministic element in Onionoo docs is the order of entries. When there is a tie in the used order parameter order is arbitrary and will differ. A simple solution could be to always sort by fingerprint in case of a tie.

True, that sounds easy enough to do.

Please find a suggestion for a patch here.

comment:6 Changed 6 months ago by iwakeh

Maybe, also use the valid after timestamp for the value in file out/update or remove the file?

comment:7 in reply to:  5 ; Changed 6 months ago by karsten

Replying to iwakeh:

Replying to karsten:

Replying to iwakeh:

Another non-deterministic element in Onionoo docs is the order of entries. When there is a tie in the used order parameter order is arbitrary and will differ. A simple solution could be to always sort by fingerprint in case of a tie.

True, that sounds easy enough to do.

Please find a suggestion for a patch here.

Commit 0b2e5a2 looks good! Can you rebase to master, add a change log entry, and provide a metrics-web patch for updating the specification?

comment:8 in reply to:  6 ; Changed 6 months ago by karsten

Replying to iwakeh:

Maybe, also use the valid after timestamp for the value in file out/update or remove the file?

That might even be part of #16513. We cannot remove that file, because we need it for updating the index in the server part. But we could use something else as timestamp than the current system time. Not sure if valid-after timestamp is the best choice, because we have two such timestamps for relays and for bridges. Maybe we could use the last-modified time of status/summary here. Untested, just an idea.

comment:9 in reply to:  8 ; Changed 6 months ago by iwakeh

Replying to karsten:

Replying to iwakeh:

Maybe, also use the valid after timestamp for the value in file out/update or remove the file?

That might even be part of #16513. We cannot remove that file, because we need it for updating the index in the server part. But we could use something else as timestamp than the current system time. Not sure if valid-after timestamp is the best choice, because we have two such timestamps for relays and for bridges. Maybe we could use the last-modified time of status/summary here. Untested, just an idea.

Last modified time of status/summary is not deterministic across instances. What about using the latest time of the two valid after timestamps? This would sort of synchronize Onionoo with 'Tor time'?

comment:10 in reply to:  7 Changed 6 months ago by iwakeh

Replying to karsten:

...

Please find a suggestion for a patch here.

Commit 0b2e5a2 looks good! Can you rebase to master, add a change log entry, and provide a metrics-web patch for updating the specification?

Sure. Isolated this part as child ticket.

comment:11 in reply to:  9 ; Changed 6 months ago by karsten

Replying to iwakeh:

Replying to karsten:

Replying to iwakeh:

Maybe, also use the valid after timestamp for the value in file out/update or remove the file?

That might even be part of #16513. We cannot remove that file, because we need it for updating the index in the server part. But we could use something else as timestamp than the current system time. Not sure if valid-after timestamp is the best choice, because we have two such timestamps for relays and for bridges. Maybe we could use the last-modified time of status/summary here. Untested, just an idea.

Last modified time of status/summary is not deterministic across instances. What about using the latest time of the two valid after timestamps? This would sort of synchronize Onionoo with 'Tor time'?

Yes, sounds good!

comment:12 in reply to:  11 Changed 6 months ago by iwakeh

Replying to karsten:

...

Last modified time of status/summary is not deterministic across instances. What about using the latest time of the two valid after timestamps? This would sort of synchronize Onionoo with 'Tor time'?

Yes, sounds good!

Child ticket #25091.

Note: See TracTickets for help on using tickets.