Opened 8 months ago

Last modified 6 months ago

#33391 needs_review enhancement

Add new metadata fields and definitions

Reported by: acute Owned by: metrics-team
Priority: Medium Milestone:
Component: Metrics/Onionperf Version:
Severity: Normal Keywords: metrics-team-roadmap-2020
Cc: acute Actual Points:
Parent ID: #33323 Points: 1
Reviewer: karsten Sponsor: Sponsor59-must

Description

Define the instance metadata fields to help us differentiate experimental measurements.

Child Tickets

Change History (10)

comment:1 Changed 8 months ago by acute

Points: 1

comment:2 Changed 8 months ago by gaba

Keywords: metrics-team-roadmap-2020Q1 added

comment:3 Changed 8 months ago by gaba

Parent ID: #33322#33323

comment:4 Changed 7 months ago by acute

To do this, I'd like to propose we add a function to the analysis code to allow inclusion of metadata provided in a JSON file. This would be a dictionary holding metadata. Possible metadata could include fields to describe the experiment type (e.g., EXP_TYPE) or git commit hash of the code that is running (e.g., EXP_COMMIT) or anything that might be useful.
The json file would be specified on the command line, as an optional argument:

onionperf measure ... --metadata <meta.json>
or
onionperf analyze ... --metadata <meta.json>

The key value pairs would be added to the resulting json and tpf outputs.

This can help the our team tell the difference between experimental and non-experimental data, and also allows us and Mike to tell the different experimental data apart in the case of multiple experiments, to a good granularity. Thoughts?

comment:5 Changed 7 months ago by acute

Status: newneeds_review

comment:6 Changed 7 months ago by karsten

Reviewer: karsten

I'll take a look later today.

comment:7 Changed 7 months ago by karsten

The general plan sounds good to me!

Just one question: What's the point of adding that optional argument to the measure mode? Would the meta data be included in the logs somewhere for the analyze mode to extract?

comment:8 Changed 7 months ago by acute

On Monday, irl, karsten and I discussed three ways of identifying data coming from different instances of Onionperf:

  1. By source. Here we rely that sources are named in a certain way, or come from a certain domain.

Pros: The easiest method to implement, no changes required in OP.
Cons: Does not scale, and relies on us/others remembering conventions and occasionally even updating metrics-web in the future

  1. Using a fixed field to record whether the data is experimental or not. This could be true by default, and we could set this to false for the instances we run.

Pros: Easy to implement, is the field we would rely on in the near future
Cons: Cannot be changed later on to include other types of measurements, does not allow any granularity for the researchers to differentiate between their own experiments

  1. The method described in my first comment, using a variable METADATA field. This would be populated by the researcher/admin with useful information, and then added to the json and tpf outputs when analysis is performed.

Pros: Very flexible, allows the admins/researchers to include a wide variety of metadata. Would not put any restrictions on what we can add in the future. Simple to implement.
Cons: Relies on the researcher or admin to populate. It would require custom code to parse if used in metrics-web, and then to be useful in metrics-web everyone should agree on semantics.


Just one question: What's the point of adding that optional argument to the measure mode? Would the >meta data be included in the logs somewhere for the analyze mode to extract?

The default measure mode calls on the analysis function at midnight - the analysis function would append the metadata info to the json and tpf output, but we need a way to signal that we want this data appended.

Let me know if I've missed anything!

comment:9 Changed 6 months ago by gaba

Keywords: metrics-team-roadmap-2020 added; metrics-team-roadmap-2020Q1 removed
Sponsor: Sponsor59

comment:10 Changed 6 months ago by karsten

Sponsor: Sponsor59Sponsor59-must

Moving to Sponsor59-must, because we should really do these in order to call Sponsor59 done.

Note: See TracTickets for help on using tickets.