Opened 9 months ago

Last modified 3 weeks ago

#25644 accepted task

Write white paper about CollecTor's data processing

Reported by: iwakeh Owned by: irl
Priority: Medium Milestone:
Component: Metrics/CollecTor Version:
Severity: Normal Keywords:
Cc: metrics-team Actual Points:
Parent ID: Points:
Reviewer: Sponsor: Sponsor13

Description

This should become a Tor Tech Report detailing Tor Metrics data collection, aggregation, and presentation as well as an overview of why and how data is collected compared to available other frameworks.

This is activity 1.1 of Sponsor 13 and covers the data pipeline up to activity 2 (see ticket #24217).

Child Tickets

Change History (6)

comment:1 Changed 9 months ago by iwakeh

Owner: changed from metrics-team to iwakeh
Status: newaccepted

Starting on this with the background from last year's tech report.

Which frameworks should we not forget to look at?

comment:2 in reply to:  1 Changed 9 months ago by karsten

Replying to iwakeh:

Starting on this with the background from last year's tech report.

Sounds great!

Which frameworks should we not forget to look at?

Uhhmmm, fine question. How about Java EE and Spring?

comment:3 Changed 9 months ago by iwakeh

In 2013 JSR 352 Batch Applications for the Java Platform was finalized. As the main implementations are Java EE 7 and Spring Batch these two should be covered by this activity. Other suitable frameworks can be found in streaming and data processing fields. These focus usually on real-time processing, which is not CollecTor's concern, but also provide solutions for the main batch processing tasks: retrieve from a source, process, and write the data. Thus, we should also take a look at Apache's Flink streaming framework that explicitly features its own Batch DataSet API. Flink is also well integrated into Apache's Java tooling/framework environment.

Thus, the list of batch frameworks we evaluate is Java EE and Spring (as JSR 352 implementations) and Flink.

comment:4 Changed 7 months ago by iwakeh

Owner: changed from iwakeh to metrics-team
Status: acceptedassigned

comment:5 Changed 5 weeks ago by karsten

Sponsor: Sponsor13
Summary: Write white paper about CollecTor's data processing (Sponsor13, 1)Write white paper about CollecTor's data processing

We changed the plan a bit by evaluating a rewrite of CollecTor's relaydescs module in Python (#28320). But the remaining report parts stayed the same. Keeping this ticket for writing the report after a working prototype in Python exists.

comment:6 Changed 3 weeks ago by irl

Owner: changed from metrics-team to irl
Status: assignedaccepted
Note: See TracTickets for help on using tickets.