Opened 6 weeks ago

Last modified 2 days ago

#24175 needs_revision enhancement

Use an embedded Jetty in metrics-web and use metrics-base as build environment.

Reported by: iwakeh Owned by: iwakeh
Priority: Medium Milestone:
Component: Metrics/Website Version:
Severity: Normal Keywords:
Cc: Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description

Avoid being dependent on external Tomcat instances.
Use metrics-base and Metrics' project structure.

These will be implemented based on #24174.

Child Tickets

TicketTypeStatusOwnerSummary
#24576defectclosedmetrics-teamCreation

Change History (11)

comment:1 Changed 6 weeks ago by iwakeh

Owner: changed from metrics-team to iwakeh
Status: newaccepted

comment:2 Changed 4 weeks ago by iwakeh

Status: acceptedneeds_review

Please find a first patch branch.
There is only one commit as this consists in moving files around, replacing scripts accordingly etc., all in all nothing that can be easily done incrementally.
Please test thoroughly and with some more data than I have available with some emphasis on the R related functionality.

What changed

Major changes are the move to an embedded Jetty and finally the move to using metrics-base only.
For the latter, the interim step (from a year ago) of first making all modules use a common build environment really helped with the current step to finally apply Metrics' java structure and use metrics-base.

I'm also going for a two-step-solution for configuration and operation issues here:
The current patch replaces all operational/deployment bash scripts with ant tasks and keeps the overall directory structure, as well as the current configuration situation, i.e., this patch is taylored to the current deployment server and not yet to easy test configuration, which is tackled in ticket #24328.
I didn't change any package names nor script (R, sql, python) names nor do we have a consistent logging approach as these are all new tickets: #24328, #24329, #24330, (and the ticket for moving ernie and renaming packages, need to look up the number).

Comparison to the old situation:

ant run-web-prepare is the former run-web.sh.
The scripts from shared/bin are replaced by ant tasks: advbwdist, connbidirect, hidserv, clients, legacy, onionperf, webstats, collectdescs. Task make-data-available copies everything as the earlier script 99-copy-stats-files.sh.
The R server start.sh is replaced by ant run-rserver.
The various modules run in their own sub-folders (if not changed in build.xml these will be generated/modules/<submodule> and logging for the logback parts happens to generated/modules/logs/...).

ant legacy-create-config copies the legacy config template to ${basedir}/legacy.config, which is used by ant legacy as config file. This is just because I found it, but it shouldn't be this way, discussion about such topics are referred to ticket #24328.

The creation of metrics-lib's javadoc has its own sub-task, which is called as part of the war file creation.

How to test web related things and new modules:

It is not necessary to run all modules nor to start an R server.
The war should run fine locally using java -DLOGBASE=logs -jar generated/dist/metrics-web-1.0.0-dev.war and then open 127.0.0.1:8080 in a browser, of course no graphs or tables will be supplied.

Open topic

On a fresh installation with only current data running detection.py only generates an empy (except for the header) userstats-ranges.csv, which prevents further processing.
Also, many other tasks take ages to complete on a fresh system and the chain of deployment in build.xml needs some tweaking.
Please, take a closer look and improve during this first round.

The objective for now should be to check if all works as expected with the minor changes mentioned above. The tweaking and improvement of configuration etc. should take place later in the newly created tickets.

Last edited 4 weeks ago by iwakeh (previous) (diff)

comment:3 Changed 4 weeks ago by karsten

Status: needs_reviewneeds_revision

Wow, that's a commit with three thousand, six hundred twenty-six lines.

Let's try to split that commit into five or ten or twenty commits to make the review at least somewhat more plausible.

I can help (next week, though, not this week)! I've split numerous commits in the past. The four Onionoo tickets around versions and recommended versions started out as a single commit which I then split up into four. It facilitates reviews a lot, and in this case I'd even say that it enables reviews.

Some ideas:

  • Let's start with changes to everything that gets the website running and switch to the data-processing modules after that.
  • We could even make changes to the various web-related parts step by step. For example, we could have one commit for updating the Rserve part and another one for producing the WAR file. Or maybe even more, I haven't looked.
  • The module changes can happen module by module, too.
  • Small changes that are unrelated to the overall change should come as separate commits. Some very quick suggestions for changes that could be separate commits:
    • Remove outdated HACKING file.
    • Take out that IOException in RelayDescriptorDatabaseImporter.
    • Remove unused imports from CollectorDirectoryProvider.
    • Update StrSubstitutor from Apache Lang to Lang3.
    • Handle null value in RObjectGenerator.
    • Remove unused imports from DirectoryListingTest. Possibly combined with the other removal of unused imports.

Oh, and regarding the open topics, let's try to handle them prior to merging. What data can I provide?

comment:4 in reply to:  3 Changed 4 weeks ago by iwakeh

Replying to karsten:

Wow, that's a commit with three thousand, six hundred twenty-six lines.

Well, that is a little overstating it as there are many moved folders, which
won't be reviewed file by file or line by line ;-)

...
Let's try to split that commit into five or ten or twenty commits to make the review at least somewhat more plausible. ...

I really didn't want to have half baked commits, with two build environments around etc. that causes trouble without need (btw the web part doesn't really deviate much from ExoneraTor).
It shouldn't be a problem to review a moved package tree.

But of course, I see your point and will try to come up with a few more commits.

Oh, and regarding the open topics, let's try to handle them prior to merging. What data can I provide?

I clearly stated above (naming it the first round) that I didn't intend to suggest any merge!

Data might help, true. Just wait with providing anything, as I might define a clearer set of data needed or make it work in this round.

comment:5 Changed 4 weeks ago by iwakeh

Status: needs_revisionaccepted

comment:6 Changed 9 days ago by iwakeh

Status: acceptedneeds_review

First of all thanks for the data, it helped a lot!

Please review the thirty something commits on this patch branch. I didn't change anything, only ironed out tiny things. The commits are for ease of review and partially examples for refactoring the modules, not so much suited for cherry picking.

I created a ticket for removing the dependency on server locale.

Most of comment:2 applies. The open topic is solved with having data. But there is a reproducibility issue, which this is not part of this restructuring ticket. The reproducibility issue concerns the two oldest dbs: userstats and tordir. The R issue (mentioned in comment:2) in the clients module doesn't show up when a sufficient amount of data is available.

What is not really reproducible from scratch is the tordir database (the oldest db here). This issue has nothing to do with restructuring as it concerns the initial db setup. The refresh_all call in the legacy module fails at call refresh_network_size. After adding more data and running the other refresh functions by hand (in arbitrary order): refresh_all suddenly works. This should be investigated in its own ticket in addition to the inconsistencies of the main db and the db setup, i.e. there are still functions in the data dump that are not in the sql scripts, for example: refresh_oldest_week.

After ant war and jar: steps for setting up metrics-web: ant run-web-prepare prepares everything, ant run-rserver starts the R daemon, and java -DLOGBASE=logs -jar generated/dist/metrics-web-1.0.0-dev.war starts jetty.

Two more tickets needed: one each for making the setup of userstats and tordir db reproducible from scratch.

(The onionperf and webstats commits are good examples for refactoring new modules.)

Last edited 9 days ago by iwakeh (previous) (diff)

comment:7 Changed 8 days ago by iwakeh

Note: two new ant tasks need to be added, but won't hinder review, one for running pgTAP tests and one for generating jsps from xml specs.

comment:8 Changed 3 days ago by iwakeh

See ticket 23830 comment 8 for a complete list of debian stretch packages.

comment:9 Changed 2 days ago by karsten

Status: needs_reviewneeds_revision

Thanks for splitting up that huge commit. These smaller commits are much easier to review!

I made it through commits 8627a0a to 6b54501 in your task-24175-stepbystep branch and have a few remarks:

  • In e9b4bb5 you're moving the users-q-and-a.txt file, but you're not updating links in metrics.json. That's going to produce a couple dead links.
  • Commit 9a80a5b says it's a fixup for b17c1bf, but I believe it's rather a fixup for 2f5fd6f. And I believe it's an incomplete fixup commit, leaving some accidentally made changes unchanged. I haven't checked in detail.

Those are the only findings from reading the diff. I might discover more issues while testing.

How do we proceed? Do you mind if I make these remaining fixes while rebasing your branch to master? I'd then share the rebased branch here while testing it locally.

comment:10 in reply to:  9 Changed 2 days ago by iwakeh

Replying to karsten:

Thanks for splitting up that huge commit. These smaller commits are much easier to review!

Good, that this worked for you!

I made it through commits 8627a0a to 6b54501 in your task-24175-stepbystep branch and have a few remarks:

  • In e9b4bb5 you're moving the users-q-and-a.txt file, but you're not updating links in metrics.json. That's going to produce a couple dead links.

Ergh, forgot to adapt these links.

  • Commit 9a80a5b says it's a fixup for b17c1bf, but I believe it's rather a fixup for 2f5fd6f. And I believe it's an incomplete fixup commit, leaving some accidentally made changes unchanged. I haven't checked in detail.

Please, continue the check. The diffs I checked before committing seemed fine, but you've made many of the changes in the concerned files.

Those are the only findings from reading the diff. I might discover more issues while testing.

How do we proceed? Do you mind if I make these remaining fixes while rebasing your branch to master? I'd then share the rebased branch here while testing it locally.

No objection. The new modules in the old style were not yet merged. So, this should work.

comment:11 Changed 2 days ago by karsten

Okay, I'll work on rebasing your branch to master and fixing those two minor issues next. (Hopefully this evening, possibly later over the weekend, worst case on Monday.)

Note: See TracTickets for help on using tickets.