Opened 2 years ago

Last modified 2 days ago

#24532 needs_review defect

Make metrics-web runs independent of server locale

Reported by: iwakeh Owned by: metrics-team
Priority: Medium Milestone:
Component: Metrics/Statistics Version:
Severity: Normal Keywords:
Cc: metrics-team Actual Points:
Parent ID: Points:
Reviewer: irl Sponsor:

Description

metrics-web script runs expect numbers formatted using the US locale. Java output is currently server locale dependent.
This affects mostly those of the roughly fifty 'String.format' calls that format numbers. Maybe, more.

Ideally, the java code should provide the correct output independent of the environments locale.

(Quick fix as applied in #24175: set locale on the java command line.)

Child Tickets

Change History (5)

comment:1 Changed 2 years ago by iwakeh

Summary: Make metrics-web runs independant of server localeMake metrics-web runs independent of server locale

comment:2 Changed 4 days ago by karsten

Cc: metrics-team added
Component: Metrics/WebsiteMetrics/Statistics
Reviewer: irl
Status: newneeds_review

This is as simple as setting the default locale for the instance of the JVM at the beginning of the main method. For metrics-web that's the patch below:

diff --git a/src/main/java/org/torproject/metrics/stats/main/Main.java b/src/main/java/org/torproject/metrics/stats/main/Main.java
index 4ea15b46..3a4baf75 100644
--- a/src/main/java/org/torproject/metrics/stats/main/Main.java
+++ b/src/main/java/org/torproject/metrics/stats/main/Main.java
@@ -8,10 +8,11 @@ import java.io.IOException;
 import java.lang.reflect.InvocationTargetException;
 import java.nio.file.Files;
 import java.nio.file.StandardCopyOption;
 import java.util.ArrayList;
 import java.util.List;
+import java.util.Locale;
 
 public class Main {
 
   private static final Logger log = LoggerFactory.getLogger(Main.class);
 
@@ -27,10 +28,12 @@ public class Main {
   /** Start the metrics update run. */
   public static void main(String[] args) {
 
     log.info("Starting metrics update run.");
 
+    Locale.setDefault(Locale.US);
+
     File[] outputDirs = new File[] { modulesDir, statsDir };
     for (File outputDir : outputDirs) {
       if (outputDir.exists()) {
         continue;
       }

That main method invokes all the other main methods of data-processing modules. But setting the locale in this first main method is sufficient for all subsequent module runs in the same JVM.

I tested this line in a throw-away EC2 instance, and it worked just fine.

If this fix looks good, we should apply the same fix to all other main methods of our various code bases:

org.torproject.metrics.collector.Main
org.torproject.metrics.exonerator.ExoneraTorDatabaseImporter
org.torproject.metrics.exonerator.ServerMain
org.torproject.metrics.onionoo.cron.Main
org.torproject.metrics.onionoo.server.ServerMain
org.torproject.metrics.web.ServerMain
org.torproject.metrics.web.UpdateNews

In case of Onionoo we can then simplify all String.format invocations containing a locale, like String.format(Locale.US, "%.12f", weights[i]));. It will be sufficient to specify the locale once at the beginning of the execution. Doesn't hurt to keep it there, but having less code wouldn't hurt, either.

comment:3 Changed 3 days ago by irl

Status: needs_reviewmerge_ready

Lgtm. I don't see any issues with adding this to the other code bases either.

comment:4 Changed 2 days ago by karsten

Status: merge_readyneeds_review

Please take another look at commit b301b27 in my task-24532 branch that contains the change above and that has all further Locale.US parts removed. The patch is a bit longer than the one above, so it might be good to have another quick review. It's tested using metrics-test integration tests. Thanks!

comment:5 Changed 2 days ago by karsten

I just pushed another commit 194068b to that branch that does the same thing with the time zone. In fact, there was a bug with the clients module not explicitly specifying UTC for timestamps imported into the database and relying on the default system time zone. This is fixed with that commit. Tested using metrics-test integration tests.

Note: See TracTickets for help on using tickets.