#21223 closed enhancement (fixed)

use the empty field consistently throughout the data sets

Reported by: iwakeh Owned by: metrics-team
Priority: Medium Milestone:
Component: Metrics/Website Version:
Severity: Minor Keywords:
Cc: Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description

userstats-combined.csv is the only csv-data-set provided on Metrics web that uses the double quotation mark enclosed empty string ""; all others simply use an empty field.
Example row 2017-01-09,bridge,sg,scramblesuit,"",10,0,8 from userstats-combined.csv, and 2011-03-06,relay,af,,,,,170,11 from clients.csv

The empty field should also be used in userstats-combined.csv.

Child Tickets

Change History (6)

comment:1 Changed 11 months ago by karsten

Status: newneeds_review

Makes sense. I'm currently running the following patch on the server, after testing it locally. We'll see later today whether that was successful or not:

diff --git a/shared/bin/80-run-clients-stats.sh b/shared/bin/80-run-clients-stats.sh
index fe93e44..f0ac1f6 100755
--- a/shared/bin/80-run-clients-stats.sh
+++ b/shared/bin/80-run-clients-stats.sh
@@ -13,7 +13,7 @@ done
 
 echo `date` "Exporting results."
 psql -c 'COPY (SELECT * FROM estimated) TO STDOUT WITH CSV HEADER;' userstats > userstats.csv
-psql -c 'COPY (SELECT * FROM combined) TO STDOUT WITH CSV HEADER;' userstats > userstats-combined.csv
+psql -c 'COPY (SELECT * FROM combined) TO STDOUT WITH CSV HEADER;' userstats | sed 's/""//g' > userstats-combined.csv
 
 echo `date` "Running censorship detector."
 R --slave -f userstats-detector.R > /dev/null 2>&1

comment:2 Changed 11 months ago by iwakeh

Wouldn't a constant field in the select statement do the trick?
A CSV export of
select date, country, transport, null as version, frac, low, high from combined
ought to give the empty field.

I didn't test this, but there should be a way to create the CSV in the wanted format and avoid post-processing.

comment:3 Changed 11 months ago by karsten

Good point. How's this (together with undoing the earlier change):

diff --git a/modules/clients/init-userstats.sql b/modules/clients/init-userstats.sql
index 314ff58..87929b9 100644
--- a/modules/clients/init-userstats.sql
+++ b/modules/clients/init-userstats.sql
@@ -706,7 +706,7 @@ CREATE OR REPLACE VIEW combined AS SELECT
   a.transport,
 
   -- The IP address version of this estimate, which is always ''.
-  ''::TEXT as version,
+  NULL::TEXT as version,
 
   -- Estimated fraction of nodes reporting directory requests, which is
   -- used to extrapolate observed requests to estimated total requests in

comment:4 Changed 11 months ago by karsten

This seems to work just fine on the server. Here's the patch to review. Thanks!

comment:5 Changed 11 months ago by iwakeh

Status: needs_reviewmerge_ready

Seems to do what was intended :-)

comment:6 Changed 11 months ago by karsten

Resolution: fixed
Status: merge_readyclosed

Great, merged. Thanks! Closing.

Note: See TracTickets for help on using tickets.