Opened 9 years ago

Closed 6 years ago

#5792 closed defect (wontfix)

Fix metrics-db segfault which might be related to parsing assignment*.gz files

Reported by: karsten Owned by: karsten
Priority: Medium Milestone:
Component: Metrics/CollecTor Version:
Severity: Keywords:
Cc: Actual Points:
Parent ID: Points: 4
Reviewer: Sponsor:

Description

Since May 6, 10:08 CEST, metrics-db's JVM sometimes segfaults with this output:

     [java] Java Result: 134
     [java] #
     [java] # A fatal error has been detected by the Java Runtime Environment:
     [java] #
     [java] #  SIGSEGV (0xb) at pc=0x00007febde77bdec, pid=6579, tid=140651047905024
     [java] #
     [java] # JRE version: 6.0_18-b18
     [java] # Java VM: OpenJDK 64-Bit Server VM (14.0-b16 mixed mode linux-amd64 )
     [java] # Derivative: IcedTea6 1.8.13
     [java] # Distribution: Debian GNU/Linux 6.0.4 (squeeze), package 6b18-1.8.13-0+squeeze1
     [java] # Problematic frame:
     [java] # V  [libjvm.so+0x3c3dec]
     [java] #
     [java] # An error report file with more information is saved as:
     [java] # /srv/metrics.torproject.org/db/hs_err_pid6579.log
     [java] #
     [java] # If you would like to submit a bug report, please include
     [java] # instructions how to reproduce the bug and visit:
     [java] #   http://icedtea.classpath.org/bugzilla
     [java] #

A quick analysis shows that the segfault happens while processing .gz-compressed bridge pool assignment files. Attempts to reproduce this problem by reading the .gz file in question 100 times in a row and by setting up a separate metrics-db instance were not successful. Switched from commons-compress-1.0.jar to commons-compress-1.4.jar, even though the change log doesn't say anything about a possible segfault.

Child Tickets

Change History (6)

comment:1 Changed 9 years ago by karsten

No, commons-compress-1.4.jar didn't fix the problem. I turned off parsing .gz-compressed assignment files until I have an idea how to even approach debugging a JVM segfault. If we miss assignments from .gz-compressed files until the bug is fixed, we'll have to sanitize old assignment files again.

comment:2 Changed 9 years ago by karsten

Points: 4

Skipping .gz-compressed files fixes the problem so far. But it's just a workaround.

This took me 1 point so far, but this can easily take another 3 points because of difficulties with reproducing the problem.

comment:3 Changed 8 years ago by karsten

Resolution: worksforme
Status: newclosed

The problem somehow disappeared. The part where I turned off parsing .gz-compressed assignment files got lost during an update some time in the last months, and there were no problems in October. Closing as worksforme.

comment:4 Changed 8 years ago by karsten

Resolution: worksforme
Status: closedreopened

Lies. .gz-compressed assignment files are still skipped. Taking out that part to further investigate. Also re-opening.

comment:5 Changed 8 years ago by karsten

Priority: criticalnormal

comment:6 Changed 6 years ago by karsten

Resolution: wontfix
Status: reopenedclosed

We're not processing bridge pool assignment files anymore. Resolving.

Note: See TracTickets for help on using tickets.