Tweak memory usage of hourly cronjob
While attempting to set up an Onionoo mirror I ran into memory problems with the hourly updater.
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
at java.util.Arrays.copyOf(Arrays.java:2367)
at java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:130)
at java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:114)
at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:480)
at java.lang.StringBuffer.append(StringBuffer.java:309)
at java.lang.StringBuffer.append(StringBuffer.java:300)
at java.util.regex.Matcher.appendReplacement(Matcher.java:841)
at java.util.regex.Matcher.replaceAll(Matcher.java:906)
at java.lang.String.replaceAll(String.java:2162)
at org.torproject.onionoo.docs.DocumentStore.storeDocumentFile(DocumentStore.java:278)
at org.torproject.onionoo.docs.DocumentStore.store(DocumentStore.java:228)
at org.torproject.onionoo.writer.DetailsDocumentWriter.updateRelayDetailsFiles(DetailsDocumentWriter.java:184)
at org.torproject.onionoo.writer.DetailsDocumentWriter.writeDocuments(DetailsDocumentWriter.java:72)
at org.torproject.onionoo.writer.DocumentWriterRunner.writeDocuments(DocumentWriterRunner.java:29)
at org.torproject.onionoo.cron.Main.main(Main.java:55)
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
at java.util.Arrays.copyOf(Arrays.java:2367)
at java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:130)
at java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:114)
at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:415)
at java.lang.StringBuffer.append(StringBuffer.java:237)
at java.io.StringWriter.write(StringWriter.java:101)
at org.apache.commons.lang.StringEscapeUtils.escapeJavaStyleString(StringEscapeUtils.java:196)
at org.apache.commons.lang.StringEscapeUtils.escapeJavaStyleString(StringEscapeUtils.java:164)
at org.apache.commons.lang.StringEscapeUtils.escapeJavaScript(StringEscapeUtils.java:131)
at org.torproject.onionoo.docs.DetailsDocument.escapeJSON(DetailsDocument.java:21)
at org.torproject.onionoo.docs.DetailsDocument.setContact(DetailsDocument.java:267)
at org.torproject.onionoo.writer.DetailsDocumentWriter.updateRelayDetailsFiles(DetailsDocumentWriter.java:158)
at org.torproject.onionoo.writer.DetailsDocumentWriter.writeDocuments(DetailsDocumentWriter.java:72)
at org.torproject.onionoo.writer.DocumentWriterRunner.writeDocuments(DocumentWriterRunner.java:29)
at org.torproject.onionoo.cron.Main.main(Main.java:55)
Neither stack trace looks like those operations would be using crazy amounts of memory, so I set up a cronjob that runs jcmd $pid GC.class_histogram
once every minute. Here's the top 10 right before the JVM exited with that second stack trace:
num #instances #bytes class name
----------------------------------------------
1: 8533341 463113472 [C
2: 8533040 204792960 java.lang.String
3: 3705980 148239200 java.util.TreeMap$Entry
4: 2995129 143766192 java.util.TreeMap
5: 593855 114020160 org.torproject.onionoo.docs.NodeStatus
6: 604450 48686880 [Ljava.util.HashMap$Entry;
7: 2390562 38248992 java.util.TreeSet
8: 931813 29818016 java.util.HashMap$Entry
9: 604451 29013648 java.util.HashMap
10: 611484 14675616 java.lang.Long
From this profile it seems that NodeStatus
would be a good candidate to save some memory. I'm attaching a branch with some memory tweaks to it as soon as I have a ticket number.