Opened 4 years ago

Last modified 10 months ago

#16225 assigned enhancement

Unify exception/error handling in metrics-lib

Reported by: karsten Owned by: metrics-team
Priority: Medium Milestone: metrics-lib 3.0.0
Component: Metrics/Library Version:
Severity: Normal Keywords: metrics-2018
Cc: iwakeh Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description

There are now four different descriptor sources in metrics-lib: DescriptorParser, DescriptorReader, DescriptorCollector, and DescriptorDownloader.

We should think about unifying how we're handling exceptions and errors and telling the user about them. These thoughts should include whether or not to log exceptions and errors, though just doing that is probably not sufficient.

Let's go through the interfaces one by one:

public interface DescriptorParser {
  public List<Descriptor> parseDescriptors(byte[] rawDescriptorBytes,
      String fileName, boolean failUnrecognizedDescriptorLines)
      throws DescriptorParseException;
}

The (slightly modified, as compared to current master) parseDescriptors method handles a single file and throws an exception whenever something goes wrong. That works quite well, because that method blocks the caller, so they can easily wrap the call inside a try/catch block.

That's somewhat different in the next interface:

public interface DescriptorCollector {
  public void collectRemoteFiles(String collecTorBaseUrl,
      String[] remoteDirectories, long minLastModified,
      File localDirectory, boolean deleteExtraneousLocalFiles);
}

This method (which we should have called collectDescriptors, in retrospect) blocks the caller, too, but it's unclear whether it should abort its operation when it hits the first exception. Right now it's rather silent about problems and only prints stack traces to System.err. But that doesn't enable the caller to handle problems, neither. At least the method makes sure that it doesn't delete extraneous local files that might still exist remotely in case of an error.

The next interface is trickier:

public interface DescriptorReader {
  public Iterator<Descriptor> readDescriptors(File[] directories,
      File[] tarballs, SortedMap<String, Long> excludeFiles,
      boolean failUnrecognizedDescriptorLines, int maxDescriptorsInQueue);
}

Note that this interface does not yet exist, but it's what I could imagine doing to simplify the current interface and make it more similar to the DescriptorCollector interface. I could also imagine overloading this method.

The idea here is that the map passed in excludeFiles would be updated while reading descriptors, so that the caller could use that to update a local history file. (This would be documented, of course.)

However, it's unclear how we would handle problems at all here. What we could do is extend the implementation of the returned Iterator<Descriptor> to return a (runtime) exception whenever there was a problem reading the next descriptor that would be added next. Or maybe we should write our own Iterator-like interface for returning descriptors and have its hasNextDescriptor() and nextDescriptor() methods throw DescriptorParseException.

By the way, the current DescriptorReader interface has a getExceptions() method in the additional DescriptorFile interface that is returned instead of Descriptor. But that's something I'd like to get rid of, too. I also don't expect many users to look at those exceptions.

The last interface is the mostly dysfunctional DescriptorDownloader interface. I could imagine that we handle exceptions/errors there similar to DescriptorReader. Or we throw out that class, because it's only used by CollecTor, and generalizing its functionality might take more effort than writing it cleanly as part of CollecTor.

Thinking about the future, we might want to add another interface DescriptorWriter, and we should think about ways to handle exceptions/errors there, too.

There, I mostly sketched out the problem here, without having good solutions. But maybe we can come up with good answers in this discussion?

Child Tickets

Change History (24)

comment:1 Changed 3 years ago by iwakeh

Milestone: metrics-lib 1.4.0

comment:2 Changed 3 years ago by karsten

Milestone: metrics-lib 1.4.0metrics-lib 1.5.0

Move to 1.5.0 in order to get 1.4.0 out sooner.

comment:3 Changed 2 years ago by iwakeh

Milestone: metrics-lib 1.5.0metrics-lib 2.0.0

comment:4 Changed 2 years ago by karsten

Severity: Normal
Status: newneeds_review

There, I started writing some code to move the discussion forward here. That code is loosely based on the thoughts above. Please review this commit, not for merging but as a basis for further discussions.

comment:5 Changed 2 years ago by iwakeh

Milestone: metrics-lib 2.0.0metrics-lib 1.7.0

comment:6 Changed 22 months ago by iwakeh

Status: needs_reviewneeds_information

Ok, I start from the opposite direction and hopefully we meet in the perfect middle :-)

The use-case this ticket mostly refers to is bulk processing, i.e. there is a bunch of descriptors somewhere (file system, remote server) and metrics-lib fetches, synchronizes, and parses these files. One constraint is that there will always be some descriptors that cannot be parsed.

From this scenario, I would conclude that methods like readDescriptors, parseDescriptors, collectDescriptors don't throw DescriptorParseExceptions, but simply process as much as possible and provide a list of problematic descriptors at the end, which the API using program can choose to ignore or process.
Does this sound like the right use-case description?

Regarding the code-example: configuration of a descriptor source by fluent-style (or method chaining) is fine, but metrics-lib currently has the DescriptorSourceFactory approach, which would need to be adapted. That is, I see two things: the ideas around the code of DescriptorSourceBuilder are ideas about a new way of configuration and not exception/error handling in metrics-lib, i.e. a different ticket (there is one for redesign of the interface hierarchy, I'll look for it). Second, the current configuration and descriptor source instanciation need to be considered together (in that other ticket).

comment:7 Changed 22 months ago by karsten

Here's a related question from #16151 (discussion around DescriptorCollector) that was the sole reason for keeping that ticket open: "When receiving something other than a 200 response code while fetching the directories or files shouldn't that be reported in some way? Things like redirects, bad requests, not authorized, or others might be important for the caller?"

comment:8 in reply to:  6 ; Changed 22 months ago by karsten

Replying to iwakeh:

Ok, I start from the opposite direction and hopefully we meet in the perfect middle :-)

Okay, let's try that! :)

The use-case this ticket mostly refers to is bulk processing, i.e. there is a bunch of descriptors somewhere (file system, remote server) and metrics-lib fetches, synchronizes, and parses these files. One constraint is that there will always be some descriptors that cannot be parsed.

Agreed.

From this scenario, I would conclude that methods like readDescriptors, parseDescriptors, collectDescriptors don't throw DescriptorParseExceptions, but simply process as much as possible and provide a list of problematic descriptors at the end, which the API using program can choose to ignore or process.
Does this sound like the right use-case description?

It does sound like the right use-case description!

The part that needs more thoughts is when and how we should provide problematic descriptors and related error cases. Providing them at the end can easily lead to out-of-memory situations, because we might have to keep more and more problematic descriptors in memory until we're finally done parsing. Maybe we can include problematic descriptors in line with non-problematic ones.

How about this: whenever we can't parse a descriptor, we include a Descriptor instance with the raw contents we cannot parse in the descriptor queue. And we include a new method Descriptor#getException() that returns the DescriptorParseException that we ran into, or null if we didn't run into an exception while parsing.

Similarly, if we run into an exception while downloading files from a remote server or while reading files from the file system, so before splitting contents into descriptor-sized chunks and attempting to parse those, we include a Descriptor instance without descriptor contents and with just the exception.

I could see how this works even without breaking existing code, because any code processing Descriptor instances should check whether they're instanceof whatever they had hoped to get before casting the instance and accessing its getters. At least that's what I'm doing, and I'm skipping any unexpected descriptors. If an application would want to learn more about reasons why a descriptor was skipped, it can log the exception or handle it otherwise. It can even distinguish between I/O and parse exceptions. But if it doesn't care as much, it can simply rely on Descriptor subtypes to be valid, parsed descriptors.

Regarding the code-example: configuration of a descriptor source by fluent-style (or method chaining) is fine, but metrics-lib currently has the DescriptorSourceFactory approach, which would need to be adapted. That is, I see two things: the ideas around the code of DescriptorSourceBuilder are ideas about a new way of configuration and not exception/error handling in metrics-lib, i.e. a different ticket (there is one for redesign of the interface hierarchy, I'll look for it). Second, the current configuration and descriptor source instanciation need to be considered together (in that other ticket).

Yes, you're right, this issue is orthogonal and should be discussed on a different ticket. I'll create one.

Thanks for your very valuable input above!

comment:9 in reply to:  8 Changed 22 months ago by karsten

Replying to karsten:

Replying to iwakeh:

Regarding the code-example: [...]

Yes, you're right, this issue is orthogonal and should be discussed on a different ticket. I'll create one.

Added as #22196.

comment:10 in reply to:  8 ; Changed 22 months ago by iwakeh

Replying to karsten:

...

From this scenario, I would conclude that methods like readDescriptors, parseDescriptors, collectDescriptors don't throw DescriptorParseExceptions, but simply process as much as possible and provide a list of problematic descriptors at the end, which the API using program can choose to ignore or process.
Does this sound like the right use-case description?

It does sound like the right use-case description!

Maybe, add that to the javadoc somewhere :-)

The part that needs more thoughts is when and how we should provide problematic descriptors and related error cases. Providing them at the end can easily lead to out-of-memory situations, because we might have to keep more and more problematic descriptors in memory until we're finally done parsing. Maybe we can include problematic descriptors in line with non-problematic ones.

How about this: whenever we can't parse a descriptor, we include a Descriptor instance with the raw contents we cannot parse in the descriptor queue. And we include a new method Descriptor#getException() that returns the DescriptorParseException that we ran into, or null if we didn't run into an exception while parsing.

Similarly, if we run into an exception while downloading files from a remote server or while reading files from the file system, so before splitting contents into descriptor-sized chunks and attempting to parse those, we include a Descriptor instance without descriptor contents and with just the exception.

If we were using java8, I'd suggest using java.util.Optional<Descriptor>.
Actually your suggestions goes in that direction, too.
Maybe, a new descriptor class InvalidDescriptor could be useful here? It would be accessible via the instanceof-method and have limited information: at least the Exception and maybe also its origin (url, path), in case of a file also bytes.

comment:11 in reply to:  10 ; Changed 22 months ago by karsten

Replying to iwakeh:

Replying to karsten:

...

From this scenario, I would conclude that methods like readDescriptors, parseDescriptors, collectDescriptors don't throw DescriptorParseExceptions, but simply process as much as possible and provide a list of problematic descriptors at the end, which the API using program can choose to ignore or process.
Does this sound like the right use-case description?

It does sound like the right use-case description!

Maybe, add that to the javadoc somewhere :-)

Yes, we should, once we fully support that use case. Right now, if DescriptorParser is tasked with parsing a given file and runs into something it cannot parse, it just throws that exception and calls it a day (see #22139).

The part that needs more thoughts is when and how we should provide problematic descriptors and related error cases. Providing them at the end can easily lead to out-of-memory situations, because we might have to keep more and more problematic descriptors in memory until we're finally done parsing. Maybe we can include problematic descriptors in line with non-problematic ones.

How about this: whenever we can't parse a descriptor, we include a Descriptor instance with the raw contents we cannot parse in the descriptor queue. And we include a new method Descriptor#getException() that returns the DescriptorParseException that we ran into, or null if we didn't run into an exception while parsing.

Similarly, if we run into an exception while downloading files from a remote server or while reading files from the file system, so before splitting contents into descriptor-sized chunks and attempting to parse those, we include a Descriptor instance without descriptor contents and with just the exception.

If we were using java8, I'd suggest using java.util.Optional<Descriptor>.

It seems like we won't have Java 8 in the near future (see #19622), but can you elaborate how that would help here?

Actually your suggestions goes in that direction, too.
Maybe, a new descriptor class InvalidDescriptor could be useful here? It would be accessible via the instanceof-method and have limited information: at least the Exception and maybe also its origin (url, path), in case of a file also bytes.

I briefly thought about such a new type for this case, too. But the only reason I found was that there wouldn't be a method Descriptor#getException() that sometimes returns null. However, adding such a type would mean that whoever is interested in exceptions/errors would have to do another instanceof check and downcast. Are there other aspects?

comment:12 in reply to:  11 Changed 22 months ago by iwakeh

Replying to karsten:

[snip]

How about this: whenever we can't parse a descriptor, we include a Descriptor instance with the raw contents we cannot parse in the descriptor queue. And we include a new method Descriptor#getException() that returns the DescriptorParseException that we ran into, or null if we didn't run into an exception while parsing.

Similarly, if we run into an exception while downloading files from a remote server or while reading files from the file system, so before splitting contents into descriptor-sized chunks and attempting to parse those, we include a Descriptor instance without descriptor contents and with just the exception.

If we were using java8, I'd suggest using java.util.Optional<Descriptor>.

It seems like we won't have Java 8 in the near future (see #19622), but can you elaborate how that would help here?

I didn't sink too much time into the possible design options. It would also mean a change from the blocking method calls to always returning something. Actually, that would be achievable with Futures much better, where the API user decides when to wait. But, just some random thoughts here, no change proposal intended.

Actually your suggestions goes in that direction, too.
Maybe, a new descriptor class InvalidDescriptor could be useful here? It would be accessible via the instanceof-method and have limited information: at least the Exception and maybe also its origin (url, path), in case of a file also bytes.

I briefly thought about such a new type for this case, too. But the only reason I found was that there wouldn't be a method Descriptor#getException() that sometimes returns null. However, adding such a type would mean that whoever is interested in exceptions/errors would have to do another instanceof check and downcast. Are there other aspects?

Yes, the usefulness of one or the other approach depends on the programming style of the API using client. I don't feel strongly about either.

Regarding the methods returning a collection type (like unrecognized fields for torperf/onionperf or the get-exception method above), I think it would be more convenient to return an empty (immutable) collection of the wanted type (like emptySet instead of null.

Last edited 22 months ago by iwakeh (previous) (diff)

comment:13 Changed 22 months ago by iwakeh

Milestone: metrics-lib 1.7.0metrics-lib 1.8.0

comment:14 Changed 21 months ago by karsten

Milestone: metrics-lib 1.8.0metrics-lib 1.9.0

It seems that we're not ready for 1.8.0 yet, and we want to release that next week. Pushing back to 1.9.0.

comment:15 Changed 21 months ago by karsten

See also #22141 which has an implementation of some aspects discussed here.

comment:16 Changed 20 months ago by iwakeh

Also consider different Loggers that enable sorting of log-statements via configuration. (see #22141#comment:15)

comment:17 Changed 20 months ago by karsten

Milestone: metrics-lib 1.9.0

We already achieved some of the aspects here by merging #22141, and now we ran out of time for the 1.9.0 release which is scheduled for tomorrow. Let's work more on the remaining aspects here when 1.9.0 and 2.0.0 are out. Removing from planned milestones.

comment:18 Changed 19 months ago by iwakeh

Milestone: metrics-lib 2.2.0

comment:19 Changed 18 months ago by karsten

Owner: changed from karsten to metrics-team
Status: needs_informationassigned

Handing over to metrics-team, because I'm not currently working on this.

comment:20 Changed 17 months ago by iwakeh

Milestone: metrics-lib 2.2.0metrics-lib 2.3.0

Move to next milestone in order to accommodate interim release.

comment:21 Changed 17 months ago by karsten

Keywords: metrics-2018 added

comment:22 Changed 17 months ago by karsten

Keywords: metrics-2017 added; metrics-2018 removed

comment:23 Changed 14 months ago by iwakeh

Keywords: metrics-2018 added; metrics-2017 removed

Will be completed in 2018.

comment:24 Changed 10 months ago by iwakeh

Milestone: metrics-lib 2.3.0metrics-lib 3.0.0
Note: See TracTickets for help on using tickets.