Problem/Motivation

Now that we're using Guzzle for outgoing http requests, the responsibility of detecting and logging errors has become slightly more complicated. Guzzle exceptions are one of three things. Quoting mtdowling:

The first two have a response attached, and getting a message from them is pretty easy ($e->getResponse->getReasonPhrase()). The last one has no response, and can be thrown for network problems. Getting a message from these is pretty easy too, but you have to look in a totally different place ($e->getError()). The other complication is that CurlExceptions don't have an HTTP status code, but do have a curl error code. In drupal_http_request(), we used to handle this kind of "network error" by returning a negative version of the error number that we had.

Obviously, this can be handled by using stacked catch blocks to special case the CurlExceptions, but this adds a lot of duplicated code for every time we need to handle a Guzzle exception.

Proposed resolution

Create a Guzzle exception wrapper class whose sole purpose is to determine a reason and a code for the failure. Then, error handling is as simple as:

I can perfectly see why these are separate exceptions — I essentially implemented a very similar set in Mollom PHP library.

The separation is necessary, because the calling code needs to react differently for each exception type/class.

For example, a 4xx client error always means that retrying the same request does not make any sense, because it's malformed, misconfigured, or unauthorized, etc.pp. Therefore, you can stop trying whatever you planned to do, because the request in itself is wrong.

Contrary to that, a 5xx server error means that potentially just this one request could not be processed; you can either retry, try a different server, put it in a queue and retry later, or error out. The application logic is different for each use-case and definitely requires custom handling for every particular purpose. In any case, it would be wrong to mix the error handling into the 4xx handling.

Lastly, there's the case of low-level networking errors, which borderline belong into the 4xx category because they are also caused by a client (platform/infrastructure) misconfiguration, but since cURL/PHP is not able to issue a request in the first place, it would be wrong to fake a server response where no request was in the first place. Additionally, these kind of errors cannot be fixed on the application level - most often you have to consult your server admin or hosting provider in order to open ports, adjust the firewall, correct php.ini settings, or whatever else might be configured wrongly. In any case, the application (Drupal) is not able to fix it, so the application actually needs to throw an appropriate error message (ideally including guidance for how to fix it).

In essence, the separate exceptions make perfect sense to me, and I'd strongly recommend to not wrap them.

@sun: Yes, that makes sense, I'm not suggesting to wrap or combine them.

Just provide a common interface (either specific messages or an actually useful getMessage()) so that you have an easy way to log it to watchdog, without requiring multipe catch blocks and checks for specific exeptions. In many cases in core, that's all we do right now (and more is often not possible, because that's background stuff like aggregator).

This is currently blocking progress on various convert drupal_http_request() to guzzle issues, I don't care that much how it looks, but we need some standard way to handle this, log something to watchdog and if it is a user-triggered action, provide him some useful feedback.

In case of a low-level curl exception (and those are very easy to produce by having a typo in the domain name for example, missing protocol, ...), the site is going to die with a fatal error because $response is NULL.

CurlException is a networking exception whereas BadResponseException is an HTTP exception. It looks like CurlException should actually implement RequestException and not BadResponseException. CurlExceptions will never have a response object because the HTTP transaction never completes when a curl error is encountered. This would be a significant BC break though, so I either need to rethink it or wait until 3.1.0.

In the meantime, I've updated the cURL exception's message so that it does not include a var_export of the curl transfer information. You should be able to just rely on $e->getMessage() now right? Alternatively, you could catch the CurlException above BadResponseException and handle them differently.

Right, that you can do that and react on a different error differently is great. But right now, you are required to and that's a lot of duplicated or ugly code if all we want to do is log an error.

That leads to developers doing it wrong, see the code in #4. That has been written by the people who actually know Guzzle, has been reviewed by others who know it and it dies with a Fatal error if you have a typo in a URL or any of the other many reasons that can lead to a CurlException. How do you imagine will it look like in contrib and custom code? :)

As explained by @mtdowling in #7, the problem there is actually that CurlException should not be implementing BadResponseException as there is no Response but that can't be fixed without a BC break. And the getMessage() in the version that is currently in HEAD logs a lot of rather useless information but this has already been improved upstream, so we can update to the next stable version (or dev but then we need to do another update so that we have a stable version).

All I'm asking here is that getMessage() or another method returns enough information in a standardized way so that we can easily catch and and log an exception if something goes wrong without having to do something as ugly as #1862538: Convert drupal_http_request usage in update.fetch.inc to Guzzle. And things are improving already :) If we can for example just do a watchdog_exception('update', $e) as we do in other places where we log exceptions, then that's often enough and sometimes already more than we're doing right now.

I decided that it was important to make CurlException implement RequestException rather than BadResponseException since treating these two types of exceptions in the same manner will lead to issues (e.g. attempting to work with a response object that isn't there). Because Drupal 8 is still in development and error handling hasn't been decided, I was hoping Drupal could update to 3.1.0.

If you're wanting to catch the different types of exceptions and reduce bloat, you could just catch RequestException. This will catch all request related exceptions. This will also allow you to check if a request has a response, and if so, log whatever you like. Further, the getMessage() method should tell you everything you need to know about the request/response/curl error.

If you do not want to catch exceptions, then you could add an event listener to do whatever you want (e.g. log whatever to wherever you'd like). This listener would always be fired before the exception. Let me know if you're interested in this.

The feed from test seems to be broken because of error "500 500 Service unavailable (with message)".

Was confused for a moment because of the "500 500" but that's actually Drupal that does this :p

The question is, do we really want to *require* two catch blocks just to display or log a simple error message? In Drupal, we frequently use the term Developer Experience (DX) as a counter piece to UX and this is not really good DX :) I think we can do better, but I'm not quite sure myself what a good exception message for these two cases would be. Suggestions?

I personally don't see a problem with the verbose exception message. More information == easier to figure out how to fix the issue.

I also don't see a problem with two try/catch blocks. Exception handling is not, and IMO, should not be some universal implementation. It makes sense for this feed reader or whatever, but not for every other instance in Drupal 8 that uses Guzzle.

If you don't want the exception to look like that, then add a custom event listener that intercepts these exceptions. Using an event listener would allow you to throw an exception with whatever message you like.

I'm inclined to agree with mtdowling. Good DX does not and should not mean "the developer never thinks about error handling". When you're dealing with network processing, there's a LOT that can go wrong you need to handle. Just be glad we aren't dealing with persistent connections you have to setup and then manually tear down. I did that in Java once and it was about 80% exception handling around all of the possible ways it could explode. :-)

Here is a patch that fixes aggregator to do the minimally required exception handling right now + redirect url handling which was removed in the original patch.

All I'm saying is that the more complicated the exception handling is the more errors we'll see in contrib/custom land. As said before, even the initial patch for the aggregator conversion, authored and reviewed by people which know Guzzle resulted in a fatal error by adding an invalid URL through the UI. That, at least is now an uncatched exception so that you can see the error but that's not optimal either...

The feed is saved elsewhere, it is exactly the same logic as before the initial conversion, just accessing it in a different way.

Looking it over again, is there a reason that BadResponseException's getMessage() method isn't what we're doing here anyway?

That getMessage() is *not* the same as what we are doing is the whole point of this issue :) getMessage() gives you *everything*. A dump of the request and response objects, including the whole response body.

To my naïve eyes, this exception handling looks like basically the same code copy/pasted twice with only one or two slight variations, which normally we would try and reduce to one copy of the code, which I guess is the point of this issue. As a drive-by module developer, I'm not sure that I would think to "use" two different types of exceptions. I just want to know if something I did failed or not, and print out an error message when it happens, and move on with my life. However, the arguments from mtdowling and Crell seem in the opposite direction, and msonnabaum backs them up on IRC.

It does seem odd though that the error is printed completely differently depending on the exception. $response->getStatusCode() . ' ' . $response->getReasonPhrase() in the case of BadResponseException and $e->getMessage() in the case of RequestException. And apparently $e->getMessage() on a BadResponseException returns enough data to crash a browser, whereas using it on a RequestException is hunky-dory? I dunno, smells fishy/inconsistent to me.

I guess I'll leave this for someone a bit more OOP-y like catch or Dries.

It does seem odd though that the error is printed completely differently depending on the exception. $response->getStatusCode() . ' ' . $response->getReasonPhrase() in the case of BadResponseException and $e->getMessage() in the case of RequestException. And apparently $e->getMessage() on a BadResponseException returns enough data to crash a browser, whereas using it on a RequestException is hunky-dory? I dunno, smells fishy/inconsistent to me.

That's pretty much exactly the response I expected from you and is what I'm thinking as well :)

This is also the reason the two different catch blocks are necessary at all. I still think there is a big difference in being able to react on different errors differently and having to do it, just because the way you need to handle the exception itself differently.

As @mtdowling explained above, we could change this within Drupal by adding a listener and converting the exception to another one. But I think Drupal 8 is about working together and improve the upstream projects instead of working around them :)

I agree with webchick and berdir on this, having the multiple catch blocks is just going to mean contrib/custom code ends up missing something, and having to access the error message differently is also going to be impossible to remember. I'm not really bothered by the verbose exception message though fwiw.

Leaving RTBC a bit longer - there's nothing wrong with the patch in itself but it'd be good if we could persuade mtdowling upstream and bring that in.

Looking at #1887046: Convert drupal_http_request usage in install.core.inc to Guzzle. that just went in:
The client code is entirely unaware of Guzzle (you just do a drupal_container()->get('http_default_client') and work on it), yet handling errors is made by catching a Guzzle exception class - you thus need to "use" a Guzzle namespace at the top of the file.
Seems a bit weird that the only place where we hardcode a reference to Guzzle is to include a namespace for an exception ? Like, it feels swappable but not really is...

yched: I don't think we should really consider guzzle swappable. There was discussion elsewhere about renaming http_default_client to http_client, because really, it's the one core uses. You could always bring in your own PSR-0-ified library if you wanted to, but I don't think most people will have a reason to do so.

I agree it's a bit odd that the only literal class names come in for error handling, but that's a minor point and doesn't relate to swappability.

@yched: There might be no hard reference, but the code operates on the classes and interfaces defined by Guzzle. There are no generic interfaces, so it's not really possible to switch it out as another implementation would not be able to know what it has to provide.

And yes, I agree that http_default_client should be renamed to just http_client, we had that discussion in the guzzle conversion issue. I think the default there doesn't imply that you can replace it with $OtherHttpClient but that it's the http client with our default configuration (like the user agent and the simpletest plugin that allows a request to operate within the same test) and you are free to define your own service with different plugins and so on. But removing the default doesn't prevent that in any way.

The pull request has been merged that removed request and response, thanks @mtdowling!

The [stuff in brackets] does show up. If we want to have a generic way to display the error, then I think we'll have to live with that. If not, then that means sticking with multiple catch blocks for the cases where we want it. Which is fine if we want that and the improved exception hierarchy gives us that possibility now.

My personal goal for this issue has been solved: Removing the *requirement* to have multiple catch blocks and/or having non-intuitive code to handle exception objects differently. It's pointless to make it a proper english string by default for example because there is no way to translate it and just "404 Not Found" is too short for a exception message. So the string above looks like a sane default to me.

The remaining question now is if we want to go with the patch form #18 *in this case* (which means it will look like it currently does in 7.x) or switch to a single catch block that will then look like the example from #32. It would IMHO be OK to have two catch blocks here for improved usability (not sure if users can make sense of it anyway ;))

Talked with Berdir in IRC. As of the latest Guzzle (we may need to resync the library through Composer, though), module developers can either have a single catch block and be happy, or have multiple catch blocks (as here) if they want to get more robust with their error handling. Up to them based on their specific case.

Assuming the bot still approves of the most recent patch, I'm going to go ahead and RTBC this. If for some reason it doesn't still apply, the bot will tell us.