Exception Handling with Explicit IO
In a previous post, we added exception handling to a hypothetical API server written in Scala. Since Scala implicitly allows arbitrary I/O anywhere, modeling exception handling by adding a layer of indirection using Either
values was essentially free. I wanted to try implementing the same logic in Haskell, which would force us to deal with intermingling exception handling intermingled with explicit I/O, creating two layers of abstraction that we’ll have to juggle.
The Problem
Let’s do a quick recap of the problem. We are implementing an HTTP API that accepts only POST requests. We have simple Request
and Response
types.
Our domain types are User
s and Resource
s. We assume that this app is responsible for auth, so we may create a User
value from the Request
auth token without any I/O. However, we assume that resources are shared between this app and others, so we need to do I/O to get a handle on a Resource
.
Once we have the User
, Resource
, and request body, we execute some database mutation (left undefined) using those three parameters.
handlePost :: Request -> Response
is our entry point. The whole program can be summarized as follows:
- Make sure the request method is POST.
- Make sure the request body is not empty.
- Get the user.
- Make sure the “Authorization” header is present.
- Make sure the auth token is valid.
- Make sure there is a user associated to the auth token.
- Get the resource.
- Make sure the resource at the request path exists.
- Execute the request.
- Make sure the user has permission to modify the resource.
- Make sure the modification was successfully applied.
- Return an appropriate response.
Each Make sure … step represents a failure where we may need to exit early, returning an appropriate response. Here is the project spec, with nine canned responses:
Next, we’ll take a look at our undefined terms,
Terms that yield Bool
s are prefixed by is and terms that do I/O are suffixed by io. Finally, (and just because we can) we’ll make a mass-import module,
Implementation
Implementing getUser
We’ll augment getUser
with error handling, so its signature will be Request -> Either Response User
, and as we do, the effect implies :: Bool -> Response -> Either Response Unit
will come in handy.
Easy peasy. getResource
will be a somewhat more challenging.
First Attempt at getResource
First note the return type,
We cannot return an Either Response (IO Resource)
, because that would mean we decide if we should exit early without doing any I/O, and then return an IO
value if we don’t exit early. In fact, since is_resource_not_found_io
has type IO Bool
, we need to do I/O in order to construct our Either
value, so the appropriate return type is IO (Either Response Resource)
.
The above method looks concise and clear. Unfortunately, it doesn’t compile. implies
expects a Bool
, but we have an IO Bool
to work with.
Second Attempt at getResource
We add an implies'
which is something like implies
but is compatible with I/O. The return
on the last line maps over the IO
in order to lift the Resource
to an Either Response Resource
. This code compiles, but it does not do what we want it to do.
To see why, suppose is_resource_not_found_io
comes back false. Then implies'
returns an IO (Either Response ())
, but there is no associated bind assignment (e.g., no <-
), so the data that the Either Response ()
represents is completely ignored, and the next line that creates a handle to the nonexistent resource gets executed anyway, potentially leading to data corruption and other undefined behavior. But doesn’t do
notation handle the short circuit logic for us? do
notation for Either
does, but notice that this do
block is for IO
. It’s completely indeferent to whether the result of implies'
is a Left
value or a Right
value, proceeding to the final line either way. This is harder than I thought it’d be.
Third Attempt at getResource
We need to find a way to conditionally call the_resource_io
. We can do this by introducing a local assignment of type () -> IO Resource
which, when called, calls the_resource_io
.
The result of implies'
is bound to notFound
, shedding the IO
and leaving an Either Response ()
. Conceptually, we want to apply doResource
to notFound
, preserving the short-circuit logic provided by Either
. To accomplish this, we use the standard library function traverse
(definition below, with type signature specialized to our use case).
Notice in the Left
case, we ignore f
and return a lifted Left
value, preserving our desired short-circuit semantics. Armed with traverse
, our getResource
function finally mixes I/O and failure handling correctly.
I should note that traverse
works whenever you replace IO
and Either e
with an arbitrary Applicative
and Traversable
, respectively.
Implementing execute
We use what we learned from getResource
when writing execute
. Since we don’t want to write to the resource if the user doesn’t have permission, we want to apply is_executed_io
conditionally. We create a temporary () -> IO Bool
to wrap it. Our do
block will be over IO
, so we will need to bind some of our Either
values using >>=
explicitly.
permitted
has type Either Response ()
. We use traverse
to conditionally apply doExecute
and assign the result (of type Either Response Bool
) to executed
. We want to finish with not executed `implies` badConnection
, but we need to do some type Tetris to make everything fit. not <$> executed
has type Either Response Bool
. We use >>=
to bind its Bool
result to (`implies` badConnection)
, resulting in an Either Response ()
. Finally, return
lifts us to IO (Either Response ())
, as desired.
Implementing handlePost
As we implement handlePost
, we’ll make use of the following helper function:
tunnel
commutes the outer Either
with the IO
and then maps over the IO
to combine the two Either
s inside.
We’ll want to conditionally apply getResource
and execute
using the same conditional-execution trick we used in their definitions, so we’ll add a ()
argument (that we simply ignore) to each.
handlePost
needs to take a Request
to an IO Response
, completely eliminating any Either
context. Let’s take a look at handlePost
and then discuss.
We begin by checking and collecting our preconditions. chkMth
has type Either Response ()
, errBdy
has type Either Response String
, errUsr
has type Either Response User
, and precon
has type Either Response ()
. We use precon
to conditionally apply getResource
and execute
, and we use tunnel
to remove nesting. errExe
has type Either Response ()
and gates the final success
response. Finally, result
has type IO (Either Response Response)
, over which we map either id id
to produce the desired IO Response
, completing the program.
Parting Thoughts
This is hard work. I’ve heard the saying “monads give you one free abstraction,” and the meaning of that saying really hit home while I was doing this exercise. Once we had two monads in play, the code became significantly more complicated. Fortunately, we have some help. The heavy lifters are fmap
(via <$>
), the applicative <*>
, >>
and >>=
, and traverse
. In general, I find that when I’m stuck in a quagmire of types, traverse
(and it’s cousins, for
and sequenceA
) are exactly what I need to dig my way out. Always keep them in mind.
I made extensive use of standard Haskell tools, such as Hoogle and the REPL. I found that it helps to pay close attention to compiler errors.
Another crutch I lean heavily on is scratch paper and pencil (not pen). For every function here except for getUser
, I had to break out the old-fashioned notepad and write some type signatures out. Here I am trying to figure out the implementation of execute
.
Notice I write out what I have and what I want. If you are a Haskell nomad, I highly recommend you develop your own paper/pencil crutch. Don’t underestimate the insight you can gain from changing your vantage point this way. Specifically, paper and pencil helps me with equational reasoning (a kind of value-level reasoning) and type unification (a kind of type-level reasoning). I should probably write posts about those one of these days.
Most of our woes are caused by using do
notation for IO
when what we really want is do
notation for Either
. One way to avoid these woes is to use explicit >>=
for our IO
, reserving do
notation for our conditional logic. I doubt that this would simplify the code significantly. A different approach we could try is using monad transformers to create a composite monad that combines the sequential execution semantics of IO
with the short-circuit semantics of Either
. I’ve been meaning to learn how to use monad transformers, so if I get motivated enough I’ll write up a post reimplementing this program.
Appendix
Here are working code examples. With all of these files in the same directory and GHC installed, you can run the tests with runghc Test.hs
and open individual files in GHCi.
Spec.hs
Undefined.hs
Project.hs
Eithers.hs
Test.hs