I believe that what differentiates geniuses and other mortals is not in what they were born with, but what they figured out — about thinking, about themselves. And how early they figured that out, because the earlier, the more practice they had at thinking differently.

Critically, they figured out how to think, and they understood the meaning of deliberate practice.

The word genius means different things to different people. Some attribute those with high IQ (or similar metrics of intelligence) as genius. Others speak of gifts such as an eidetic memory, or how fast a person can consume knowledge.

Hollywood has also spared no effort in marketing such “genius”. Consider the famous TV serials such as Prison Break (Michael Schofield), Suits (Mike Ross), Criminal Minds (Dr. Reid), and to some extent, The Big Bang Theory (Sheldon Cooper), just to toss a few.

But genius could also mean those who produced great, life-changing results. But more than just the results, the amazement comes from the often insanely inconceivable process in which these geniuses produced those results — in the likes of men such as Leonardi da Vinci, Niels Bohr, Thomas Edison. I’m talking about these.

But don’t get me wrong. I’m not one of those who believes we’re born absolutely equal and everything else is a function of your upbringing, exposure and opportunities thrown your way. We’re not equal. Some of us are endowed with beauty, some with incredible intelligence and memory that they passed the Bar for fun, some with enough charisma to stage a revolution, and some with eloquence sufficient to talk a fish out of the water, so to speak.

However, I believe that no matter what gifts you have, or don’t, an overwhelming difference between the smart ones, and the less so, lies in whether or not you’ve, again, figured out how to think, and understood the meaning of deliberate practice.

 

Figuring Out How to Think

Figuring out how to think is absolutely fundamental in being better, smarter, and more creative. In other words, you need to think about how to think.

The article How Geniuses Think sheds some insight into how these great geniuses think. I’ll summarize it for you here in one sentence.

It’s not what you think, it’s how you approach your thinking.

Most of us, when faced with something new, are totally conditioned to revert back to what we know, which by definition almost always means what we were taught, and apply the similarities to the new thing at hand. After all, we humans are incredible pattern matching machines. Let’s try it.

What’s one divided by three?

> One-third.

Anything else?

It’s natural, it’s what we’re good at, and it’s what we are going to do.

But that’s not creative thinking, that’s not pushing boundaries, that’s not seeing new patterns and dimensions. To see a new pattern is to discard the old patterns, to turn off or temporarily disable that pattern matching subconscious machinery and enable new, baby-style learning. A baby does not have a cache of memories or experiences or pre-conceived notions to pattern match against. Everything is new. That’s how to think.

That’s not to say the first instinct cannot be the “obvious” answer. But it’s critical to not stop there.

Here’s another one:

What’s the answer? (Btw the pre-school part is a very, very big hint. Too big, in fact).

Solvable?

Creative geniuses are geniuses because they know “how” to think, instead of “what” to think.

Sociologist Harriet Zuckerman published an interesting study of the Nobel Prize winners who were living in the United States in 1977. She discovered that six of Enrico Fermi’s students won the prize. Ernst Lawrence and Niels Bohr each had four. J.J. Thompson and Ernest Rutherford between them trained seventeen Nobel laureates. This was no accident. It is obvious that these Nobel laureates were not only creative in their own right, but were also able to teach others how to think creatively. Zuckerman’s subjects testified that their most influential masters taught them different thinking styles and strategies rather than what to think.

In summary, the article distills down a number of ways of “how” to think:

  1. Explore all approaches, not just the ones you knew.

    Geniuses think productively, not reproductively. When confronted with a problem, they ask “How many different ways can I look at it?” Genius often comes from finding a new perspective that no one else has taken.

  2. Make novel combinations, even when two concepts are completely foreign to each other.

    Leonardo da Vinci saw a relationship between the sound of a bell and the ripples a pebble makes hitting water. This enabled him to make the connection that sound travels in waves.

  3. Take two opposites, and forcibly hold them together till they make sense, or don’t.

    Physicist Niels Bohr believed that if you held opposites together, then you suspend your thought and your mind moves to a new level. The suspension of thought allows an intelligence beyond thought to act and create a new form.

  4. Express yourself, in many different ways.

    Einstein always found it necessary to formulate his subject in as many different ways as possible, including diagrammatically. He thought in terms of visual and spatial forms, rather than thinking along purely mathematical or verbal lines of reasoning.

  5. Produce, a lot.

    Thomas Edison held 1,093 patents, still the record. He guaranteed productivity by giving himself and his assistants idea quotas. His own personal quota was one minor invention every 10 days and a major invention every six months.

If you’re still not at the article yet, go read it.

However, unless you’ve already figured out how to think (and that means you’re probably one of those geniuses, or somewhere pretty close, and I salute you for it), just being told how to think is not quite the meaningful recipe. The secret sauce is seeing someone do it, preferably again and again, in front of you. That’s where your mentor comes in. I’m not referring to the traditional concept of a teacher (or sifu, or master), but having the opportunity to be exposed to the right people.

The other thing about us humans is that we’re great mimicry machines. We can and love to mimic. We’re born to mimic, and not in a negative way. It’s literally how we learn, and how we survive. But we can’t mimic, if there’s no one to mimic. Or worse, if we’re surrounded by mediocrity, it is very easy to mimic the mediocrity.

Having a person, or better yet, a group of such people surrounding you sets the bar such that you do not even conceive being lesser than that. That becomes the minimum, the survival state, the rule. Once we are operating in such an environment, you have much less chance of not learning and becoming better.

“A noble man compares and estimates himself by an idea which is higher than himself; and a mean man, by one lower than himself.”

- Macrus Aurelius

Hence, all I can say is, if there’s an opportunity to learn from the best, and if that’s what you want to be, you should go seek out the best. It’s not a bad idea to be the dumbest guy in the room.

 

Deliberate Practice

Practice makes perfect. We’ve all heard about that mantra since we were kids going to school. However, I think somewhere along the way of that being drummed into us, the essence of that phrase got lost. Or perhaps the person or persons trying to drill that into us never really quite understood what that phrase meant in the first place. That’s pretty likely, actually.

Practice indeed makes perfect. The first time you do something, you are a beginner, a novice, a newbie. You’ll stumble, you’ll make mistakes. You’ll look dumb. If you don’t get scared off (or you don’t have a choice), then you do it again, and you’ll get a little better. Maybe you make new mistakes, but you’ll probably make smarter mistakes, and you should repeat the old mistakes less. And then you do it again, and again, and again, till you’re a master at it. Till you’re respected, a guru. And that’s the most dangerous point.

Once you’ve become a master, you’re no longer practicing, you’re repeating. Repeating something at a given skill level is not practice. Repeating does not make perfect. You need to stop repeating. You need to deliberate practice.

Deliberate practice is a highly structured activity engaged in with the specific goal of improving performance. Deliberate practice is different from work, play and simple repetition of a task. It requires effort, it has no monetary reward, and it is not inherently enjoyable.

When you engage in deliberate practice, improving your performance over time is your goal and motivation. That’s not to say that deliberate practice can’t be designed to be fun, but it isn’t inherently enjoyable on it’s own.

If you want to gain skills rapidly or approach expert-level status at something, you must understand the importance of deliberate practice and learn how to incorporate it into your daily life.

Herein lies the difference, at the start, whether it’s something you’re interested in, or you’re forced to do (maybe it’s your job), in your mind and to the world you are willing to admit that you’re new at it, and you’re bad at it. You just started. Hence, you’re willing to try, to ask, to make mistakes, to learn from them.

However, once you start getting good, start getting respect, and start having others consult you for your opinion and advice, then you’re in this zone where you have something to lose when you say “I don’t know”, and when you make mistakes, especially in front of other people. Furthermore, it could be a double whammy because you already have a baseline skill that others perhaps do not, you could start to get complacent. And it’s not wrong, after all, you have something that others don’t, so it’s true that you’re “better”. And that’s when you are no longer practicing nor learning, but just repeating.

The key point is that once you’re already “good” at something, you need to move on, you need to stretch and deliberately practice that stretch. You want to put yourself squarely outside your comfort zone, but not so far out that it becomes irrelevant to your quest of becoming better, or that it demoralizes you to the extent that you revert right back into that comfort zone.

For instance, if you’re a great pianist in classical music, you need to move on into other areas, best those unlike classical music, like learn the pop style, or the jazz style. You’re outside your comfort zone, but your training and skill as a classical pianist will accelerate your ability to pick up the new style of music. And you’ll be the greater pianist for it, able to integrate elements from other styles into your playing and composition, effortlessly, when the situation calls for it.

If you’re a great programmer in C, you need to move on to other languages, best those unlike C, like Lisp, or Haskell, or Forth. Again, you’re still in your comfort zone, your understanding of programming, of the machine, of code and of design will propel you to learn the new dimension of the new language — the new constructs, ideas, idioms. Again, you’ll be able to seamlessly call upon your new ideas, making your code more elegant, more beautiful.

Take for example programmers who have experience in both the imperative and functional style. Their code often fuses the functional style of programming in areas which are best expressed in that style, even when the main code base is imperative. Similarly, programmers who delve into firmware programming in assembly produce crazy (in a good way) code, because they truly understand the machine. We sometimes jokingly-insult them and call them Real Programmers. Jokes and folklore aside, these are the guys who are truly in sync with the machine, and it shows in their (higher-level) code. Learning a new dimension will only make you a better programmer. You’ll know where your current tools fall short, ways to get around those problems, concepts and ideas from other languages which you can model and incorporate. And that’s just about programming languages.

After all, the best athletes cross-train.

However, again knowing this and being motivated to deliberately practice isn’t the complete recipe. No doubt, it will go a long way in making you a more highly-skilled person.

But here’s where the mentor can come in and accelerate that. Because at this point, there is a question: what exactly should i deliberately practice? In today’s wide open world, the possibilities are endless. A mentor will be able to propose some directions, as well as debate with you about the pro and cons of that direction, because he’s done it before.

There’s no need to follow his or her path, but it acts as a lighthouse. You may agree and follow, or you may disagree and go your own way. Or you may adopt bits and pieces. In any case, you are better off in your decision-making process.

But that said, even without guidance, there’s no reason to not deliberate practice.

There are many things in life that annoy or discourage you, may more than threaten to de-rail you, or actually derail you. In the midst of all that, to me, my constant, my rock, is God, my wife and my family.

This picture embodies just that.

Every programmer, or really just anybody who does code, probably has come across a myriad of coding styles — some pleasant, some not so pleasant. In the midst of the entirety of the “coding style”, is this little, but very significant, segment on naming conventions. Indeed coding style represents a lot more than how you name your variables, functions, methods and classes, but it would be an easy argument to say that naming convention is one of the biggest, if not the biggest, influence on how a piece of code looks like. After all, it’s the first thing your eyes will notice — it is the very look of the code.

Over time, two naming conventions have become dominant: camelCase (and it’s cousin PascalCase), and under_scores. There is an interesting article here, titled CamelCase vs underscores: Scientific showdown that does a sort of informal, semi-scientific study on which naming convention is superior.

I’m going to severely abuse Python here, to illustrate to an extreme extent, the two conventions:

class handle_email(object):
  def send_email(message):
    try:
      smtp_obj = smtp_lib.SMTP('smtp.server.org', 25)
      smtp_obj.send_mail('from@server.org', 'to@server.org', message)
      smtp_obj.quit()

    except smtp_exception:
      print "Error: unable to send email"
class HandleEmail(object):
  def sendEmail(message):
    try:
      smtpObj = smtpLib.SMTP('smtp.server.org', 25)
      smtpObj.sendMail('from@server.org', 'to@server.org', message)
      smtpObj.quit()

    except SMTPException:
      print "Error: unable to send email"

In case you didn’t visit the Whatthecode article linked above, I’m going to take a leaf from the author and ask you to choose, instinctively and before reading on: which do you like? It’d be quite unlikely that both are equally as good or as bad, since everybody feels differently. If you voted the first, then you’re clearly a under_score kinda guy. If not, you’re a camelCase person.

I boil down my choice of which is better to an aesthetic factor, and three key factors. What I’m saying here is that this is my personal, over-the-years and thought-through opinion, and my justification for it. I am not touting this at fact. Hence, that said, I’d like to take a shot at:

  1. Which simply looks better?
  2. Which gives you the most information?
  3. Which is easier to code in?
  4. Which is better for comprehending code?

 

1. Which simply looks better?

This is perhaps the most not-related-to-coding question, as it deals with nothing more than aesthetics. This has more to do with what the eye (and brain) perceives as beauty, than with actual issues such as comprehension and conciseness, which will be dealt with later. As such, the answer to this question is really just a matter of personal taste.

I personally will vote for the under_score convention here, as it sits well with my notion of what beautiful prose should look like — well-spaced, easy to pick apart key words, more spread out lines, and so on.

My opinion: under_score.

 

2. Which gives you the most information?

I’d argue that camelCase, together with PascalCase, has the highest fidelity of information. There are two reasons for that. Firstly, camelCase and PascalCase are distinctive, yet belong to the same naming convention. Hence, using camelCase to represent one set of things (say, function names, variable names), and PascalCase to represent another (say, class names, module names), gives immediate clues into the origin and purpose of a given “thing”. Once you’ve assimilated this and it has become second nature, you’ll be reading and comprehending code much faster, perhaps without even realizing it.

Secondly, the camelCase/PascalCase convention is the more condensed of the two conventions. numItems and MilitaryBoat is shorter than num_items and military_boat. On a single identifier, the length may not make much of a difference. However, code isn’t made up of one or two identifiers on a single line (with a few exceptions), it’s made up of numerous. I believe that the amount of -relevant- information in a given line has a direct translation to how easy it is to comprehend the code. We, as humans, don’t have infinite abilities to keep everything in our heads, and a visual reference is very important in aiding understanding when we cannot memorize and connect everything mentally.

Hence, the conciseness of representation has a huge bearing on me. As a small point, many underscores on a line is also.. ugly. Again, that’s personal ;)

My opinion: camelCase/PascalCase.

 

3. Which is easier to code in?

Again, for this I’d go with camelCase/PascalCase. Hitting shift on a new word is just easy to hit (at least on QWERTY keyboards) than the darn underscore key. Even for decent touch-typists, of which I consider myself, the error rate on hitting the underscore hit is easily infinitely higher than hitting the shift key, because you can hardly miss the shift key.

Hence, camelCase/PascalCase is often much, much easier to type, fast.

My opinion: camelCase/PascalCase.

 

4. Which is better for comprehending code?

There’s no real difference between reading other people’s code and reading your own code after some period of time. They all rely on your ability to comprehend the code that you see. Taking comments out of the picture, which should not be there to replace or fix badly named identifiers anyway,

I’d say that camelCase/PascalCase is again, better. camelCase/PascalCase accentuates the interactions between identifiers, which is the (more) important issue, rather than figuring out the name of the identifier.

Let me make the counter-argument first: under_scores closely mimic the way words are crafted into sentences in English, and hence is certainly easier to read.

I don’t agree. Yes, it is easier to read — as words. It is not easier to comprehend — as code. Code is not about one or two lines, it’s about blocks. This is akin to paragraphs in a language, not sentences. A single line of code hardly as much meaning in the grand scheme of things. The act of introducing blanks (represented by underscores) splits words up so it’s easy to read them quickly, but you lose the meaning that you are looking at an identifier. Taken from a “sentence” point of view, you lose the ability to distinguish the interactions of identifiers with each other. Taken from a “paragraph” point of view, you have a bunch of harder “sentences” to read.

Consider this:

remaining_chars = end_of_file_index - current_file_index

and this:

remainingChars = endOfFileIndex - currentFileIndex

The interactions between the three identifiers (in this case, variables) is extremely clear in the second case, but less so in the first case. Throw in your curly braces, indentations, other forms of “whitespace”, and things become less and less clear. The obvious counter-argument is that IDEs will beautifully color identifiers, making my point moot. I disagree that it’s moot though. Yes, indeed syntax highlighting resolves the issue to a large extent. However, my gripe is that firstly, not all syntax highlighting schemes are appropriately designed. Secondly, there are often times when you browse through code that is not syntax highlighted. Thirdly, I prefer having two forms of pattern matching for my brain to hook on (shape, and color), rather than one (just colors).

Let’s try this again, with colors this time:

remaining_chars = end_of_file_index - current_file_index

and this:

remainingChars = endOfFileIndex - currentFileIndex

Different, yes. Sufficient, perhaps, But two levels of distinction is still better.

My opinion: camelCase/PascalCase.

 

My Personal Feel

My conclusion is this: camelCase/PascalCase has a good number of things going for it, sufficient to overcome the few issues that it poses.

I am not oblivious to some of the issues with this naming convention. I know about the fact that sometimes it doesn’t deal well with special words (e.g. URLCharacters), single letters (e.g. MyIPhone), and so on. However, I do feel that these can be gotten around with by choosing better names.

I also realize that camelCase/PascalCase takes some time to get your brain around identifying capitals as word boundaries. However, once you’ve gotten that part down (which doesn’t take long, really), the benefits are apparent.

The human brain is a intricate and extremely powerful pattern matching machine. The introduction of a character that not just represents, but looks like the blank space character impedes on critical function of our pattern matching abilities — the ability to know when we are looking at a single “thing”, an identifier, and when we are not (i.e. we are looking at interactions between identifiers).

In addition, camelCase/PascalCase naturally introduces greater variation in the way things are named, allowing for conscious or even subconscious deduction of the meaning of an identifier. I’m talking about the fact that when I see PascalCase, I immediately thing class, and when I see camelCase, I immediately think function or variable (assuming that’s the convention of whatever you’re reading). Hence, when I read

Vehicle.getType()

I know what Vehicle and getType are. Immediately. No mousing over to check what the IDE tells me, no attempting to jump to function definitions, etc.

vehicle.get_type()

just doesn’t convey the same fidelity of information. IDE or not.

I’ve been using Vim for a while, and I’m also a hardcore Directory Opus (Dopus) fan. I find that both of them complement my work well, allowing me to greatly speed up the way I handle files and edit them. However, I’ve always wanted a way to move from the very keyboard-centric Vim to a full-fledged file manager. That may be Dopus, or plain vanilla Windows Explorer, or whatever your favorite file manager utility happens to be.

This is what Dopus looks like, with my preferences configured.

Dopus showing the root drives.

Dopus showing some files.

But I digress.

My gripe is that extracting the file name and path in Vim is a tad inconvenient, not to mention that after extracting it to, say, the clipboard, I then have to open up the Windows Explorer, paste in the full path, remove the file name, hit enter, before I can finally browse the directory in a usual manner.

Certainly, Vim has many options, and for those who may be thinking: “Well, Vim can return you just the file path”, yes, I’m aware. My point is really just to illustrate that moving from Vim’s view of editing a given file, to browsing the directory that the file lives in, is not exactly the most intuitive process in the world. Also, while Vim certainly has some options for dealing with directories, and some of them are not bad at all, such as the venerable NERDTree, it’s really not what I need. I don’t just need to browse a directory structure, but I often need to shuffle files around, rename them, map network shares, and so on. That is firmly in the realm of file managers, not an editor — Vim. Hence, I want my file manager.

As such, here’s a little Vim function (it’ll sit nicely in your vimrc file, or anywhere else you prefer to put it), that does all that for you. I like <leader>-based keystrokes, so for me, hitting <leader>-e (natural and similar to Win-E to me) will pop the file in Dopus, if Dopus exists on the system. If Dopus isn’t there, then it falls back to the traditional Windows Explorer. Such a fallback is for reliability, and also because I use Vim on multiple VMs that will not have Dopus installed. Finally, it also highlights the file so you can see it easily.

" Open current file in Explorer (Dopus version)
function OpenPathInExplorer()
    let filepath=expand('%:p')
    let dopuspath="C:\\Program Files\\GPSoftware\\Directory Opus\\dopusrt.exe"

    if filereadable(dopuspath)
      :exe '!start "'. dopuspath .'" /cmd Go "' . filepath . '" NEW'
    else
      :exe '!start explorer.exe /select,"' . filepath . '"'
    endif
endfunction
nmap <leader>e :call OpenPathInExplorer()

If you don’t like the Dopus bits, feel free to strip them. Alternatively, here’s a plain, simple, vanilla, works-on-all-Windows-systems version:

" Open current file in Explorer (simple version)
function OpenPathInExplorer()
    let filepath=expand('%:p')
    :exe '!start explorer.exe /select,"' . filepath . '"'
endfunction
nmap <leader>e :call OpenPathInExplorer()

There we go. The command can be seen at status area of Vim. I can’t show the keystrokes, though. ;)

Hope you find this useful. :)

It is assumed that the reader more or less understands the continuation-passing style. The first thing that the Cont monad does is to abstract away the need to pass continuations explicitly in continuation-passing style.

The Cont monad abstracts away the handling of continuations, by creating a series of “partially applied” (I’m using this quite loosely) functions that are all waiting for their continuation. The add function was originally add x y, and add_cps became add_cps x y k. This meant that add_cps explicitly took a continuation function k.

What happens if we have this?

-- add
add :: Int -> Int -> Int
add x y = x + y

add_cps :: Int -> Int -> (Int -> r) -> r
add_cps x y k = k (add x y)

add_abstracted_cps :: Int -> Int -> ((Int -> r) -> r)
add_abstracted_cps x y = (\k -> k (add x y))

We may see add_abstracted_cps as a “partially applied” function of add_cps, and the missing argument is the continuation function. Hence, we can define a bunch of such “abstracted” functions which are partially applied functions.

Notice that the type signature for the “abstracted” version of the function is exactly the same as that of the cps version, only that due to partial application, it returns a function that takes a continuation and returns a type r. This is made explicit using parentheses, though it’s not necessary (the parentheses part). Put another way, we are just converting between two equivalent forms using lambda expressions, as follows:

f :: Int -> Int
f x = x+1         -- This is equivalent to this, both
f = \x -> x+1     -- in the function body and in the type signature

The above are equivalent.

Now, back to partial application, suppose we have the following:

add :: Int -> Int -> Int
add x y = x + y

add_cps :: Int -> Int -> (Int -> r) -> r
add_cps x y k = k (add x y)

add_abstracted_cps :: Int -> Int -> ((Int -> r) -> r)
add_abstracted_cps x y = (\k -> k (add x y))

Now, add_abstracted_cps is just add_cps partially applied, and is just missing the continuation before it can be evaluated (i.e. it returns a function that expects k, the continuation). However, so far the use of the words “partially applied” here is a bit loose, in the sense that you’ll notice that add_abstracted_cps still has the variables x and y, and those have certainly not been applied. We’ll see why I keep saying partially applied in just a moment. However, for now, let’s do the same for all of the functions and put them all together:

-- add
add :: Int -> Int -> Int
add x y = x + y

add_cps :: Int -> Int -> (Int -> r) -> r
add_cps x y k = k (add x y)

add_abstracted_cps :: Int -> Int -> ((Int -> r) -> r)
add_abstracted_cps x y = (\k -> k (add x y))

-- square
square :: Int -> Int
square x = x * x

square_cps :: Int -> (Int -> r) -> r
square_cps x k = k (square x)

square_abstracted_cps :: Int -> ((Int -> r) -> r)
square_abstracted_cps x = (\k -> k (square x))

-- pythagoras
pythagoras :: Int -> Int -> Int
pythagoras x y = add (square x) (square y)

pythagoras_cps :: Int -> Int -> (Int -> r) -> r
pythagoras_cps x y k = square_cps x (\x_squared ->
                       square_cps y (\y_squared ->
                       add_cps x_squared y_squared (\sum_of_squares ->
                       k sum_of_squares)))

pythagoras_abstracted_cps :: Int -> Int -> ((Int -> r) -> r)
pythagoras_abstracted_cps x y = (\k ->
                                  square_cps x (\x_squared ->
                                  square_cps y (\y_squared ->
                                  add_cps x_squared y_squared (\sum_of_squares ->
                                  k sum_of_squares))))

Now, suppose that we fully apply the *_abstracted_cps functions, we are going to get a function that expects k, the continuation. If we apply k to that, then we can “evaluate” the function body. Hence, *_abstracted_cps when fully applied, is just *_cps that has been partially applied just up to (but not including) the continuation parameter, k. Hence my use of the words “partially applied” earlier. Thank you for tolerating my loose use of those words.

Hence, to reiterate, the *_abstracted_cps functions, after that have been evaluated (by applying all the required parameters), returns a computation (a function) that requires a continuation function in order to fully evaluate. That continuation is received as the first (and only) paramter to the function. We’re going to be seeing this kind of function a lot, hence, we’ll give it a more intuitive name. Let’s call them waiting functions. Thus, waiting functions are functions that are waiting on a continuation, as the first and only parameter.

 

The Cont monad

Clearly, the fact that we have to handle the continuation function explicitly can be abstracted cleanly away using a monad. This is implemented in the Cont monad. Let’s take a look at the definition first. We are going to see some very big similarities. However, the first thing that we need to constantly keep in mind, is what exactly the monad represents. Here goes: the Cont monad “stores”, as the inner value of the monadic value, the fully applied *_abstracted_cps functions. That’s our waiting function. Here is it again:

The monadic value “contains” the computation that is just waiting for its continuation (the waiting function).

The Cont monad is defined in Control.Monad.Cont, hence the following line is necessary:

import Control.Monad.Cont

Here’s the data definition of Cont:

newtype Cont r a = Cont { runCont :: ((a -> r) -> r) }

First off, the type of the monadic value of the Cont monad represents ((a -> r) -> r), which looks very familiar, since it’s the type signature of the waiting function. It’s pretty much the rear end of all our cps function type signatures. The continuation itself is given by the type (a -> r), and as you can see, represents the first and only parameter of the waiting function. The waiting function, with a continuation as its argument evaluates to a value of type r, since the continuation returns a result of type r. Thus, a Cont “contains” a waiting function.

Hence, this would work:

Cont (pythagoras_abstracted_cps 100 200)

 

return and >>=

Now let’s take a look at how the Cont monad defines return and >>=.

instance Monad (Cont r) where
  return a  = Cont (\k -> k a)
  mv >>= f  = Cont (\k -> runCont mv (\a -> runCont (f a) k))

return is simply the continuation version of identity. Consider the following:

id :: Int -> Int
id x = x

id_cps :: Int -> (Int -> r) -> r
id_cps x k = k x

id_abstracted_cps :: Int -> (Int -> r) -> r
id_abstracted_cps x = (\k -> k x)

That’s exactly return in the Cont monad, with the addition of applying the monad’s constructor to the result, of course. It creates a computation that is going to return just plain x (its argument), once it has its continuation supplied. Sounds familiar? :)

>>= is just a little bit more complicated. Remember that the monadic values contains the computations just waiting for their continuation. Now suppose we have more than one of these functions. For instance, we want to combine the continuation versions of both add and square together.

square_cont 4 >>= add_cont 200

Before examining how >>= works in the Cont monad, remember that mv >>= f embodies the concept of extracting the inner value of mv, and somehow relate f to the inner value. The inner value is the computation waiting for its continuation (call this c), and f is a function that is going to take the result of the computation of c (which is bound to a, by the definition of continuations), and produce a new computation that is waiting for its continuation (call this c’).

Okay that was a mouthful. Let’s go through this step by step. First of all, remember that the mechanism, or the convention if you will, of functions written in continuation-passing style, is that we have a function that contains some code, and that function takes, as an “extra” argument, a continuation, to which it passes the result. This means that the function’s code itself, intuitively, will not contain any references to the continuation function passed into it, with the exception of the last “line” of the function which calls that continuation function. Look at the following function yet again:

add :: Int -> Int -> Int
add x y = x + y

add_cps :: Int -> Int -> (Int -> r) -> r
add_cps x y k = k (add x y)

The only use of “k” is to call the continuation, and this is the last action. Continuations are defined as such, since the idea is to pass control to the continuation function. We can thus think of the function body being completely independent of the continuation, except for the final pass.

Again, by convention, the continuation is going to take the “result” of the function body as its argument. In this case, k is going to take a single argument, with the result of add x y.

Now we are ready to look back at the definition for >>=:

newtype Cont r a = Cont { runCont :: ((a -> r) -> r) }
instance Monad (Cont r) where
  return a  = Cont (\k -> k a)
  mv >>= f  = Cont (\k -> runCont mv (\a -> runCont (f a) k))

mv contains the computation (waiting for its continuation), which we extract using the runCont helper function. Hence, runCont mv produces that computation function (as we just said). It’s waiting for its continuation, so we pass it the continuation (\a -> runCont (f a) k). Remember that the definition of continuations is such that body of the computation can be “evaluated” to produce a “result”, independent of the continuation? That “result”, again by definition, is going to be passed into the continuation, which takes it in as a single argument. In other words, the “result” of the body of the computation is going to be bound to a in the continuation. This represents the notion of “evaluating” mv to produce its “result”.

Then we have the monadic function f. We recall that monadic functions are going to take a value somehow extracted (and perhaps tweaked a little) from the monadic value mv, and produce a monadic value. In our case of the Cont monad, what f is expecting is the result of the computation in mv. We hence give it exactly that, and that result was bound to a. The output, and therefore the result of (f a) is a monadic value, which is a new computation waiting for its continuation, again. We extract that computation using runCont (again), give it a continuation, k (which won’t evaluate because it’s waiting for k, see the lambda), and that completes the process. This represents the notion of “evaluating” f to produce the final “result”.

Note that we’ve been using the word “evaluate” and “result” again quite (okay, very) loosely. Nothing gets “evaluated” in mv >>= f, because we’re dealing with computations, or the notion of computations, depending on how you want to see it. Remember, Haskell is lazy. All that’s there is a promise to evaluate, when it’s required. Hence, the final “result” is nothing more than a large computation that is still waiting on a continuation — the top level continuation. Once we supply this does the entire computation have what is required to finally and truly evaluate, and when it does, it actually produces a result.

Let’s finally use the Cont monad:

-- add
add_cont :: Int -> Int -> Cont r Int
add_cont x y = return (add x y)

{- This creates the computation waiting for its continuation:
     Cont (\k -> k (add x y)) -}

-- square
square_cont :: Int -> Cont r Int
square_cont x = return (square x)

{- This creates the computation waiting for its continuation:
     Cont (\k -> k (square x)) -}

-- pythagoras
pythagoras_cont :: Int -> Int -> Cont r Int
pythagoras_cont x y =
    square_cont x >>= (\x_squared ->
      square_cont y >>= (\y_squared ->
        add_cont x_squared y_squared >>= (\sum_of_squares ->
          return sum_of_squares)))

-- pythagoras (with do-notation)
pythagoras_cont':: Int -> Int -> Cont r Int
pythagoras_cont' x y = do
  x_squared <- square_cont x
  y_squared <- square_cont y
  sum_of_squares <- add_cont x_squared y_squared
  return sum_of_squares

test1 = runCont (pythagoras_cont 3 4) print
test2 = runCont (pythagoras_cont' 3 4) (+100)

So far, we've understood how to write in continuation-passing style, and how the Cont monad takes care of turning functions into continuation-passing style by turning all functions in the Cont monad into computations that expect a continuation. Hence, all the functions that we define in the Cont monad are such expect-a-continuation functions. It also defines the >>= operator to compose such computations, and the >>= operator takes care of passing the right continuations through the composed functions, making everything nice and implicit.

 

A note on the "right" continuations

Observe that the "right" continuations change (obviously) as the computations "progress". Because we are composing computations using the monadic >>= bind operator, and because of how the Cont monad defines >>=, the "right" continuation is always going to contain the computations on the right hand side of >>=. In other words, given mv >>= f, when "evaluating" mv, the continuation to mv is going to contain the computation represented by f. Given mv >>= f >>= g, then the continuation to mv contains the computation of f (and not g), and the continuation to f contains the computation of g. What then is contained in the continuation of g? It's going to be the top-level continuation, the one that we pass in when we say runCont (...) print. In this example, the top-level continuation is print.

Note that if we write our code slightly differently (though equivalently), we get something a little bit different (but equivalent). Consider the code:

mv >>= (\x -> (f x) >>= (\y -> (g y)))

Now, the continuation to mv is going to contain the computation of (\x -> (f x) >>= (\y -> (g y))), and the continuation to (f x) is going to contain the computation of (\y -> (g y)). In this way, we can see things in a slightly different light. The continuation at any point is always the current-continuation, where the current-continuation represents "the rest of the program". In the case of Haskell, because we're trapped in the Cont monad, it represents "the rest of the monadic computation". However, the two forms are no different in practice.

 

Gaining even more control

Now, can we obtain even more control over our continuations, "jumping" to continuations whenever we want (and not just, as we've seen, at the end of a function under the Cont monad), and pass whatever we want into the continuation (and not just, as we've again seen, the currently computed value)? Indeed so, after all, what's stopping us from writing code that calls the continuation, k, at any point that we like? This is trivially simple in the continuation-passing style, for we have direct access to the continuations itself. For instance, we could call k under a condition, and do something else when the condition is false, as follows:

pythagoras_cps' :: Int -> Int -> (Int -> r) -> r
pythagoras_cps' x y k = square_cps x (\x_squared ->
                        square_cps y (\y_squared ->
                        add_cps x_squared y_squared (\sum_of_squares ->
                        if (sum_of_squares > 100) then k (sum_of_squares-10)
                        else k sum_of_squares)))

test3 = pythagoras_cps' 10 10 print
test4 = pythagoras_cps' 1 1 print

Hence, a close analogy in imperative languages is the return statement (not the monadic return function), whereby we use return to "exit" a function, and pass return an argument to, well, return. Similarly, in continuations, since we are "returning" to the continuation, we can simply call the continuation to exit, and pass the continuation an argument to, well, continue with.

However, how about when using the Cont monad? Remember that the Cont monad "hides" the continuation, by turning the computation (e.g. the pythagoras function) into what we called a waiting function. That means that we don't have direct access to the continuation, which is the whole purpose of the monad in the first place. How then can we access the continuation so that we have the power and flexibility we just saw?

We do that using a helper function called callCC (okay, whether you see it as a helper function or not is your choice). What is callCC? It's the commonly called call-with-current-continuation. This is a queer little fella, and we should get some intuitive idea about callCC first. So, what is it?

callCC takes a waiting function, and it is going to create a Cont (which contains that waiting function), and it's going to run it. But with what? It's waiting for a continuation. The obvious answer is that it's going to run it with the continuation it's given, as it's being composed via the >>= operator. See the section on "A note on the 'right' continuations" (above) for why. And the "right" continuation is -- the current-continuation. So what's the big deal? Well, the big deal is that without callCC, all the "right" continuations were implicit. Your code could not access the implicitly created continuations (all the \a -> ... stuff in the definition of >>=). With callCC, it exposes the continuation, by passing the current-continuation explicitly into the waiting function, and that being a waiting function, has an explicit argument that is expecting a continuation. First consider this code:

addOne :: Int -> Cont r Int
addOne n = return (n+1)

doSomething :: Int -> Cont r Int
doSomething n = return n >>= (\x -> return (x-2))

test5 = runCont (doSomething 5) print

Nothing special above. Now let's put in a callCC, but not invoke the current-continuation passed to callCC.

doSomething' :: Int -> Cont r Int
doSomething' n = return n >>= (\x -> callCC (\breakOutOfHere -> return (x-2)))

test6 = runCont (doSomething' 5) print

Looks good, but doesn't do anything cool. Now let's invoke it conditionally.

doSomething'' :: Int -> Cont r Int
doSomething'' n = return n >>= (\x -> callCC (\breakOutOfHere ->
                                                if (x>5) then breakOutOfHere 100000
                                                else return (x-2)))

test7 = runCont (doSomething'' 5) print
test8 = runCont (doSomething'' 10) print

Now, we literally get to break out when we want, and pass (or throw) out whatever value we want when we break out. Now we have a fully hidden and implicit implementation of continuations, with the option to expose the continuation whenever we want, via callCC.

Thus, you saw that invoking the current-continuation (it has a name now) at any point in time inside the code of the waiting function, breaks out of the waiting function. Break out to where, you may ask. Again, the continuation is the current-continuation from the point of the callCC call. Hence, it breaks just straight out of the waiting function, to the point right after.

 

Type and definition of callCC

Now let's take a look at the type and definition of callCC:

              arg = waiting function        result
                        |                      |
          v---------------------------v    v------v
callCC :: ((a -> Cont r b) -> Cont r a) -> Cont r a

callCC f = Cont $ \k -> runCont (f (\a -> Cont $ \_ -> k a)) k

So, how does callCC do what it claims to do? First, we see that it takes f, a waiting function. Like all waiting functions, it needs to take in a continuation. We already figured that out. It then says that it wants to "run" f, so it has to call runCont on f. Again, that is quite simple. However, what does it pass to f? Normally (not in callCC), we would pass in the "right" continuation. However, for callCC, it doesn't want to do that. It wants to let the programmer break out of the waiting function, and where is that point exactly? It's the in "outer" continuation, which is in k. Now the function f is a waiting function that, when "evaluated", generates a result. As per our normal way of passing results into continuations, the "a" in our example here is going to take the result of f. But this time, callCC explicitly passes that result into k, which is the "outer" continuation. It breaks out with the result of f.

Now, the type. The argument to callCC is of type ((a -> Cont r b) -> Cont r a), which we can see is a function, f. We recall that this function is a waiting function, and that it takes a continuation. The continuation therefore has the type (a -> Cont r b). Why is the type different from the type of the previous continuations that we saw? That's because callCC is going to pass the current continuation, and the current continuation is going to represent the rest of the program. Remember also that we're in the Cont monad. So what does the "rest of the program" return? Well, it has to return something in the Cont monad! Hence the type of ((a -> Cont r b) -> Cont r a) for the function f.

This article is on what is loosely but oft referred to as “monadic functions”. It contains my personal and intuitive way of thinking of monadic functions, and it may not suit you well. However, for who out there who this may help, I hope it makes that little difference for you.

I notice that in most monad tutorials, there is tremendous detail in how monadic composition works, as well as examples on how the >>= bind operator is implemented in various commonly seen monads (Maybe, [], State, etc), but there is very little what exactly are monadic functions. The key idea that can be gleamed is that monadic functions are the stuff that act on monads, through the help of the >>= bind operator (which essentially finds a way to meaningfully “extract” the value that the monad contextualizes), and sometimes the return function as well. Also, we often see that monadic functions, say f and g, can be suitably composed as follows:

f, g :: a -> m b
f >>= g

We also come across examples, usually just before introducing the beautiful do-notation, that look like this:

f, g :: a -> m b
someFunction x = return x >>= (\a ->
                 ... some other code ...)

And then, somewhere in our journey, we come across code that looks like this (this example is in the IO monad, but is not peculiar to the IO monad):

someFunction x = do
  a <- getLine
  b <- getLine
  putStrLn (show x ++ a ++ b)

And then we may, perhaps, wonder: getLine is a function, but it's type signature is not that of a -> m b, or for the IO monad, a -> IO b. In fact, its signature is:

getLine :: IO String

So how is that a monadic function? And yet our intuition tells us that it's fine. Firstly, it works. Secondly, it's producing a monadic value, which gets "assigned" to a. We then mentally translate this to the de-sugared notation:

someFunction x = getLine >>= (\a ->
                   getLine >>= (\b ->
                     putStrLn (show x ++ a ++ b)))

And then things become clearer, because the type signature that we're looking for is embodied in the two anonymous functions. I don't know about you, but this bothered me for a while. Not from the formal correctness of it all, as it's obvious that's not a problem, but the intuitive sense of a "monadic function".

So let me try to sort of resolve the issue in a structured way in this article. If all this is already crystal clear to you, and you're thinking "this guy doesn't get it", then you may want to not read this, as this contains my personal and intuitive way of thinking of monadic functions, and it may not suit you well. However, for who out there who this may help, I hope it makes that little difference for you.

First off, some terminology:

  1. A monad is a typeclass that defines >>= and return (minimally).
  2. A monadic value is a value contextualized by a type constructor.
    (Examples would be (Just 4), or [100, 200])
  3. The inner value of a monadic value is the actual value that was contextualized by a type constructor.
    (Examples would be 4, and each of (100, 200))
  4. A monadic function is a function that produces a monadic value.
    (Note that we said nothing about its input type)

Given mv >>= f, the main reason for requiring that >>= has the type of >>= :: m a -> (a -> m b) -> m b, and hence, f having the type of f :: a -> m b, is because we want a means to pass the inner value of mv to the monadic function f. However, this doesn't make everything apparent.

Hence, creating an example out of the Maybe monad, suppose we have the following monadic functions:

mf1, mf2 :: Int -> Maybe Int
mf1 x = return (x*2)
mf2 x = return (x*3)

And we want to compose them:

cmf :: Int -> Maybe Int
cmf x = Just x >>= mf1 >>= mf2

Now suppose that we have some more "elaborate" monadic functions:

mf1, mf2 :: Bool -> Int -> Maybe Int
mf1 b x = if b then Just (x*2) else Just (x*4)
mf2 b x = if b then Just (x*3) else Just (x*9)

Oh now these functions take some extra stuff, and are no longer nice and simple like the original mf1 and mf2. Certainly (and very obviously so), direct composition does not work, as the type signatures are all messed up:

cmf' x = return x >>= mf1 >>= mf2

However, it doesn't mean that this doesn't work:

cmf x = do
  a <- mf1 True x
  b <- mf2 False a
  return (a*b)

Which is the same as this:

cmf x = mf1 True x  >>= (\a ->
        mf2 False a >>= (\b ->
        return (a*b)))

The issue here is that our "definition" of a monadic function is a tad loose. Often, you will find that where the words monadic function is used, it can either refer to:

  1. Functions of the form f :: a -> m b, where a is the type of the inner value of the monad. (Call these classic monadic functions)
  2. Functions of the form f :: anything -> m b, where the input of the function really doesn't matter. (Call these loose monadic functions)

We note that for composability, as in cmf' x = return x >>= mf1 >>= mf2, our monadic functions mf1 and mf2 needed to be in classic form, where the input is only the type of the inner value of the monad.

However, any function that produces the monadic value can indeed be used "inside" the monad. By "inside" I mean code that acts within a monad, which is the code enclosed in the do-notation, and its corresponding de-sugared form. We can do this by making a wrapper around this loose monadic function, whereby the wrapper is a classic monadic function itself. I.e. a wrapper that takes in a single value of the type of the monad's inner value.

In our example with our two loose monadic functions mf1 and mf2, we successfully "composed" them as follows:

cmf x = mf1 True x  >>= (\a ->
        mf2 False a >>= (\b ->
        return (a*b)))

which is:

cmf x = mf1 True x  >>=
          (\a -> mf2 False a >>=
            (\b -> return (a*b)))

Again, the anonymous functions are the wrappers which are, themselves, classic monadic functions. How so? Take a look at the anonymous function that has a parameter "a". It takes a single input value, named "a", and the type of "a" (by type inference) is the inner value of mf1 (by definition of >>=). It's result is the result of the call of mf2, which we know is a loose monadic function and hence will produce a monadic value. Thus the result of the anonymous function is a monadic value. Hence, (\a -> mf2 False a) :: a -> mb. The exact same argument goes for the second anonymous function.

This leads to the following conclusion. The fundamental reason for requiring the f in mv >>= f to be of the type f :: a -> m b, with emphasis on the "a", is because in general, this function is designed to receive the value produced by another (may be "loose") monadic function, or simply stated by some monadic value, or created by a monadic function that does "nothing" but return a monadic value (think getLine :: IO String). Hence, all of these work (we're in the Maybe monad):

classicMonadicFunction x = Just (x+2)
looseMonadicFunction x y z = Just (x+y+z)
looseMonadicFunction' = Just 4

f = Just 4 >>= classicMonadicFunction                  -- Needs a wrapper
f = Just 4 >>= (\x -> looseMonadicFunction x 100 200)  -- Don't need a wrapper

f = do
  classicMonadicFunction 4

f = do
  Just 4 >>= classicMonadicFunction

f = do
  a <- looseMonadicFunction 10 100 200
  classicMonadicFunction a

f = do
  a <- classicMonadicFunction 4
  looseMonadicFunction a 100 200

f = do
  a <- Just 4
  classicMonadicFunction a

f = do
  a <- looseMonadicFunction'
  looseMonadicFunction a 100 200

And you can work out all the rest of the combinations yourself. This gives more flexibility in designing and implementing monadic functions, as they may be loose, classic, or just monadic values themselves (Just 4).

Hence remember, the only true constraint in designing monadic functions (or when using them), is that they have to return a monadic value. Failure to do so will result in a type error under all conditions.

Why you can forget about loose and classic monadic functions:
Because the do-notation creates the necessary "wrappers" for you, you often don't have to think about classic or loose monadic functions. Any function which is going to throw back a monadic value (and do whatever side-effects it wants to), is going to work inside the do-block. It's after all, the same as you creating wrappers around every single monadic function (classic or loose), and thus guaranteeing that the f in mv >>= f is going to be of the correct type, and then calling your monadic functions (classic or loose) with the correct set of parameters. Hence, the wrappers take care of the fact that the assumption of the >>= bind operator is that it needs to find a meaningful way to pass in the "extracted" inner value, leaving everything else you write nice and simple. Put another way. since the do-notation handles the creation of all wrappers, you can use any monadic function (loose or classic), without ever thinking of whether they're loose or classic anymore (meaning you can forget about this article! ;)). Hence, all you have to know, intuitively, is that any function that returns a monadic value is a monadic function, and is good to go!

Going back to our getLine, a real example of where all these occurs is in indeed the getLine function of the IO monad. Recall that the type of the getLine function is as follows:

getLine :: IO String

This function is of the same type signature as our looseMonadicFunction'. It takes in nothing, because all of its functionality is in its side-effects: Grabbing a line from the terminal. It's only result is a monadic value, which contextualizes the string that it grabs inside the IO monad.

The putStrLn function, on the other hand, is a classic monadic function. It should be obvious why, from it's type signature. It returns a monadic value, and takes a single parameter of the type of the inner value:

putStrLn :: String -> IO ()

The previous post on Fundamentals of Monads suggested that monads empower the programmer with abstraction of context (algebraic data type) details, as well as order of evaluation, with the latter bringing us one step closer to the model of imperative languages when we need it, while still maintaining all the beautiful properties of the functional language that Haskell is. If you haven’t read the previous post, you may want to, as some of the terminology and intuition described there is used here.

However, for Haskell to be truly useful, not only do we need to guarantee order of execution (when it’s needed), but we also need a way to maintain state (again, when it’s needed). Too many problems are stateful (i.e. needs side-effects) by nature, and pure functional languages don’t quite handle the notion of state (as least in the primitive language constructs) very intuitively.

A simple solution that is often given to Haskell beginners, is that state can be captured by literally passing the state around functions, whereby each function has access to the state, and can modify the state.

However, that approach is tedious, and has a whole bunch of boilerplate. Again, can we find a way to abstract away that whole notion of state, so that a programmer can ignore the fact that he needs to pass state around, while still having the side-effect’ing notion of a state?

Indeed we can. Enter the State monad.

Note: At this point, because we’re introducing the state monad, we’re going to focus purely on the state monad. Haskell offers other ways of handling state, as well as more powerful monads that do this. This is just the very beginning.

As usual, a monad is characterized by the data type definition, and the definition for >>=. Here’s what the state monad looks like:

data State s a = State s -> (a, s)

Alternatively, to get a “getter” function for free, and using newtype instead, we can define it as follows (this is the formal definition):

newtype State s a = State { runState :: s -> (a, s) }

Now, this looks very different from the Maybe monad, and the List monad. The key difference is that the so-called monadic “value” (or rather, a more general and precise word would be “computation”, and since that’s what it truly is we’ll use that here, properly) is not a “value”, as in the intuitive sense that 4, or 100, is, but rather a function. See why it’s a computation? By the way, a computation can result in a value, but in this case, it’s a function.

In short, remember that monads give context to it’s value (or computation). In this case, our value (or computation) is a function, and we’re going to give it a context. That context is the state. Also, since the function acts on a state, it has to receive the state, logically. Hence, the function takes a state, of type s, and produces a result and a new state, represented in a tuple of types (a, s). Hence, remember, the state monad gives a function a context of state. Since those functions act on state (hey, it’s a state monad), we’ll call them state-transformation functions.

So, we have a state monad, that contextualizes a function with the context of state. This implies that the function we’re talking about manipulates state in some way. We’ll call them state-transformation functions.

runState gives us a “reference” to that function for free. In other words, runState “extracts” the function for us.

 

Is State a Monad? Not quite.

Looking at the type of State, we notice that it is:

State s a

That’s two free variables. If you were alert while figuring out monads, you’d realize that monads take only one free variable. Look:

class Monad m where
  return     :: a -> m a
  (>>=)      :: m a -> (a -> m b) -> m b
  (>>)       :: m a -> m b -> m b
  fail       :: String -> m a

Everywhere, you see “m a” and “m b”, and by type inference, that means that m is a type constructor that takes one parameter. So how can State be a monad?

Indeed, State itself is not, but (State s) is. And that is why the type declaration for State seems to reverse the a and s. It needs to partially apply and get (State s), so that (State s) is a monad, and the parameter is a. What this means is that the type of the state, s, is constant, while the type of the result, a, is not, and indeed by change as we >>= stuff. Put another way, from the Monad typeclass above, m is (State s).

As such, for the state monad, the type definitions looks like:

class Monad (State s) where
  return     :: a -> State s a
  (>>=)      :: State s a -> (a -> State s b) -> State s b
  (>>)       :: State s a -> State s b -> State s b
  fail       :: String -> State s a

This is actually more important than it looks. The monad is (State s), which means that the monadic function expects a type of a (which is not a function, nor a state, but a resultant output value), and acts on that value, and returns a state-transformation function. Yours truly was, for a while, confused because I thought that, like simpler monads, the monadic function takes a state-transformation function (wrapped in the monad), acts on it, and returns a state-transformation function — just like how a monadic function for Maybe takes a value, acts on it, and returns a Maybe type. It does not!

The monadic function for state monads take a resultant output value, acts on it, and returns a state-transformation function.

 

Getting an Intuitive Sense of State

At this point, it would do us some good to revise the intuitive concept of a monad, and see exactly how it applies to state. To recall, a monad, in general, comprises of the following ideas:

  1. A monad gives a context to a value.
  2. It also allows us to abstract away details of the context.
  3. We also had the concept of a monadic value, which is the value which the monad contextualizes
  4. And we also had the concept of a monadic function, which is a function that acts on a monadic value.

How does that specifically apply to the state monad?

  1. A state monad gives a context to a function.
  2. It also allows us to abstract away details of the context.
  3. We also had the concept of a monadic value, which is the function which the state monad contextualizes
  4. And we also had the concept of a monadic function, which is a function that acts on a monadic value.

The tricky bit is the last point. For simple monads like Maybe, recall that the monadic function is a function that acts on the inner value of the monadic value (the 4 in Just 4), does something with it, and outputs a monadic value. Hence, this would be right:

monadicFunction :: Int -> Maybe Int
monadicFunction x = Just (x + 1)

Just 4 >>= monadicFunction    ==> Just 5

How about for our state monad? Our monadic value represents a function. Hence, what is our monadic function? We’ll use exactly the same words here, to see the parallel.

Our monadic function is a function that takes a resultant output value, does something with it, and outputs a state-transformation function.

Yes, I’m belabouring the same point as in the previous section, because it’s important.

 

Exploring return

Here’s the definition of return, for the state monad:

return a = State (\s -> (a, s))

Quite simple. The constructor expects a function that takes a state and returns a result and a modified state, as explained earlier. When we give a “simple” value to return, we wrap it up in a state, and generate a function that, when given some state, just throws back the simple value, with no state modification. Can’t be simpler than that.

However, a common point of confusion here is what is the “state” that the function expects? We’re so used to wrapping values that sometimes we forget this is a function, so I’ll say it here again, this is a function. The state that the function expects, we do not know. It expects that state as a parameter so that it can act on the state! If this is obvious to you, please ignore it.

 

Exploring >>=

How about the >>= bind operator? At this point, let’s not confuse ourselves with the official definition of >>=. That will come very soon. But let us get some more intuition first: the >>= operator is a mechanism for composing state-transformation functions. That’s the outcome we want out of >>=.

So what, when dealing with state, what do we want the bind operator >>= to mean? Remember, monads are all about giving some context, and >>= is all about abstracting away that context (plus extracting values for sequenced computations). So, again, what does >>= mean?

Intuitively, and very accurately so, it’s just function composition! Yes, like head . tail composes a function that returns the second element of a list. But things are a bit special here: we are composing state-transformation functions, not ordinary (pure) functions like head and tail. Since state is sequential, we need to be careful to produce the correct intermediate state when composing functions.

Once again, we must remember not to get confused between state-transformation functions, which lives inside the state monad as part of the monadic value, and the monadic function, which is the f in m >>= f.

The state-transformation function, f’, looks like this:

f' :: s -> (a, s)

This state-transformation function is, again to repeat, wrapped up in our state monad, as the “value”. So, if we really have such a function, f, as defined above, we could certainly do this:

State f'

and happily wrap f up in a state monad. But that’s beside the point here.

The monadic function, f, is up to what the programmer wants to do with the resultant value. An example may look like this:

f :: a -> State s a              -- which you can think of as f :: a -> (s -> (a, s))
f x = State (\s -> (x*2, s))

Now, remember the type of the >>= operator:

m >>= f  :: m a -> (a -> m b) -> m b

What the bind operator >>= is doing, is it’s going to take a monadic value (containing our state-tranformation function), strips out the context, get the result of the state-transformation function, feed that as the resultant output value into the monadic function (f), which is going to produce another monadic value (containing a state-transformation function). It’s then going to give the new state-transformation function the new state produced by the first state-transformation function.

If that sounds like a mouthful, it is. Let’s go through this step by step.

First and foremost, here’s what >>= wants to achieve:

  1. >>= wants to compose state-transformation functions.
  2. It wants to take a state-transformation function wrapped in a state monad, and a monadic function.
  3. It knows that the monadic function is going to act on the result of the state-transformation function, and produce yet another state-transformation function
  4. It knows that at the end, it has two state-transformation functions to handle
  5. It wants to join the state-transformation functions, and correctly thread the state through them.
  6. It produces a new state-transformation function, which is the composition of the two functions, with properly threaded state.

The >>= bind operator looks like this:

m >>= f  = State (\s ->
                      let (a, s') = (runState m) s
                      in runState (f a) s')

Let’s unpack that. m is the monad that contains a state-transformation function. Remember that runState basically pulls out that function. we can see it as a helper function that just accesses the inner value of the monad). Certainly, we could have patterned matched as well, like this:

State f' >>= f  = State (\s ->
                             let (a, s') = f' s
                             in runState (f a) s')

And here’s how it achieves what it wants to achieve:

  1. (runState m) extracts the state-aware function, and applies a state to it. Whatever that state-aware function was supposed to do, it was waiting for a state, and hence was (and is still) just a computation (think of it as a piece of code, a function). We apply some state to it, and hence we can “run” it, but not quite. The state that we are applying to it is still a parameter of the new function that we are composing. So (f’ s), or (runState m) s, still remains a computation (a function) that is waiting on some state.
  2. Anyway, assuming that at some point that gets evaluated (when we provide the state), we bind that result to a, and its production of a new state, to s, in the tuple (a, s’).
  3. We then apply the monadic function, f, to the result of the previous computation, a. Remember that the monadic function is expecting the resultant output value. It’s also going to produce, as output, a state-transformation function.
  4. Hence, the new function produced by (f a) is yet another state-transformation function, waiting on a state, wrapped in a state monad. Hence we use runState to pull that out. However, going by the way we need to thread states, we give it the new state, s’, produced by the previous computation.
  5. The final result is a composition of the function wrapped in m, and the new function, f.

We can continue along this path, “composing” functions together which each modify the state (s gets modified to s’), into one huge function that contains all the modifications (think many nested lets), all waiting on intermediate states, whereby the outermost function is waiting on the initial state. We provide that initial state, and the whole gigantic function evaluates, and produces a new state, and a result.

Also, remember that monads provide a guarantee of order of execution, by means of lambda functions (see the post on Fundamentals of Monads. Note also that our >>= operator is doing exactly that. Let’s see what happens when we compose more than one function.

m >>= f >>= g

  ==> (State (\s1 ->
                 let (a1, s1') = (runState m) s1
                 in runState (f a1) s1')) >>= g

  ==> (State (\s2 ->
                 let (a2, s2') = (runState (State (\s1 ->
                                                   let (a1, s1') = (runState m) s1
                                                   in runState (f a1) s1'))) s2
                 in runState (g a2) s2'))

Notice how the lambda expressions are forcing the correct order of evaluation, and how the state is correctly threaded through the final expression? s2, in this case, is the initial state that, when provided, triggers the whole evaluation of the composed state-aware functions inside m, of f and of g:

runState (State (\s2 ->
                 let (a2, s2') = (runState (State (\s1 ->
                                                   let (a1, s1') = (runState m) s1
                                                   in runState (f a1) s1'))) s2
                 in runState (g a2) s2')) initState

Every (algebraic) data type is created for a very specific purpose. A data type isn’t made because you feel like making one (even if you can). Those data types which exist for no reason, in general, will have difficulty being generalized (into monads, as we will see later). However, let’s assume we’re all keen and astute programmers who won’t do that for the heck of it, and that we’re actually trying to accomplish something.

With that, let’s take a whirl around the idea of algebraic data types, and what they really mean. As mentioned, when a data type is created, it’s because it is trying to represent a particular concept or property.

Assumptions: You understand the Haskell type system (typeclasses, algebraic data types, type declarations, etc), and Haskell syntax in general.

For instance, suppose we want to represent failure. Haskell’s Maybe type does that. The very reason we type something as a Maybe, is because we want to represent failure on that something. As an example, if a function f decides whether or not an integer is a perfect square, the intuitive result that the function should return is a boolean — true or false. Either something is a perfect square, or is not. This should be simple.

Hence, a C function may look something like this:

bool isPerfectSquare (int i) {
  if (code_to_check_for_perfect_square)
    return true;
  else
    return false;
}

However, what if we pass that function a negative number? By definition, over integers, negative numbers cannot be perfect squares. Hence, we would want to say that the function failed.

In most imperative languages (C/C++, Java, etc), we would use a failure value to represent failure. Perhaps we would also like to abort the “execution” of that function. However, we cannot use a boolean here anymore, because we essentially need a third result, and there are only two possible values for a boolean. Hence, a common (and ugly) solution in C would be to “upgrade” (essentially a downgrade) the return type to an integer. Hence, a C function may look something like this:

int isPerfectSquare (int i) {
  if (i < 0)
    return -1;

  if (code_to_check_for_perfect_square)
    return 1;
  else
    return 0;
}

There are similar strategies, but basically they all involve using a "magic" value to represent failure. That may be -1, NULL, 0, or whatever other value pleases the programmer.

However, such strategies place the implicit meaning of the "magic" value on the programmer -- the human. The programmer says that -1 means failure, and humanly enforces it. The compiler (or the code) doesn't know that, and hence cannot enforce it. If another programmer comes along and takes isPerfectSquare() for a ride, and decides to alter it and state that -1 means something else, and -999 now means failure, then any code that relies on isPerfectSquare() is going to break. This is not a good approach.

Haskell solves this by allowing the programmer to define new custom types that gives a context to various things (values). For example, in the isPerfectSquare function, Haskell lets a programmer state, through a type, that the function has failed, or succeeded, and if it succeeded, what the result is. The custom data type here is the Maybe type, and is defined as follows:

data Maybe a = Just a | Nothing

Hence, suppose we have a result of True (or False), we can type the result (a boolean) with the Maybe type, by saying "Just True" (or "Just False"). Note that here, it is assumed that you understand the Haskell type system, at least where type constructors, value constructors and polymorphic/parameterized types are concerned. And if the function failed, we can just return a Nothing.

 

Giving a Context

The idea here is not Haskell's type system, but the idea of typing a value. Alternatively, you can see it as encapsulating, or wrapping, a value. Regardless of how you see it, the key thing is that we are now placing a value (the boolean, in this example), into a context. The context is failure. Hence, in the example above, we just placed the resultant True or False into a context. There is always a very specific reason for creating a particular custom type (Maybe, in this case), and for Maybe, the reason is to represent failure.

If there is another data type, then it must have a reason to have been created.

For instance, if you created a data type called Person:

data Person = Person String String String deriving (Show)

Then the reason for the Person data type is that you, the programmer, has decided that a Person has three string fields (maybe first name, last name, middle name).

If you create a data type called Tree, as follows:

data Tree = Empty | Tree a (Tree a) (Tree a) deriving (Show)

then a possible reason is that you are representing indecision, whereby the tree illustrates all possible decisions -- all outcomes for a given input.

Similarly, for a list:

data List = Empty | List a (List a) deriving (Show)

For all these data types, whenever we type a value as that data type, we are giving that value a context. And that context has a meaning in which you can articulate. For instance, a Tree 4 has type Tree Int, and what it means is that you have a decision with input 4. Since the tree has no branches, there are no decisions and outcomes yet. If you have a Tree 4 (Tree 5) (Tree 6), it represents a decision with input 4, that can yield a 5 or a 6 (if you "traverse" the tree). Similarly, if you have a list like List 5 (List 6 (List 7)), you may be representing non-determinism, whereby a value can take either 5, 6 or 7.

Note: it is important to note that data types have a reason for their existence. If not, it becomes difficult to see how they can be generalized into a monad.

 

Abstracting Away (Each and Every Different) Context

Now, what happens with our isPerfectSquare function? Let's write the proper Haskell version of isPerfectSquare, complete with the Maybe type to represent failure. We also modify it a bit, because we don't just want a boolean, we want the value of the function succeeds. In other words, if it's not a perfect square, we return Nothing, if it is, we return the number itself. Let's also add another similar function that decides if something is a big number a not. We decide that numbers larger than 1024 are "big":

isPerfectSquare :: Integer -> Maybe Integer
isPerfectSquare n =
    let sq = floor (sqrt (fromIntegral n :: Double)) in
    case (sq * sq == n) of
      True  -> Just n
      False -> Nothing

isBigNumber :: Integer -> Maybe Integer
isBigNumber n
    | n < 1024   = Nothing
    | otherwise  = Just n

Great, nice and simple. Now what if we want to treat those two functions as primitives, and develop a isBigAndPerfectSquare function, which tests if an integer is larger than 1024, and is a perfect square? One way to do that would be to check if an integer is a perfect square (isPerfectSquare), and if it's bigger than 1024 (isBigNumber). Not the most efficient, but just for illustration purposes. Hence we can write this:

isBigAndPerfectSquare :: Integer -> Maybe Integer
isBigAndPerfectSquare n =
    case (isPerfectSquare n) of
      Nothing  -> Nothing
      Just x   -> case (isBigNumber n) of
                    Nothing  -> Nothing
                    Just y   -> Just y

Again, there are a ton of better ways to write it. It's for illustration. Let's go further.

isMagicNumber :: Integer -> Maybe Integer
isMagicNumber n
    | n == 262144  = Just n
    | otherwise    = Nothing

isBigAndPerfectSquareAndMagic :: Integer -> Maybe Integer
isBigAndPerfectSquareAndMagic n =
    case (isPerfectSquare n) of
      Nothing  -> Nothing
      Just x   -> case (isBigNumber n) of
                    Nothing  -> Nothing
                    Just y   -> case (isMagicNumber n) of
                                  Nothing  -> Nothing
                                  Just z   -> Just z

By now something is becoming very apparent. The reason for the custom data type is worming its way into your code. Everytime you have a "primitive" function (such is isPerfectSquare) which returns a result (or computes to a value) of a data type that has a context (i.e. Maybe), you have to check that context (is it a Just x, or a Nothing?) should you wish to use the result.

To state that again, when a result has a context (i.e. Maybe, to represent failure), using that result means unwrapping the context to (a) extract the value out so that you can actually act on the value, and (b) checking the entire context to see if you need to handle special cases. The special case is failure, the value is the x in Just x. We performed the value extraction by pattern matching against Just False and Just True, and we checked the special case by pattern matching against Nothing. Once again, in this case it's simple, because Maybe is quite simple. If you have a more complex data type, you have to check and extract more stuff. Very quickly, this becomes a pain. Also notice one more thing: The point at which things became a pain was when two "primitives" were sequenced together. If we have a function that just calls isPerfectSquare once, it would be fine. The function could declare a return type of Maybe Bool, and call isPerfectSquare, and return what isPerfectSquare returns, which is a Maybe Bool. All's fine and dandy. Things got annoying the moment we sequenced calls to isPerfectSquare. In other words, when we want to act on the value of isPerfectSquare, and maybe call isPerfectSquare again, or even another function that will take the value we extracted and act on it, that all that special case and extraction stuff occurred. Therein lies a key point: when we sequence these calls, things get messy.

Key idea:
Hence, whenever we have a context that we build up or assign, because we want to represent something, it means we have to deal with the context. Since every context is different (represents different things), the manner in which to:
   (b) Handle special cases
   (a) Extract the value
is going to be different.

There are two ways we can handle (a) and (b). We can make the programmer (who uses the type) responsible for value extraction and special case handling. Or, we can make the type creator responsible for it. Of course, they may be the same person, but that's the fundamentals of abstraction/generalization anyway, right?

What we've seen above is making the programmer (who uses the type) responsible for it. He had to write code that checks for the special case (Nothing) and the value extraction (True and False). Not only is that a pain (lots of typing!), but it also distracts away from the real thing the code is doing (check for power of 4 and power of 8). It's hard to read, annoying to debug, and so on. Basically, it violates the principle of abstraction.

So, how do we make the type creator responsible for all that? Guess what? We abstract the process of sequencing, and define how (a) values can be extracted, and (b) special cases can be handled, when such functions (Integer -> Maybe Bool, in this example) are sequenced. It's very important to note that the idea of abstraction here acts on sequenced functions.

Hence, we define a special operator, which we call >>=, which represents sequencing. Hence, for the above example, when dealing with Maybe, we define >>= for Maybe as follows:

(>>=) :: Maybe a -> (a -> Maybe b) -> Maybe b
(>>=) Nothing f   = Nothing
(>>=) Just x f    = f x

Responsibilities of >>=:
The bind operator >>= has a few jobs to do. As alluded to above, it certainly has to define how to extract the value, and handle special cases. However, more generally, this is what is has to do or can do:
   (a) Handle special cases
   (b) Extract the value
   (c) Modify the value (if it wants to)
   (d) Find a way to meaningfully interact with function f

So, the operator takes a Maybe type value, and a function, and somehow feeds the value to the function. The value is simple: It's just that -- Just 16, Just 4, Nothing. The function f is interesting. It represents the function(s) that we want to use in sequence. A possible function, in our example, is isPerfectSquare. Note that this is not limited to sequencing the same function.

Hence, the idea is this:

isBigAndPerfectSquare :: Integer -> Maybe Integer
isBigAndPerfectSquare n =
    Just n >>= isPerfectSquare >>= isBigNumber

isBigAndPerfectSquareAndMagic :: Integer -> Maybe Integer
isBigAndPerfectSquareAndMagic n =
    Just n >>= isPerfectSquare >>= isBigNumber >>= isMagicNumber

What a MASSIVE reduction in code and improvement in readability. All that value extraction (Just x) and special cases (Nothing) were all taken care of by the beautiful >>= operator. Sequencing is now easy, and clean.

Notice also that we "fed" Just n into the various functions, and not plain n. That's because the >>= operator abstracts away the unwrapping (value extraction) process, and it needs something that is wrapped to unwrap. The type signature of >>='s first parameter confirms this. It's a Maybe a. Hence, we have to wrap our first n. This is good, if you think about it. n should exist with it's context, not floating around. The only time it "floats" as a plain value, is when it's inside a function, such as isPerfectSquare and isBigNumber. At all other times, it is safely wrapped in it's data type -- Maybe. That keeps things clean.

However, having the programmer deal with wrapping things up is all fine and dandy, but would it be nice if there was a generic way to just wrap things up, no matter what data type the programmer is using? Would it be possible to make the type creator responsible for that aspect as well? Turns out there is. And since we're making the type creator responsible for more and more of these abstraction stuff, maybe all of this behaviour can be made part of a typeclass, and again, indeed it is. That's your monad typeclass, which we are ready to introduce.

class Monad m where
  return     :: a -> m a
  (>>=)      :: m a -> (a -> m b) -> m b
  (>>)       :: m a -> m b -> m b
  (>>) m m'  =  m >>= \_ -> m'
  fail       :: String -> m a

First thing to note: The return types of all the functions defined in the typeclass are monads. We've already introduced >>=, and the meaning here is exactly the same (intuitively at least, since we haven't formally introduced it) as what we've been doing above. Basically, it takes a monadic value (our Just x, above), and feeds it into a function that expects the actual value (the inner value, or x, in our example), does something to the value, and spits out yet another monadic value.

Hence, Just x >>= isPerfectSquare feeds Just n into isPerfectSquare, and the definition for >>= results in the extraction of x and calling of isPerfectSquare x (remember when we made the type creator do all that extraction work and special case handling work?). isPerfectSquare x produced either a Nothing or a Just y, which is of our monadic type. The type creator must define >>=. In other words, what it means to extract and handle the edge cases (yes, again! this is important!)

>> is a convenience function, which does exactly what >>= does, but it doesn't care about the input. It just spits out the second parameter, a monadic value, m'.

fail is a function, which takes an error string and gives a monad, which represents failure, possibly including the given error string. We'll ignore this because we hardly define it.

Hence, if we wanted to make Maybe a monad (it already is), then we would do the following:

instance Monad Maybe where
  return x        = Just x
  Nothing >>= f   = Nothing
  Just x  >>= f   = f x
 

We should, by now, understand how and why >>= is defined. Return is where we said we wanted to make the type creator responsible for wrapping something into a type (a monad). In the case of Maybe, the simplest way to wrap something (actually the only way, for Maybe) is to put it into a Just. Nothing is not a good idea, because Nothing means failure. If we want to wrap the value 4, we don't want to generate failure, we want Just 4.

Hence, writing monadically, we should do this:

isBigAndPerfectSquare :: Integer -> Maybe Integer
isBigAndPerfectSquare n =
    return n >>= isPerfectSquare >>= isBigNumber

isBigAndPerfectSquareAndMagic :: Integer -> Maybe Integer
isBigAndPerfectSquareAndMagic n =
    return n >>= isPerfectSquare >>= isBigNumber >>= isMagicNumber

 

Creating Order Of Evaluation

Remember that in a lazily evaluated language like Haskell, there is no guarantee if and when expressions (or computations) are evaluated. For instance, given the following code:

myFunction :: Int -> Bool
myFunction x = if x > 100 then True else False

We have no guarantee if and when True, or False, will be evaluated. Everything is lazy. Similarly:

myFunction :: Integer -> [Integer]
myFunction x = take 10 [x..]

does not generate the entire list, only the first ten elements. In short, we cannot determine when evaluation occurs.

However, while this has a LOT of benefits (performance, conciseness, infinitity, etc), when dealing with the real, external world, many times evaluation order matters, and matters a lot. For instance, it matters when writing to a file, and reading from one. Same with databases (they can be viewed as fancy files). And so on. Hence, we need a way to guarantee that the order of evaluation can be predictable when we need it to be predictable.

How can we do this in a purely functional, lazily evaluated language? By the observation that we can force evaluation order by ensuring that a computations X require a computation Y to compute. Hence, Y must evaluate before X.

If we express X and Y as anonymous functions, we can do something like this:

((\y ->  y / ((\x -> x + 1) 3)) 12)

This guarantees that the (x+1) function must run first, because the (y/x) function relies on the computed result (the value) of the (x+1) function for it to be evaluated. Hence (x+1) returns 4, given its argument, which is fed to the (y/x) function to becomes (y/4). the (y/4) function takes the parameter 12, and computes the result of 3.

Note how similar this is to assignment in imperative languages! In fact, this is the exact equivalent of the imperative code:

x = 3
x = x + 1
y = 12
y = y / x

Imperative languages determine order of execution simply by order of listing. The compiler MUST generate code that evaluates those 4 statements in that order (optimizing compilers can rearrange, but the meaning must not change). Since declarative languages cannot have such multi-statement "programs", and can only consist of functions which consist of expressions which evaluate, we then observe that execution "flow" can be simply expressed by a bunch of interdependent nested functions, exactly as above. In essence, we are forcing the compiler (or the evaluator) to evaluate the (x+1) computation before the (y/x) function, because we created a scenario in which it simply had no choice but to do so, no matter how lazy it wanted to be.

How does this relate to monads? Or rather, how do monads help to create order of evaluation in a nice and simple way? Make the following observation:

This function (exactly the same as above):

isBigAndPerfectSquareAndMagic :: Integer -> Maybe Integer
isBigAndPerfectSquareAndMagic n =
    return n >>= isPerfectSquare >>= isBigNumber >>= isMagicNumber

is computationally identical to this function:

isBigAndPerfectSquareAndMagic n =
    return n >>= (\x ->
        isPerfectSquare x >>= (\y ->
            isBigNumber y >>= (\z ->
                isMagicNumber z)))

The ONLY difference is that now the "fed" and "unpacked" values (from the contextualized monadic value) are given nice names (x, y and z), and the functions are called with those names, instead of being implicit. To be obvious:

return n >>= isPerfectSquare
  ==> Just n >>= isPerfectSquare   -- by definition of return
  ==> isPerfectSquare n            -- by defintion of >>=

which is exactly the same as:

return n >>= (\x -> isPerfectSquare x)
  ==> Just n >>= (\x -> isPerfectSquare x)  -- by definition of return
  ==> (\x -> isPerfectSquare x) n           -- by definition of >>=
  ==> isPerfectSquare n                     -- by function application

Hence, when we sequence expressions using monads, via >>=, we are in effect, guaranteeing their order of evaluation! Hence:

return n >>= (\x ->
    isPerfectSquare x >>= (\y ->
        isBigNumber y >>= (\z ->
            isMagicNumber z)))

is identical to the imperative code:

x = construct(n)
y = isPerfectSquare x
z = isBigNumber y
return(z)

neglecting the Maybe stuff, and construct means the monad's "return", and return means the imperative style "return" of actually returning a value from a function explicitly.

Since we now have order of evaluation guarantee, we can essentially "write" imperative looking code, in Haskell! Now this is not going to break referential transparency, because at this point, we are just forcing order of evaluation, not dealing with state, or variables. In other words, it would be nice if we could write the above Haskell code something as follows, right?

x = return n
y = isPerfectSquare x
z = isBigNumber y
isMagicNumber z

Since the = operator in Haskell already has a meaning, we give it another operator instead. How about:

x <- return n
y <- isPerfectSquare x
z <- isBigNumber y
isMagicNumber z

Looks good. Haskell provides exactly that in its do-notation, as in the following real Haskell code:

isBigAndPerfectSquareAndMagic n = do
  x <- return n
  y <- isPerfectSquare x
  z <- isBigNumber y
  isMagicNumber z

Wonderful looking syntax. It's just syntatic sugar, so it really and precisely means the previous looking code with all the lambda expressions (anonymous functions). One thing to note, is that the last expression in the do-notation cannot have an "assignment" type thing. That's because, if you translate the original code with all the lambda expressions, you realize that the last lambda is the expression to be evaluated and returned. Making an assignment there does not make sense, and is not permitted. If you try, Haskell tells you this:

  The last statement in a 'do' construct must be an expression.

So what if we want to return something else? Say, for some weird reason, we want isBigAndPerfectSquareAndMagic to always return 1000. Then we do this:

isBigAndPerfectSquareAndMagic n = do
  x <- return n
  y <- isPerfectSquare x
  z <- isBigNumber y
  w <- isMagicNumber z
  return 1000

Of course! We use return to pack up 1000 as a monadic value (Just 1000, in this case), and since it's the last expression, we return it. See the complete abstraction over the monad (Maybe, again, in this case) at work here?

And of course, all this is possible only because Maybe is a monad, which has the wonderful >>= operator implemented by the conscientious! With >>=, we get:

Properties of Monads:
1. Abstraction from context (unpacking, special cases)
2. Guaranteed order of evaluation

That's the fundamental property of monads (there's loads more).

A Queer Example

However, just to clarify things and hopefully get a better mental picture of how Haskell really treats types, let’s play with a really weird typeclass, and find a way to make an instance of it.

class Weird p where
  weird :: j k -> p k j

The weird function takes one parameter, and returns a single, concrete type. We look for -> to determine parameters and return values. However, the type of the argument and the return value is a little more complicated. Let’s work it out.

Since (j k) must be a concrete type, we can derive the kinds of j and k quite easily, because k itself must be a concrete type, and j is clearly a type constructor, that takes a single concrete type (represented by k).

k :: *
j :: * -> *

Since we have derived the kinds of j and k, we can figure out p, since p is also clearly a type constructor, taking two types, k and j. And we know the kinds of k and j.

p :: * -> (* -> *) -> *

What this Weird typeclass is saying, is that instances of it, represented by p, must be of kind: * -> (* -> *) -> *. This is no different from having a typeclass as follows:

class Eq a where
  (==) :: a -> a -> Bool

In this case, a is obviously a concrete type, and has kind of a :: *.

Similarly,

class Functor f where
  fmap :: (a -> b) -> f a -> f b

In this case, f is clearly a type constructor taking one concrete type, and hence f :: * -> *. Hence, instances of the Functor typeclass, as you are probably already familiar with by now, expects a type constructor taking one concrete type. Examples of those are [] and Maybe.

Hence, instances of Weird must be TYPE CONSTRUCTORS, taking one conrete type, and another type constructor (that takes one concrete type). Hence, the kind is

p :: * -> (* -> *) -> *.

Let’s construct everything ourselves.

Since k is a concrete type, we can construct one trivially. Note that we don’t really have to do this, we could just as well use any existing such type, such as Int. We can test this using:
:t SomeK

data SomeK = SomeK deriving (Show)

Since j is a type constructor that expects a concrete type we can also construct j quite easily. Again note that we don’t really have to do this construction, we could just use an existing such type, such as Maybe (I’m using this loosely). We can test this using:
:t SomeJ SomeK

data SomeJ a = SomeJ a deriving (Show)

And finally, we know that p is a type constructor that expects two parameters: first a concrete type, and then a type constructor that is going to take a single concrete type. We construct that as well. This we probably have to do, because there isn’t an easily available type that looks like this. In order to ensure that “b” is treated as a type constructor, we “apply” it to “a”. We also reverse the resulting type (constructing with SomeP b a results in type SomeP a b) because that’s what the weird function expects. We want to match it. We can test this using:
:t SomeP (SomeJ SomeK)

data SomeP a b = SomeP (b a) deriving (Show)

Now let’s make an instance of Weird. Remember that the weird function takes one argument, of type (j k). However, in the function itself, we don’t have to specify the type (since it’s already been specified in the class, obviously). We’ll just call that argument x, in Haskell style. Next, we want to create the function’s body such that it returns a (p k j). We don’t really care what the function body does, because this is a play on types. We just care about the return type.

class Weird where
  weird :: j k -> p k j

The easiest, immediate thing that is going to return a (p k j) is the SomeP that we constructed (again obviously because we were basing the construction on the requirements of weird. Hence, the SomeP value constructor is going to create a type of SomeP a b, where a represents k, and b represents j. In order for the SomeP value constructor to produce SomeP a b (or equivalently SomeP k j), it’s going to take SomeP b a (or equivalently SomeP j k). (j k) is exactly the type of x. Hence, SomeP x will trigger the type constructor of SomeP (j k), and create a TYPE of SomeP k j. That’s exactly (p k j).

instance Weird SomeP where
  weird x = SomeP x

We have an instance of Weird! Not saying it’s useful, but well, it’s just to help understanding.

Types of Types

We’ve seen, at an observer level, how Haskell deals with types and type classes. But how does it actually formalize that so that it can get the checking right? While values have types (and expressions generate values), and we can use types to guarantee that functions which expect a type gets exactly that type. How then does Haskell guarantee that type constructors get the type it expects? The answer is elegant and beautiful — Haskell gives types, types. That means that each type, has a meta-type, and it’s called a “kind”.

:k StringString ==> StringString :: *
:k StringAnything ==> StringString :: * -> *

What this means is that StringString is going to return a concrete type all on its own (remember that constructors with no type variables create concrete types directly). StringAnything, on the other hand, is a type constructor that takes a concrete type and returns a concrete type. That means that plain old StringAnything is not a concrete type (again, similar to what we discussed earlier). Similarly, Maybe is a type constructor that takes a concrete type to produce a concrete type, and Maybe itself is not a concrete type.

We’re used to seeing functions as follows:

myFunction1 :: Int -> Bool

And also functions that take type variables that represent a concrete type

myFunction2 :: a -> Bool        (usually with class constraints added)
myFunction3 :: a -> b           (usually with class constraints added)

What if we have a function that takes a type constructor?

myFunction4 :: c a -> Bool
myFunction4 x = False

Look at this function. It takes one parameter, of type “c a”. Not just “a”, but “c a”. It’s obvious that “c” is being used as a type constructor, and in fact, it takes exactly one concrete type, “a”. An example for “c” would be “Maybe”. And we could apply myFunction4 as follows:

myFunction4 (Just 4)      ==> False
:t myFunction4            ==> c a -> Bool
:t myFunction4 (Just 4)   ==> Bool

What’s the kind of “a” and “c” then? We know that (c a) must produce a concrete type, because function parameters have to take a concrete type, not some airy-fairy type. Hence:

(c a) :: *

We also know that “c” is a type constructor that takes one concrete type, hence this follows:

c :: * -> *
a :: *

Let’s try one more:

myFunction5 :: d c a -> Bool
myFunction5 x = False

Can you derive the following?

Since      (d c a) :: *
and        a :: *

Therefore  c :: * -> *
and        d :: (* -> *) -> * -> *

Well, that’s my summary of Haskell types.