* Functors Redux We've learned by now how a lot of types (well, type constructors really) are instances of =Functor=, like =[]=, =Maybe=, =Either a= and a =Tree= type that we made on our own. We saw how we can map functions over them for great good. In this section, we'll take a look at two more instances of functor, namely =IO= and =(->) r=. If some value has a type of, say, =IO String=, that means that it's an I/O action that, when performed, will go out into the real world and get some string for us, which it will yield as a result. We can use =<-= in /do/ syntax to bind that result to a name. We mentioned that I/O actions are like boxes with little feet that go out and fetch some value from the outside world for us. We can inspect what they fetched, but after inspecting, we have to wrap the value back in =IO=. By thinking about thir box with little feet analogy, we can see how =IO= acts like a functor. Let's see how =IO= is an instance of =Functor=. When we =fmap= a function over an I/O action, we want to get back an I/O action that does the same thing, but has our function applied over its result value. #+begin_src haskell instance Functor IO where fmap f action = do result <- action return (f result) #+end_src The result of mapping something over an I/O action will be an I/O action, so right off the bat we use /do/ syntax to glue two actions and make a new one. In the implementation for =fmap=, we make a new I/O action that first performs the original I/O actio and calls its result =result=. Then, we do =return (f result)=. =return= is, as you know, a function that makes an I/O action that doesn't do anything but only presents something as its result. The action that /a do/ block produces will always have the result value of its last action. That's why we use return to make an I/O action that doesn't really do anything, it just presents =f result= as the result of the new I/O action. We can play around with it to gain some intuition. It's pretty simple really. Check out this piece of code: #+begin_src haskell main = do line <- getLine let line' = reverse line putStrLn $ "You said " ++ line' ++ "backwards!" putStrLn $ "Yes, you really said " ++ line' ++ "backwards!" #+end_src The user is prompted for a line and we give it back to the user, only reversed, Here's how to rewrite this by using =fmap=: #+begin_src haskell main = do line <- fmap reverse getLine putStrLn $ "You said " ++ line ++ " backwards!" putStrLn $ "Yes, you really said " ++ line ++ " backwards!" #+end_src If we look at what =fmap='s type would be if it were limited to =IO=, it would be =fmap :: (a -> b) -> IO a -> IO b=. =fmap= takes a function and an I/O action and returns a new I/O action that's like the old one, except that the function is applied to its contained result. Another instance of =Functor= that we've been dealing with all along but didn't know was a =Functor= is =(->) r=. The function type =r -> a= can be rewritten as =(->) r a=, much like we can write =2 + 3= as =(+) 2 3=. When we look at it as =(->) r a=, we can see =(->)= in a slighty different light, because we see that it's just a type constructor that takes two type parameters, just like =Either=. But remember, we said that a type constructor has to take exactly one type parameter so that it can be made an instance of =Functor=. That's why we can't make =(->)= an instance of =Functor=, but if we partially apply it to =(->) r=, it doesn't pose any problems. If the syntax allowed for type constructors to be partially applied with sections (like we can partially apply =+= by doing =(2+)=, which is the same as =(+) 2=), you could write =(->) r= as =(r ->)=. How are functions functors? Well, let's take a look at the implementation, which lies in =Control.Monad.Instances=. #+begin_src haskell instance Functor ((->) r) where fmap f g = (\x -> f (g x)) #+end_src First of all, let's think about =fmap='s type. It's =fmap :: (a -> b) -> f a -> f b=. Now what we'll do is mentally replace all the =f='s, which are the role that our functor instance plays, with =(->) r='s. We'll do that to see how =fmap= should behave for this particular instance. We get =fmap :: (a -> b) -> ((->) r a) -> ((->) r b)=. Now what we can do is write the =(->) r a= and =(-> r b)= typee as infix =r -> a= and =r -> b=, like we normally do with functions. What we get now is =fmap :: (a -> b) -> (r -> a) -> (r -> b)=. So this type remind us of Function Composition, we pipe the output of =r -> a= into the input of =a -> b= to get a function =r -> b=. Next up, we're going to look at the *functor laws*. In order for something to be a functor, it should satisfy some laws. All functors are expected to exhibit certain kinds of functor-like properties and behaviors. They should reliably behave as things that can be mapped over. *The first functor law states that if we map the =id= function over a functor, the functor that we get back should be the same as the original functor*. If we write that a bit more formally. it means that =fmap id = id=. So essentially, this says that if we do =fmap id= over a functor, it should be the same as just callint =id= in the functor. Remember, =id= is the identity function, which just returns its parameter unmodified. It can also be written as =\x -> x=. *The second law says that composing two functions and then mapping the resulting function over a functor should be the same as first mapping one function over the functor and then mapping the other one.* Formally written, that means that =fmap (f . g) = fmap f . fmap g=. Or to write it in another way, for any functor /F/, the following should hold: =fmap (f . g) F = fmap f (fmap g F)=. If we can show that some type obeys both functor laws, we can rely on it having the same fundamental behaviors as othes functors when it comes to mapping. We can know that when we use =fmap= on it, there won't be anything other than mapping going on behind the scenes and that it will act like a thing that can be mapped over, i.e. a functor. * Applicative functors In this section, we'll take a look at applecative functors, which are beefed up functors, represented in Haskell by the =Applicative= typeclass, found in the =Control.Applicative= module. When we were mapping functions over functors, we usually mapped functions that take only one parameter. But what happens when we map a function like =*=, which takes two parameters, over a functor? If we have =Just 3= and we do =fmap (*) (Just 3)=, what do we get? From the instance implementation of =Maybe= for =Functor=, we know that if it's a =Just something= value, it will apply the function to the =something= inside the =Just=. Therefore, doing =fmap (*) (Just 3)= result in =Just ((*) 3)=, which can also be written as =Just (* 3)= if we use sections. #+begin_src haskell :t fmap compare (Just 'a') #+end_src #+RESULTS: : fmap compare (Just 'a') :: Maybe (Char -> Ordering) We see how by mapping "multi-parameter" functions over functors, we get functors that contain functions inside them. So now what can we do with them? Well for one, we can map functions that take these functions as parameters over them, because whetever is inside a functor will be given to the function that we're mapping over it as a parameter. #+begin_src haskell :{ let a = fmap (*) [1,2,3,4] :} fmap (\f -> f 9) a #+end_src #+RESULTS: : Prelude> [9,18,27,36] But what if we have a functor value of =Just (3 *)= and a functor value of =Just 5= and we want to take out the function from =Just (3 *)= and map it over =Just 5=? With normal functors, we're out of luck, because all they support is just mapping normal functinos over existing functors. Meet the =Applicative= typeclass. It lies in the =Control.Applicative= module and it defines two methods, =pure= and =<*>=. It doesn't provide a default implementation for any of them, so we have to define them both if we want something to be an applicative functor. The class is defined like so: #+begin_src haskell class (Functor f) => Applicative f where pure :: a -> f a (<*>) :: f (a -> b) -> f a -> f b #+end_src This simple three line class definition tells us a lot! Let's start at the first line. It starts the definition of the =Applicative= class and it also introduces a class constraint. It says that if we want to make a type constructor part of the =Applicative= typeclass, it has to be in =Functor= first. That's why if we know that if a type constructor is part of the =Applicative= typeclass, it's also in =Functor=, so we can use =fmap= on it. The first method it defines is called =pure=. Which takes a value and puts it in some sort of default (or pure) context - a minimal context that still yields that value. The =<*>= function is really interesting. It has a type declaration of =f (a -> b) -> f a -> f b=. Which remembers a lot of =fmap=. Whereas =fmap= takes a function and a functor and applies the function inside the functor, =<*>= takes a functor that has a function in it and another functor and sort of extracts that function from the first functor and then maps it over the second one. When I say /extract/, I actually sort of mean /run/ and then extract, maybe even /sequence/. * The newtype keyword In the previous section, we saw that there are actually more ways for the list type to be an applicative functor. One way is to have =<*>= take every function out of the list that is its left parameter and apply it to every value in the list that is on the right, resulting in every possible combination of applying a function from the left to a value in the right list. #+begin_src haskell [(+1), (*100), (*5)] <*> [1,2,3] #+end_src #+RESULTS: The second way it to take the first function on the left side of =<*>= and apply it to the first value on the right, then take the second function from the list on the left side and apply it to the second value on the right, and so on. Ultimately, it's kind of like zipping the two lists together. But lists are already an instance of =Applicative=, so how did we also make lists an instance of =Applicativi= in this second way? If you remember, we said that the =ZipList a= type was introduced for this reason, which has one value constructor, =ZipList=, that has just one field. We put the list that we're wrapping in that field. Then, =ZipList= was made an instance of =Applicative=, so that when we want to use lists as applicatives in the zipping manner, we just wrap them with the =ZipList= constructor and then once we're done, unwrap them with =getZipList=: #+begin_src haskell getZipList $ ZipList [(+1), (*100), (*5)] <*> ZipList [1,2,3] #+end_src #+RESULTS: So, what does this have to do with this /newtype/ keyword? Well, think about how we might write the data declaration for our =ZipList a= type. One way would be todo it like so: #+begin_src haskell data ZipList a = ZipList [a] #+end_src A type that has just one value constructor and that value constructor has just one field taht is a list of things. We might also want to use record syntax so that we automatically get a function that extracts a list from a =ZipList=: #+begin_src haskell data ZipList a = ZipList { getZipList :: [a] } #+end_src The /newtype/ keyword in Haskell is made exactly for these cases when we want to just take one type and wrap it in something to present it as another type. In the actual libraries, =ZipList a= is defined like this: #+begin_src haskell newtype ZipList a = ZipList { getZipList :: [a] } #+end_src Instead of the /data/ keyword, the /newtype/ keyword is used. Now why is that? Well for one, /newtype/ is faster. If you use the /data/ keyword to wrap a type, there's some overhead to all that wrapping and unwrapping when your program is running. But if you use /newtype/, Haskell knows that you're just using it to wrap an existing type into a new type (hence the name), because you want it to be the same internally but have a different type. With that in mind, Haskell can get rid of the wrapping and unwrapping once it resolves which value is of what type. So why not just use /newtype/ all the time instead of /data/ then? Well, when you make a new type from an existing type by using the /newtype/ keyword, you can only have one value constructor and that value constructor can only have one field. But with /data/, you can make data types that have several value constructors and each constructor can have zero or more fields: #+begin_src haskell data Profession = Fighter | Archer | Accontant data Race = Human | Elf | Orc | Goblin data PlayerCharacter = PlayerCharacter Race Profession #+end_src When using /newtype/, you're restricted to just one constructos with one field. We can also use the /deriving/ keyword with /newtype/ just like we would with /data/. We can derive instances for =Eq=, =Ord=, =Enum=, =Bounded=, =Show= and =Read=. If we derive the instance for a type class, the type that we're wrapping has to be in that type class to begin with. It makes sense, because /newtype/ just wraps an existing type. So now if we do the following, we can print and equate values of our new type: #+begin_src haskell newtype CharList = CharList { getCharList :: [Char] } deriving (Eq, Show) #+end_src ** Using /newtype/ to make type class instances Many times we want to make our types instances of certain types classes, but the type parameters just don't hatck up for what we want to do. It's easy to make =Maybe= an instance of =Functor=, because the =Functor= type class is defined like this: #+begin_src haskell class Functor f where fmap :: (a -> b) -> f a -> f b #+end_src So we just start out with: #+begin_src haskell instance Functor Maybe where #+end_src And then implement =fmap=. All the type parameters add up because the =Maybe= takes the place of =f= in the definition of the =Functor= type class and so if we look at =fmap= like it only worked on =Maybe=, it ends up behaving like: #+begin_src haskell fmap :: (a -> b) -> Maybe a -> Maybe b #+end_src Now what if we wanted to make the tuple an instance of =Functor= in such a way that when we =fmap= a function over a tuple, it gets applied to hte first component of the tuple? That way. doing =fmap (+3) (1,1)= would result in =(4,1)=. It turns out that writing the instance for that is kind of hard. With =Maybe=, we just say =instance Functor Maybe where= because only type constructors that take exactly one parameter can be made an instance of =Functor=. But it seems like there's no way to do something like that with =(a,b)= so that the type parameter =a= ends up being the one that changes when we use =fmap=. To get around this, we can /newtype/ our tuple in such a way that the secont type parameter represents the type of the first component in the tuple: #+begin_src haskell newtype Pair b a = Pair { getPair :: (a,b) } #+end_src And now we can make it an instance of =Functor= so that the function is mapped over the first component: #+begin_src haskell instance Functor (Pair c) where fmap f (Pair (x,y)) = Pair (f x, y) #+end_src As you can see, we can pattern match on types defined with /newtype/. We pattern match to get the underlying tuple, the we a pply the function =f= to the first component in the tuple and then we use the =Pair= value constructor to convert the tuple back to our =Pair b a=. So now if we convert a tuple into a =Pair b a=, we can use =fmap= over it and the function will be mapped over the first component: #+begin_src haskell getPair $ fmap (*100) (Pair (2,3)) #+end_src ** On newtype laziness We mentioned that /newtype/ is usually faster that /data/. The only thing that can be done with /newtype/ is turning an existing type into a new type, so internally, Haskell can represent the values of types defined with /newtype/ just like the original one, only it has to keep in mind that the their types are now distinct. This fact means that not only is /newtype/ faster, it's alto lazier. Like we've said before, Haskell is lazy by default, which means that only when we try to actually print the results of our functions will any computation take place. Now, consider the following type: #+begin_src haskell data CoolBool = CoolBool { getCoolBool :: Bool } #+end_src #+RESULTS: It's your run-of-the-millalgebraic data type that was defined with the /data/ keyword. It has one value constructor, which has one field whose type is =Bool=. Let's make a function that pattern matches on a =CoolBool= and returns the value ="hello"= regardless of whether the =Bool= inside the =CoolBool= was =True= of =False=: #+begin_src haskell helloMe :: CoolBool -> String helloMe (CoolBool _) = "hello" #+end_src #+RESULTS: : :3:10-17: error: : Not in scope: data constructor ‘CoolBool’ Instead of applying this function to a normal =CoolBool=, let's throw it a curveball and apply it to =undefined= #+begin_src haskell helloMe undefined #+end_src #+RESULTS: : "*** Exception: Prelude.undefined : CallStack (from HasCallStack): : error, called at libraries/base/GHC/Err.hs:79:14 in base:GHC.Err : undefined, called at :10:9 in interactive:Ghci6 Yikes, an exception! Now why did this exception happen? Types defined with the /data/ keyword can have multiple value constructors (even though =CoolBool= only has one). So in order to see if the value given to our function conforms to the =(CoolBool _)= pattern, Haskell has to evaluate the value just enough to see which value constructor was used when we made the value. And when we try to evaluate an =undefined= value, even a little, an exception is thrown. Instead of using the /data/ keyword for =CoolBool=, let's try using /newtype/: #+begin_src haskell newtype CoolBool = CoolBool { getCoolBool :: Bool } #+end_src #+RESULTS: And let's apply =helloMe= to an =undefined= value: #+begin_src haskell helloMe undefined #+end_src #+RESULTS: : hello It worked! When we use /newtype/, Haskell can internally represent the values of the new type in the same way as the original values. It doesn't have to add another box around them, it just has to be aware of the values being of different types. And because Haskell knows that types made with the /newtype/ keyword can only have one constructor, it doesn't havi to evaluate the value passed to the function to make sure that it conforms to the =(CoolBool _)= pattern because /newtype/ types can only have one possible value constructor and one field. *Whereas /data/ can be used do make your own types from scratch, /newtype/ is for making a completely new type out of an existing type.* * Monoids Consider the following: =*= is a function that takes two numbers and multiplies them. If we multiply some number with a =1=, the result is always equal to that number. It doesn't matter if we do =1 * x= or =x * 1=, the result is always =x=. Simirlaly, =++= is also a function which takes two things and returns a third, only instead of multiplying numbers, it takes two lists and concatenates them. And much like =*=, it also has a certain value which doesn't change the other one when used with =++=, that values is an empty list: =[]=. There's another thing that these two operations have in common that may not be as abvious as our previous observations: when we have three or more values and we want to use the binary function to reduce them to a single result, the order in which we apply the binary function to the values doesn't matter. It doesn't matter if we do =(3 * 4) * 5= or =3 * (4 * 5)=, The result is =60=. The same goes for =++=. We call thir property /associativity/. =*= is associative, and so is =++=, but =-=, for example, is not. By noticing and writing down there properties, we have changed upon /monoids/! A monoid is when you have an associative binary function and a value which acts as an identity with respect to that function. When something acts as an identity with respect to a functino, it means that when called with that functino and some other values, the result is always equal to that other value. =1= is the identity with respect to =*= and =[]= is the identity with respect to =++=. Which is why the =Monoid= type class exists, it's for types which can act like monoids. Let's see how the type class is defined: #+begin_src haskell class Monoid m where mempty :: m mappend :: m -> m -> m mconcat :: [m] -> m mconcat = foldr mappend mempty #+end_src First of all, we see that only concrete types can be made instances of =Monoid=, because the =m= in the type class definition doesn't take any type parameters. This is different from =Functor= and =Applicative=, which rewuire their instances to be type constructors which take one parameter. The first function is =mempty=, it's not really a function, since it doesn't take parameters, so it's a polymorphic contant, kind of like =minBound= from =Bounded=. =mempty= represents the identity value for a particular monoid. Next up, we have =mappend=, which is the binary function. It takes two values of the same type and returns a values of that type as well. It's worth noting that the decision to name =mappend= as it's named was kind of unfortunate, because it implies that we're appenting two things in some way. While =+= does take two lists and appent one to the other, =*= doesn't reappy do any appending, it just multiplies two numbers together. The last function is this type class definition is =mconcat=. It takes a list of monoid values and reduces them to a single values by doing =mappend= between the list's elements. It has a default implementation, which just takes =mempty= as a stating values and folds the list from the rigth with =mappend=. Because the default implementation is fine for most instances, we won't concern ourselves with =mconcat= too much from now on. When making a type an instance of =Monoid=, it suffices to just implement =mempty= and =mappend=. The reason =mconcat= is there at all is because for some instances, there might be a more efficient way to implement =mconcat=, but for most instances the default implementation is just fine. We mentioned that there has to be a value that acts as the identity with respect to the binary function that the binasy function has to be associative. It's possible to make instances off =Monoid= that don't follow there rules, but such instances are of no use to anyone because when using the =Monoid= type class, we rely on its instances acting like monoids. Otherwise, what's the point? That's why when making instances, we havi to make sure they follow these laws: - ~mempty `mappend` x = x~ - ~x `mappend` mempty = x~ - ~(x `mappend` y) `mappend` z = x `mappend` (y `mappend` z)~ Haskell doesn't enforce there laws, so we as the programmer have to be careful that our instances do indeed obey them.