two things to know

(2) randomIO :: Random a => IO a

The one function you really need to know about is randomIO (The type of this function is Random a => IO a. Don't worry if you do not understand the type; it suffices to know that it involves IO). In this example, we use and generate a random Int:

import System.Random

main = do r <- randomIO print (r + 1 :: Int) -- Note re the ':: Int' above: Haskell can't figure out from -- the context exactly what type of number you want, so we -- constrain it to Int

One neat feature is that you can randomly generate anything that implements the Random typeclass. In the example below, we generate a random Bool. Notice how we do not do anything differently, except to treat the result as a bool (i.e. by applying not to it)

import System.Random

main = do r <- randomIO print (not r)

A useful exercise, if you know about typeclasses, is to implement Random for one of your own types. The toEnum function may be useful.

more advanced stuff

you can use randomRIO :: Random a => (a,a) -> IO a to generate random numbers constrained within a range

Instead of using the functions randomIO and randomRIO, you can separate obtaining a random number generator, from using the generator. Doing so allows you to minimise your reliance on the IO monad. It also makes your code easier to debug, because you can opt to always pass the same generator to it and make life much more predictable. See the functions random and randomR for details.

a potentially handy trick is to generate an infinite list of random numbers, which you can then pass to a function. See the randoms function for details.

Edit: fixed s/randomR/randomIO/

Random numbers are the kind of thing I use rarely enough that by the time I want to use them, I have forgotten the relevant details, but frequently enough that I get annoyed whenever it happens.

Hopefully these notes will be useful to somebody in a similar situation.

two things to know

(2) randomIO :: Random a => IO a

The one function you really need to know about is randomIO (The type of this function is Random a => IO a. Don't worry if you do not understand the type; it suffices to know that it involves IO). In this example, we use and generate a random Int:

import System.Random

main = do r <- randomIO print (r + 1 :: Int) -- Note re the ':: Int' above: Haskell can't figure out from -- the context exactly what type of number you want, so we -- constrain it to Int

One neat feature is that you can randomly generate anything that implements the Random typeclass. In the example below, we generate a random Bool. Notice how we do not do anything differently, except to treat the result as a bool (i.e. by applying not to it)

import System.Random

main = do r <- randomIO print (not r)

A useful exercise, if you know about typeclasses, is to implement Random for one of your own types. The toEnum function may be useful.

more advanced stuff

you can use randomRIO :: Random a => (a,a) -> IO a to generate random numbers constrained within a range

Instead of using the functions randomIO and randomRIO, you can separate obtaining a random number generator, from using the generator. Doing so allows you to minimise your reliance on the IO monad. It also makes your code easier to debug, because you can opt to always pass the same generator to it and make life much more predictable. See the functions random and randomR for details.

a potentially handy trick is to generate an infinite list of random numbers, which you can then pass to a function. See the randoms function for details.

2008-07-27

Pandoc is a universal document converter. You feed it documents in one format (say, HTML) and it spits them out in another one (say, ODF). Assuming it works correctly, Pandoc has the potential to replace all those little one-to-one convertors (e.g. latex2html) in my toolbox. Just the one simple Pandoc.

And now, thanks to fiddlosopher (John MacFarlane?), it knows how to write Mediawiki files! Mediawiki? That's the syntax/software that powers Wikipedia, Wikibooks, and a whole slew of organisational or community wikis (like HaskellWiki).

Hey, Haskellers probably have a lot of LaTeX documents lying around. Maybe this is their chance to get them on Haskell wiki?

We're halfway to being able to do a roundtrip between LaTeX and Mediawiki! All we need is for somebody [maybe John :-D] to implement a Mediawiki reader for Pandoc and things could get mighty interesting... Oh and yes, and if anybody is working on a wiki with direct LaTeX support, hats off to you! Sometimes Mediawiki is a fact of life, though.

Pandoc is a universal document converter. You feed it documents in one format (say, HTML) and it spits them out in another one (say, ODF). Assuming it works correctly, Pandoc has the potential to replace all those little one-to-one convertors (e.g. latex2html) in my toolbox. Just the one simple Pandoc.

And now, thanks to fiddlosopher (John MacFarlane?), it knows how to write Mediawiki files! Mediawiki? That's the syntax/software that powers Wikipedia, Wikibooks, and a whole slew of organisational or community wikis (like HaskellWiki).

Hey, Haskellers probably have a lot of LaTeX documents lying around. Maybe this is their chance to get them on Haskell wiki?

We're halfway to being able to do a roundtrip between LaTeX and Mediawiki! All we need is for somebody [maybe John :-D] to implement a Mediawiki reader for Pandoc and things could get mighty interesting... Oh and yes, and if anybody is working on a wiki with direct LaTeX support, hats off to you! Sometimes Mediawiki is a fact of life, though.

2008-07-23

I guess this isn't big enough to go on the haskell@ mailing list: I have uploaded Krasimir Angelov and Iavor S. Diatchki's Data.Tree implementation of zippers onto hackage. The package is called rosezipper and it is available under the BSD3 license.

For the interested, "The Zipper is an idiom that uses the idea of “context” to the means of manipulating locations in a data structure." (Haskell wiki).

For me, zippers are just a very nice way to navigate and edit trees. By "nice", I mean elegant, efficient and purely functional. Before learning about zippers, I only knew how to navigate trees from top to bottom, but if I wanted to go back up a node, or visit a sibling node, I basically had to start over from the root. Zippers allow me to walk the tree in any direction, visiting a node's parent, children and siblings without starting over from the top. This kind of thing is especially handy for Natural Language Processing people, basically, anybody who eats trees for a living.

If you would like to learn more, I would recommend Apfelmus's very friendly tutorial (part of the Haskell wikibook).

Thanks to Krasimir and Iavor for implementing this and for allowing me to package it up.

I guess this isn't big enough to go on the haskell@ mailing list: I have uploaded Krasimir Angelov and Iavor S. Diatchki's Data.Tree implementation of zippers onto hackage. The package is called rosezipper and it is available under the BSD3 license.

For the interested, "The Zipper is an idiom that uses the idea of “context” to the means of manipulating locations in a data structure." (Haskell wiki).

For me, zippers are just a very nice way to navigate and edit trees. By "nice", I mean elegant, efficient and purely functional. Before learning about zippers, I only knew how to navigate trees from top to bottom, but if I wanted to go back up a node, or visit a sibling node, I basically had to start over from the root. Zippers allow me to walk the tree in any direction, visiting a node's parent, children and siblings without starting over from the top. This kind of thing is especially handy for Natural Language Processing people, basically, anybody who eats trees for a living.

If you would like to learn more, I would recommend Apfelmus's very friendly tutorial (part of the Haskell wikibook).

Thanks to Krasimir and Iavor for implementing this and for allowing me to package it up.

2008-07-22

Here's another coding-project idea: I would like to see a hex editor that knows how to display characters in other encodings than ASCII (specifically: I want to debug messed up UTF-8 text files).

Google and apt-cache search reveal no such editor, at least not in the free/open-source worlds, nowhere in Linux or MacOS X freeware land. On Debian based systems, there are a couple that handle some Japanese encodings, but nothing that deals with UTF-8.

Likely features:

toggle between an ASCII-only mode and a show-as-UTF-8 mode

good UI for the fact that UTF-8 characters have a variable length in bytes

graceful handling of encoding errors

Haskellers could possibly do this as a part (plugin?) of Yi, or maybe just a completely standalone product.

And if you want a slightly simpler project, a UTF-8 hex dumper would be good. Hmmph... come to think of it, maybe it would have been more productive to just go write that instead of this blog post.

Edit: Well, I went ahead and made a stupid little dumper for my needs. Here is the output on some sample corrupted UTF-8

Highlighting by hand. I should probably go figure out how to colourise the corrupted characters. Or maybe I should just go ahead and package this, put it up on hackage? Make it available via darcs? I would need a decent name. So far, I have hexy-xxy and hexdump-utf8 neither of which are that great :-/

Here's another coding-project idea: I would like to see a hex editor that knows how to display characters in other encodings than ASCII (specifically: I want to debug messed up UTF-8 text files).

Google and apt-cache search reveal no such editor, at least not in the free/open-source worlds, nowhere in Linux or MacOS X freeware land. On Debian based systems, there are a couple that handle some Japanese encodings, but nothing that deals with UTF-8.

Likely features:

toggle between an ASCII-only mode and a show-as-UTF-8 mode

good UI for the fact that UTF-8 characters have a variable length in bytes

graceful handling of encoding errors

Haskellers could possibly do this as a part (plugin?) of Yi, or maybe just a completely standalone product.

And if you want a slightly simpler project, a UTF-8 hex dumper would be good. Hmmph... come to think of it, maybe it would have been more productive to just go write that instead of this blog post.

Edit: Well, I went ahead and made a stupid little dumper for my needs. Here is the output on some sample corrupted UTF-8

Highlighting by hand. I should probably go figure out how to colourise the corrupted characters. Or maybe I should just go ahead and package this, put it up on hackage? Make it available via darcs? I would need a decent name. So far, I have hexy-xxy and hexdump-utf8 neither of which are that great :-/

2008-07-21

A year and a half ago, I posted what seemed to be the simplest recipe for reading and writing UTF-8 in Haskell. In this post, I will provide an even simpler recipe, made possible by Eric Mertens' utf8-string package.

For those who are not familiar with Haskell, its internal representation for characters is Unicode, but for IO it effectively assumes that that it is reading and writing in the ISO8859-1 format. This used to be annoying for those of us who wanted to work with the UTF-8 encoding, but now there is a very simple solution, perfect for those of us who don't want to think too much and just get the job done.

the example

The sample problem from my last post was to take a UTF-8 encoded file as input, reverse all its lines, writing the results in the same file, with a ".rev" extension appended to its name. The solution might be self-explanatory if you are used to Haskell, but I will make some minor comments below, just in case.

In the above code, we use some drop-in replacements for some System.IO functions. Some of these functions are also provided in the Prelude, so we must hide them so that they do not overlap with what we import. (Alternatively, we could import the UTF-8 ones qualified, which could be handy in contexts where we want the option of reading and writing in UTF-8 without committing to it). The rest is straightforward. Notice that we do not jump through any hoops whatsoever. In fact, you can pretty much take any pre-existing Haskell program that you have written and turn it into a UTF-8 version by changing the import statements.

The utf8-string package is available on HackageDB. Thanks to Eric M. for providing this little wrapper! It's a perfect example of the kind of thing which seems obvious... after somebody else has thought to do it.

A year and a half ago, I posted what seemed to be the simplest recipe for reading and writing UTF-8 in Haskell. In this post, I will provide an even simpler recipe, made possible by Eric Mertens' utf8-string package.

For those who are not familiar with Haskell, its internal representation for characters is Unicode, but for IO it effectively assumes that that it is reading and writing in the ISO8859-1 format. This used to be annoying for those of us who wanted to work with the UTF-8 encoding, but now there is a very simple solution, perfect for those of us who don't want to think too much and just get the job done.

the example

The sample problem from my last post was to take a UTF-8 encoded file as input, reverse all its lines, writing the results in the same file, with a ".rev" extension appended to its name. The solution might be self-explanatory if you are used to Haskell, but I will make some minor comments below, just in case.

In the above code, we use some drop-in replacements for some System.IO functions. Some of these functions are also provided in the Prelude, so we must hide them so that they do not overlap with what we import. (Alternatively, we could import the UTF-8 ones qualified, which could be handy in contexts where we want the option of reading and writing in UTF-8 without committing to it). The rest is straightforward. Notice that we do not jump through any hoops whatsoever. In fact, you can pretty much take any pre-existing Haskell program that you have written and turn it into a UTF-8 version by changing the import statements.

The utf8-string package is available on HackageDB. Thanks to Eric M. for providing this little wrapper! It's a perfect example of the kind of thing which seems obvious... after somebody else has thought to do it.