Posts Tagged haiku

I subjected Haiku Detector to some serious stress-testing with a 29MB text file (that’s 671481 sentences, containing 16810 haiku, of which some are intentional) a few days ago, and kept finding more things that needed fixing or could do with improvement. A few days in a nerdsniped daze later, I have a new version, and some interesting tidbits about the way Mac speech synthesis pronounces things. Here’s some of what I did:

Tweaked the user interface a bit, partly to improve responsiveness after 10000 or so haiku have been found.

Made the list of haiku stay scrolled to the bottom so you can see the new ones as they’re found.

Added a progress bar instead of the spinner that was there before.

Fixed a memory issue.

Changed a setting so it should work in Mac OS X 10.6, as I said here it would, but I didn’t have a 10.6 system to test it on, and it turns out it does not run on one. I think 10.7 (Lion) is the lowest version it will run on.

Added some example text on startup so that it’s easier to know what to do.

Made it a Developer ID signed application, because now that I have a bit more time to do Mac development (since I don’t have a day job; would you like to hire me?), it was worth signing up to the paid Mac Developer Program again. Once I get an icon for Haiku Detector, I’ll put it on the app store.

Fixed a few bugs and made a few other changes relating to how syllables are counted, which lines certain punctuation goes on, and which things are counted as haiku.

That last item is more difficult than you’d think, because the Mac speech synthesis engine (which I use to count syllables for Haiku Detector) is very clever, and pronounces words differently depending on context and punctuation. Going through words until the right number of syllables for a given line of the haiku are reached can produce different results depending on which punctuation you keep, and a sentence or group of sentences which is pronounced with 17 syllables as a whole might not have words in it which add up to 17 syllables, or it might, but only if you keep a given punctuation mark at the start of one line or the end of the previous. There are therefore many cases where the speech synthesis says the syllable count of each line is wrong but the sum of the words is correct, or vice versa, and I had to make some decisions on which of those to keep. I’ve made better decisions in this version than the last one, but I may well change things in the next version if it gives better results.

Here are some interesting examples of words which are pronounced differently depending on punctuation or context:

ooohh

Pronounced with one syllable, as you would expect

ooohh.

Pronounced with one syllable, as you would expect

ooohh..

Spelled out (Oh oh oh aitch aitch)

ooohh…

Pronounced with one syllable, as you would expect

H H

Pronounced aitch aitch

H H H

Pronounced aitch aitch aitch

H H H H H H H H

Pronounced aitch aitch aitch

Da-da-de-de-da

Pronounced with five syllables, roughly as you would expect

Da-da-de-de-da-

Pronounced dee-ay-dash-di-dash-di-dash-di-dash-di-dash. The dashes are pronounced for anything with hyphens in it that also ends in a hyphen, despite the fact that when splitting Da-da-de-de-da-de-da-de-da-de-da-de-da-da-de-da-da into a haiku, it’s correct punctuation to leave the hyphen at the end of the line:

Da-da-de-de-da-
de-da-de-da-de-da-de-
da-da-de-da-da

Though in a different context, where – is a minus sign, and meant to be pronounced, it might need to go at the start of the next line. Greater-than and less-than signs have the same ambiguity, as they are not pronounced when they surround a single word as in an html tag, but are if they are unmatched or surround multiple words separated by spaces. Incidentally, surrounding da-da in angle brackets causes the dash to be pronounced where it otherwise wouldn’t be.

Pronounced you es, unless in a capitalised sentence such as ‘TAKE US AWAY’, where it’s pronounced ‘us’

I also discovered what I’m pretty sure is a bug, and I’ve reported it to Apple. If two carriage returns (not newlines) are followed by any integer, then a dot, then a space, the number is pronounced ‘zero’ no matter what it is. You can try it with this file; download the file, open it in TextEdit, select the entire text of the file, then go to the Edit menu, Speech submenu, and choose ‘Start Speaking’. Quite a few haiku were missed or spuriously found due to that bug, but I happened to find it when trimming out harmless whitespace.

Apart from that bug, it’s all very clever. Note how even without the correct punctuation, it pronounces the ‘dr’s and ‘st’s in this sentence correctly:

the dr who lives on rodeo dr who is better than the dr I met on the st john’s st turnpike

However, it pronounces the second ‘st’ as ‘saint’ in the following:

the dr who lives on rodeo dr who is better than the dr I met in the st john’s st john

This is not just because it knows there is a saint called John; strangely enough, it also gets this one wrong:

the dr who lives on rodeo dr who is better than the dr I met in the st john’s st park

I could play with this all day, or all night, and indeed I have for the last couple of days, but now it’s your turn. Download the new Haiku Detector and paste your favourite novels, theses, holy texts or discussion threads into it.

If you don’t have a Mac, you’ll have to make do with a few more haiku from the New Scientist special issue on the brain which I mentioned in the last post:

Being a baby
is like paying attention
with most of our brain.

But that doesn’t mean
there isn’t a sex difference
in the brain,” he says.

They may even be
a different kind of cell that
just looks similar.

It is easy to
see how the mind and the brain
became equated.

We like to think of
ourselves as rational and
logical creatures.

It didn’t seem to
matter that the content of
these dreams was obtuse.

I’d like to thank the people of the xkcd Timediscussion thread for writing so much in so many strange ways, and especially Sciscitor for exporting the entire thread as text. It was the test data set that kept on giving.

I’ve been sitting on some improvements to Haiku Detector for a while, and it’s about time I released the new version. I had been planning to put this version on the app store, but I’m waiting to hear back from somebody about an icon for it. So for now, you can download it without going through the store. It should work on Mac OS X 10.6 or later.

This version finds haiku made up of multiple sentences rather than only those made of 17-syllable sentences. I also fixed the bug which caused it to crash occasionally when dealing with very long texts. To celebrate, I’ll go through some of the same texts I did when I first released Haiku Detector, and see what new haiku are discovered. To start with, John Scalzi‘s Old Man’s War. This version of Haiku Detector finds 304 haiku in it. Sometimes, sentences can be included in more than one haiku:

“I’m sorry. My sense
of humor was surgically
removed as a child.”

“My sense of humor
was surgically removed as
a child.” “Oh,” I said.

“Oh,” I said. “That was
a joke,” she said, and stood up,
extending her hand.

Here are some of my favourites of the multi-sentence haiku:

She asked, still without
actually looking up
at me. “Pardon me?”

“Okay,” I said. “Mind
if I ask you a question?”
“I’m married,” she said.

“Well, she doesn’t have
to live with you, now does she.”
“How was the cookie?”

“Our friend Thomas would
make it to mile six before
his heart imploded.”

This one sounds like it could be a metaphysical statement about what consciousness is in general:

Your consciousness is
perceiving the small time lag
between there and here.

“I would not presume
to assume, Master Sergeant!”
‘Presume to assume’?

My wife’s out here, sure.
But she’s happy to live her
new life without me.

I found a lot of new haiku in the CMS paper announcing the discovery of the Higgs boson, but they were all combinations of names from the stupendous author list. Since I included some from New Scientist last time, here are some from the issue of New Scientist that I am currently reading, a special issue on the human brain:

Imaging techniques
are allowing us to see
the brain in action.

The sound waves broke up
the synchronous firing,
ending the seizure.

Thought experiments
Sometimes an experiment
is impossible.

The ancient Greeks knew
about thought experiments
in mathematics.

These two go together:

Does that mean we should
revise our definition
of intelligence?

Until recently,
the same one had been used since
the 1950s.

I have many ideas for improving Haiku Detector, and I’d still like to see if I can detect the best-sounding haiku using linguistic tagging, but before that I’m thinking of rewriting the whole thing in Swift as a learning exercise. Since I don’t have a day job at the moment, I have a bit of free time if I strategically ignore sections of my to-do list. Actually, on that note, here are some particularly obvious haiku from the Mac OS X and iOS Human Interface guidelines:

At a minimum,
a menu displays a list
of menu items.

A picker displays
a set of values from which
a user picks one.

That will do for now. I hope you enjoy playing with the new version of Haiku Detector.

I originally wrote Haiku Detector because my friend Gry saw Times Haiku and wondered whether there were any haiku in her Ph. D. thesis. The other day I heard back about the haiku she found. It turns out that even the title of the thesis is a haiku:

Developments for
studies of the extremes of
nuclear matter

Here’s another one, which could be about anything. The last line is a bit of an anticlimax.

As of today, the
origin of this strength is
not well understood.

When I read this one, I wondered if miniball was a mini-golf style version of another ball game:

At HIE-ISOLDE
the MINIBALL would be used
for the same purpose.

The impurities
of 48,50Ti
are easily seen.

After seeing these, I sent her the as-yet-unreleased new version of Haiku Detector, which can detect haiku made up of several sentences. Having mostly had my name on papers authored by the entire CMS collaboration, I expected her to find a lot of haiku in the author list. But ISOLDE is much smaller, and also this is her thesis that she wrote, not some paper whose author list she got tacked onto. So she got some from references:

Kitatani, S.
Goko, H. Toyokawa,
K. Yamada, T.

C 47,
537
(1993).

and some things with section numbers tacked on:

2.1.1
Open shell nuclei and
collective models

This matrix is the
starting point for the Oslo
method. 45

That last one has so many possibilities. I like to think of it as being about an electronic band called The Oslo Method which released a 45rpm record about The Matrix. Unfortunately, nobody can be told what the haiku is. You have to see it for yourself. And indeed, you can see the other haiku she found on the #MyHaikuThesis tag on Twitter.

I noticed something interesting while writing this post — some of the ‘haiku’ Gry found include gamma (γ) symbols:

Haiku Detector on her Mac has treated them as having zero syllables, as if they are not pronounced, and I think I recall characters like that not being pronounced in the Princeton Companion to Mathematics. But I just checked on my Mac running Mac OS X Yosemite, and the speech synthesis (which Haiku Detector relies on for syllable counting) pronounces γ as ‘Greek small letter gamma’, so Haiku Detector does not find those erroneous haiku. I think that this might be a new feature in Yosemite.

But here’s where it gets weird: you’d think that it’s just reading ‘Greek small letter gamma’ because that’s the unicode name of the character. I tried with a few emoji and other special characters, and that hypothesis is upheld. But the unicode character named ‘chicken’ (🐔) is pronounced ‘chicken head’. Spooky. Another strange thing is that there is no unicode ‘duck’ character.

If you’ve beenpayingattention, you probably know why I happened to come across those oddities. I’ll have to investigate them later, though; right now I’m in Edinburgh for NSScotland, and it’s about time I looked at some tourism information.

So, Haiku Detector; what now? Maybe look for supersymmetric haiku?

Update: It seems that in Mac OS X 10.8, γ is not pronounced, and 🐔 is pronounced ‘chicken emoji’. Other emoji also have ’emoji’ in their pronunciations, while still others are not pronounced. I wonder if pronunciations were added (and later edited to remove the ’emoji’) for certain emoji, and now the default pronunciation has changed from nothing to the unicode name. So ‘🐔’ ended up with the explicit pronunciation ‘chicken head’ while others which were not previously pronounced use their unicode names. So this should be a haiku in Yosemite, though for some reason Haiku Detector does not detect it:

When I discovered that the court proceedings of the Old Bailey were available online, naturally I had to see whether they contained any haiku. The archive is too huge to put into Haiku Detector all at once, so I just checked the ‘on this day in…’ link whenever I had time. The most haiku-rich I’ve seen so far was from a wounding case on 8 September 1773, which, now that I think about it, should not have appeared as an ‘on this day…’ link yet. I had to clean up the text a little first, to remove all the Q.s and speakers’ names. Here are some of the 55 haiku that were left.

These ones sound like some kind of metaphor for the fiddly final steps towards achieving goals, and the monsters that might demotivate us from climbing toward those goals, but which are secretly part of ourselves:

How far is it from
the upper step of the stairs
to the door itself?

Upon the landing.
Was the door within view of
you at that time? Yes.

The General must
have seen you coming up two
or three steps at least?

How far had you got
up stairs before you saw Hyde?
Did you hear Hyde’s voice?

Who else was with you
there? I cannot remember
any one but me.

Where did you wait while
Hyde went into the house? At
the top of the street.

The world’s simplest riddle:

Yes. Where did you go
when you came into the house?
Into the entry.

And some more intriguing questions:

After Lee struck me:
the knife dropped upon the ground.
Was it by a blow?

Had he no blow with
the butt end of a pistol?
Not that I know of.

You say you knew the
General very well; do
you think he knew you?

When you came back what
part of the family did
you find below stairs?

In what condition
was the door when he fired
the second pistol?

What did he tell him?
That a parcel of fellows
were below with sticks.

Did you observe the
hole in the door case that was
made by the pistol?

Did you look through the
door to see the direction
the ball had taken?

Was the General
upon his legs or not? He
was upon his legs.

Some which sound like bloody massacres until you get to the last line:

I believe this is
the knife you was cutting the
bread and butter with.

Was James in the room
with you while you was cutting
the bread and butter?

Finally, a few which sound a bit dirty (or so I am told) if you have that kind of mind:

I added some features to Haiku Detector so that it will find haiku made of more than one sentence, though I haven’t released the new version yet, since I’d like to release it on the Mac App store (even though it will probably still be free, at least at first) to see how that works, and to do that I’ll need an icon first. If you know anyone who can make Mac icons at a reasonable price, let me know. Meanwhile, New Scientist has released a new ‘collection‘ called The Unknown Universe, so why not mine it for haiku? The topics are ‘The early universe’, ‘The nature of reality’ (again), ‘The fabric of the cosmos’, ‘Dark materials’, ‘Black holes’, ‘Time’ (again) and ‘New directions’.

Let’s start at the very beginning, the early universe:

Can we really be
sure now that the universe
had a beginning?

At first, that seems like a terrible place to break the sentence to start a new line. But what if we pretend, until we get to the next line, that ‘Can we really be?’ is the whole question? Because that’s the real reason people wonder about the universe.

Now, here’s a multi-sentence one, which conveniently has a full sentence as the first line:

I’m still behind on New Scientist, so I’m now reading the issue which has a special feature on Shakespeare. It seemed like a good issue to look for poetry in. Here are the haiku that Haiku Detector detected in the articles about Shakespeare. The first is a strategically-syllabicised book promo:

His book The Science
of Shakespeare is published this
month (St Martin’s Press)

The next has a supporting quote from the Bard himself:

Supporting quote: “If
sack and sugar be a fault,
God help the wicked.”

but this one is my favourite:

Most of all he swings
between moods superbly high
and desperately low.

That doesn’t seem like enough stuff for a blog post. Luckily, the issue just after the special issue that I alreadyfoundhaiku in has a feature on ‘stuff’, so here’s the only haiku from that:

I suppose it could make sense if somebody named Darren hopes that the self is an image:

The self may be a
necessary illusion
(image, Darren hopes)

The others are from the main text:

But we surely still
have the same self today that
we had yesterday.

For most people, most
of the time, the sense of self
is seamless and whole.

These ones are about sleep, perchance about dreaming:

Our emotional
undercurrents seem to be
the guiding force here.

This one requires ‘2008’ to be pronouned ‘two thousand eight’, not ‘two thousand and eight’:

In 2008,
hints emerged that these might be
the deeper stages.

The fountain of youth
may have been as close as our
bedrooms all along.

So it’s puzzling that
we still don’t really know why
it is that we sleep.

And finally, one on the final sleep, death:

When the risk is slight,
mild concern may be all that
is appropriate.

That’s all from that special issue of New Scientist, though the latest issue is dedicated to Shakespeare, so I hope to find some poetry in it. If there’s anything else you’d like me to mine for haiku, let me know!

While I was writing a poem a day, there would be times when I’d just feel like writing prose, for a break. I was hoping that this prose pressure would build up and I’d write something amazing when NaPoWriMo ended. Now that I’m trying to prioritise writing a short story for a competition, poems are trying to force their way out. So I still could manage 30 poems in 30 days, but I’m not going to pressure myself to post them by each midnight, and I won’t feel bad about posting found haiku when I don’t have a poem ready.

There’s only one unintentional haiku on the subject of consciousness, but it’s a good one:

You may think you know
the reasons, but they could be
a work of fiction.

Two about life:

These discoveries are
bringing an old paradox
back into focus.

There is a simple
way to get huge amounts of
energy this way.

One of these days I’ll add in some linguistics-based heuristics or a learning algorithm to rank the haiku; haiku lines ending in prepositions are often not as good, for example, and splitting the adjective from the following noun is a little weird too.

The section on time has the most and best haiku. This pleases me, because the largest text I tested Haiku Detector on when I first wrote it was the forum thread about the xkcd Time comic. There were a lot of haiku in there, and pointing them out encouraged people to write more.

So clocks tell us that
time is inextricably
linked somehow to change.

Now, more than ever,
we have to face up to our
ignorance of time.

If time’s arrow is
not in the laws of physics,
where does it come from?

Why do human brains
only remember the past
and not the future?

WE ALL, regardless
of our cultural background,
experience time.

Traditionally they
have lived by small-scale farming,
hunting and fishing.

Nonetheless, we could
do some interesting things with
our own time machine.

On the subject of time, I’d better hurry up and go out. Tune in next week for New Scientist’s unintentional haiku on the self, sleep, and death.

I’m not only behind on poems, I’m also behind on reading New Scientist magazine, so I’m just starting on a special issue with the ‘big questions’ with articles about reality, existence, God, consciousness, life, time, self, sleep, and death. This seemed like a good place to find interesting unintentional haiku, so I ran Haiku Detector over the first three sections. Perhaps I’ll do the rest on later Saturdays, to give myself a weekly break during poetry writing month.

There’s only one unintentional haiku on the subject of reality:

Afterwards, we map
the locations of all the
thousands of flashes.

These three are about existence:

“Small simulations
should be far more numerous
than large ones,” he says.

Sadly that means you
will never be able to
meet your other you.

A few researchers
even think it could happen
in the next decade.

That last one works for many great scientific quests, at any time. Here are some about God… or… Santa?

More interesting still
was a second version of
the experiment.

Santa knows if you’ve
been bad or good but does he
know all that you do?

Kurukkan suggested using Haiku Detector to find the unintentional haiku in Edgar Rice Burroughs’ ‘A Princess of Mars’. This it seemed like a fine idea to me. I haven’t read it yet, but I’ve heard of it, and there was even another movie based on it (‘John Carter’) released recently. There are quite a few haiku which have a nice twist in the last line; one even has a rhyme. I’ve trimmed out some that really don’t work, but since they’re not much effort to read anyway, I’ve left in some that still sound picturesque even if they don’t break nicely into the lines. If you’re not into Mars fiction, there are some haiku about a real Mars mission, and an opportunity for you to send your own haiku to Mars, at the end.

On regaining the
plaza I had my third glimpse
of the captive girl.

“Some day you shall know,
John Carter, if we live; but
I may not tell you.

And now the signal
has been given to resume
the march, you must go.”

“I am glad you came,”
she said; “Dejah Thoris sleeps
and I am lonely.

I have twice wronged you
in my thoughts and again I
ask your forgiveness.