Example Haiku Generator 4- Parts of Speech
A feature the RiTa offers that we have not taken advantage of yet is that the words in its dictionary are not only tagged with pronunciation data (which is how we tell how many syllables are in a word), but they are also tagged with the part of speech for the word- or rather, the parts, since a word can be used in multiple ways. This allows us to make slightly more nuanced distinctions when we ask for random words.
We'll make a slight change to our code, and allow our random word function to request words of a given part of speech. This is a very minor code change- it just means adding another parameter to the syllableCount function that specifies the Part of Speech (pos). Here is the revised function:
String syllableCount(String pos, int n) {
return lex.randomWord(pos, n);
}
You can replace the existing function in your code with this one, or you can add it. Using two functions with the same name and different parameters is called "overloading", and is commonly done to make programs conceptually simpler to use. The program is able to determine which one to use by the parameters you send it. They COULD do something completely different, but it is good practice for them to be as similar as possible.
While the code may be only slightly changed, the grammar is not. In order to take advantage of this new capability, we'll need to add a great deal to the grammar definition. We can no longer use the simple rule <1> to get a random one syllable word- we have to tell it what KIND of one syllable word. This has added a lot of definitions to our grammar. The grammar below looks complicated, but if you examine it, you'll see that it follows the same basic structure as the previous examples, but has a lot more of the definitions specified. In that way, it is kind of a blend between the first haiku generator, where each word was literally spelled out, and the second, where we only defined the word by its syllable count.
{
"<start>": "<5-line> % <7-line> % <5-line>",
"<5-line>": "<one-noun> <four-verb> |<wh-pronoun> <three-verb> <one-adverb> |<one-pronoun> <three-verb> <one-adverb> |<one-pronoun> <three-verb2> <one-adverb> |<one-determine> <one-adject> <three-plu-noun> | <one-determine> <one-adject> <three-noun> |<one-noun> <two-verb> <two-noun> |<one-noun> <two-verb> <two-plu-noun> |<wh-pronoun> <two-verb> <one-noun>, <one-interject> |<one-determine> <one-list> <two-adject> <one-noun>",
"<7-line>": "<one-interject> <one-interject> <5-line> |<two-adject> <5-line> |<5-line> <one-noun> - <one-super> |<5-line> <two-super>",
"<one-super>": "`syllableCount(\"jjs\",1);`",
"<two-super>": "`syllableCount(\"jjs\",2);`",
"<one-list>": "`syllableCount(\"ls\",1);`",
"<one-interject>": "`syllableCount(\"uh\",1);`",
"<one-adject>": "`syllableCount(\"jj\",1);`",
"<two-adject>": "`syllableCount(\"jj\",2);`",
"<one-determine>": "`syllableCount(\"dt\",1);`",
"<one-prop-noun>": "`syllableCount(\"nnp\",1);`",
"<one-noun>": "`syllableCount(\"nn\",1);`",
"<two-noun>": "`syllableCount(\"nn\",2);`",
"<two-plu-noun>": "`syllableCount(\"nns\",2);`",
"<three-noun>": "`syllableCount(\"nn\",3);`",
"<three-plu-noun>": "`syllableCount(\"nns\",3);`",
"<one-adverb>": "`syllableCount(\"rb\",1);`",
"<one-pronoun>": "`syllableCount(\"prp\",1);`",
"<wh-pronoun>": "`syllableCount(\"wp\",1);`",
"<two-verb>": "`syllableCount(\"vbz\",2);`",
"<three-verb>": "`syllableCount(\"vbz\",3);`",
"<three-verb2>": "`syllableCount(\"vbp\",3);`",
"<four-verb>": "`syllableCount(\"vbz\",4);`"
}
There is a lot of room for improvement here- the number of different haikus generated by this version is far fewer than the others, because the grammar is much shorter. As an exercise, why don't you add some more rule definitions to it? You can either flesh out the higher rules to make more combinations, or you could add more terminal productions to choose different words. Or both.
Here is the list of the part-of-speech tags you can use:
- cc
- Coordinating conjunction
- cd
- Cardinal number
- dt
- Determiner
- ex
- Existential there
- fw
- Foreign word
- in
- Preposition or subordinating conjunction
- jj
- Adjective
- jjr
- Adjective, comparative
- jjs
- Adjective, superlative
- ls
- List item marker
- md
- Modal
- nn
- Noun, singular or mass
- nns
- Noun, plural
- nnp
- Proper noun, singular
- nnps
- Proper noun, plural
- pdt
- Predeterminer
- pos
- Possessive ending
- prp
- Personal pronoun
- prp$
- Possessive pronoun
- rb
- Adverb
- rbr
- Adverb, comparative
- rbs
- Adverb, superlative
- rp
- Particle
- sym
- Symbol
- to
- to
- uh
- Interjection
- vb
- Verb, base form
- vbd
- Verb, past tense
- vbg
- Verb, gerund or present participle
- vbn
- Verb, past participle
- vbp
- Verb, non-3rd person singular present
- vbz
- Verb, 3rd person singular present
- wdt
- Wh-determiner
- wp
- Wh-pronoun
- wp$
- Possessive wh-pronoun
- wrb
- Wh-adverb