Gwern has answered my prayers and taught GPT-2 poetry.
GPT-2 is the language processing system that OpenAI announced a few weeks ago. They are keeping the full version secret, but have released a smaller prototype version. Gwern retrained it on the Gutenberg Poetry Corpus, a 117 MB collection of pre-1923 English poetry, to create a specialized poetry AI.
I previously tested the out-of-the-box version of GPT-2 and couldn’t make it understand rhyme and meter. I wrongly assumed this was a fundamental limitation: “obviously something that has never heard sound can’t derive these complex rhythms just from meaningless strings of letters.” I was wrong; it just didn’t have enough training data. Gwern’s retrained version gets both of these right, and more too. For example:
Thou know’st how Menoetiades the swift Was dragged, of Hector and the fierce compeers And Phrygian warriors. So, we will dispatch Your bodies, then, yourselves to burn the ships In sacrifice; with torches and with bells To burn them, and with oxen to replace Your gallant friends for ever. But I wish That no man living has so long endured The onset of his foes, as I have power To burn or storm; for mighty Hector erst Was slain, and now returns his safe return
This is all perfect iambic pentameter. I know AP English students who can’t write iambic pentameter as competently as this.
(by the way, both “compeers” and “erst” are perfectly cromulent words from the period when people wrote poems like this; both show up in Shelley)
It has more trouble with rhymes – my guess is a lot of the poetry it was trained on was blank verse. But when it decides it should be rhyming, it can keep it up for a little while. From its Elegy Written in a Country Churchyardfanfic: