Machine Translation and the Porpoise Corpus

I might have mentioned that I got to do some world traveling for my work recently. Seeing rural Tanzania was an experience that I still don’t really have good words to describe. But this is not a post about that. This is a post about a sticky idea I got stuck on in some science fiction I was reading during my multi-day to and fro travel.

On my around-the-world-in-4.5-days journey, I read the Jewish feminist sci-fi novel He, She, and It by Marge Piercy. It’s got a classic hard AI theme, about a robot that is so, so human… I’d recommend it. But dilemmas of whether a robot can make a minyon in the reform tradition of 2059 has not stuck in my mind the way this one line about whales has:

The great whales—we had just about killed off the last of them before we began to translate their epic and lyric poetry.

Okay, I’m a little embarrassed by it when I re-read it, but seriously, could we do it? That is, does a serious attempt to translate whale songs into english have a chance in this modern age? I once had a dusty book about an effort by Carl Sagan and his buddies to learn to communicate with dolphins in the 1960s, but technology has seriously advanced since then.

The last talk I saw on statistical machine translation was an effort to do arabic-to-english translation without telling the computer anything about the structure of sentences in either language.

Whale-to-english translation is at least one step harder, since there is a whale-speech-to-whale-text component that needs to precede the machine translation part (and, I suppose, there is the possibility that whale songs cannot be translated into english).

A few questions: Has it already been done/proven impossible? Do you think we could do it? Do you have a vast collection of whale songs available to aid in the quest?

Regarding question 3, after a little searching, I’ve found the lab at Cornell that probably has the necessary data set. They seem more interested in counting and tracking whales than translating them, but I’ve seen many a health researcher be protective of this sort of precious data. I wonder if the Bioacoustics Research Program could share a few thousand hours of recordings.


6 responses to “Machine Translation and the Porpoise Corpus

  1. Aaron Clauset coincidentally has a recent blog about his more serious research on whales,

  2. And further investigation into that book about talking to dolphins makes me cautious about following this route too far…

    From the Kirkus Review of John Liddy’s The Mind of the Dolphin: “It was bound to happen; the signs were there six years ago. Lilly, a man respected by his colleagues for his neurophysiological studies and for a fine flair for electronics, has freaked out. He has flipped on Flippers. He has become the Leary of dolphinology; the leader of a cult of spiritual discovery which says that the proper study of mankind is dolphins.”

  3. andy

    oh yeah, i read this, and i’m gonna go rush out and get a trademark on that tagline. maybe the domain name too while i’m at it…

    keep me posted!

  4. Dude, if you haven’t read the Illuminatus Trilogy, I suggest you go and do that right now…

    ‘ “Epics,” said Hagbard. “They’re mad for epics. They have their whole story for the past forty thousand years in epic form. No books, no writing— how could they handle pens with their fins, you know? All memorization. Which is why they favor poetry. And their poems are marvelous, but you must spend years studying their language before you know that. Our computer turns their works into doggerel. It’s the best it can do. When I have the time, I’ll add some circuits that can really translate poetry from one language to another. When the Porpoise Corpus is translated into human languages, it will advance our culture by centuries or more. It will be as if we’d discovered the works of a whole race of Shakespeares that had been writing for forty millennia.”‘,%20Robert%20Anton%20&%20Shea,%20Robert%20-%20The%20Illuminatus!%20Tril.html

  5. Casey

    Interesting story on Here and Now about whales and their chatter….