Extremely likely, especially with the increasing abilities of LLM to decode unknown languages. Then the test would be for us to produce these sounds and see if the whales respond as expected.
You could train an llm on all existing whale sounds, get it to “listen” to live whales and respond with what it “thinks” it should, then do human analysis on the results, maybe find one shred of meaning, rinse and repeat.
That's literally impossible. Imagine trying to learn Japanese by talking to a Japanese man on the phone, with neither of you being able to understand or see each other or what you're each doing. Without shared context communication is impossible. Best case, you and the Japanese man would create a new context and a new shared language that would be neither English nor Japanese that would allow you to communicate about whatever ideas fit through the phone line. Maybe words like "sound", "word", "stop", etc.
Impossible in a single step, but perhaps impossible is too strong a word to reject the possibilities that arise when we consider how the statistics of words, or sounds are connected. If you can work out the statistical correlation between groups of sounds you can start to gain an idea of how they are interrelated. This is a stepping stone on the path to understanding.
>the statistical correlation between groups of sounds
That assumes that the speaker is similar to the person correlating the sounds. For example, if you had statistical data for utterances of English sounds in the context of Magic the Gathering tournaments, and you tried to decipher the speech of a Swahili electrical engineer talking about transistors, you could very well decipher something that's seemingly coherent but entirely incorrect.
It would be an overgeneralization to assume that whales speak about things in the same statistical patterns that humans do.
If you scroll down, the very first step they describe is for collecting datasets of existing translations. They aren't translating even unknown human languages, let alone completely alien ones.
I dunno if sometimes the language would be contextual, and utterances could not be understood without taking into account the context of what is occurring, or the speaker. Yes I know human language can be subject to these variables too. Anyhow it's all speculation and the dream of talking to animals is surely exciting.
I've heard that "da kine" in Hawai'i Creole English historically was, and still may be, used exactly in situations where the speakers share plenty of context, allowing them to figure out what it denotes, but leaving listeners largely unenlightened.
In a language such as Thai, pronouns are left out in most cases, and only added when you need to disambiguate. No plurals either, requiring you to add this information with extra words when it matters. But nobody forces you to communicate effectively, or use Oxford commas.
> Imagine if they are communicating using a lot of pronouns.
That's fine. The idea is to record them with lot of metadata in situ. Recording what is going on with the whales. (are they feeding? are they traveling? are they in a new location or somewhere they have been for a while? How many wales there are?) And also about their surrounding (sea state, surface weather, position and activity of boats, prey animals etc etc.)
You would need some way to convert the whale LLM to human language though. Otherwise you would just be making pre trained GPT4 for whales. One option would be to label data according to induced reactions in whales to whale language completions (i.e., let the LLM complete whale language and use the reactions to try to induce some understanding. But it feels unlikely we would get further than providing a chatgpt for whales that only they can understand.
You wouldn't necessarily need that. You don't actually need translated text for every single language pair a LLM will learn to translate.
ie train a LLM on English, French, Spanish data. This data only contains parallel text in English-French. Can this LLM still translate to and from Spanish ? Yeah.
You still have a bridge and each of those languages are not just from the same species but the same language family. If there’s English to French and French to Spanish there’s a semantic relationship between English and Spanish.
There exists no bridge to whale any more than there is aliens from Alpha Centauri.
Common concepts are common, what species the language is in is not as relevant as you think. Text and Image space, two entirely different modalities are so related in high dimensional space, you can translate between them with just a simple linear projection layer.
My guess: train a generative model to predict whale sounds, based on recordings of real ones, and hope that the resulting latent space will map to the one of a human-trained LLM. We'd need a stupidly large amount of recordings of whale songs, a tokenization scheme, and few already translated sounds/phrases to serve as starting points for mapping the latent spaces.
Exactly. Also, I think an alternative to LLM that is more generally trained towards identifying large linguistic patterns across a language could be cross referenced with the aforementioned more standard llm to at least point to some possible meanings, patterns, etc
We'd need contextual tracking of what the whales are actually doing/communicating to match to the songs. An LLM would be excellent at finding any correlated patterns between the language and actions, and then mapping those to similar English concepts, but that all requires the behavioral data too. Cameras strapped to whales maybe?
Would just need a way to tokenize, then use predictions to map back to some positive interaction symbol. Something like we think a certain phrasing means "food-fish-100m-down" and whales respond consistently to that.