You could train an llm on all existing whale sounds, get it to “listen” to live whales and respond with what it “thinks” it should, then do human analysis on the results, maybe find one shred of meaning, rinse and repeat.
That's literally impossible. Imagine trying to learn Japanese by talking to a Japanese man on the phone, with neither of you being able to understand or see each other or what you're each doing. Without shared context communication is impossible. Best case, you and the Japanese man would create a new context and a new shared language that would be neither English nor Japanese that would allow you to communicate about whatever ideas fit through the phone line. Maybe words like "sound", "word", "stop", etc.
Impossible in a single step, but perhaps impossible is too strong a word to reject the possibilities that arise when we consider how the statistics of words, or sounds are connected. If you can work out the statistical correlation between groups of sounds you can start to gain an idea of how they are interrelated. This is a stepping stone on the path to understanding.
>the statistical correlation between groups of sounds
That assumes that the speaker is similar to the person correlating the sounds. For example, if you had statistical data for utterances of English sounds in the context of Magic the Gathering tournaments, and you tried to decipher the speech of a Swahili electrical engineer talking about transistors, you could very well decipher something that's seemingly coherent but entirely incorrect.
It would be an overgeneralization to assume that whales speak about things in the same statistical patterns that humans do.