Hacker News new | past | comments | ask | show | jobs | submit login
Our brains 'time-stamp' sounds to process the words we hear (nyu.edu)
162 points by hhs on Nov 7, 2022 | hide | past | favorite | 102 comments



There's an interesting sortof thought-trap when thinking about these things - which this article doesn't fall into - but I'll mention it anyway cos its fun. Dennett calls it the Cartesian Theatre - and the idea is that when thinking about how the brain works, we may mistakenly imagine that once the brain has (say) processed these timestamped sounds, it then puts them all back together somewhere to 'play them back' to our consciousness. But thats a paradox because then you'd need another consciousness to interpret the reconstructed sound. Dennett likened it to imagining a little homunculus inside your brain that is watching a screen that plays back your conscious exprience. Of course it can't work like that, because it becomes recursive.

So when the brain 'time stamps' these sounds (as they put it) it (probably) doesn't then need to ever put them back in the right order again. That bit of processing is done. A corollary to this idea is that consciousness is (most likely) spread throughout the brain so there is no 'one place' where things come back 'together'. That also means there is no one instant in time where we become 'conscious' of things. If its spread throughout the brain is must necessarily be smeared across a (fairly short) internal of time too.

I think these days with neural nets being better understood perhaps we dont fall into this thought trap so much.


Dennett is making a (naive) philosophical point about the nature of consciousness, not a linquistic point about the nature of language perception.

You don't need a Cartesian Theatre, or even strict reassembly. I'd expect something more like a hierarchical/nested structure of template recognition for common word sequences and sentence structures.

Brains notice certainly when out of order are words. But brains can still make sense of them - with some extra effort - as long as there's still some templated structure left to work with.

You need to randomise much longer sequences before the templating breaks down.

None of this says anything relevant about what consciousness may or may not be. It's still the same old problem of qualia, only now it's about qualia that are perceived as linguistic and conceptual relationships, not trivial perceptual identifications. ("Dog", "orange", "philosopher", etc.)


You don't need a Cartesian Theatre

Thank goodness.

I don't understand how your post sits opposed to mine. They seem to be in agreement?


I think the other poster wants consciousness to stay mysterious and unrelated to pattern recognition. But I'm not sure where does it end up then given agreement about lack of Cartesian Theater.


> Brains notice certainly

Nice. ;)


> A corollary to this idea is that consciousness is (most likely) spread throughout the brain so there is no 'one place' where things come back 'together'. That also means there is no one instant in time where we become 'conscious' of things.

People can start acting with seemingly-conscious intent before they consciously become aware of the action. This delay can be quite a few seconds. Under ideal conditions with brain imaging, it's possible to guess some time what a person will do before they announce their own awareness of their intent (e.g. press one of two buttons).

I've had hints of this experience subjectively, from time to time. Action, then thought, in that order. I'd like a sip of water; with particularly careful awareness of relative sequencing, I'm fairly sure that, sometimes, when this thought percolates to the surface of my mind, my hand is already moving towards the glass.

Sometimes I wonder if the conscious self is mostly just a passive observer, constantly coming up with post-hoc rationalizations to explain why the body just acted the way it did. Decidedly unnerving, but that seems to be the nature of our being.

Multiple loci of consciousness is, IMO, fairly strongly supported by what happens with brain damage. Very rarely, people with damage to the corpus callosum linking the hemispheres of the brain, develop something like alien hand syndrome, where half of their body adopts complex and seemingly-intentional behaviour, that the person is quite unaware of consciously. This manifestation is sometimes described as having a personality and desires, which are often similar, but not identical, to the conscious part of that person.


People can act with "seemingly-conscious" intent without ever becoming "consciously" aware of the action. Eric Schwitzgebel wrote a lot about this; one of my favorites of his is How Well Do We Know Our Own Conscious Experience? The Case of Human Echolocation.

https://faculty.ucr.edu/~eschwitz/SchwitzAbs/Echo.htm


There is an excellent (very) short story by Ted Chiang called "What's expected of us" that I think you'll like.

https://www.nature.com/articles/436150a


Great story. But the end bothered me. Particularly this part:

“Some of you will succumb and some of you won't, and my sending this warning won't alter those proportions.”

Of course sending the warning will have an effect because it becomes part of the conditioning of the people who read it! They don’t get to choose how they will be affected by it, but it will certainly have an effect. To say that a person has no free will is not to say that they are not affected by their environment.


> Of course sending the warning will have an effect

Compared to not sending it? Of course – but that's a counterfactual. The present leads to the future, where this message is sent; the act of sending the message did not alter the past.


It's something, alright. I just find it a little... strange.

Compatibilism isn't a new invention, and Chiang ought to have heard of it, but it isn't mentioned at all in the story.


People can start acting with seemingly-conscious intent before they consciously become aware of the action

This idea is largely based on Libet's experiments in the 80s, and recent studies have concluded that what he was measuring was not the brain 'making a decision' but more or less just random noise which seemed to become significant due to the way the data was processed. It was widely reported in 2019, e.g. see here https://www.theatlantic.com/health/archive/2019/09/free-will...


> Under ideal conditions with brain imaging, it's possible to guess some time what a person will do before they announce their own awareness of their intent (e.g. press one of two buttons).

I think I read somewhere that this result couldn't be reproduced with a more rational task. The original experiment asked the participants to press one of two buttons when some signal was triggered, and asked them to remember what time they saw when they chose which button to press, all while hooked up to some EEG machine. The spike corresponding to their choice was consistently happening before the self-reported time.

However, if I'm remembering correctly, when further experiments were carried out with a more rational choice, they were unable to find a similar gap that would precede the reported time of choice. A possible conclusion is that in fact the original experiment had found nothing interesting about conscious choice, but something interesting about how we make random choices - when the normal variance in brain activity happens to spike in some way, the timing/place/some other characteristic of the spike is used later by our consciousness to determine the result of the choice.

Unfortunately, I don't know how to find the paper I remember, so I don't know if I'm remembering it right and/or if it is regarded as a good paper or not - so don't put too much stock in what I've said above.


You're talking about Libets famous emperiment in 1984 and the more recent articles (it was widely reported in 2019) that because of the aritificiality of the setup (as you say, participants were asked to just act randomly) Libet had essentially made a sortof statistical error and was not measuring 'decisions' at all. e.g. see https://www.theatlantic.com/health/archive/2019/09/free-will...



Why is that a trap. It stands to reason the little man requires only a smaller brain to process the tiny world of the homunculus theater. And the little man in his head is smaller still. At the bottom of the recursive stack is a single neuron. No contradiction.


Indeed. Sounds like a fine metaphor for the experience of memory to begin with, which absolutely loses detail as it's compounded. Remembering a memory may well just be adding another layer at the top of that recursive stack


>So when the brain 'time stamps' these sounds (as they put it) it (probably) doesn't then need to ever put them back in the right order again. That bit of processing is done.

I'm not sure I buy it. It wont have a "homunculus", but it could very well have a pipeline of autonomous processing centers, where the second part needs to have them in order.

uniq is not a homunculus here either, but still needs its input in order:

  cut -d, -f1 xxx.csv | sort | uniq | wc -l
(yeah, uniq is not needed here, but that's not the point)


Our brains bounce around in time constantly. Every time your eyes microsaccade, your brain is filling in the gaps after the fact with the best guess at what you were looking at, hence why periodic movements can be "elongated" when you first look at them (look at a clock and the second hand lingers a bit long for the first second).

Our brains absolutely do not need to process things in chronological order, but it seems we have evolved for our egos to believe we do, for evolutionary advantage (although, in the last few thousand years we've been doing everything we can to do an end run around that).


Our perceptive unconscious or pre-concious self does not need events to be in order to assemble a meaningful stream of cognition but I'd argue our conscious layer does. The part of us that has the ability to believe or not believe things is by definition a modeling appliance. It takes in information from the outside world through the various interfaces available to it and uses that information to build a causal model. It's often wrong, but it does fundamentally want to see the universe as this happens which has caused that in the past which means that this other event may have a similar outcome. I wonder if this modelling appliance is a major part of the ego and what we consider the 'self'.


At some point, the ordering of events _has_ to become 'information about the ordering of events' and once that happens, they no longer need to be in order.


it could very well have a pipeline of autonomous processing centers, where the second part needs to have them in order.

It might, but it might not. I guess the point is that re-integration is not _necessarily_ required. The thought trap is the idea that things _have_ to be re-integrated for 'consciousness to see them'.


>A corollary to this idea is that consciousness is (most likely) spread throughout the brain so there is no 'one place' where things come back 'together'

While it's true that consciousness-aware features are processed in a distributed manner, I don't think it follows to say that consciousness is spread throughout the brain. The core function of consciousness is integrated processing of disparate sensory data into a single decision stream. In other words, various sensory experiences can all influence our decision-making in arbitrary ways, and this "availability for high level decision-making" is what we identify as consciousness.

The location of this integration-decision apparatus seems like the best place to "locate" the seat of consciousness. While cortical areas are essential for specific kinds of processing, we shouldn't "locate" consciousness "in" those areas. This is because the output of a cortical area is modulations in signals sent to various downstream target areas. If we were to artificially modulate the downstream targets with an exact duplicate of the pattern of output from the cortical area in question, we would have the same conscious experience. The conscious experience isn't in the signal or its modulation, it is in a very specific kind of integrated processing. That is, there is some kind of representational translation that takes disparate frequency modulated signals and integrates them into a communal representational schema to support the single decision stream. Presumably this communal representation should have a unique computational strategy that stands out as a change in signal dynamics as you follow a stream of sensory processing up the hierarchy. Consciousness would be localizable to within this boundary.


Just a point about recursion. Recursion doesn't immediately imply that something gets smaller and smaller until it bottoms out (or never returns). Things can be self-referential without getting smaller. Code is data and data is code. Quines are a thing. Fixpoints and combinators are things.


The reason the Cartesian theatre is problematic is that it pushes the work of 'understanding' or 'processing' down the chain for the homonculus to sort out. So it's not so much the recursive thing-within-thing size difference, it's more that it's problematic because it dodges the hard bits.


>I think these days with neural nets being better understood perhaps we dont fall into this thought trap so much.

From what I've read, the designers of AI/ML systems are less and less able to definitively explain how the algorithm works, or how the system is going to respond to a given input. I suppose for 'sentient' AI thats the goal, but I think it is a bit scary if we get a result from a system, and nobody can tell you why or how it was computed.


That's kind of what I meant. The inner workings of ai/ml are mysterious to us but we are familiar with the idea of a 'black box' that can do something like 'find a face in this photo' and we know inside the black box there's a tangled network of weighted connections. We don't imagine a cartesian theatre inside the black box. But maybe 50 years ago we might have? So perhaps we are getting better at reasoning how the mind might work. People used to use clockwork as a metaphor for the mind, when clockwork was all they knew. Now we have better metaphors.


> I think it is a bit scary if we get a result from a system, and nobody can tell you why or how it was computed.

You don't need ML for that. That's Wednesday in any corporation.


But with ML, that's the target as that it's doing it's own thing.

Your Wednesday example is just because tribal knowledge is becoming extinct as they lay off the old timers nearing retirement before doing a brain dump. The younger people know they don't know it and rewrite it in a new language, but don't even know what that new language is doing in the background because they've imported so many 3rd party libraries using a newly written class that essentially just reaches out to the original black box but looks new and shiny


Heh, true. I was thinking more about things like china's social credit score which uses ai/ml.


by this logic, with humans having millions of neurons in other parts of the body, such as the gut, we may have consciousness spread out over the body, not just the brain.


Perhaps there's a reason for the cliches - "go with your gut", "what is your gut telling you?". It's pretty common to feel like some emotions (anxiety, excitement, shame) are radiating from the gut. Maybe there is actually a physiological link for certain base emotions?


It's common for meditation practitioners to relax their body (relax tense muscles) in order to gain access to calmer states of mind


Antonio Damasio is a neuroscientist who's done work in the area, and has decent popsci books about this idea. I read the older ones, "Descartes' Error" and "The Feeling of What Happens", and they're fun, good reads. Apparently he's written more on the subject as well.


Not sure about consciousness, but some of its processing is a given, the gut is even called a "second brain" by scientists in related fields.


Probably related, I’m not an native English speaker, and I have noticed that sometimes I don’t understand what someone is telling me in this language , just one or two words, and it kind of gets saved on a buffer in my mind and they keep talking and after a few seconds what was in the buffer gets “processed” and I understand it at the same time that whatever is being said to me currently. It feels weird.


I'm a native English speaker and I have the same thing, where someone will say something I don't understand, and then in the middle of asking what they mean, I realise what they were trying to say (The processing just took longer than I expected)


I think i know the buffer you are talking about and there are at least two insights I’ve learned from it. 1) it can contain any sound, and it de-coheres when you attempt to make meaning out of the sound. 2) you can train yourself to expand the length if the buffer, to the point of recalling sounds that have occurred several minutes ago, without focusing on them, like rewinding a tape that runs despite your awareness.

You’ve trained yourself already, to an extent, so that your tape rolls as far as it takes for a full sentence to fit and error-correct for the parts you don’t know. Notice the boundary of meaning and non-meaning you have to tread on when experiencing speech you’re not completely familiar with.

I’m also bilingual so perhaps learning a language is one of many ways to deconstruct our perceptions.


From my experience with learning languages I'd say that's perfectly normal when you have a conversation in a language that you haven't mastered yet. (Though your English seems fine.) The brain simply needs more time to process the input, whether it's because you're not familiar with a word, or haven't yet gotten used to the speaker's pronunciation or their intonation/melody.


I’ve had this happen in places were multiple languages are spoken.

Hearing something in Language B, but not realizing, so the brain processes it in Language A. Then not being able to understand for a few seconds until the brain realizes that what I heard is actually Language B, and suddenly the first seconds of the speech make sense.


I had this years later from The Simpsons of all things. In "Who Shot Mr. Burns (Part 2)" Tito Puente has a musical number "Señor Burns"[0] where the singer calls Burns "el diablo con dinero" which I remembered the sound of but not the meaning until I learned some Spanish like 10 years later.

[0] https://www.youtube.com/watch?v=p9kdDet7G14


I have an oddly similar sensation when becoming aware of a sound that woke me.


Even as a native English speaker I experience this from time to time.

I'll idly hear someone speaking - usually at quite some distance - but their words are very indistinct, I'm unable to make out what they're saying. I'll think to myself "what language are they speaking?", but after a second or two, there's some kind of frequency tuner in my brain that settles on the correct band and suddenly everything falls into place and I can understand them perfectly.


This is something that also happens as you age - even if your hearing remains decent, Central Auditory Processing Disorder becomes an issue. I imagine it will be more pronounced with a second language, so you have to look forward to.

https://www.rehab.research.va.gov/jour/05/42/4suppl2/martin....


I'm surprised nobody has mentioned the connection to AI. Many models need "positional encodings" as part of their input in order to to understand the spatial or temporal relationship between different input tokens. It's not at all surprising to me that the brain would have an equivalent.

It's not always clear what form these encodings should take. If we could figure out the actual encodings used by the brain, I bet we would find that they have big advantages over the ones we're currently using.


Came here looking for this comment. Even as someone who doesn't have feelings one way or another about any brain-neural network similarity, this one struck me the same way. Thanks for mentioning it.


Or the inverse - like "garbage" dna, our minds encode memories with "inefficient" baggage stamped to them... which are very likely how we tie emotion to memory. Perhaps a mind stamping with simple math could be much more "efficient".


My wife jokingly describes what I do as “buffering” where I’m not paying attention and she will interrupt me and I’ll catch up, almost by re-hearing the last 2-4 seconds of what she said.

It’s a bizarre sensation but I wonder if it’s actually rather common.


I do that too - it's an interesting sensation to realise you should have been paying attention and "Replay" the last few seconds of conversation to build a context. Unfortunately it's only a few seconds...

The most interesting related thing I've experienced is hearing an unexpected noise in the night (once it was a mouse in my room) and being woken up by whatever part of my brain monitors that stuff, and then having the really visceral sensation of the unexpected sound being played back to my now conscious brain to try and give context.


It’s a subtype of ADD. I have it too. Sluggish cognitive tempo.

I don’t suppose you’re also a very visual person, who prefers to solve math problems geometrically rather than algebraically, etc. There may be a link between that buffering and the ability to turn abstract shapes in one’s mind.


Oh wow, you just managed to describe me completely. I'm inattentive-type ADHD (diagnosed as an adult), and all of my thought processes are super visual; constructing visual mental models in space is how I problem solve and a lifetime ago in school I was top of the math class when I could construct visual models of the problem but struggled when I couldn't.

My diagnostic ADD testing was inconclusive because I was able to excel at working memory and shape tests because I am able to hold onto a ton of information if I can write it on my mental blackboard etc.

When I'm listening to a conversation, what I'm hearing often produces a stream of involuntary visuals representing related information and subjects, like I have a mental background process showing me flash-cards of possibly useful information.

I also have the issue where someone will talk to me and there's this delay where I often appear to be non-responsive for a moment or two until my brain spools-up and then I mentally lock into the conversation.

Anyway, if you have any references or additional learnings you've found related to this topic, please share.


I have both of these characteristics. Where did you learn it was a subtype of ADD?


Intensive independent research. I learned of it around 2010 so I forget the exact trail. Eventually I hit on a medical journal that documented it, and I was like “that’s me.”


I have this too - and I thought it was related to my complete inability to multitask.

In a small enough buffer time, I can catch up and context switch and “hear” what was said… but after a certain point, it’s gone (probably out of working memory).


I'm sure I'm missing some important nuance or something, but it sounds like what they're saying is, "You can understand spoken words because your brain processes the sounds in the order you hear them."

This sounds like the most obvious thing in the world to me. Everything about me experiences time sequentially. If I remember things, I can cast my mind back in time, but I still remember things from beginning to end, not in some other order.

I'd have assumed my brain processes my visual inputs sequentially too. I can't even think what other option there might be. Somehow everything I've seen or heard over some set period of time hits me all at once?

What am I missing here?


> Everything about me experiences time sequentially.

That's not entirely true. Ever look at a clock with a second hand and see the hand linger a little longer the first second? Congratulations, you just saw things out of order. Your brain did not receive input for the period of time when your eyes were in motion towards the clock, so it lied to you and filled in the missing data after the fact once it started getting visual input again.


...but it still put it before the hand moved, so it's still sequential, even if the gap was filled in with something that didn't happen (the hand standing still) rather than my brain guessing that the hand moved once before the one I saw, isn't it?


Similar things happen with vision BTW. When you start looking into it, what our eyes and ears and the connected parts of the brain actually do is pretty crazy. Our sensory systems are eventually consistent, and there are glitches, but amazingly it all mostly works.


Essentially our hardware isn't perfect, and there's delays so the software has a bunch of patches and workarounds to interpolate what it believes is occurring, which happens to be right a lot of the time. Normally only with illusions where this reality is confronted to us.


When you start looking into it


Ha! I wish I could say that was intentional. Thanks for pointing it out.


I think it could be almost described as a full on 3D rendering engine, we are getting the raw data from outside ourselves, but what we see is completely built up inside our own minds. Does that make sense?


It has been illuminating for me over the past few weeks to have lost hearing in my right ear as a result of a lingering sinus infection; I understood that identifying the direction a sound might come from would be difficult with only one ear, but I had no idea how crucial it was for speech comprehension, particularly when any other noise is also present.

I've found that I can barely even discern that speech is present within a mix of noises, much less am I able to comprehend the speech until such time as I can reduce the other noises to a minimal level, and directionally point my ear at the source of said speech. Conference calls were quite a delight there for a while.


It's much much more like our new AI image generators than a physical simulation of light and volume with mathematical precision. Imagination is a "good enough" fluctuation of seemingly random stimulation of neurons that we're able to judge as being quite approximate to the same stimulation pattern that we get from our physical sensors. Thinking about a scene or music and "hearing it" is like a unit test.


I like that. We are always processing information in 3D, not just vision but you can imagine hearing someone behind you and your brain immediately places them in 3D space based on your prior knowledge of your surroundings.

It's amazing sensor fusion because you can see/hear/smell/touch something all at the same time and associate it correctly with one object.


We hugely underestimate how processed all of our senses are.

Hearing doesn't listen to pressure waves. It does some very complex real time source separation to distinguish between different sound sources.

Then it performs overtone and resonance matching to identify different speakers.

Then it follows up with phoneme recognition to identify words - which somehow identifies phoneme invariants across a wide range of voice types, including kids, male/female, local/foreign and social register(class)/accent.

Then it recognises emotional cues from intonation, which again are invariant across a huge range of sources.

And then finally its labels all of that as linguistic metadata, converts the phonemes into words, and parses the word combinations.

It's not until you try to listen to a foreign language that you hear the almost unprocessed audio for what it is. And even that still has elements of accent and intonation recognition.


Non linguistic elements of verbal communication are so universal nobody even really notices when fictional alien species in media communicate to human protagonists. It's noncontroversial to the audience that hostile/non-hostile, instruction, friendliness, cooperation, etc, are all embedded in the tones of all animals, robots, and even tree creatures throughout the universe.


This isn't surprising given that it's well known that we share more than 90% of our DNA with tree creatures across the universe


I’ve heard garbled words over a bad connection that I didn’t understand, only to have my brain parse them seconds later without intentional effort. It makes me wonder, is the language center processing the memorized version of sound here or is it reprocessing at the lower level?


This is basically an everyday experience for me, and not limited to telephone calls. My hearing is not great, and I’ll often ask someone to repeat something only to finally finish parsing it about a second after I asked because in the intervening time my brain reconstructed a signal from a bunch of noise.


I've had the same experience my entire life. That said, testing shows my hearing isn't great, but it's not horrible either. I even have an exceptional ability to identify actors solely by their voice when others can't identify them at all. I've often wondered if there is something else going on that runs deeper than just surface level "hearing". Like I'm hearing slower but more deeply.


Congratulations, you might have APD: https://en.wikipedia.org/wiki/Auditory_processing_disorder

I suspect I have it, too. I seem to experience sound differently than other people. Separating voices from background noise is HARD, but I'm also very good (I think) at recognizing actors by voice.


I don’t know if I’m hearing more deeply, but basically the same experience: failed every hearing test I ever took but I’m not deaf either. Just need to rely more on post-processing than others, and the best I’ve been able to figure is I didn’t get enough socialization at critical points of my early development, which is true, I just can’t prove that it’s also related to my hearing.


Fascinating... this happens to me a lot also, but I never really realized what I was experiencing until reading your comment. Later on I will feel guilty asking people to repeat things when I actually understood them. I never considered what you are saying, that I didn't yet have the information when I asked them, but later did.

I often wonder if my hearing is poor, or if I am just overly sensitive to the possibility of mis-hearing people, that I would rather err on the side of confusion. I overhear a lot of other people talking that I can tell misunderstand each other, but neither are aware, and I wouldn't want to do that.


And, if it's your native language, you can't help but process it.

This is why I like talking to 4-year-olds, they see the world as it truely is, and can communicate it back out. They don't have all the conditioned learning the rest of us have, but can see a clearer picture without bias.


>And then finally its labels all of that as linguistic metadata, converts the phonemes into words, and parses the word combinations.

How did the scientific studies show this?


You're asking for a summary of the entire fields of speech perception and psycholinguistics. A good place to start is the groundbreaking speech perception experiments done at Haskins Laboratories in the 1950s by people such as Alvin Liberman.


I remember in grade 8 my geography teacher noticing I wasn't paying attention and asking: "What did I just say?"

The sound of his last couple sentences were in my recent memory (timestamped?), but it seems I hadn't fully parsed them yet. When he asked, I took a brief moment to parse them, then to his disappointment I repeated them verbatim to him.

I'll never forget the way my brain processed delayed audio that day!


I wonder if this is part of why it's nearly impossible to form sentences when you hear an echo of your own voice on zoom or something, like your brain perceives it as a duplicate chunk that was mis-timestamped.


I do wonder myself if there is some research on this, it is irrationally annoying when I hear an echo of myself while talking, so much so that I just can't proceed, it would be nice to know why that is.


Look at research on stutters. There are treatments for stutter that involve replaying one's own voice back to stutterers with a delay.

https://en.wikipedia.org/wiki/Delayed_Auditory_Feedback


I wonder if Auditory Processing Disorder could be related to this mechanism.

https://en.wikipedia.org/wiki/Auditory_processing_disorder


"Speech consists of a continuously-varying acoustic signal. Yet human listeners experience it as sequences of discrete speech sounds, which are used to recognise discrete words. To examine how the human brain appropriately sequences the speech signal, we recorded two-hour magnetoencephalograms from 21 participants listening to short narratives. Our analyses show that the brain continuously encodes the three most recently heard speech sounds in parallel, and maintains this information long past its dissipation from the sensory input. Each speech sound representation evolves over time, jointly encoding both its phonetic features and the amount of time elapsed since onset. As a result, this dynamic neural pattern encodes both the relative order and phonetic content of the speech sequence. These representations are active earlier when phonemes are more predictable, and are sustained longer when lexical identity is uncertain. Our results show how phonetic sequences in natural speech are represented at the level of populations of neurons, providing insight into what intermediary representations exist between the sensory input and sub-lexical units. The flexibility in the dynamics of these representations paves the way for further understanding of how such sequences may be used to interface with higher order structure such as lexical identity."

Neural dynamics of phoneme sequences reveal position-invariant code for content and order | https://doi.org/10.1038/s41467-022-34326-1


This reminds me of the correlogram modeling of auditory perception where the cochlea uses autocorrelation to encode temporal information of a sound signal. A neat idea that helps describe a lot of time encoded auditory processes.

Short explanation: https://ccrma.stanford.edu/~malcolm/correlograms/index.html?... [stanford.edu]


There was an excellent PBS show called The Brain with Dr David Eagleman which, in one episode, went into detail on the way the brain re-syncs what it sees with what it hears, essentially doing a tiny bit of time travelling to ensure that your world makes sense. It's well worth seeking out.


Is this why we turn down the car volume when looking for a house number? The resync process is taking too much bandwidth, or messing with our visual/temporal system, so reducing the audio input helps the system.


I couldn't find the show, but he has a related lecture on YouTube

https://www.youtube.com/watch?v=vv_e99qbJ4U


This was fascinating, thank you


Fantastic, thank you!


I experience something like audio dyslexia. At the phonetic level, jumbling syllables vs whole words.

Reading the other comments, seems like many others learned to cope by consciously buffering, allowing parsing and comprehension to catch up.

I'll often have multiple tentative interpretations. So then I intentionally replay the sounds I think I heard, as well as wait for more context (because a lot of speech and conversation is redundant), before settling on an answer.

I eventually trained myself to not habitually say "What?" I eventually figured out that just pisses people off.

It'll be hysterical if I've had something like ADHD all this time. Would explain so much.


It implies a single buffer so there is a delay in cognition which implies our consciousness is always lagging behind physical reality.


That seems so efficient to do it all at once like that. The brain seems to be this massively parallel/buffered biological machine with an assembly pipeline to the consciousness center(s)? Like a mix of specific architecture layout (software 1.0) with a bunch of pattern matchers (software 2.0) maybe but I'm complete layman here obviously.


I wonder if our brains do the same timestamping when we listen to someone sing a song. Our brain database will not have enough examples of those exact words being sung in that musical way. I guess that's why we need a little more focus to understand the lyrics of a song vs someone talking to us.


Children however seem to prefer the musical version and find that easier to focus and understand, especially around 1-2 yrs old.

Perhaps the tempo of music helps build establish this timestamping ability then is no longer needed.


Reminds me of positional encoding in attention based networks.


I was thinking that too, but this description makes it seem more like a type of windowing rather than the position encoding in a transformer (which is fixed): "the information [...] gets passed between different neural populations in a predictable way, which serves to time-stamp each sound with its relative order.”


This is why (unmentioned company) runs on timeseries.


I figured there had to be something like that.


> The researchers found that the brain processes speech using a buffer

This is not news to anyone with ADHD who regularly says "what?" before immediately figuring out what was already said and responding to it. It feels exactly like reprocessing an existing auditory buffer.


Is this a specific thing to ADHD? I do this a lot, and I am ADD, but I thought this is a common thing.


Yep adhd and do the same.

It’s around a 5 second buffer.


[flagged]


It's sad to see how many people in tech circles blindly bought into the "just wear a mask" bullying without ever considering that there are consequences for others.

The fact my comment got voted down so quickly illustrates that.

communication is complicated and when you remove the ability for a neurodivergent person to effectively communicate, you really change their entire life.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: