The sound of voice is largely a result of things happening behind the mouth, in the throat. I think most of what people refer to when they say mouth noises is lips and cheeks.
I thought that only vowels were considered to be "throat" sounds, and that most consonants came from the mouth? I guess a lot of them from tongue and teeth (like "s", and "t"), but I'm not sure how I could make an "m" or a "p" sound without using my lips
When you make a noise from the voicebox, the noise is what's heard. It's modulated by the lips and tongue but the lips and tongue don't add noises, they just envelope them.
You have a point about "s" and "th" which is a whistling between tongue and palate.