The answer to this question is complicated. JavaScript char encoding is roughly UTF-16, which is 2-bytes, but the byte you read may have been part of a surrogate so, depending on your first one, you must read the next 2 bytes to complete your character.
And of course, this basic explanation doesn’t really do justice to answering your question, because depending on your definition of what a “character” is, you may need to take into account ligatures etc.
The 2-bit part applies, if not the rest, and it’s not really “on disk” but rather “memory for data type,” right?
Given the nature of the questions, I presumed they were interested in knowing “how does JavaScript load strings into memory, anyway?”
And to answer that question, your rough heuristic should be “2 bytes per character” not 1, even for ascii range. That just leads to additional questions, though, because of the oddity of it.
In order to achieve the ability to do Unicode, there’s a reserved set of values within that 2-bytes, to allow you to extend the encoding to reach Unicode.
Back to the original measurement, for the string “hello world”, I believe a JavaScript `sizeof`, if it existed, would report 24 bytes (22 for the characters, and 2 (give or take) for either the NULL character or for a length header.
The thread was originally about CRA vs Vite size on disk (or implicitly, if we're applying it to real world applications, network cost in CI job startup times). And like I said, surrogate pairs don't apply to ASCII.
See this[0] for reference. Note how the first byte must fall within a certain range in order to signal being a surrogate pair. This range quite deliberately falls outside the ASCII range. This fact is taken advantage of by JS parsers to make parsing of ASCII substrings faster by special casing that range, since checking for a valid character in the entire unicode range is quite a bit more expensive[1].
IMHO nitpicking about memory consumption of the underlying data structure is a bit meaningless, since the spec doesn't actually enforce any guarantees about memory layout. An implementation can take more memory for pointer to prototype, to cache hash code/length, etc, and there are also considerations such as whether the underlying data structure is polymorphic or monomorphic due to JIT, whether the string is boxed/unboxed, whether it's implemented in terms of C strings vs slices, etc.
Regardless, it doesn't change the fact that the octet sequence "hello world" takes 11 bytes in ASCII/UTF8 encoding (disregarding implementation metadata).
And of course, this basic explanation doesn’t really do justice to answering your question, because depending on your definition of what a “character” is, you may need to take into account ligatures etc.