It's exactly the point that this is one token. It's a string literal with opening delimiter `"` and closing delimiter `{`, and that whole token itself serves as a kind of opening "brace". Alternatively, you can see `{` as a contraction of `" +`. Meaning, aside from the brace balancing requirement, `"foo {` does the same a `"foo " +` would.
Still alternatively, you could imagine a language that concatenates around string literals by default, similar to how C behaves for sequences of string literals. In C,
"foo" "bar" "baz"
is equivalent to
"foobarbaz"
Similarly, you could imagine a language where
"foo" some_variable "bar"
would perform implicit concatenation, without needing an explicit operator (as in `"foo" + x + "bar"`). And then people might write it without the inner whitespace, as:
"foo"some_variable"bar"
My point is that
"foo{some_variable}bar"
is really just that (plus a condition requiring balanced pairs of braces). You can also re-insert the spaces for emphasis:
"foo{ some_variable }bar"
The fact that people tend to think of `{some_variable}` as an entity is sort-of an illusion.
> How does this change how you highlight either?
You would highlight the `"...{`, `}...{`, and `}..."` parts like normal string literals (they just use curly braces instead of double quotes at one or both ends), and highlight the inner expressions the same as if they weren't surrounded by such literals.
Fair enough. The point, as you have acknowledged, being that unlike + you have to treat { specially for balancing (and separately from the “).
> The fact that people tend to think of `{some_variable}` as an entity is sort-of an illusion.
I guess. I just don’t know what being an illusion means formally. It’s not an illusion to the person that has to implement the state machine that balances the delimiters.
> You would highlight the `"...{`, `}...{`, and `}..."` parts like normal string literals (they just use curly braces instead of double quotes at one or both ends), and highlight the inner expressions the same as if they weren't surrounded by such literals
Emacs does it this way FWIW. But I’m not sure how important it is to dictate that the brace can’t be a different color.
In any event, I can agree your design is valid (Kotlin works this way), but I don’t necessarily agree it is any more valid than say how Python does it where there can format specifiers, implicit conversion to string is performed whereas not with concatenation. I’m not seeing the clear definitive advantage of interpolated strings being an equivalent to concatenation vs some other type of method call.
The other detail is order of evaluation or sequencing. String concat may behave differently. Not sure I agree it is wrong, because at the end of the day it is distinct looking syntax. Illusion or not, it looks like a neatly enclosed expression, and concatenation looks like something else. That they might parse, evaluate or behave different isn't unreasonable.
That should probably not be one token.
> My view on this is that it shouldn’t be interpreted as code being embedded inside strings
I’m not sure exactly what you’re proposing and how it is different. You still can’t parse it as a regular lexical grammar.
How does this change how you highlight either?
Whatever you call it, to the lexer it is a special string, it has to know how to match it, the delimiters are materially different than concatenation.
I might be being dense but I’m not sure what’s formally distinct.