Values in interpreters written in C are frequently implemented as (manually) discriminated unions - i.e. unions that share a field at the start to indicate the type and contents of the remainder - because that's a handy way of implementing the polymorphism required for a straightforward interpreter. It's pretty much necessary to use structs inside unions in order to have a more than one field per layout; the struct is just grouping, so it doesn't need a type name.
So without looking at any of MRI source, I'd be willing to guess that most, if not all, of its structures representing Ruby values start with a field of type RBasic, and that type contains information necessary to distinguish and interpret the remainder of the value.
Yeah. I just looked at the source because I was confused about how it knew how long the embedded string was (since the length field is in the other half of the union and ruby strings can have embedded \0 bytes), and RBasic is a struct containing a VALUE referring to a class and a VALUE "flags" that tends to have a lot of bit fiddling done to it.
Apparently out of the non-reserved bits, one is used to tell whether the string is embedded or not and five more are combined to give the string's length. Makes sense!
So without looking at any of MRI source, I'd be willing to guess that most, if not all, of its structures representing Ruby values start with a field of type RBasic, and that type contains information necessary to distinguish and interpret the remainder of the value.