Yes, I do agree with you - live image programming has to be composable/comprehensible/reproducible, and crucial state shouldn't be in anonymous objects. (I've even thinking of replacing mutable objects with with pure functions modifying a a tree of data). Types is another direction and the work on Strongtalk has proved influential for popular VMs.
But, we dont need to go back from objects to files, except for the purpose of interacting with the OS. Richer structures actually help comprehensibility. For instance, revision control operating at a structural level. UNIX would have much nicer, if something like nushell had been adopted from the beginning, and the 'little pieces' used to build the system worked on structured data.
"Programs consist of modules. Modules provide the units to divide the functional and organizational responsibility within a program."
"An Overview of Modular Smalltalk"
https://dl.acm.org/doi/pdf/10.1145/62083.62095