Brain Oriented Programming

legerdemain · on Aug 15, 2020

This is a nice story, but the reality is that popular languages and libraries tend to expose API surfaces that are broad and shallow, not ones that are narrow and deep.

For example, a Python string object has something like 40 methods. I counted about 85 instance methods on org.joda.time.DateTime from Java.[1] Turning to JavaScript, every method in Underscore.js lives in a single gigantic namespace.[2]

Experience suggests two related factors that favor the proliferation of big, shallow APIs.

Modern IDEs offer rich auto-completion with pop-up docs. This makes it much easier to look at the available list of fields and methods and choose the one that's appropriate.

A second, related reason is that IDEs generally don't have tooling that is nearly as rich to help you identify the chain of calls you need to make to obtain an instance of some type that has the method you want. Do you get a Connection from a ConnectionContext, a ConnectionContextManager, or a ConnectionContextManagerCache? And how do you get one of those?

Shallow APIs and big namespaces tend to favor easy lookup. Deep APIs where every type only has a handful of methods make you memorize them.

[1] http://joda-time.sourceforge.net/apidocs/org/joda/time/DateT...

[2] https://underscorejs.org/

didibus · on Aug 15, 2020

I agree, one of OOP's problem is conflating code organization with data-structure and access scopes. Especially true when each class must be its own file as well.

In FP, in contrast, you might have one namespace that contains your entire set of functions. Each function is still just its input to output. So when you read the function, in your mind, there is nothing else you need to think about, just the function that fits in only 7+-2 line of code most likely.

Now, if you find that the functions need some logical grouping to help you find them, or if you need to split things in more than one file to share the file with other projects or for multiple people to more easily work on the code, you can do that as well, and move some functions out to other namespaces, but doing so only affects your import statements, not the structure of your data or anything else of that sort.

Similarly, in FP, things are immutable for the most part, which means you don't have to think about all the other parts of the code that might modify the value of your variable, or list, or map, etc.

Another issue with OOP is touched in the article as well, the OOP structure doesn't map with everything you'll need to work with. That's where the whole ORM issue comes in. For example, tables are really hard to represent as objects. Which in turn makes working with them and manipulating them more difficult than it needs to be.

kaens · on Aug 15, 2020

i have found larger real world fp codebases to have essentially the same organizational problems as OOP codebases, just expressed in different ways.

most of the problem wrt code organization has to do with communicating meaning to humans, fp codebases written by people with a weak and evolving understanding of their problem domain are prone to the same sort of spaghettish expression of intent as OOP codebases in similar spaces, albeit often less verbosely by LOC.

the assumption that all you have to think about is the implementation of the transformation of the function you are currently looking at's inputs to its outputs is often incorrect, or only true in small or idealized cases.

this isn't a diss on FP styles as a whole, i prefer them, but it is not in my experience a great model for everything nor do i find it to significantly aid clarity of purpose in many cases. it is well suited to a lot of shapes of problems that currently pop up often, particularly stream transformation / aggregation types of things.

edit: for clarity, the thing that happened in this industry where "OOP" became synonymous with "how you write code / how you model problems" was a terrible mistake and the influence of fp mindsets on the whole has been a positive, especially considering that context. anything that disconnects seriously thinking about how what you're writing is being understood by humans and ran by machines and how that intersects with your domain is imo a mistake, however this is a bit idealistic of me.

pbw · on Aug 15, 2020

One of the reasons Linus cites for disliking C++ is he dislikes C++ programmers. He considers the average C++ programer to be of much lower quality than the average C programmer. And I don't really doubt him.

I think the same can be said for FP vs. OOP. People really into FP have passed through a filter. If you take a very mediocre OOP programer and track them over the next 5 years, they are unlikely to get into FP. But if you take a very bright, motivated, curious OOP programmer who likes to test limits and explore, they are a prime candidate for maybe drifting into FP. So I think it's obvious the caliber of FP programmers is higher on average than C++ or Java programmers.

As you say, though, it's possible to write bad code in any language using any programming paradigm. Not just possible, it's trivial! So I don't think FP magically solves all your problems. I think FP has some great properties and does 100% avoid some specific pitfalls that plague OOP, but it's not magic.

Great FP programmers can write amazing programs, but great OO programmers can as well. Just like you can write great (or horrible) literature in English, French, German, or Chinese.

kaens · on Aug 15, 2020

While I recognize what you're saying here wrt types of programmers and what they're likely to get into, I am not convinced that this doesn't have more to do with relative popularities of terms + the tendency for people who are "into" a subject to explore the subject / be open to explorations of different ways of thinking about the subject.

neither fp or oop are well defined terms in actual usage, more of smears of approaches really. moods almost. We have an awful lot of this sort of thing and often a term will get smeared out into a kind of emotional suggestion + some things that help you avoid types of bugs or accomplish a specific sort of goal.

I don't like this much but for a lot of situations it just means you have to understand things well enough to talk about them without using their name or take the time to establish contexts of meaning w/whoever you are talking to, both of which are good things to have in programmers in general.

That is to say: I think you are correct they have passed through a filter, particularly if they came from an oop background, but I suspect that filter isn't much related to fp vs oop principles as a matter of comparison

edit: just noticed you're the author of the article. here's a thought: the reason you want to avoid mutability can be framed in terms of it causing the number of things you have to hold in your head to very quickly exceed nine, often in ways you may not be aware of unless you have pretty comprehensive knowledge about the runtime characteristics of your entire system's stack.

pbw · on Aug 15, 2020

I disagree that FP and OOP are aren't sufficiently distinguished from each other. There is substantial academic literature on both methodologies, and they are very different worlds in theory and in practice.

I do agree though it's possible and commonplace to do a blend of both, plus structured programming, plus just general hackery. I've often seen OO systems with a big pile of pure functions, particularly math functions, because math mostly is just functions. And it's quite nice! You don't have to worry about the interactions, the advantages of pure functions are quite clear.

I agree with you that mutability can explode the amount of state you need to be aware of. Milewski just tweeted in response to the article "The problem with OO is that, even if you expose 7 things on the surface, there are hidden dependencies that you cannot ignore: object A may store a pointer to B. You can write complex functional programs, but the syntax forces ugliness to the surface for all to see."

https://twitter.com/BartoszMilewski/status/12946650880543784...

However that's the nice thing about small objects, they better contain the mutability chain reaction. A small inner object cannot see anything else. Whereas in a big kitchen sink object every attribute and every method is in play.

kaens · on Aug 15, 2020

not saying they aren't distinguished, am saying that a huge amount of the discussion around them in the workforce is being done by people who couldn't define them for you, and that a lot of everything in our field is driven by people who don't understand things copying the shape of them and repeating stuff they heard from people who sound like they do. in general, these aren't people who are going to cross-paradigm as a matter of honing their craft and the ones that are tend towards being good programmers in the end.

i'd further say that that mass lack of understanding is closer to the reason for there being such a mess in many oop systems, in combination with economic pressures, despite common OOP practices being not good for lots of things they got used for.

to quote you elsewhere in the thread:

> The reason bad OO is so common is some codebases become economically valuable right around when they start falling apart due to poor design, but you can hire more and more people if the software is bringing in more and more money.

this is a good insight, but i assure you this happens in fp codebases written by inexperienced programmers or poorly managed inexperienced businesses who hired someone to burn them out as well, in exactly the same way, for the same reasons, and that those fp codebases are just as bad, especially now that fp is a "attract the 10xers" item for companies. (would you rather figure out where the thing mutating the state under your nose is or where the hidden assumption that this state will mutate is? i'd prefer doing neither)

Milewski is correct with the caveat that people write the ugliness and build and maintain upon it, producing systems that themselves have tons of spooky action at a distance and get mitigated by bugpatch burn marches just the same

pbw · on Aug 15, 2020

I guess I’ve lost the plot here. Sturgeon’s Law says 90% of everything is crap. Half of programmer’s and their programs are below average. The article is just saying that objects with lots of attributes can be problematic and so we should avoid doing that. I stand by that suggestion.

kaens · on Aug 16, 2020

oh yeah i absolutely agree with you there honestly i just got too in the weeds here, sorry

valand · on Aug 16, 2020

Simplified and abstracted, Great programmers think things through. They don't just figure out how to make things work. They think about the tools itself (in this case their tools are the programming language) and see whether certain pattern in a PL produce the best outcome, either the semantic or the performance.

I think what Linus saw was the pattern where bad programmers tend to intersect with the demographic of c++ and java

It is an unfortunate generalization, since I have one person in mind who is very mature in software development and spends a lot of time in c++ due to job and experience circumstances.

But since it's Linus, it might be a valid generalization for us to watch out for, regardless if it is accurate or not

AnimalMuppet · on Aug 15, 2020

I wonder if the "filter" is because OOP is the default way of teaching programming at the moment. If we changed to FP being the default, then only the great programmers would become OOP programmers, and the mediocre ones would be stuck in FP.

kaens · on Aug 16, 2020

i think so, largely yes. large industry immersion for a long time (still) lead to, in my view,teaching people wage work skills as opposed to, say, trade or craft skills if that makes sense.

i don't think they'd necessarily become OOP programmers if fp was the default, but i do think we would see a similar trend of fp becoming a distasteful term. all approaches have edge cases and pain points, all those edge cases and pain points get amplified very loudly when they're not accounted for. a lot of fp requires at least as much subtlety in this regard as anything else.

within the framings of software development, i think it is the maintaining of rotting systems that ultimately causes distaste. it is not hard to imagine someone only ever running into, eg, oop systems that are a mess because that's basically all there was for many companies building many things under burnout schedules with no one with more than 5 years of experience working on them.

the problems that fp makes go away easily, the big wins, particularly things like mutability, are things that just happen to fit very well in the problem spaces a lot of software in "the discourse" is written for. if and when those spaces change, so will favored approaches, whatever they end up being.

all that said, i do not think that everyone writing software needs to be a mega computing wizard. the industry is very broad in terms of what types of skillsets are wanted or needed where. i don't think we support this idea very well as a whole.

LkpPo · on Aug 17, 2020

> I don't think they'd necessarily become OOP programmers if fp was the default, but i do think we would see a similar trend of fp becoming a distasteful term.

Inevitable! You only have to see how the abuse of GOTO has traumatized memories with very dogmatic contrary reactions. While used well there is no harm.

> because that's basically all there was for many companies building many things under burnout schedules with no one with more than 5 years of experience working on them.

Not only. There are people who will never learn even after 15 years of career. It doesn't interest them or they don't have strong enough fundamentals and will never catch up.

didibus · on Aug 16, 2020

That's a different way to approach the challenge though.

You're hoping that some technique or tool can be so rigid and fool proof that no one can write a bad program in it.

Where as I'm hoping to find some techniques and tools that let me write programs with better designs.

Nothing fulfills the former, even though lots of people are researching it. On the other hand, I'd say FP fulfills the latter.

So the questions for me become about the spread of quality. Is the worst FP program worse or better than the worse OO program? And what about the best?

I'd personally say the worst of FP is better than the worst of OO, and similarly, the best of FP is better than the best of OO. That makes me prefer it overall as my paradigm of choice.

Even though this does mean you will find FP code bases that are worse than some OO code bases, but if you wanted to pick a paradigm in order for you to write a well designed program, FP would probably be a better choice, obviously, in my own opinion, so that's why I personally pick FP over OO.

pbw · on Aug 15, 2020

I love flat APIs with many methods or functions. My article only talks about objects and attributes not methods or APIs. Numpy's ndarray object has around 15 attributes but over 50 functions. Imagine the reverse, if it had 50 attributes? Shudder.

There's a huge difference between attributes and methods. As the article says attributes are just global variables in the structured program that is the object. Every attribute that you add increases the number of "globals" visible to every method, present and future.

Consider if all your attributes were just booleans. If you have 4 attributes your object can be in one of 16 states. That means every method in theory has to account for each of those 16 states. If instead your object has 10 boolean attributes the object can now be in one of 1024 states. Every method, present and future, needs to work correctly no matter which of the 1024 states the object is in.

This is why Functional Programming usually prohibits mutable state. 1024 states is insane enough, but an object which bounces around to different points in that state space over time? That's the stuff of nightmares. Many (most?) big OO programs grow over time into that exact nightmare scenario. Many people hate OO precisely because they've spent years shoveling in the illegal sapphire mine that is large badly written OO codebases.

The reason bad OO is so common is some codebases become economically valuable right around when they start falling apart due to poor design, but you can hire more and more people if the software is bringing in more and more money.

Adding an attribute to an object increases the complexity of every method in the object, because as the article says an attribute is just a global variable to the object. But adding a method is fairly harmless. It does not increase the object's complexity very much at all. Flat APIs are often nicer than nested. In fact it's super common to have a deeply nested object tree but then expose a single flat API for external consumption. That's the best of both worlds.

waheoo · on Aug 15, 2020

Shallow or flattened structures are always what I default to.

The example given does look better, but it's far more complicated to actually use.

Focus on flat denormalized structures as your default, its simpler and more discoverable.

At some point it does get unwieldy, at some point it is worth to find a seam and refactor out a chunk of logic.

The difference is you're using the codes modularity needs to guide what gets defined where, not some arbitrary idea of what looks neat and organised, because what looks neat and organised is usually not very flexible.

loopz · on Aug 15, 2020

The simplest flexible grouping of data is just a struct. The default approach would be to append members to that struct, and it would in most cases remain backwards-compatible.

It is when you see the need to structure the members, or subgroups of them are used in very different contexts, that one need to refactor. This is precisely the right time and not any sooner, because now you know why you now need to do all that extra analysis and work!

With OO it gets a little bit more complicated. If you use OO without needing the features of OO, you could, and maybe should, simplify by just using the struct. If you need to tie action with data, you also know which use cases this will represent in what contexts, what to utilize the encapsulation or polymorphism for. Ie. using OO to adress higher level concerns than data and action alone. The added complexity may require its own additional meta-logic in order to do that, which for pure data would've been solved more directly, but without the higher-level interface.

For DDD one could solve most of this on paper before any code is written, because knowing the domain, the design can be made more explicitly upfront rather than growing something "organically" while trying to weed the results later.

The approach really depend on the problem space and experience of the developer. With experience, there'd always be more ideas for improvement, even if we go backwards sometimes.

TheOtherHobbes · on Aug 15, 2020

I suspect OP is confusing number of attributes with number of abstraction layers.

Attributes are not a problem as long as they're listed with intent and not just thrown into a bucket, and there's some rationale for their existence.

It's much harder to trace flow down and up through multiple abstraction and encapsulation layers. And it's harder still to remember the layers for different classes/applications. And it's even harder to do this when all the class names are related and hard to tell apart.

pbw · on Aug 15, 2020

Hierarchy and abstractions have costs, no doubt. I didn't get into that in the article, but the article was only 1500 words. I can imagine a long "yes, but..." section that explores the risks and downsides of small objects. It's possible to write total crap software using only small objects for sure. I don't advocate doing that. That's where the Bonsai metaphor comes in, that's as much about shaping and pruning as growing. Uncontrolled growth of small objects has a name: cancer.

tchaffee · on Aug 15, 2020

The article talks about object properties, not methods. The problem described is that each property is like a global variable that all methods share. So the string object having 40 methods is not a counter example. As far as I can tell from the docs, Python strings have exactly one property: the sequence of characters that make up the string.

And the api surface isn't strictly shallow either. Strings rely heavily on the Format class, right?

Happy to be corrected.

pbw · on Aug 15, 2020

I wrote the article and you nailed it. In another comment I mention NumPy's ndarray has 15 attributes but 50 methods. Imagine the reverse, an object with 50 attributes? Even if the attributes were just booleans, that object could be in one of 2^50 different states! That's trillions upon trillions of states.

Yes if attributes are global variables for the structured program that is the object, then the complexity of that object is very sensitive to the number attributes, but adding methods is mostly harmless. New methods are just new useful things you can do with that object, adding more methods does not degrade the existing methods, except maybe as far as discoverability and documentation. An object with a 1000 methods would be a challenge to learn.

emptyparadise · on Aug 15, 2020

Maybe instead of computer- or human-oriented programming, we need more human-oriented tools. Let there be lots of autocomplete and documentation.

pbw · on Aug 15, 2020

I think tools are great. I don't get into it in the article, but with both tools and humans you can in fact makes sense of complicated structures even if there are dozens of attributes per object, but the thing is it's hard, and that work is throw-away.

If you spend three weeks studying a complex piece of code until it makes sense to you, unless you modify the code with your newfound understanding, the next person has to climb up that same ladder.

So even if the tools are great, I think in maybe cases it's go to push that understanding back into the artifact, so the artifact improves. A counter-example might be legacy software where it's just too risky to make anythings. Then maybe you just push your understanding into really good documentation, or a 2nd parallel implementation.

wruza · on Aug 15, 2020

Yes. Usually there is just a list of names when you initiate autocompletion. But exploration could be more easy if instead of that list (or maybe as a separate mode of a completion, like <Dot><F1>) a panel popped up with a well-drawn cheatlist from which you could navigate and read synopsises easily. Most IDEs still do not even draw a separator between superclass borders.

Kinrany · on Aug 15, 2020

Flat APIs are not affected by the 7+-2 limitation. They do not depend on each other, so we don't need to think about all of them at the same time.

pbw · on Aug 15, 2020

I agree if attributes are global variables to the structured program that is the object, then methods are just functions and they are far less "dangerous". If you have a modest number of globals, like 3, you can have hundreds of different functions without much fuss. The only issue with tons of functions is really documentation and discoverability, teaching people what all you can do with the object.

It's common to have a nested tree of objects to control state and keep the implementation clean, but then expose a flat API on top of all that. While it's more code, sometimes it's the best of both worlds. In fact creating objects with many attributes is kind walking down the primrose path, you think you're making the API easier to use, but you are making the implementation into a quagmire that you many never pull out of.

joepie91_ · on Aug 15, 2020

> Turning to JavaScript, every method in Underscore.js lives in a single gigantic namespace.

This is really not a great example. Code written with heavy use of Underscore/Lodash is notoriously difficult to understand and maintain, because it is difficult to keep a mental model of all the pieces of the library and how they interact.

People mainly use these libraries when they come from a language with an extensive standard library, and want something familiar in JS. Experienced JS developers virtually never use them, for the above reason - it's bad for maintainability.

> This is a nice story, but the reality is that popular languages and libraries tend to expose API surfaces that are broad and shallow, not ones that are narrow and deep.

The unspoken assumption you're making here is that "what is popular, must also be the optimal thing to do". There's absolute mountains of evidence (eg. the tech hype cycle and ingrained cultural problems around NIHing) that suggest that this really isn't true.

Indeed, grabbing for broad and monolithic tools seems to be primarily a cultural thing, and people commonly do it because people commonly do it. It's ingrained culture.

This is especially painfully obvious in JS-land, which has all the tooling that's necessary to work with smaller, easier-to-reason-about, modular tools - but where I'm still spending half of my days in #Node.js trying to explain to people why it's a bad idea to try and replicate the Python standard library in JS. Yet those who take the advice seriously, consistently report that it has improved their development and maintenance process.

Likewise, dependencies in JS are a popular thing to complain about, but virtually all of the complaints are based on assumptions from how other languages work (where they would hold true), but that do not hold true for JS - and people rarely bother to try and learn more about how it actually works in JS. It's ingrained culture, not optimization.

In short, I don't think there's any merit to the claim that, paraphrased, "if everyone is doing it this way, it must be the better way". The referenced article makes a pretty good rational case, and it's a little odd to dismiss it purely based on "that's not what software looks like".

> Modern IDEs offer rich auto-completion with pop-up docs. This makes it much easier to look at the available list of fields and methods and choose the one that's appropriate.

Except it doesn't, because autocompletion is based on words, not on concepts. You still need to know what something is called, at least approximately, to get anything useful out of autocompletion. Otherwise it's just a big list of words. The reduction of concepts to smaller-scoped ones is precisely what helps there!

> Shallow APIs and big namespaces tend to favor easy lookup. Deep APIs where every type only has a handful of methods make you memorize them.

No, not really - it's actually the other way around.

Lookup is actually much easier in smaller-scoped APIs, because you can do a 'refined search' - instead of having to find-by-word the thing you're looking for in a huge list, you can start by selecting the appropriate API subset from a much smaller list, and then drill down into specific methods and such.

Big-scope APIs, on the other hand, require memorization; if you don't remember what a certain concept is called, you have no hope of ever finding it, short of a detour through some StackOverflow posts.

wruza · on Aug 15, 2020

  (start_ns, end_ns) -> Span
  (process_id, thread_id) -> Origin

Nope x10. Before that transformation I got what the structure does instantly, but after that I have no idea what span and origin exactly are. You cannot make things simpler by adding nomenclature. Remember that "show me your code" vs "show me your data" thing? This hides your data, and now readers have to go to definition and remember +2 references to get the whole picture. It may look nice on a type diagram, where that's obvious, but not in code.

But that shouldn't defeat the article's meaning. I think that the same rule can be applied to groups of related attributes instead. When you have 5-7 groups per structure, it is okay, but when these groups exceed that limit, that's a signal for rethinking your god object.

Group things:

  self.name = name
  self.category = category
  self.args = kwargs
  self.phase = phase

  self.start_ns = start_ns
  self.end_ns = end_ns
  
  self.process_id = process_id
  self.thread_id = thread_id

And they'll be much easier to grasp. Your brain will do that span/origin grouping in background, in terms that it definitely knows, but for which it may not even have a vocal association. It may not be able to remember 10 different things, but it will remember that thread_id is coupled with a process_id, because you already have that knowledge in a long-term memory.

pbw · on Aug 15, 2020

I love that. Chunking by textual grouping is great. Sometimes that is totally enough. One really nice thing about sub-objects though is you can pass them to and from functions like:

    info = get_owner_info(event.owner)
    info.record_duration(event.span)

I usually like that better than:

    info = get_owner_info(event.process_id, event.thread_id)
    info.record_duration(event.start_ns, event.end_ns)

If several attributes are passed around together through several functions, I'm much more likely to consider making a sub-object. It's 100% true that sub-objects and hierarchy have costs of their own. Nothing is free. If the sub-objects are not pulling their conceptual weight they should probably be removed. I used the Bonsai metaphor because those trees require constant trimming and shaping. There's a name for unchecked growth, budding sub-objects willy nilly: cancer.

The bigger thing though is pushing functionality down into the sub-objects. For example if my server has these three attributes:

    self.connection_address
    self.connection_start_time
    self.connection_status

I'd be highly likely to create a Connection sub-object because I'd feel very confident that I'd want it to sprout methods like:

    self.connection.is_alive()
    self.connection.drop()
    self.connection.get_duration()
    self.connection.stats.get_total_bytes()
    self.connection.stats.get_mbits()

My article only talks about objects and attributes and doesn't say too much about methods. I think having lots of methods on an object is often fine. Numpy's ndarray object has around 15 attributes but over 50 methods. I don't have a problem with that, especially for such key object.

There's a big difference between attributes and methods. Adding an attribute to an object increases the complexity of every other method in the object, and every future method yet to be added. As the article says an attribute is just a global variable to the object, adding even a few globals can tip the scales to where the object is a confusing mess. But adding a method is fairly harmless, it doesn't really make the other methods more complex.

thelazydogsback · on Aug 15, 2020

>Chunking by textual grouping is great. Sometimes that is totally enough

Agreed, but much prefer object-first vs. verb first:

owner_info_get(...)

Having 600 methods that start w/"get" and 450 that start with "set" doesn't help docs or autocomplete. I remember the Win32 API books were organized alphabetically and it would have been comical if it weren't so useless.

throwaway13337 · on Aug 15, 2020

'args' being one of the arguments rings alarm bells beyond anything written in the article.

Python libraries make it often unclear which arguments are possible. You have to hope it's in the docs or go digging around in the code.

It's extremely common and a horrible part of the python community.

Why is this standard?

pbw · on Aug 15, 2020

I totally agree, but in this case the args precisely are key/value pairs that appear the in the chrome://tracing GUI under the heading “Args”! So it actually works pretty nicely. But over use and misuse of args and kwargs is a huge problem in Python I agree.

bedobi · on Aug 15, 2020

Uh... Poorly composed bags of stateful attributes that are globally shared all over the place don't become OK or easy to understand or work with just because each of them have seven such attributes or less.

pbw · on Aug 15, 2020

The Bonsai metaphor in the article agrees with your point, you don’t want poorly composed bags shared all over the place. You want a carefully and continuously groomed tree of objects that’s well designed and well cared for.

userbinator · on Aug 15, 2020

The biggest trap of software development is that it’s easy, trivial in fact, to write software that you yourself cannot understand, and in turn no one else can understand.

An interesting counterpoint: https://www.linusakesson.net/programming/kernighans-lever/in...

I've never had problems with too many fields or even global variables. A recent "weekend project" I did of medium complexity (video decoder) has around 50 globals and zero actual "objects". That might be a bit atypical since it's mostly a set of nested loops and not as "branchy" as usual "business code"; but on the contrary, I've had problems with the ultra-deep callstacks and massive indirection that this sort of "micro-structuring" tends to produce. The micro-complexity goes down, but the macro-complexity goes up and it becomes harder to see the state of the whole system at once, which is extremely important for debugging. Following data as it flows between multiple objects and jumps around between functions is much more difficult than scrolling slowly through a long section of mostly straight-line code.

I wonder if being able to see more of your program at once, as APL programmers often espouse, is really the key to effective programming. You may only be able to hold "between 5 and 9 things in your brain at any one time", but to be able to see the other 40+ with a single glance at the code seems much more important to me. To use my example of a video decoder above, the existing implementations I could find were all multiple files and multiple functions in each file, and to study how they worked was rather tedious. When I wrote my version in a single file with very few functions and objects, it seemed much simpler, and my sense of understanding increased. It was a similar feeling of epiphany as when I first saw https://news.ycombinator.com/item?id=8558822

TeMPOraL · on Aug 15, 2020

Agreed. When I program on larger codebases these days, this is my number one desire: I wish I could see the structure of the entire program at glance, and seamlessly navigate it. Unfortunately, there are no tools for that (that I know of). In the absence of such tooling, each extra layer of indirection makes the program fractally harder to understand and follow.

The same applies to runtime. For now, the best trick I have is building the codebase, then launching the program in a debugger, setting breakpoints around areas of interest, and then stepping it, exploring the call stack and the flow of data. But debugger UIs aren't exactly friendly when you're exploring, and not actually debugging.

(Sometimes, but very rarely, I can get away with running a tracer on a program. E.g. for Common Lisp, I wrote myself a small tracing profiler[0], and half the time I use it to explore the control flow rather than chasing performance issues. Fortunately, "time travelling debuggers" are getting more popular as a concept, so maybe good tooling will become more available in the future.)

--

[0] - https://github.com/TeMPOraL/tracer

TeMPOraL · on Aug 15, 2020

It would be great if it was just so simple. The advice from that article ends up punting the complexity into hierarchy of objects/types. Which would be fine if you never needed to be in several places of that hierarchy at once. Except in practice, you almost always do.

I've worked on a large C++ codebase that went with the "lots of small objects, strong typing everywhere" approach. Classes rarely had more than 5 fields or 5 methods. Data was aggressively transformed and narrowed so that, within any given function, you had in your hands a bunch of objects that contained only the things you needed.

The experience was me slowly going insane from the sheer amount of jumping around files and the need to keep in my head 30 different strongly-typed aliases to std::string.

I find myself leaning towards a belief that there is such thing as excessive subdivision, excessive separation. If you treat 7 +/- 2 as the size of your brain's L1 cache, then what you see in front of you is an L2 cache. Constantly swapping things between L1 and L2 is expensive, but nowhere near as expensive as between L1 and RAM (all the other files in your project you aren't looking at this very second).

--

Not to mention, objects don't communicate lifetime hierarchy well. If you split your Foo class into Foo, Bar, Baz and Quux, just to keep the size of the interface below 7, then when I see a Bar and Quux appearing somewhere in the code, it's not obvious to me that these are part of the same thing and always used together. Perhaps in a perfect world I shouldn't care, but most of the time I really do. This is probably a more general problem of OOP - structuring your program into a large graph of small objects, in conditions when you have near-zero visibility into how the entire graph looks, and you're forced to deal with only small pieces of it.

That picture in the article labeled "Complexity grows without bound"? That's exactly how OOP feels to me when you follow the proper OOP practices.

--

Edit: 'TheOtherHobbes expressed it perfectly here: https://news.ycombinator.com/item?id=24167120. Attributes are not the problem. But jumping up and down the ladder of abstraction, breaking in and out of encapsulation layers - that's what's mentally taxing.

rl3 · on Aug 15, 2020

This is where data-oriented design really shines, without even considering the excellent performance. It affords a purity in design very similar to what's achievable with functional programming. Granted, what you describe very much sounds data-oriented, with the exception of a high degree of coupling, and that would squarely place it in OOP territory.

>The experience was me slowly going insane from the sheer amount of jumping around files and the need to keep in my head 30 different strongly-typed aliases to std::string.

As for that problem, I've been meaning to try this person's approach since I'm pretty sure they're a genius ever since reading it.[0]

Likewise doing a projects in a small number of giant files can be beautiful. Sokol exemplifies this.[1]

[0] https://news.ycombinator.com/item?id=23800729

[1] https://github.com/floooh/sokol/blob/master/sokol_gfx.h

pbw · on Aug 15, 2020

I don't advocate punting the complexity into a hierarchy of objects/types. Limiting yourself to seven attributes per object in not a panacea, you can create an utter shite design with small objects. We've all seen that.

Let's turn it around. Do you feel that the more attributes in a class the better? So 10 is better than 5? And 20 is better than 10? I suspect not. I suspect you also feel there is a balance. I'm just saying that seven is an interesting number in the balance based on how our brains work.

You want to make sure every additional object is pulling its conceptual weight. You want to make sure you minimize the accidental complexity while accommodating the necessary complexity.

Balance in all things. Neither OOP or FP are cure-alls where if you just do those things you can turn off your brain and go on auto-pilot. Brain Oriented Programming is the same way, it's a guiding principle, it's something to keep in mind, but it's not so profound that it wipes out the need for good design judgment, engineering skills, or common sense.

esperent · on Aug 15, 2020

> If you treat 7 +/- 2 as the size of your brain's L1 cache, then what you see in front of you is an L2 cache. Constantly swapping things between L1 and L2 is expensive, but nowhere near as expensive as between L1 and RAM (all the other files in your project you aren't looking at this very second).

As soon as file grows beyond a hundred lines or so, most of it is off-screen so it's not that different to putting it into another file (it's no longer in L2 cache to use your analogy).

It's a choice between jumping around in a single big file, or jumping around between several smaller files organized in a folder tree structure. I have the folder tree open to the side of the editor while I work on the current document so that's also in my L2 vision cache.

Personally I prefer that to one big file that I have to keep scrolling/jumping around in. Of course, for a large file I can open an outline view instead to help me jump around in the file. But I prefer files that are short enough so I don't need that.

TeMPOraL · on Aug 15, 2020

A file growing too big is indeed a problem, but it's partially mitigated by the fact that things in a single file are all directly related to each other (if they aren't, it's refactoring time). An outline view or even code folding also helps. On top of that, I can scroll within a single file quickly using single keystroke.

Still, if you can keep a tree of all your source files entirely visible on screen, then your codebase is rather small. The kind of codebases I'm talking about, the tree doesn't fit. You have to actively search it to find the files you need. And each time someone decides a class isn't "small" anymore, the tree gains 2 or more extra files.

drenvuk · on Aug 15, 2020

That digit span test is incredibly irritating. I've done it 10 times by now and I can't get more than 9 digits.

On a more philosophical note, I've wondered if there are some outstanding math or even societal problems that could've been solved already if humans had the capability of holding onto 13 or 15 more things at a time rather than just 7ish. It always feels like we're going to be limited in some way with regards to that if the problem is incapable of being broken down or collapsed.

pbw · on Aug 15, 2020

Yeah very few people can get more than 9 digits consistently, that's the whole point really, 7 plus or minus 2.

Interestingly chimps do have better working memory than we do, checkout this crazy video: https://youtu.be/nTgeLEWr614?t=8

tchaffee · on Aug 15, 2020

Maybe it's the opposite. Just like there are likely evolutionary advantages to some of our cognitive biases (better to mistake a rock for a bear than the opposite), it might be worth considering if holding on to more than seven things has disadvantages.

mojuba · on Aug 15, 2020

Great article, a lot of interesting thoughts.

Another useful way of thinking of entities we deal with - variables and functions - is their properties. For example, it's one thing if you have 7 references, but things get more complex if they all are nullable. Nullability adds a new dimension and you may well be dealing with 14 concepts rather than 7 (`v` referes to an object of type `V`, but what if it's null?).

Similarly you may have variable `i` that's an index in some array `a`, however the possibility of `i` being out of bounds is an additional concept for the brain.

In other words, every extra "what if" adds complexity. In this regard, high-level dynamic languages are in a big disadvantage since any entity has a possibility of having an unexpected type with an unexpected value.

Languages that allow reducing the scope of possibilities on the other hand are more advantageous: you declare things with precise types and e.g. as non-nullable wherever possible, you use enums wherever possible, you define variables as computable whevever possible (Swift is sweet!), you avoid unnecessary side effects with more functional-style code, etc. - all that reduces the number of concepts for the brain to deal with.

Multicomp · on Aug 15, 2020

You just explained why I first fell in love with F#. I've been trying to figure out how to do that for years.

I didn't particularly care about functional style programming although I have learned that it has its uses, but the fact that I can offload some of these questions to the compiler rather than keep them all in my head helps me immensely.

randcraw · on Aug 15, 2020

The premise of the article is that the human brain's 7+/-2 short-term memory rule should govern how software is designed.

That's the silliest thing I've read in a long time.

Unlike a sequence of random numerals, software is composed of meaningful components and roles -- modules, processes, screens, functions -- each with a well-defined and distinct job to do. Forcing these into a synthetic hierarchy with a fan-out no greater than 7+/-2 makes sense only if the code reader is trying to memorize each layer as if they were no more than random symbols -- meaningless and without context.

That's the least sensible strategy for software design I can imagine. If easy memorization is the objective of S/W design (which is debatable at best) then it's far better to shape the components into names with implied relations that are already familiar - like design patterns, or better yet, the actors in a compelling novel. Then the natural interplay of opponents and allies and obstacles and desires will make for a much more coherent and memorable narrative that will be far more intuitive than 7+/-2 randomly chosen labels ever could be.

pbw · on Aug 15, 2020

I never mentioned "memorization" of code in the article, and I don't feel that "memorizability" is at all a goal. In my mind the goal is learning and understanding and familiarity and navigating the landscape of the design, and these things are easier when things are nicely factored into bite-sized pieces.

Speaking of bites, it's very much like eating. You only put a certain amount into your mouth, then you chew and a swallow that, if you try to stuff a giant hamburger into your mouth all at once you might choke. If you are forced to puzzle over one giant monolithic object after another it will be very difficult to learn the system, but if the system is nicely factored it will be easier. That doesn't sound at all controversial to me.

What do you feel is a good range for the number of attributes in an object? Do you feel an object with 100 attributes is just as easy to understand as one with 5 attributes? That seems unlikely to me.

cjfd · on Aug 15, 2020

100% correct article. The compiler/runtime does not care if all of your variables are global and all functions modify all of them. Almost all program structure is there for humans and not for computers.

ryanmarsh · on Aug 15, 2020

I get the feeling that many programmers look at programming as the act of imparting a superior program (their vision) from a superior machine (their brain) into an inferior machine (the computer).

I see an ape banging on a particle accelerator.

Still, we know a lot about how the ape’s brain works. You’d think we’d have designed programming languages and even syntax highlighting schemes that empirically lead to faster development and more correct programs. Surely we could devise a paradigm that is empirically better for sense-making, but here we are still arguing about semicolons and editor preference.

Programming is so tribal, religious, prone to fads and poor outcomes that it’s cringe inducing when SWE act as if they’re superior to other people or professions. Everyone who hasn’t had to pay programmers for an outcome thinks they must be geniuses, everyone who has thinks they’re single celled organisms.

pbw · on Aug 16, 2020

This seems like the tip of the iceberg towards some interesting ideas. Have you written anything up online?

ryanmarsh · on Aug 16, 2020

I started work on a book a few years back but never published anything. I’m guessing I should.

mrkeen · on Aug 15, 2020

> The key thing to realize is a single object with a lot of attributes is itself not Object Oriented. Instead, it’s a 1970s style Structured Program in disguise. The attributes of the object are the global variables of the program, and the object’s methods are the program’s functions. Every function can freely access every global variable which is what causes many of the problems.

Yes!

lucteo · on Aug 16, 2020

Nice article.

But, beware that grouping things also has its costs. It can grow the complexity of the software both vertically and horizontally.

See also http://lucteo.ro/2019/01/19/golden-mean-in-software-engineer... and also http://lucteo.ro/2019/02/16/clean-code-ne-well-engineered/

hliyan · on Aug 15, 2020

I believe it was Bjarne Stroustrup who said that programming is the art of writing libraries and then using them. After two decades of programming, this still holds true for me. The best way to handle complexity is to write and verify layers with a well defined API, to be used in the layer above it, much in the same way the protocols in the TCP/IP stack are built on top of each other. Once a layer is verified, you can flush the complexity it encapsulates from your brain and focus on the layer above it.

agronomov · on Aug 15, 2020

Nice article, but I wonder what author would say about working with ORMs. When objects represent database tables it's not as easy to extract attributes into objects. Every extract, rename or move is a database migration, when you probably want to keep the old columns around for a bit in case things go wrong. With author's philosophy in mind it feels it's quite easy to bury yourself in that process instead of adding direct business value.

pbw · on Aug 15, 2020

I really have not worked with databases a ton. But I do a think a huge thing I didn't address in the article is how readily are you even able to make changes?

I'm really talking about a system where you free to change things. Once stuff is exposed to the outside world you might relinquish your ability to keep factoring and factoring. Although that's a reason to only expose an API and not your actually object structure.

I'm not sure how this all related to databases, but I suspect in many cases there you simply can't change existing things, too much would break. So yeah I think that's a different ball game to some degree.

tchaffee · on Aug 15, 2020

That's a good question.

With database normalization you do end up with more tables that have less columns anyway.

But the ORM boundary has always been problematic and normalization won't always get you down to seven columns, so the question remains.

philipswood · on Aug 15, 2020

Hmmm...

"Look! I've reduced complexity; decreasing the number of object members by increasing the number of classes...

This only works if I don't need to think about the new objects much.

pbw · on Aug 15, 2020

It's a bad idea to add objects/classes/structs for no reason. If they are not pulling their conceptual weight they should be moved. Hierarchy has costs, flat is sometimes better.

But as I wrote in the article attributes are like global variables to an object. Every boolean attribute you add to an object doubles the number of possible states. An object with 10 boolean variables can have 1024 states.

I find it's better to tamp down on the number of attributes by carefully introduce select sub-objects that have value beyond just reducing the number of attributes.

What has your experience been? How many attributes makes you start to wonder if a sub-object would be appropriate?

philipswood · on Aug 15, 2020

Agree in principle.

Nice article - I like the idea of designing code (or looking at code) with the perspective of how it gets parsed by a human mind.

Great code _looks_ super simple, effortless, until you try to write something similar to it and realise that there is a lot of effort in factoring it to be that simple.

That said - I don't think minimising object members are the main point of attack.

I don't find the number of _members_ to be that taxing to working memory. It seems to be naturally chunked: the members I'm working with and everything else. I don't need to keep all the members in mind. It's like stuff in a drawer - it's in one file so I can rummage to find the bits I need and keep them "in hand", going a small factor over the working memory limit is fine.

(I'm not saying there isn't a retrieval cost, more members increase retrieval cost, potential confusion, etc. etc.)

What seems to be a lot more taxing is keeping reference to relationships to other objects. Stuff NOT in the current file.

Having to keep the relationships between a number of other objects/classes in mind is more taxing, because those need to be in working memory - and putting them there either means they are in long term memory (e.g. with a known codebase or library), or parsing it out by navigating through the codebase, tracing the relationships and understanding the intents.

Ironically super-factored code can be a lot of effort to "get" at first, while intermediate quality code reads pretty easily.

I find my mental map of a nicely factored object is more tied to it's intent/purpose or "meaning" rather than just the raw number of its members.

Your PerfEvent example I'd parse it the same way, though.

_ink_ · on Aug 15, 2020

I can relate very much. In my company I am working on a complex distributed component as single dev. I have created exactly what the article mentioned: overly complex structure, which I can hardly understand and surely nobody else will.

How can one train to write well structured, easily understandable code? It is not just code, also overall architecture and data flow.

slx26 · on Aug 15, 2020

write down the system requirements in paper, really study the problem before trying to solve it. like, for hard problems you really need to allocate 50% of your time to planning and system design. focus on modularity, and try to keep processing and interpretation as separated as possible. it's not the code that needs to be simple to understand, it's the diagrams representing the system. if you can recursively break down the system into modules that make sense on their own and don't mix different concerns (like what I said about processing vs interpretation: sometimes you don't have an explicit code dependency between modules, but you have too many dependent implicit interpretations that the code relies on), you should be much better. the code is not so important there. for example, a crypto hashing package might contain very tricky code, but if the surface area of its API is small enough and you understand what part hashing is playing in your system, having it as a blackbox is not a problem. that said, with hard problems it's hard to gain enough insight until you actually make a first implementation and see where your model is lacking. and this tends to take more time than you often have.

tgv · on Aug 15, 2020

> Thinking in general works very much like vision.

Well, that's a sweeping statement. No reason to read any further. One simple argument against it: thinking about problems is supported by much more memories, and of a different kind, than e.g. face recognition, and it's much slower, too.

pbw · on Aug 15, 2020

I believe how thinking works has many parallels with how visual perception works, because they both use the same underlying hardware. Chunking and hierarchy are super primitive and very likely are abilities that permeate the brain.

Kurzweil (I know, I know) in his book How To Create A Mind writes about how we read:

    For example, to recognize a written word there
    might be several pattern recognizers for each
    different letter stroke: diagonal, horizontal,
    vertical or curved. The output of these recognizers
    would feed into higher level pattern recognizers,
    which look for the pattern of strokes which form
    a letter. Finally a word-level recognizer uses the
    output of the letter recognizers. All the while
    signals feed both "forward" and "backward". For
    example, if a letter is obscured, but the remaining
    letters strongly indicate a certain word, the word-level
    recognizer might suggest to the letter-recognizer which
    letter to look for, and the letter-level would suggest
    which strokes to look for. Kurzweil also discusses how
    listening to speech requires similar hierarchical
    pattern recognizers.

https://en.wikipedia.org/wiki/How_to_Create_a_Mind

Ixiaus · on Aug 15, 2020

I like the idea behind "brain oriented" code but I'm skeptical of Python's suitability for the task. A language that gives you explicit types, algebraic data types, and can separate effects from pure code is a crucial first step towards taming complexity IMHO.

mvn9 · on Aug 15, 2020

7 attributes could be too much. Those attributes belong to a class, not an object. If those attributes are binary, there can already be 128 objects within that class. If you have to think about their interaction, you are already well beyond 9 elements.

imhoguy · on Aug 15, 2020

> "To reduce the number of attributes we introduce two new classes or structs"

So practically we are shifting the complexity to higher abstraction levels.

Let's be honest - programing is complex.

tchaffee · on Aug 15, 2020

And where is the problem? Abstraction in programming is one of the most powerful tools we have. Perhaps the author should have warned about the anti-pattern of premature abstraction, but it was a very light treatment of the subject and the real value was the "object properties as global variables" insight.

imhoguy · on Aug 15, 2020

Exactly, abstraction has consequences too: explosion of classes, packages, modules, libraries, thus hierarchy, wiring, dependencies, lost versioning trace when moving around things etc. The author assumed we are now saved from hassle by introducing new abstraction in some naive example. That is not so easy.

pbw · on Aug 15, 2020

The example was a real-world example I encountered the morning that I wrote the article. The article is 1500 words, I'd love to do a 5000 word version sometime with more in depth examples.

As for explosion of things, there's accidental complexity and necessary complexity. Complicated things are going to have complexity somewhere. The question is where to do put.

The article is arguing for balanced, carefully groomed tree of objects. In other comments I clarify I'm talking about attributes not methods or APIs. Flat APIs can be wonderful. Attributes are different story, since they are global variables that complicate every single method in the object past and future.

imhoguy · on Aug 15, 2020

That is clear. Thanks.

imvetri · on Aug 15, 2020

Disoriented brain programming. :D