Hacker News new | past | comments | ask | show | jobs | submit login
A Theory of Software Architecture (go.ro)
507 points by nreece on Oct 28, 2020 | hide | past | favorite | 241 comments



I spend a lot of time thinking about these sorts of topics (actually, I just taught a 4 hour session yesterday that used most of the terms in this article), working with newer, less experienced developers, and trying to figure how to distill the essence of "architecture" down to something simple that everyone can start with.

This is what I’ve started telling people:

Use mostly functions, try to make most of them pure.

I think that can get people (even new devs) 80% of the benefits (testability, composability, loose coupling, and the ability to reason about code) of more complicated, prescriptive architectures (Hexagonal, Onion, Ports & Adapters, Clean, etc) with a minimal amount of ramp up.

Of course, this isn't the solution to every problem (it obviously depends on the domain you are working in, for me its webDev and backends), but I think maybe its a good way for people to start.

Edit: Here is a great talk demonstrating that by following a simple functional approach, your code can naturally fall into a “pit of success”: https://youtu.be/US8QG9I1XW0


> Use mostly functions, try to make most of them pure.

This reminds me of:

> Eat food, not too much, mostly plants > > -- Michael Pollan

My new mantra: Write software, not too much, mostly functions.


"Not too much" is interesting. If I understand you correctly, you can write "too much software". I can think of at least three ways - bad architecture, too little abstraction forcing repetition, and just bad writing.

Did you have something else in mind here?


NIH or Resume-Driven Development might be another way.

From personal experience, I worked with a team years ago where product owners wanted a datetime picker or parser for an app we had built. Senior dev on team decided it would be cool if it included some lite NLP. When product owners heard from him how easy it would be to add, they were on board.

He started with a popular existing Python library. But there was a bug with one corner case that was causing problems. So he took the initiative to spend a few extra days on the story to write his own simple NLP date parser.

A couple months later, early on Jan 1st, the new feature wished our ops team Happy New Year by taking down our application.

I happened to open an issue for the library's bug on Github after learning about it from the other dev. The owner there promptly responded to share a simple workaround for the issue. But by that time we already had too much software on our hands.


I believe “Not too much” is a simple reminder that more code equals more bugs. So, try to write less code whenever possible.


I think "Too much software" comes from product and engineers not being willing to say 'no' to feature requests and product bloat, rather than any sort of strictly technical thing.

If your product or tool lacks a clear focus and goal, then the code base will reflect that, and will grow and sprawl endlessly.


I have been net negative lines of code for the month more than once, while productively adding more features and fixing bugs. There is a lot of code out there that need not exist at all.


I think more likely is to actually have too much abstraction.


Either can be a problem. Too much abstraction and you're writing FooFactoryFactoryFactory. Too little, and you're repeating yourself. Somewhere between the extremes is a sweet spot.

The problem is that, as the project continues over time, the sweet spot moves...


Nah. It's always about 4 layers. Hardware/services/data stores/etc, wrappers/models/components, business logic / application, ops/deploy.

If you nest your models/components deep enough that they're forming new abstraction layers, you're nesting them too deep. Backup and use mixin-style stuff instead.

If your business logic or application are nesting pretty much at all, then you haven't succeeded at making good choices in your models/components.


Agreed, and I’d add that code duplication is hardly a big problem. It’s kind of like if I legitimately see a lot of code duplication, then I have enough data points to do the right thing. On the other hand, abstracting early to avoid code duplication is worse.


Hmm... I think you might be right. Or, at least, that the problem that is and the problems that go with code duplication are easier to resolve than those that show up with shit abstraction.

Maybe... Duplication is a code smell, but shit abstraction is a code problem...?


I would definitely add "too many features". You can do a really good job building something way too large and you've built too much.


In a similar vein, I look forward to adding less lines and removing more in my git commits.


I am sort of in a similar situation meaning, I try to transfer to the team what I have learned over decades of development and now it is intuitive for me.

I think the most important principle is to "keep it simple". I mean, even if you have absolutely no idea how to put things together, just trying to limit LoC and trying to not do anything fancy is probably going to get you into better spot than any other principle alone.

"Keep it simple" has the advantage that it in itself is quite simple. Even if you are just starting you can pretty much tell simple from complicated code. You might not yet be able to make your code simple but if you honestly try you are also equipped to judge your results in an intuitive way which is essential to improve. The only way to improve is to be dissatisfied with your product and the only way to do that is to be able to judge it at least in some way. Usually the way you are dissatisfied is going to shape the direction in which you are likely to improve.

That is not the case with other principles, which oftentimes are easy to state and sometimes easy to show on a small scale of a very simple example, but give not much guidance on how to use and combine with other principles on a large scale.

For example, principle of using patterns to structure your code (we are talking GoF, PoEAA, etc.) which is frequently taught to people is right in itself (ie. whenever possible we may want to use established ideas on how to solve particular problems) but is frequently misrepresented by adepts who try to overload the application with constructs which even if correctly implemented, might have been replaced by a simpler construct. It frequently requires a lot of experience and good judgment to tell which pattern could be used in particular situation and it absolutely gives you no hint of how much is enough giving impression to novice developers that the more patterns you cram the better developer you are.

I also think that if you put "keep it simple" as your main principle, in your search on how to "keep things simple" you are probably going to discover other principles and also understand why, in the process.

I have many times seen codebase that has been thoroughly obfuscated by "pattern wannabees". I have also seen codebases made by teams that had very little knowledge of programming (like barely being able to program their way out of paper bag). Of the two, I very much prefer code written by people who don't try to be too fancy.

It takes twice as intelligent person to read the code so if some intelligent people try to be smart with their code but are misguided, it freqeuntly results in a codebase that is very hard to untangle.

With a code that is naively written, the problems tend to be simpler both to understand and to resolve.

Couple of months ago I had a discussion with a dev who was supposed to implement circuit breaker on one of the services (dunno why on one service only) and the resulting was a dozen pages of Java code that was chock full of callbacks, suppliers, optionals and what not. But something did not sit right with me because I could not for the life of me see what the code actually is doing and given the problem it was supposed to do I did not expect so much code.

So I sat down and over two hours I have simplified entire dozen pages to a single try / catch with an if in it.


Interesting. It aligns with my line of thinking. With OOP, the most interesting objects are always stateless and the only "state" present is used for dependency injection.

It seems like in the industry the cost of having state has always been overlooked.

Funny enough, in frontend software this pain surfaced a lot, but in the backend it keeps being unnoticed, at least in the world I live in (Ruby, Javascript).


I understand that the point of his example is to introduce the reader to a variant of the very classic multitier architecture with layers.

https://en.wikipedia.org/wiki/Multitier_architecture

But I prefer the first version of the example, because while it's named "complex code" I think it's the simplest version. I think that there is no need to decouple this simple and straightforward function in three functions that cannot be reused independently. Splitting a function in smaller functions make sense where the code is complex, but I think it's important to not overly do it.


I don't think this is a very good example of clean architecture; this is just a refactoring that extracts three methods. There's no dependency inversion used here, which is one of the key tools you would use to create a clean architecture. The use case itself depends directly on the functionality in requests, which means the business logic is coupled to the concrete details of the requests API.

Clean architecture also differs from n-tier in that–although there are layers, they don't have to be physically distinct and the APIs between them are not typically heavyweight gateways designed to work across network boundaries etc. The key design activity in CA is reducing the coupling between abstract business logic and the concrete code that executes it. It's actually quite a simple approach to describe, but more difficult in practice.

There's nothing particularly wrong with the first function; functions certainly don't only have to "do one thing" in clean architecture.


I find the Multitier Arch (MA) has much better names.

The arch proposed by the article has "Application Business Rules" (ABRs) and "Enterprise Business Rules" (EBRs). Now in a small start-up, what are the "enterprise rules", is "enterprise" not merely a buzzword in this context? And how ABRs and EBRs differ is not well explained.

In MA this is much clearer: the Data Access Layer (DAL) contains DAOs (the persistence side of the model in MVC), Business Layer (BL) contains the business/domain logic (the other side of the M in MVC and/or the business logic that may end up in the C of MVC; aka services in Rails), the Application Layer (AL) contains what would be typical "controller logic" (authorization/redirection/data gathering for the presentation) in MVC and the Presentation Layer (PL) contains the V from MVC.

> [...] I think it's important to not overly do it.

Yups. So maybe one can do without a BL at first, and put the business logic in the DAL at first and dont mind to have a litttttle bit of it creaping into the AL (controller).


I've always been confused by the Clean Architecture and even somewhat dismissive, because the terms used are so "not Clean" (from my perspective biased towards startup/indie/freelance). Multitier Arch is already 10 times more appealing just by using more appropriate words.


This reminds me of Brian Will's "Object-orientation is Bad" where he makes the case that most decoupling tends to be more confusing than long-form code that's got sufficient comments.

https://youtu.be/QM1iUe6IofM?t=2235


I think that

    fn x() {
        doStuff();
        moreStuff();
        forgotSomething();
    }
is pretty bad code, but that's probably because I consider procedures with no arguments and no return value a sign that something is poorly factored. However,

    fn x(y) {
       foo = doStuff(y);
       bar = moreStuff(foo);
       if (isSomething(bar)) {
           return theRest(bar)
       } else {
           return theBest(bar);
       }
    }
can be a good way to separate the why from the how and clearly communicate what's going on, especially with conditionals in the mix.

It's important, however, that these helper functions are not haphazardly strewn around the code base and accessible to things that don't need them. Depending on the language/context, I'd reach for nested function definitions or public/private keywords (or a combination), because definitely, it can be very hard to approach a big file with a bunch of (often poorly-named) functions that are defined at the same hierarchical level but not meant to be used at the same level.


Haven't you just described functional vs procedural programming?


No he described implicit arguments and mutation vs. explicit arguments and mutation.

It's independent of OOP vs. functional, through it's easier to have implicit arguments and mutation in OOP so you see it more often there, especially if people somehow ended up believing that programming OOP means you need to create a class for everything and make all states a object member instead of e.g. a variable in the stack.


Yes! I often see code that where a method sets an instance variable, then calls another method that does some work based on that instance variable, which is then not used until the next time that first caller is used. The instance variable is redundant, and the control flow is obfuscated. I think the principle is just to only expose something when you really have to (harhar), and that the 70's suspicion of unstructured use of global scope should be equally directed to class scope.


That's called temporal coupling, where some code only works if some previous code set some variable earlier in time. And it's a sign that you're building an implicit state machine.


It's not explicitly expressed (the functions could still execute sideeffects and e.g. mutate things), but yes, that is pretty much the style that pure functional programming enforces!.


The former example was much easier for me to understand than the latter, even if it wasn't factored well. But I'm assuming that the functions aren't mucking about with global state or something equally distasteful.


Well, if we assume that both are doing the same work, then the explicitly named variables y, foo, and bar of my second example would, in the first example, have to be global variables, or at least things that are defined in the scope of the definition/call of x.

Given that, I have to say that I, err, and I'm sorry if this sounds a little arrogant, but I don't really believe you when you say that it's easier to understand. It might be easier on the eyes, sure, but to really understand how the first one works, you'd have to look into all the defined helper functions, and see which outer-scope variables they touch (global/module/class scope), and also, probably, have some knowledge of where x is called. In a style closer to the second example, there are less outside dependencies.

Of course, this is all very abstract. I'm not saying there aren't cases where a few variables in global (or class) scope that get mutated by bare functions can't work at all, but as a rule I don't consider it a good style.


I don't really fully understand how the kernel scheduler works, but I know in general how it works, and so I can write software using it. In this sense, a simple implicit abstraction is easier for me to understand than one where I'm peeking under the covers, so to speak.

Even just seeing a few variables or objects passed around, I still don't know exactly what it's doing, or hiding. I can more quickly understand the high level idea with the former example without the details potentially confusing me.


I see what you mean.

I think it depends on the type of code and how you interact with it. A library that requires its consumer to keep track of a bunch of data and pass it in and out at the right times when it could just as well handle internally is not a very good library. But if we say that all the code the above example is from an application, where a normal change might involve touching all of the mentioned functions, making it explicit where some variable is used.


> But I'm assuming that the functions aren't mucking about with global state or something equally distasteful.

There's not much else for them to do, since they don't take arguments, but to muck about with global state and/or have side effects.


> But I'm assuming that the functions aren't mucking about with global state or something equally distasteful.

It seems fairly obvious to me that either the former example is waaaay simpler or it is doing something really distasteful.


Some designs don't require your own code to pass around state between functions, because each function can handle state in its own way without dependence on other functions. I've written a lot of such simple scripts where I just need to execute a series of tasks that aren't necessarily related to one another, but do follow each other.


I hate unnecessary functions with a passion and it's refreshing to hear his opinion.

Sometimes a big, fat block of code is just easier to understand.


> Sometimes a big, fat block of code is just easier to understand.

John Carmack made once the same comment: http://number-none.com/blow/blog/programming/2014/09/26/carm...


> I don’t think that purely functional programming writ large is a pragmatic development plan, because it makes for very obscure code

As a functional programming enthusiast, I agree with this.

Recently I worked on rewriting a relatively large backbone web application in React/Redux (For those that don't know, Redux is a framework for writing JS in a more functional style).

While moving all app logic to functional style makes it safer and easily testable, it is definitely not friendly to most programmers that maintain it. We've had lots of bugs caused by new people coming in, not understanding the functional style and making bad changes such:

- Doing side effects from functions that are assumed to be pure.

- Writing a lot of logic into functions that are directly doing side effects, instead of refactoring the decision making into reducers.

While I still love mostly pure functional code, I would only recommend it for smaller projects, where one or few developers with strong grasp of the style would strictly review every new PR to ensure new people don't mess up.


You're not wrong, but I don't really think this is the fault of FP.

I had the exact same experience with people not understanding new paradigms with Procedural (programmers experienced with Assembly using GOTO for everything), OOP (programmers used to procedural using only static methods), MVC (by putting everything in the controllers and ignoring views/modules/helpers), MVVM (by modifying state by themselves instead of using the MVVM mechanism).

All those things were always "obscure" for newcomers since it was different from what they learned in college, but after a while they became second nature.

I think the answer is not to avoid those paradigms because they're hard, but rather to teach people how to work with them. It's expensive but it's only way forward IMO.


> Doing side effects from functions that are assumed to be pure.

That completely goes away as soon as your language has purity declared at the type system.

It's just not available for Javascript development, like most nice FP features.


That's because JS is not a functional language. You are enforcing a functional style by discipline. Additionally a huge portion of benefits of the functional style are lost on untyped languages. If you're not using typescript you're dealing with a lot of unnecessary bugs.

If you guys moved to a fully functional paradigm where the language enforces the functional style you will see greater benefits.

Unfortunately for the front end the ecosystem for functional languages outside of TS/JS is not that great. But an easy one to try out to actually see the benefits I recommend writing a little app with ELM. With something like ELM, the functional style is enforced by the compiler. You will see that 90% of the bugs you typically deal with in JS/TS will disappear. One type of bug that will disappear is runtime errors. Runtime errors are not possible in ELM.

I find a lot of JS programmers haven't fully grokked the functional style. Example: You'll find JS programmers who talk about how much they like functional programming but don't understand why for loops don't exist in functional programming.


I haven't read that in awhile. But the argument was around optimisation rather how easy it is to understand.


The way I read the article, I believe his main point was to make reasoning about state easier, even at a small cost to performance, making code easier to test and debug.


Not really, Carmack explicitly states that this is not a performance optimisation in the article:

> In no way, shape, or form am I making a case that avoiding function calls alone directly helps performance.

Rather, he argues that (among other things) this is a way to make current bugs more visible and to avoid future bugs by disallowing calling functions that should be inlined.


Very insightful. Can I read more John Carmack's letters somewhere?


Took me a while to find something useful, but here you are:

https://fabiensanglard.net/fd_proxy/doom3/pdfs/


Wow that's a lot of interviews and letters. Cheers


If you have a well documented system, OWNED by a software architect, then breaking it down is better.

If you don't have that, at least big fat blocks of code are self documenting...


Thank you for this.

It has been a strongly held opinion of mine for a very long time, but I haven't been able to state it as eloquently as that.


I love his series. Eye opener for beginner coders.


Agreed. IMO it's an eye opener for experienced coders too.


If that is a simple 10 line script, I agree. But if that is part of an application, you will not be able to properly test it - so you either have to do time consuming manual tests or time consuming integration tests. Splitting it into different parts allows to write unit tests and smaller integration tests. Then, when you only change the "pure" parts, you only have to verify the unit tests. That gives a HUGE productivity boost in bigger applications.


You only want to test on public API, because else you cannot change the implementation without adapting your perfect unit test on implementation details. So, I prefer the first code example and do not mind some mocking (unless the extracted functions make sense on the public API).


If you take that idea/ideal to the very logical end, then you can only do system tests / acceptance tests (however you want to call it). Your tests must only be able to do what end-users do - everything else would be relying on implementation details. No unit tests at all!

...unless we take a more compositional approach and define our application to consist of many "parts" each of which has a public API and an implementation. Well, in that case I would argue that each function can be considered as such a part, having a public API (the definition of its inputs and outputs) and an implementation that can be changed while keeping the public API stable.

Over the time I came reached a conclusion: whenever I have to run some code to be sure it does what it should, I write a test instead. And every often that is a unit test. If I'm highly confident that the code will do what I expect, then I rely on very high level system tests, unless I fear that someone might break the code later by accident without the compiler (or other existing tests) being able to detect that.


A large application has lots of implementation leak to the API, doesn't it?


For this reason, when mocks are involved (that is, just about any external API), I think Listing 2 is the ideal - extract a small function designed around the desired interface, a facade that's easily mockable and won't change with the library implementation.


Lile the author wrote, the example is only an example, in the real world those functions are way bigger. Such a small function would be fine but most functions are far bigger and often call other functions that again do some IO.


I usually look at this in the form of “layers”. If you have a lot of duplicated logic, its a lot nicer to abstract those concerns into a “layer” of functions / classes.

Whats important is to not have leaky abstractions though. The moment you need to start reading into those functions is the moment you realize you would have been better off leaving the code as is.

Its very hard (at least for me) to come up with those layers and requires a ton of iteration. Thats why I also prefer the top example - you need to work with the lower level code first, and go through enough of it before you can start spotting patterns and creating layers, otherwise you’re bound to mess things quite a bit.


As someone who's been coding for 25 years, I agree with you.

I want code to be "clean and simple". Thats how I judge my own code and everyone else's. And the first example is indeed cleaner and simpler.

In my theory, you are only allowed to introduce more complexity (functions, frameworks, etc), when it makes it overall simpler and cleaner than before.


You don't think

    def find_definition(word):           # Listing 3
        url = build_url(word)
        data = requests.get(url).json()  # I/O
        return pluck_definition(data)
is more readable that the original code that inlines the definitions of both helper functions?


The original code is 9 lines, and it's very clear what it does.

Now you have a 'build_url' method, which is a bad name to start with. It's actually a 'build_def_request_url' or something like that.

Imagine you add another method find_synonyms(word). My quick brain would think I can use that same build_url() method. Wrong, because one needs a build_def_request_url() and the other a build_synonyms_request_url().

Let's say I need to debug and need to see what url it's calling. In the original code it's right there, here I have to on a goose chase to find my answer. One thing that wasn't refactored is putting the hardcoded url into a constant. That one I would do.

Another complexity is adapting one of the helper functions. If I adapt it, which callers could I break? Who is using this function?

Another part is that I would rewrite the pluck_definition(data) to a more generic get_value(data, u'Definition'). That way my other potential find_synonyms() could also use it, and it basically adds no extra lines of code except for an extra parameter.

I know this is an easy example to explain things, but the original code needs to be way more complex to justify it splitting up into multiple functions, that in this case can serve nothing else but that original function.

Just my feeling towards this code.


Agreed. Abstractions are more mentally expensive than a single concrete idea but less expensive than many concrete ideas. When something doesn't need to be abstract, especially in a language where refactoring is easy, it should not be abstract.


Or, how I would put it more pragmatic: the first example is a sure way towards seven slightly different mechanisms to call an Api. Per year. On a team with seven devs.


Not doing the first example is a sure way towards either copying that function and changing it a bit because you need a slightly different call, or a function with a pile of branches in it because it needs to operate in several modes.

Both ways could go wrong. I don't care what your architectural choice is, you can still mess it up.


If so, then please add 2 comment lines above each "chunk", for faster human reading, and everyone will be happy:

// build URL

// find "definition"


Multi-tier is different compared to say hex or clean, in that depedencies for edges of the system depend on interfaces in the "Core".

Tradtional mulit-tier the UI depends on Business Logic, Buinsess Logic depends on persistence etc

The idea is protect the core, and persistence or UI concerns adapt to how the core wants to interact with the world.


I agree that it may be overkill for such a simple function, but the author (at least the original author, Brandon Rhodes) is drawing an analogy to more complex applications.

He uses such a simple example so that his idea that applies to much larger, more complex codebases can expressed in a presentation/article, not to argue that all such simple functions need this level of decoupling.

https://en.wikipedia.org/wiki/Straw_man#Steelmanning


There are people who cannot use an abstraction unless they understand every line of code behind it. Abstractions are blockers, rather than aids. This lowers the value of abstraction for those individuals.

Others find it tiresome to read every line of code and develop intuition around abstractions instead. The "find_definition" function will be all that is parsed.

The funny thing is people shift between the two as the scope of the project increases.


In a real project - millions of lines of code - there is no chance to understand it all. You have to trust the abstraction does what it says without caring how it works. You only break the abstraction and understand when you have reason to suspect that it somehow isn't obeying your quick mental model of what it does - normally this means you suspect the bug you are working on is down that abstraction, but sometimes it is just the name is bad and so you need to figure out what it is really doing.


Often this is done in the name of "testability" by the the "(unit) test everything" zealots. I've met some of them who haven't seen a function they wouldn'tsplit up to test different branches or what have you independently. The resulting code is frequently a nightmare of function aliases, mocks and other problematic artifices.


This is good advice and a good writeup, but I take exception to one (boldface!) line:

Coupling kills software

I hear this a lot but I think it’s highly overstated.

The “clear” final version is still strongly coupled -- you can’t call find_definition without it directly calling the build_url helper! The key aspect isn’t the coupling, it’s that the high-level imperative function is calling a simple, low-level, testable helper.

Coupling can be bad, sure, but not if it’s kept under control. And excessive decoupling can make programs much harder to understand and debug! Have you ever worked on an app where every reference to another object is injected via a DI framework, and where all the significant calls that actually do stuff are farmed out to asynchronous messages and callbacks, all in the name of decoupling and testability? That can make it really hard to debug problems. A good balance is what’s needed, not decoupling over all other concerns.


> every reference to another object is injected via a DI framework

And it's injected with an interface, not a concrete class, so you have to go on a giant easter egg hunt to even figure out which class is being injected... only to eventually discover that there's only one class that implements that interface.


But then despite that the class is getting a generated proxy that does basically nothing or is causing the problem and it is really hard to determine anything except tracing it at runtime to find out what actually is being run. People do some crazy things with Spring and interface injections in Java and it can get really Opaque quickly but also often it really is just 1 class implementing an interface and there are no tests using the interface at all.


For the one project I worked on that was Spring-based, my joke was that Spring was a really effective way to convert compile-time errors into run-time errors :)


> Spring was a really effective way to convert compile-time errors into run-time errors :)

This is not a joke. Spring invalidates refactoring tools and a lot of static analysis. It makes working on Java feel like you are working on Python or something. The traceability of errors is very low. You can limit yourself to a subset of Spring's functionality (eg no component scanning, wiring up the beans manually) but it will not be idiomatic.


No need to create interfaces generally in spring when there will only be one class implementing. For testing, mocks can be created directly from the actual classes.


And now you're into the Liskov substitution principle. All of your implementations should honor the same contract even if they do it in different ways.


> only to eventually discover that there's only one class that implements that interface

I mean, in this case it's literally:

* right click on "someInjectedService"

* select "Go To Implementations"

and you'll be there.

Generally I do relate to the complaint though. DI makes your project feel like a "system" as opposed to a "program" (not that this is a bad thing or unnecessary).


That's the simple case. When it's a big chain of injected dependencies and you have to untangle the whole thing, it gets annoying quickly.


It's not a problem if you're using a proper IDE. Rider and IDEA will take you right to the implementation (or will display a list if there are multiple implementations) on Ctrl+Alt+B. This works on any of the parent classes/interfaces, or any inherited field or method.


It’s less of a problem but it’s still a problem because it forces people to have to keep more context in addition to what they’re really working on. Even experienced developers with the latest IDEs are slowed down when the culture encourages layers of gratuitous complexity, and the gains offered by advanced tooling are often cancelled out by people seeing that tooling as reducing the counter pressure against adding more.


Agreed, it’s not an insurmountable problem, but it’s still a hassle that gets in the way. Something that should be one click becomes 2 or 3 clicks and only works 85% of the time. For example, there might be one real implementation and one (or more) test stubs and that slows you down.


Yeah this is what I was thinking. I pity the people who navigate java projects without ctrl+alt+b. What an existence that must be.


I pity the people who navigate java projects.


> It's not a problem if you're using a proper IDE.

From another POV it's IDE vendor lock-in: make the basic programming exercise so challenging that the users have to buy your special tool to even attempt it.


We just did a major reorg, and everyone who promoted that locked in tool got let go. There were other reasons for the reorg, but take this as a warning, if the tool you promote isn't productive you lose your job.


Without additional context, that seems a bit extreme. People often promote tools because they have some experience and have not (yet) had a poor experience with them.


These were senior technical architects that I doupt wrote more than a dozen lines of code in the tool, nothing real. They had great pie in the sky ideas, and the tools did useful things, but they forced everyone to use it for many years and kept defending it for years while evidence kept building up that using the tools made projects take 5 times longer than just writing in C.

If they had accepted the evidence and backed off the tool until it could deliver the better than C quality (which it did do) at the same speed of development they might still be around. However in the end we need to ship code to make money and our competitors were catching up on us which isn't acceptable to the business.

Let this be a lesson when you want to adopt new tools. They might be better but they might have some unexpected downside that means they are not useful.


A decade ago I was writing a big C# application using Ninject* DI framework and this is identical to my experience.


I agree sometimes it can be an hunt if someone used some crazy implementation of the strategy pattern. However if you are following SOLID you shouldn't depend on a concrete class.

I'm dealing with a project at the moment where they haven't done DI and everything is dependant on concret classes and I just can't write tests (without huge setup methods) for anything non-trivial and I have these large constructors initialising things in every class.


This is a common misconception about "coupling". Decoupling is making something truly independent, not making the dependency injectable. Making it injectable is the last resort when you can't make it independent.

The real thing you should be thinking of is, look at those libraries and their documentation / READMEs (depending on library size, smaller actually preferred). Can I make my code modules more like that? Will the API boundaries I define make sense?


> Coupling kills software

> I hear this a lot but I think it’s highly overstated.

Coupling is like trying to get to your destination by swimming through molasses – each move you make impacts atoms far away from you because they are tightly bound to the ones near you, and those atoms exert a force back.


Move most of your code to pure functions, that is the number one mantra to solve most scaling problems in software development. Even in an OOP paradigm, it makes sense to make almost all objects as purely functional, with limited use of internal state.

The second mantra is to think a thousand times before you name something. I actually keep a list of names (..Manager, ..aggregator,etc.) gleaned from various sources, to name my classes. Earlier, I used to name most of my classes as <Function/Operation>Manager

Edit : Here is a question, that actually prompted me to keep a list.

https://stackoverflow.com/questions/1866794/naming-classes-h...


> (..Manager, ..aggregator,etc.)

Names ending in Manager are usually a poor choice.

See Peter Coad's "-er-er" principle:

> The “-er-er” principle. Challenge any class name that ends in “-er.” If it has no parts, change the name of the class to what each object is managing. If it has parts, put as much work in the parts that the parts know enough to do themselves.

See also: http://www.carlopescio.com/2011/04/your-coding-conventions-a...


I think the commenter generally views things as code acting on entities. If so, that code is suited to being called an xxxManager, or xxxService, or xxxCoordinator, or xxxController. Of course we have returned to a place in history by doing so, of creating big balls of mud as complexity increases.

Peter Coad advocated against this in favor of modeling the problem domain under consideration using an object oriented approach.

In an object oriented domain model there is no place for such external “controllers”, but I think the commenter doesn’t propose this approach.


I have an object called TransportMaster, with concrete implementations X_TransportMaster, Y_TransportMaster, Z_TransportMaster. The application instance may contain any number of any mixture of these objects, and they may be created and destroyed at any time by the user. Their use within the application is also subject to the user's whims.

Why would a TransportMasterManager, which provides factory methods, enumerations, a collection of the current set of TM's, and knowledge of the current user choice for which TM's is being used for what, have "no place" in this design?


>modeling the problem domain under consideration using an object oriented approach

Really curious. Do you have any material that explains this way of design?

I work on mostly web apps. End of the day, it's really about moving data and transforming data. So most of my programs have no choice but to deal with data, and so, my whole design process revolves around gathering, storing, operating upon and transferring data.


I don't know Peter Coad, but the approach reminds me of domain driven design (DDD). Most business logic would be in objects named entities, but DDD has also services, since there might be logic that affects multiple entities (actually special entities called "aggregate root").


Peter Coad did write a number of books, which are pretty old and likely out of print. How to actually model a complex system into an object oriented domain model, is something I have long thought to teach. However, sadly there seems little appetite in a world currently dominated by procedural code acting on data, modeled using functional decomposition or through a relational data model. Of course our industry will eventually relearn the forgotten lessons and methods of the past, rename them as something new, and adopt them as the new silver bullet for software development.


Perhaps you can explain this part from the post you linked which is perplexing me:

> AppDomainAffine, therefore, would be a more appropriate name. Unusual perhaps, but that's because of the common drift toward the mechanics of things and away from concepts (because the mechanics are usually much easier to get for techies)

how is 'AppDomainAffine' not a concept?


I read that as saying that AppDomainAffine is a concept, and that's why it's an unusual name, because a lot of naming these days focuses on mechanics instead of concepts (c.f. MarshalByRefObject)


ah okay that makes sense I was thinking it was opposite of that which made no sense


One of my favorite classes in my code base is called something FooBarBazListenerListener


> think a thousand times before you name something

I gave up .... I write mostly in languages that allow you to deterministically rename things with refactoring tools, so I frequently rename important classes 5 or 6 times before I'm done.


Completely agree. I find that the right name takes ages to come about. Worse, it may require a lot of architectural refactoring that, many times, has nothing to do with the one entity you are trying to name. Instead, it is connected with the entire workflow you are designing. Nothing worse than spending ages coming up with the right name for a class, only to find out the entire class is not needed and you got the workflow all wrong :-) which I have done many a time, to be fair


This.

Not only renaming, but re-bundling entities, moving layers, changing abstractions – refactoring is crucial during and immediately after development. As you implement your idea, you will inevitably find a better way to express it, and it's crucial to be able to re-do these things as many times as possible to reach the best possible result (if it's not a throwaway prototype, of course): no maintainer, including yourself a month later, will have a picture as full and clear as you right after finishing the first iteration.


I've been around long enough to remember a time before refactor->rename was a thing. Now naming can evolve as the class/variable evolves so there's so much less initial cognitive overhead worrying about a name than there once was.


Likes more like a list of names to avoid for me. Manager, handler, controller, and service add little to no information. I try to find more specific verbs instead.

Distributor is a better one sometimes. For example, ConnectionDistributor instead of ConnectionManager if it accepts connections and distributes them to a thread pool.



The thing is a proper REST API is a Kingdom of Nouns. The core issue is that very few were doing object oriented programming and design the way it was intended. Giving me some leeway here, Kay's original idea was a programming model similar to that of the internet but in memory.

If one's position is that striving for a distributed model in memory is the wrong approach because of the inherent complexity associated with distributed systems that's fine but I rarely hear this argument. Typically, OOP/OOD is criticized based off of patterns that aren't proper.


Agree on the functional bit.

Naming things: I try to not think about it for more than 10 seconds, and go with the best I've got by then. I find myself renaming things sometimes, and I'm eager to do this when a better name comes to me.


I use the thesaurus that comes with macOS whenever I think of a name and it just doesn't feel right. Then I typically find a more fitting name after spending 20 seconds looking at near synonyms.


I did try that. That is how I ended up with 50 "Manager" classes in my app.

At that point, it is a cognitive burden to handle so many "managers".


Shakespeare the software architect: "First kill all the Managers".


Before we shuffle off this mortal stack.


Yes this. Naming is an intuitive thing, you can't force it, and it will get in the way, moreover, it's fluid and won't matter until later, things could change.

Just name it whatever and come back to it when it starts to matter more and you've probably thought of something better by then.


Renaming things is a luxury only enjoyed by people who don't have other people using their code downstream. Once people have used it, renaming things becomes a breaking change others in your organization will oppose.


Or hopefully your code reviewer suggests names that are better if your choice doesn't make sense


That is true. I am also the code reviewer at this time, but I do perform code reviews.


Pure functions can also easily be memoized which can speed up processing.


Another pet peeve of mine is variable names suffixed with "Data" and "Info".


Yes, but... sometimes you have data about data (aka metadata) or about a thing, in which case "...Info" is the best thing I know of. Say FileInfo or ProgramInfo (e.g. a CNC machine program).


Worst variable name I’ve ever seen was “data2”. There was no longer a “data” in that code but presumably there had been at one point.


Kill "Manager" of that list :). Adds to the naming burden (aka quality) by removing the fallback :).

I will adopt the list idea!!!


It's an old talk but it zero's in on a point that is tremendously over looked in discussions about OOP/OOD.

https://www.infoq.com/presentations/Making-Roles-Explicit-Ud...


Pure functions are a good solution for a onion in a request / response pattern (which a lot of the scenarios are). However, an onion architecture representing something with state (e.g. a shadow Dom implementation), sometimes a OO inner core is the right solution.


Would love to see your list of names if you’re willing to share!



It's not just pure functions. Two things break modularity: Free variables and mutation.

The problem with OOP is that no method is truly pure, no method is a combinator.

   class Thing
      var memberVar
      def addOne():
          return memberVar+1;

The above is an example of your typical class. AddOne is not modular because it cannot be used outside of the context of Thing.

You can use static functions but the static keyword defeats the purpose of a class and makes the class equivalent to a namespace containing functions.

   class Thing
      static def add(x):
           return x + 1;
the above is pointless. Just do the below:

   namespace Thing:
       def add(x):
           return x + 1; 
The point of OOP is for methods to operate on internal state. If you remove this feature from OOP you're left with something that is identical to namespaces and functions.

Use namespaces and functions when all you have are pure functions and use classes when you need internal state... there is literally no point for OOP if you aren't using internal state with your classes.


Scaling is not the only way to accomplish high-performance. In many cases performance can be achieved with stateful, imperative yet efficiently executed code that economizes on the use of resources on a single node. In many cases, such as for mobile, desktop, or on-prem deployments, this is the only way to "scale" since you do not have the luxury of increasing number of nodes on demand.


I think the parent comment is talking about scaling in terms of software size, not performance.


I meant scaling of software development. As in, how to make sense of large amounts of code.


We have built software for a long time and we have learned a lot. There are many good ways (patterns) to reuse when you need to solve a particular problem. Uncle Bob speaks the truth.

For me, the challenge has never been with the architecture or our combined technical knowledge.

What I have observed as the main challenge is that most technologists start solving the problem before they know what the problem is.


This, so very much. It helps to spend time understanding a situation before assuming you have appropriate ideas about changing it. Management consultants (good ones may be rare, but they exist) have some pretty complex process models just to get a handle on the problem.


Personally I have found that actually understanding situation means that I write code that solves it.


An equally large challenge is that solving a problem with software (and sometimes hardware) tends to lead to new problems that are experienced as part of the original problem.

Another equally large challenge is that the person/people with the problem to be solved doesn't actually understand the problem and is incapable of describing it in a way that makes it a priori understandable.


If the people asking you to solve a problem don't know what problem they want solved then there isn't much you can do except try to solve something and see if they complain.

You could argue that you should talk more with the stakeholders, but most of them don't have the skills required to accurately identify their own problems. Instead they need to see something running and then notice when things are missing, which is why we start building stuff without knowing what problem we ultimately are supposed to solve.


An business analyst appeared in the wild (..) as a consequence of stakeholders not knowing or having the necessary skills. BAs are experts in illiciting user requirements and transforming them into specifications and goals you can use to solve a problem.

Unless you know your goal you will never be able to deliver on your customer's expectations.


BA's work on business software which is typically written to help support processes and workflows that lean toward structure, definition, repeatability and so forth.

There's lots of software in the world (all creative software, for example) that doesn't share these attributes, and all the skills of a BA are useless.

The scope of software is larger than webdev and larger than business.


Completely agree, and for that you probably have a clear picture of what you want.

I whimsically develop programs for myself that I could manually sort out in 5-10 minuts if done sequentially.


I don't think the example here is the best. There's a case to be made for extracting pure functions and organizing them like this, but I don't think this code makes it. The benefit of pure functions IMO is primarily in that the code becomes easy to reason about if it doesn't depend on state. But any app that does anything will have state, and the question is how you manage that. One guideline could be that individual code units should reduce the amount of state you need to worry about at higher levels of abstraction.

In the example, there is hardly any code that does anything different depending on state. There's no state being managed, so there isn't actually any architectural problem being solved here. Should the API go down or change its format, the code breaks. The pure pluck_definition() will still fail to parse the JSON if the format changes. The pure build_url() will stop working if the API changes its URL format. They will pass unit tests, but fail in practice.

An actual problem to be solved here is to abstract away the details of the REST API, formatting and network errors. One way to do this is to pack that into a component with a well defined interface. You can still do this stateful/non-stateful split within the component if you want, but on the application level you need to apply that heuristic recursively at different levels of abstraction.


There is absolutely a problem here. Having worked in disasters of a code base, the architectural pattern in the first example is probably fine... until the software grows. The first function truly is a thing-do-er which violates SRP. Then it will easily become a ball of mud.

Why is this so bad? Its not because its expensive, yes that is bad, but the largest issue with working in a ball of mud architecture, is that the code becomes so fragile and interdependent that changing any one thing can easily lead to breaking many other things. This leads to a culture of fear of change which grows tech debt. Then one day someone steps up and decides to actually refactor this ball of mud to have some semblance of logic to it, what a noble soul. That person is then subject to a barrage of bugs and issues from the refactor and is that the mercy of their supervisor.

Dealing with state and other side effect like issues is certainly something to consider in architecture, but it is a different argument entirely.


Nice and all, but It feels quite pompous to tout this as a grand unified theory of anything. This is a highly specific little corner of software architecture, and is neither grand, unified or even a theory.


I support this a bit, I would have different idols in mind, but I wont fault someone for theirs. Once something is learnt it really doesn't matter.

The pain is the title..


Yes, sure a little unfair and nitpicky, but still the title baited me to click and left me unimpressed. I wonder if my impression had been better if the content matched the expectations.


Frankly, if you really thought a single article could deliver a "grand unified theory" of anything, that's on you.

People pick catchy titles; it's a key part of getting noticed these days.


And we have a word for it, clickbait. It’s okay that it exists. It’s also ok for us to call it out when we see it.


I have a past as a physicist, so I am a little sensitive to that phrasing. Optimally a catching title matches relevant content - probably get lower bounce rate that way too...


A few years ago I was on a team and the dev lead had a practice of sharing Gary Bernhardt's Boundaries talk every time there was some degree of rotation or churn on the team. Almost part of the onboarding. As a functional programming aficionado, I didn't need the sales pitch, but it was the presentation that was gripping.

Keep in mind, Ruby isn't a functional language, but here was this presenter describing essentially how you write good Haskell code, but in Ruby! A language that makes it so easy to mutate in place that it's known for libraries that "monkey patch" base classes and can change the definition of operators and functions at runtime!

Fantastic talk, fantastic presentation. I now share it with my new engineering colleagues too.

Here's the talk where he talks about imperative shell, functional core, "Boundaries": https://www.youtube.com/watch?v=yTkzNHF6rMs


The talk is fantastic. The interesting thing about that talk is that he advocates using a functional style in ruby and quietly suggests that in the end the coder should 'probably use erlang'. It turned out that jose valim was working on elixir at that time, which basically became exactly what gary was asking for (a rubyish erlang). It's a shame that gary doesn't use elixir.


What is a functional language - something that encourages programming in a functional style, or actively outlaws imperative programming?

IMO it's pleasant to use Ruby in a functional style, much more pleasant than Python owing to the ease of chaining calls that take lambda arguments (blocks). Every time I try to do something similar to a big chained Enumerator stream in Ruby in Python using a list comprehension, I have to invert my thinking - as far as I'm concerned, those things are written backwards and inside out.


I think Gary shared a Twitter related project that was written in this way, but do you have any examples of other projects by any chance?


Ah, I am blocked by Gary on Twitter and there is some irony that his Boundaries talk culminates in a Twitter client. (Gary if you're reading this I enjoyed your feed!)

I'm not aware of any good OSS examples, either.

I think React and Redux are in a way an implementation of these concepts. React is a view layer that sends messages and each component is (ideally) a pure function of state, and sometimes its own history. It's so, so easy to unit test the interactions.

Hooks make it a little harder to reason about, and it's unfortunately easy to write hooks that perform IO and you suddenly end up in a miasma of difficult-to-test code where you have to return to using a mocking library to "replace" IO functions with fake versions, and then you're on a slippery slope again toward mixing interaction and mutation in one layer.


"most projects in elixir". Some good open-source examples are oban, hex.pm, papercups.io


Along the same lines as the article, I've started thinking about internal code as ETL - all code.

There is some faucet, transformation and then a sink. Only at the sink does external state get mutated.

It helps because transformations can be closer to pure functions and you know there is no state changes until you've hit the sink/loader.

Not sure yet, I haven't concluded it's the right way for me, as it has saved a few headaches here and there - but I'm sure I've unknowingly caused more else where just yet unseen.

Also for the author, my favorite word of idempotent - given the same inputs - always get the same output.


Idempotent means feeding the output back in as the input will result in the same output. It might require purity, but is conceptually distinct.

For example, boolean negation is a pure function as it depends only its input and has no side-effects, but it's not idempotent since: not(not(var)) != not(var)


The fundamental problem of writing about architecture: The concerns about decoupling and layering is only relevant in complex software. But an article need to present simple examples to be readable.

In this case, the first listing is simple and easily readable.


[Meta]

The introduction to the article relies on so much context and assumed knowledge that I almost didn't read past the third paragraph.

Who are Uncle Bob, Gary Bernhardt, and Mr. Brandon Rhodes? I have no idea, and I don't need to know who they are to read the rest of the article. (Which itself is extremely well-written.)

IMO: Have an introduction that doesn't rely on unneeded context. It's appropriate to credit people; but do it in a way where a reader unfamiliar with the context doesn't assume that they need to know the context to read further.


The author could stand to flip the sentences around a bit, but the introduction is basically trying to say "these three thoughtful guys collectively formulated some useful ideas about software architecture and in this post I will present those ideas".

It is important too attribute ideas to their originators (or at least the source from which one has learned them) and it is good that the OP is doing that.

I agree it could have been done a little more gracefully, though, with just a minor tweak to use phrasing that doesn't seem to assume the reader is familiar with those dudes.


Is it just me, or does it feel a bit "dishonest" to have """data = requests.get(url).json()""" in example 3 but use two lines for that in examples 1.


I prefer the first example. It’s more straightforward and avoids accidentally creating dependencies (for example some other code calling pluck_definition) that will be make it harder to modify when you need to add features or the API changes.

Testing pluck_definition by itself is completely pointless since it does nothing on its own. This is the “test public interfaces, not private implementation” principle.

Similarly build_url and pluck_definition need to be coupled because they both depend on the specifics of a third party API. It makes it much more clear what the expected output is to keep them together so that if something breaks you know which url to check and what the response should look like.

I also dispute that it’s hard to unit test the first function—-you would simply mock the api response and then you have a great test. Much better than having separate tests for tiny helper functions that do nothing on their own.

Now maybe this example is just too simple and the presented architecture makes more sense on larger code but if so then the article is poorly written. Examples need to be realistic enough not to obfuscate.


> I also dispute that it’s hard to unit test the first function—-you would simply mock the api response and then you have a great test.

How many times have you seen a test pass or fail when it shouldn’t have because the mocks didn’t match the actual code? It’s a useful technique but it has drawbacks which are not easily prevented.


You would still need to use mocking to test the imperative shell


This, here, is what I meant!


We use mocks a lot on my team, and I don't think we've ever seen what you describe. Maybe you're using mocks incorrectly?


You might also want to consider the possibilities that you have problems but haven't noticed them yet or that your experience is not a global truth applicable to all projects.

If you think about the problem more, ask yourself how it's possible to mock a remote HTTP call without the risk of this problem? I've seen people write frameworks which periodically save and cache responses but unless you regularly perform some kind of validation there's no way to avoid that problem.


To me, a software architecture is an implementation of the decisions made (amongst competing choices) in meeting its requirements, both functional and non-functional.

I simply don't find the article to provide a grand unified theory across different genres of software at all. What worries me is that newcomers to the industry will, until they move to the next discovered silver bullet unified theory.


While I agree there is no silver bullet, most applications will have a lot of similar nonfunctional requirements - API, data processing, and i/o. Having a common basis to go to - then evolving it - is a good practice. Cowboying or reinventing the wheel is not a good practice.


It's a good theory for writing general code that can be easily tested and modified in the future.

This is the first thing you need if you want to meet any business requirement at all.


I find it amusing how these software architecture gurus always demonstrate their teachings with a cookie-cutter CRUD app. There are software which do things other than making REST calls... Show me how you’d implement a basic MS Paint clone and we can talk!


If you wanted to be able to test your MS Paint clone, you might implement your drawing core without being dependent on a window existing. An off-the-cuff approach might be to have the core define a canvas interface with the rendering primitives it desires, and provide a suite of functions that will translate user input actions to canvas rendering primitives.

A purely functional approach would involve a core of functions taking in a current state and returning an updated state and a list of canvas actions to take.

Functions might include "select ellipse tool" and "set pen width to 5" and "click on canvas at point (3,15)" and return canvas primitives like "render ellipsis from (3,15) to (47,99) with stroke width 5, stroke colour black, and no fill".

Unit testing of the core should obviously be quite simple to do, being totally independent of the GUI system.


This is the feeling I get most of the time in these 'let me explain software architecture in 1 post' articles. You see, I agree with pretty much everything in the article and write code like that all the time. Because yes, it does work. But you learn stuff like that automatically after a couple of years. Still, this is just one simple function; the actual hard part is scaling that to 1k+ functions while still maintaining something clean and understandable. Sure you can draw this in an onion diagram, but actually executing that in real life is something else, and I almost never see practical examples of that in online articles simply because that is not possible I guess, only in actual code or in more alaborate books.


This happens in a lot of tech talks too. Also a problem with explaining design patterns.

I wish there was a talk, that took something like Microsoft Word and de-constructed it and explained how someone can program it, from first principles.


I don't think that would be particularly useful. In my experience, most of the problems you need to solve with a big application like Word are either very particular to the problem space (e.g. word processors, 3D game engines, etc) or very particular to that specific codebase (i.e. how engineers dealt with the decisions made earlier by previous engineers).

The Word codebase is no work of art, I guarantee. Like any codebase of any size, it's more of a ball of duct tape and bailing wire.


> Show me how you’d implement a basic MS Paint clone and we can talk!

A MS Paint clone is one of those apps that fits rather nicely on a Clean Architecture.

You have the domain model (raster image), you have the application layer (image processing operations over the raster image), and you have the IO/service layer (image exporters/importers, GUI, platform-specific features, etc).


Are you actually a programmer?

You sound more like an academic with no real life experience on large software.

You do realize that within those "neatly separated layers of yours" there will be an insane amount of complexity which will be impossible to separate in layers?


> Are you actually a programmer?

Why, yes I am.

> You sound more like an academic with no real life experience on large software.

Well, I do work as a software engineer for a FANG, so that does fit the bill I guess.

> You do realize that within those "neatly separated layers of yours" there will be an insane amount of complexity which will be impossible to separate in layers?

Well, there really isn't you know?

I mean, unless you don't know what you're doing.

In my line of work, the hardest problem is to reign in complexity. It's very easy to mess up and end up with insane amount of complexity, specially if you don't have a clue about what you're doing.

In fact, as the saying goes, incompetent developers make complex things, while competent developers make simple things.

It seems you make a living struggling with complex things that ends up piling on a lot of complexity. Perhaps you would do yourself a favor by improving your skillset and learn how not to follow that path.


Trust me I'm considered a very good developer and simplification is something I always put as top priority.

You on the other hand still belive you can make everything totaly managable forever and always and in any situation.

Your academic tone betrays you: you obviously believe you can always split things into neat little parts and then write books about it and how even in production this is so easy and always "just a bit of layering and it will work".

Anyone who worked on real production systems over years knows that unless your company has infinite money for developers and infinite time that you will sooner or later get parts on which complexity starts getting layered on.

But hey, "just add a few layers".

Also being a FANG developer means nothing. I've heard stories about FANG company developers, they are not very FANG-like to say the least.


What is it that makes you think you couldn't implement ms paint in this architecture? I've seen all sorts of thing with this sort of architecture, ranging from dungeon crawlers to distributed virtual machine orchestrators, to file system browsers, etc.


E.g. I don't see how the text tool would work. Where's the transient state stored while the text is being entered, but not yet finalized? Similarly, how would drawing a line work? That is, drawing a long squiggle, that is continuously updated while drawing, but will still be undone/redone with a single Undo/Redo operation.

I simply don't see how this fits the functional core + imperative shell pattern.


The imperative shell could be something like

    while not is_closed(state) {
        (mouse_activity, kbd_activity) := get_mouse_and_kbd() //blocking call
        (state, is_modified) := update(state, mouse_activity, kbd_activity)
        if (is_modified) {
            draw(state)
        }
    }
There is no reason that `update` and every function it calls cannot be pure. The state will have to include undo and redo histories, an indicator of the active mode, so that you might be in one mode while you're entering text, a different mode while you're dragging out a line or a shape, etc. It will include information about what drawing tool you have selected and what colors you've chosen for your ink and eraser. All that data will be used by the draw procedure to render the correct view of the state.

The one issue here is that once your state gets to a certain level of complexity, as it certainly does in ms paint, it's not going to be performant without immutable "persistent" data structures, which are either not available or not widely used in python/js/ruby/c.

If you're thinking i haven't achieved the goal here because the draw procedure is big and complicated and imperative, then maybe we need to replace

    draw(state)
with

    view := render(state)
    draw(view)
where render is pure and does almost all the work. But I'll leave thinking about what render looks like as an exercise for the reader (or to be described by an HN user who knows more about graphics programming than I do)


see sibling comment for specific detials.

From a explanatory pov: I suspect you're thinking that the mutable text state should be "core" but it's actually not, it's part of the "shell". Think about a RESTful frontend providing UI to a (mutable) database holding the state, the database conceptually is the "core" of the program, but in parlance of the architectures discussed here it would be in the "shell".

It is a bit confusing, unfortunately, and it's one of the things I dislike about this classification scheme.


I think the idea described by the op is how you're supposed to organize code when programming in Haskell. Having most of your application logic in pure functions makes it easy to test and reason about.

While this is possible to do, it is easier said than done. Organizing the code this way requires a lot of time dedicated to thinking/rewriting, which is often not available in the usual corporate environment with strict deadlines.


I think it only requires a change of mindset for a developer. Learning TDD truly will usually lead to functional code, even if you do not write your tests first.

Eg. I can't imagine what would be hard for a MS Paint clone to achieve using this approach. There are always things with side-effects (IO, namely), agreed, but in this CRUD example, the integration bits are coupled in a way where you need a single trivial integration test. Similar approach can be applied to a GUI app, so I am not sure why does GP feel it can't? Any concrete things where you think you can't apply it?

Sure, you'll have many more small functions, but I think the core takeaway should be that you can always structure code in away where integration functions are simply statement-lists (eg. no control-flow logic in them). That will usually require rest of your code to be functional.

And yeah, it's harder to adapt existing code to this pattern, but introducing new code in an existing code base following functional pattern is trivial.


> Any concrete things where you think you can't apply it?

I'd say that you can apply it mostly everywhere, however it is not friendly to most people that will maintain the code, and that is a problem since most software will have many maintainers over its lifetime.

This reply to another comment I made on this thread should elaborate a bit more: https://news.ycombinator.com/item?id=24917764


Oh sure, there's always going to be some friction! So let me go on a tangent here.

While the current education system is geared toward procedural programming for the most part (I imagine mostly theoretical computer science curriculums only focus on functional programming and lambda calculus too, but even then, only very late and very theoretical), the question is more of whether it is a better approach when applied "universally" (with non-functional languages, it's unlikely to be really pure)?

If deemed that it is, functional proponents like me (and you, it sounds like) should push for it to get a better coverage in Universities than eg. OOP, even for OOP languages. Most academia is out of the software industry, so should we educate them or not? And how best to do that if the answer is yes?

I do have a worry that some of it is also incomprehensible to some people, or that the barrier to entry is higher. Is such purity more reserved for those that also like mathematical abstractions?

Now, the biggest problem I have with colleagues reviewing my code is that it seems too-simple, and they would have introduced another 2 layers of indirection/abstraction, but they can't really say that anything is wrong with my approach. It's really hard to get them to jump out of their "OOP bubble".

The code is easy to maintain, but there is a big risk that someone will pop in and just turn it into one big side-effect mess that will be hard to maintain. But then again, that's what they would have done anyway, this has at least some chance of not becoming that :)


The only mantra about code organization I subscribe to is that it should be as simple as possible. Sometimes it means using several layers of abstraction like in this article. Usually it means just writing the damn thing in the most straightforward way because simple implementations are easy to adapt for future changes.


"Most things aren't rocket science."


With this architecture, it feels like the very outermost level - the framework and drivers - would ideally have little or no application-specific logic in it whatsoever, and would exist simply to glue together the various functional components with I/O and distribute them to whatever computational resources are required for their execution.

This, to me, feels very very similar to a model-based approach, where the outermost level is a modeling framework that does nothing more than route data between different functional components and IO components.

I have a strong hunch that this outermost layer does not need to consist of anything more than a suitable framework plus some configuration data specifying (a) the mapping from functional components to hardware resources and (b) the data flow between functional components.

If this is indeed the case, then, extracting individual components or subsets of the system for testing should simply be a matter of providing a suitable transformation of the configuration data, and re-executing the framework.


Sometimes, the complexity of the world you're modelling requires there to be many, many `build_url`s and `pluck_definition`s piping into and out of each other.

I've found these organizing principles to be quite useful.

1. identify the key that your application is processing (in this case, it's `word`)

2. make it so that as many functions as possible take this "key" as their only argument.

3. whenever you are fetching data about a model, it is either directly or transitively in terms of this key.

4. push model-fetching (technically the I part of I/O) as far to the leaves of your program-tree as possible, hiding them behind "model client" classes, that are passed in to your job at construction.

5. Perform all upserts idempotently, atomically, and in the bottom-right corner of the execution tree. This makes it easy to reason about mutations, and also easy to omit when you're wanting to do "dry runs" or read-only local runs.

6. (4) means you will occasionally fetch the exact the same model more than once in the scope of one execution. This is OK. You can optimize this later with execution-tree-scoped caching.

With this approach, the user-directed I/O (imperative shell) is at the root of your tree, and very narrowly-defined to be the key of computation. You can build a run-loop on top of this, or a CLI tool, or an rpc service. It's narrowness is kind to all kinds of interfaces.

The functional core is everything that isn't the leaves of the execution tree. These functions essentially compose model client calls, and only take the keys as inputs.

The leaves are a collection of reusable one-to-five-liner RPC requests / data base queries. If you like, you can wrap these in a class that caches the queries, as described in (5).

The tests for leaves mock nothing if they target a DB. They mock the rpc service if they target an RPC service.

The tests for the functional core functions mock the model-clients (leave functions).

With this structure, as long as your tests are brittle enough to break when an RPC contract or data model changes, you do not need integration tests.

I do this in Python, but it would work just as well in any language with structs/interfaces. I've observed the same benefits proclaimed in the article, although my approach is a bit different.

If people are interested, I could write more on this, with examples. Let me know!


Shameless plug of my 2012 blog post where I demonstrate The Clean Architecure for Go applications: https://manuel.kiessling.net/2012/09/28/applying-the-clean-a...


Thank you! You have gone much deeper.


Way off the mark. "Software architecture" is about how you make the disparate parts of a software system work together towards a common goal. Most of these parts aren't code: hardware, programmers, tech support, HR, licensing and legal, etc.


I take it from this comment and others that some places call their engineering VPs/senior management "software architects" these days.


Software architecture involves HR? You can just change around definitions all you want, but don't expect the rest of the world to agree.


There is no consensus about the definition of software architecture. http://beza1e1.tuxen.de/definitions_software_architecture.ht...


None of those includes human resources like programmers as a part of software architecture.


Obviously it does. E.g., banks standardizing on Java because it's the only thing they can hire for, not for any technical reasons.


I think you're confusing a software architect with a systems or enterprise architect.

It's true though that software architect is a vague term in our unregulated industry. Many companies never have software complex enough to need an architect; and many struggle through lack of a global plan, and just have lots of local decision making and get slower and slower over time.


Looks like 'Software ate the world'.


Completely agree, software does what its told, thats easy, driving consensus across people is the hard part


If it was easy why don't you just implement the solution directly instead of spending time in meetings discussing how to implement the solution? The reason we spend time caring about software architecture is because writing software is hard and anything that can make it simpler is very helpful, like structuring its architecture beforehand.


So applied Onion Architecture. I agree with this part.

However, the functional part of the discussion, while I agree with its benefits, is highly depending on the domain within the onion.


Generally, to those of us who apply the functional approach everywhere, it comes naturally whatever the problem.

There are idiomatic ways to program in particular languages (even in Python or JavaScript) which are strictly against the functional approach, even though nothing in those languages prohibits it. It gets trickier with external dependencies which are "forced" on you too.

Do you have a concrete "domain" example that you think will be hard to turn functional (which basically means turn the integration points into minimal functions doing just the integration bits — basically, any side-effects are limited to those integration functions)?


Multithreaded embedded systems. Yes, the functional people are absolutely correct that shared mutable state is evil. If that's the world you live in, though, you need to deal with it effectively. You need to have thread 1 able to respond to an external event by changing shared state that thread 2 sees, but in a controlled way so that thread 2 never sees an inconsistent state.

Now, there is a place for pure functions in that environment. But making decisions based on state can run so thoroughly through the code that "functional core" leaves very little core.


"Multithreaded embedded systems" is not saying much.

My initial hunch is that you've got a lot of existing code that passes and mutates big global state objects around. Still, even your description clearly highlights an issue that is there regardless of whether you want to push for a "functional core": "decisions based on state ... run so thoroughly through the code" _will_ come back and bite you.

It also highlights an easy way to decouple those into simpler functional parts: identify which part of the state is really needed in each part, and only pass that in, and have it return an updated state: basically, the only change you are making is turning the implicit parameters into explicit function arguments and return values. Turning the entire codebase around will be tricky, of course. But perhaps you can switch over chunks of it as the units are readied as you go along.

Of course, if you've got a lot of data that would be expensive to copy around, you might want to keep some of the logic for mutating that non-functional, but you could still decouple that thus making it have a functional core too.


Here's a router for television signals. It's got 100 different video sources, 80 different destinations, and "layers" (you can route audio differently from video, though you usually route them together).

You have 6 or so different sources of control (different panel systems, automation systems that use serial interfaces, other automation that uses Ethernet). Each of those is a different thread.

All those different sources of control need to see the same image of what's connected to what. So when one thread makes a change, it has to change for all the threads.

You could think about separating that state into parts, but does that really gain you anything? If you've got 80 variables that behave identically instead of one 80-element array, are you really ahead?


A shadow DOM. A state manager ;). A protocol implementation which needs state. There are cases where the domain has state. Like said, typically, for request/response cases which is 90% of everything we program nowadays, this state is typically loaded from somewhere else. I am also not particular arguing for OOP here. It is just the absolutism which are an issue.


A DOM is a huge hierarchical data structure and not much else (sure, it has function pointers too, but that's pretty much it).

You can easily turn things into substructures and have functions only work on those: whether you pass by reference or value is up to you and your choice of language, but even when passing by reference (to avoid memory copying), you can write functional code.

Again, it sure is non-idiomatic for most languages, but that does not mean it's impossible or even hard.

As far as "absolutism", generally, aiming for minimal statefulness will help you write more maintainable code (citation missing :).


What about an "update(state, event) -> NewState" design for state machines and stateful protocols?


Frankly, find_definition should be additionally modified to take a function that consumes a url and returns json data. This way it's easily unit-testable.


That would be dependency injection. It is a valid way to design around side-effects.

Another (equivalent) way to test it is to mock out requests.get and response.json using a mocking library: instead of performing real requests, do what the test wants (return correct data, return unexpected date, or throw an exception).


"Functional Core / Imperative Shell" is also why I love a combination postgresql/nodejs stack. The transactional sql or plpgsql surrounding the state makes dangerous operations much safer and clean. Then as you get farther from risky state changes, you get the productive flexibility of less strict javascript.

You also get the option of putting complex operations in plpgsql functions when you want them executed atomically without any side effects, corruption, or cleanup requirements when one of the step fails. Your default failure mode becomes "You got an error, everything was automatically reverted, you can safely tweak your request and try again".

Also operations being performed in a monolitic db means that the different pieces of data are much more likely to be loaded in adjacent memory caches and ready to be joined, combined and processed, providing significant efficiency and speed gains.


Running code in a database is basically the opposite of the software architecture covered in the article.


I tend to write in a modular, layered fashion (I call it the “layer cake” pattern), and I like to try breaking up large methods to reduce CC. For example, I might break a large switch statement up into “sets” of handlers.

The main motivation for this, is because I use what I call “evolutionary design.” I tend to refine design as I progress through development, as opposed to having it substantially complete at the start (A lot of classic developers probably just defecated masonry at the very thought of that, but it WFM). Having a finer granularity goes a long way, in supporting this methodology.

It also helps a lot for refactoring, improvements, and testing. The overall quality of my products is drastically enhanced by modularity.


But its all rules of thumb and common sense, largely. When are we going to get an axiomatic theory of software engineering/architecture which shall allow us to argue about and compare different architectures for a solution and arrive at an ideal one?


I don't think you'll be able to do that.

What you can is make is define a starting point app in the desired architecture. Then create a list of functional changes to go from starting point, to desired state.

Then you can use various stats for each architecture on how complex was the changes.

Make the functional starting point, the functional list of changes and stats a standard.


The author makes an observation that has been growing on me in the last few years; many ORM models really prevent you from using an architecture like this, because your “pure core” actually has to have logic for DB operations like saving/filtering.

It’s possible to build a DDD Repository in Django or Rails, but really a lot of work. I think the frameworks like sqlalchemy and NHibernate seem to do things a bit better by tracking “dirty” models, hiding the “ORM-ness”, and letting the higher levels control when to flush/write to the DB. But the more you abstract this stuff, the more you lose the auto generated sugar that makes frameworks like Django so productive.


Absolutely.

[Uncle Bob is clear](https://blog.cleancoder.com/uncle-bob/2012/08/13/the-clean-a...) that the inside circles must not know about the outside world.

That is, your business logic is not allowed to know what a DB operation is, nor should it do any DB operations behind the scenes.

> Frameworks and Drivers.

> The outermost layer is generally composed of frameworks and tools such as the Database, the Web Framework, etc. Generally you don’t write much code in this layer other than glue code that communicates to the next circle inwards.

> This layer is where all the details go. The Web is a detail. The database is a detail. We keep these things on the outside where they can do little harm.


The tension I observe is: when should you use a highly productive and well-known framework like Django or Rails, and when should you build your own DDD/Clean architecture?

This discussion has gone on for a long time, e.g. see the back-and-forth around https://dhh.dk/2014/test-induced-design-damage.html after Weirich demonstrated what sort of thing is required to use the Hexagonal Architecture with Rails.

I believe that if your system is sufficiently complex, you'll start to see the benefits of a more structured architecture, but it's a net drag on productivity for small projects. These days perhaps you chop your monolith into microservices before reaching the ROI point for Clean/Hexagonal?

I could believe that a "framework-first" architecture is more productive while you have <100kloc. Or maybe it's <10kloc, I don't know. I'm certainly seeing the architectural strain with a Django-first architecture in the current 150kloc monolith I'm working on, and chopping into a few services makes sense there for other reasons, so it might obviate the need for a bigger refactor onto Clean/Hexagonal.


Software architecture, as I understand it, applies to a particular software product/solution. Architecture is your opportunity to choose which problems will be easy and which problems will be hard. For example, a microservices architecture makes HA easy but global consistency and data locality/caching hard. A centralized database architecture is the mirror image.

The article discusses architecture in the abstract, which sounds more like a programming model or paradigm (OOP, functional, actor model, imperative), or maybe a pattern.

Wouldn't it be weird to discuss building architecture in the abstract, without a notion of the site to build on or the purpose of the building?


I had discovered this recently too and asked a relevant StackOverflow question about how people deal with lots of I/O.

https://softwareengineering.stackexchange.com/questions/4160...

The premise is the imperative shell can become pretty riddled with decisions if your application contains a lot of I/O.

Some like the "Free Monad" solution, but I found that too be too coupling.


It seems to be a joke to claim such as “grand unified theory of ...”


I think software architecture is the wrong word for this. It’s module structure, architecture to me includes all the surrounding bits concerning the “ilities”. (Availability, Interoperability, Modifiability, Usability, Testability, Security, Performance). I say this because getting the module structure right is important but not the only factor to a successful application.

My favorites in regard to the topic of module structure are David Parnas and Juval Lowy


"Write code. Not too much. Mostly functions."


I always really liked Avdi Grimm's "Confident Code" talk: https://www.youtube.com/watch?v=T8J0j2xJFgQ

It has similar conclusions, but does it from the frame of the "story of the code" rather than abstractions like "functional core" vs "imperative shell"


When I see a subroutine with a verb in its name I think of side-effects. In my book a pure function should be named after the result it returns, so in this case I would use the name `definition' instead of `find_definition' and `definition_url' instead of `build_url'. For predicates I try to avoid an "is" prefix when a simple adjective is sufficient.


Map, reduce, filter, fold, project, transform, group_by, bind, apply - any functional API you care to look is all verbs.

Functions do work. That doesn't mean they have to have side-effects; if they're functional, they do work on the input and produce output. Doing is a verb. It's natural.

Using a noun as a function name is at best justified when you have a situation where you want to hide whether data is being calculated on demand, or fetched from some storage or lookup table. Some languages bake this in, in the form of properties - attributes of a structure which look like fields, but are actually functions. Those things have nouns as names.


It depends on the culture and the programming language. The advantage of using noun phrases for pure functions is that

1. only by reading the name of the function you know it's a pure function

2. a call to a pure function in an expression reads more naturally since operands are values and values are nouns.


Too bad you're heavily downvoted. Principle "function name is a noun if it returns something, and verb if it doesn't" is super useful - just by looking at its name you know immediately if it has side-effects. Since I learned it I apply it all the time in programs I write alone, but almost always I see pushback in a team because people are unfortunately used to see `getX`, `fetchY` etc. as method names.


I guess a verb phrase comes up naturally when you are eager to implement a new function and think of all the steps required to calculate the result. However, a noun phrase is a more proper abstraction since the implementation may change to simply return a cached value. (see also the Uniform Access Principle).

Anyway, here is my naming strategy:

1. Boolean pure function

Use an adjective phrase where the adjective is the last word, for instance UrlValid or DefinitionFound.

2. Non-boolean pure function

Use a noun phrase where the noun is the last word, for instance CurrentDefinition or DefinitionUrl.

3. Non-pure "function"

Use a verb phrase where the verb is the first word, for instance PrintError or ReadInput.


I don’t think this is true at all. The most functional, side-effect-free functions are often just verbs (map, reduce, concat, etc).


My theory is software should be architected with debugability, deployability, and developer friendliness in mind. If you can't deploy your solution on a predictable schedule, with multiple teams working on it in parallel, and keep stakeholders happy, then you are on a fast path to irrelevance no matter how "sophisticated" the architecture is.


> data = response.json()

That isn't IO, that's encoding. Don't you want to verify you are receiving json (whatever interface you are assuming is available) in pluck_definition? How far down do you want to go with this?


I think what I would need mostly is: - dependency inversion, try to push decision as later as possible by using this technique + interfaces - separate pure functions vs side effects functions - static typing that can help me with structs and interfaces with optional value

It ends up with a lot of typing but I think it is worth the trouble. Down the road it is easier to maintain and it already saves you time anyway.


This is great— thanks for sharing


Stop calling this theory. This isn't theory. This is just rules of thumb and an opinion.

You want actual theory? Here's theory:

http://www4.di.uminho.pt/~jno/ps/pdbc.pdf


It's theory in the analytical/interpretive sense, not the empirical scientific sense.


The paper I presented is not theory in the scientific sense. It's theory in the mathematical/logic proof based sense. There are really two uses of the word "theory" for this context, science based theories proven by statistical experiments, and logical based theories created from a set of elements and assumed axioms.

Things like this shouldn't be titled with "Theory." Call it your "opinion" or a "design pattern." Theory is associated with things like the theory of gravity (science) or game theory (math). This title is wildly inappropriate and shows a lack of understanding of what constitutes a theory.

The fact that this post is voted up shows how much the public misunderstands "theory" and how they mistake things like this which is actually just someone's qualitative opinion with the integrity of an actual theory.

Using big words and drawing diagrams does not lend any formal legitimacy to your opinion.


Another "academic level" demonstration that doesn't work in practice.

It's so simple to say "look how awesome this approach is" on a 50 line program, "just use pure functions everywhere".

In real life things get very complex because there are 200 working parts interconnected. And not because "it's bad design". But because that is the requirement. Soon you get pure functions with tons of parameters or parameters that are complicated classes/structs themselves because the work that needs to be performed is very complicated.

Then you get to the issue of "this function gets the entire class as the input param but it only access a small amount of members in that class".

"Yo bro just split the class into multiple smaller ones". Sorry bro can't do, those separate classes will need to be acccessed as a whole sooner or later in another part.

People act as if there will ever be a thing such as a "perfect programming design". There won't because things will always evolve & change. Real life programs are simply too complex.


> In real life things get very complex because there are 200 working parts interconnected. And not because "it's bad design". But because that is the requirement.

These sorts of things are always better discussed in terms of specific cases instead of generalities, but I think one of the key parts of good software engineering is factoring the requirements into the simplest design possible.

And even when time or other considerations prohibit creating an extensive design ahead of time, practicing a few well-chosen heuristics like "use composition", "keep functions and methods short", "use clearly defined and consistent terminology in your abstractions" can go a long way to making code less unruly and easier and cheaper to refactor down the road.


Sure, real life programs are very complex. But that doesn't stop you from testing and refactoring whatever local point you need to touch, so that you can get the job done easier.

My hunch is that moving pure-functions and pure-objects out of the spaghetti will gradually eat away at the spaghetti.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: