Developing Service Oriented Architectures

JeffJenkins · on March 20, 2014

If you're considering switching your monolithic application into a SOA you should consider the testing and debugging implications seriously.

If your call graph goes more than one level deep then doing integration/functional testing becomes much more complicated. You have to bring up all of the services downstream in order to test functionality which crosses that boundary. You also have to worry a lot more about different versions of services talking to each other and how to test/manage that. The flip side is that the services will be much smaller, so leaf nodes in the call graph can reach a level of test coverage higher than a monolithic service.

Debugging and performance testing becomes more complicated because when something is wrong you now have to look at multiple services (upstream and downstream) in order to figure out where the cause of some bug or performance issue is. You also run into the versioning issue from above where you have a new class of bug caused by mismatched versions which either have tweaked interfaces or underlying assumptions that have changed in one but not the other (because the other hasn't been deployed and those assumptions are in shared code). The bright side for debugging and performance is that once you know which service is causing the issue it's way easier to find what inside the service is causing the issue. There's a lot less going on, so it's easier to reason about the state of servers.

CarlHoerberg · on March 20, 2014

It depends how you do SOA. We try to publish "events" rather than to "call" another service and expect responses. We try to decide as much as possible in the service that publishes an event, so that information doesn't have to be returned. Other services act on that information.

Your kind of SOA sounds more like distributed RPC, which indeed is complicated.

JeffJenkins · on March 20, 2014

Yeah, if you can get away with that model things are simpler. The best first step into SOA to take is offloading work that doesn't need a user response to a pool of workers (often by publishing to a message bus, as mentioned elsewhere in the thread). I've implemented systems like that using Rabbit and Redis and it worked fairly well.

However, some kinds of requests are fundamentally about integrating the results of a bunch of different services into a response to send to the user. In that case you somehow need to gather the results of your rpcs/events in one place to integrate them. An example is Google search where the normal results, ads, and various specialized results/knowledge graph data need to be integrated to present to the user.

Another consideration is how much you want to be able to isolate services. If you have a user/auth service as in the article which completely encapsulates the database and other resources needed for data about users then you'll end up with a lot of calls into that service. It's a disadvantage because of all the reasons in my original comment, but it's great from the perspective of being able to isolate failures and build resilient systems

CarlHoerberg · on March 21, 2014

Ok, yes, in the case where you have to have all information on one page. Another way is of course to get that information in a ajax call, or open a SSE/Websocket connection to listen for events from the event bus. But there are of course cases where that's not feasible.

And in the case of auth systems what we typically do is to have a separate app for logins/authentication, then do simple SSO or domain cookie sharing and let each sub system handle the authorization.

My point is that not all SOA has to be as complicated as the article's. But if you go that way, yes, then all your points apply.

noelwelsh · on March 20, 2014

Jay Kreps epic blog post on the Log should be required reading: http://engineering.linkedin.com/distributed-systems/log-what...

We do things very differently to that discussed in the OP (and are heavily influenced by the Krepian school of SOA.) I'd write more, but I'm on a train with flaky internet.

tom_b · on March 20, 2014

Awesome link. Epic is the right descriptor. You could do much, much worse if you were looking to understand data integration in practice.

The links section at the bottom alone are a tremendous resource. I've stumbled across most of these over a period of years and if you started with just skimming over this article and the links, you would save yourself many hard lessons.

adrianhoward · on March 20, 2014

For a slightly different slant on SOA's I'd thoroughly recommend watching Fred George's talk on the technical side of implementing micro-service architectures from Oredev last year

https://vimeo.com/79866979

along with his talk on Programmer Anarchy which is more about the resulting team / working patterns

https://vimeo.com/79866978

It's moved playing with much finer-grained service architectures way up my to to list.

alecthomas · on March 20, 2014

That first video is really interesting. It seems very reminiscent of a distributed tuple space architecture [1].

[1] http://en.wikipedia.org/wiki/Tuple_space

zeroDivisible · on March 20, 2014

I've loved both of those talks - thanks for sharing them!

contingencies · on March 20, 2014

Other big benefits you get from well defined interfaces include security - you can do application level firewalling, statistics, anomaly detection really easily - and testing (play back known traffic, literally generate every possible message and see what happens, etc.)

Also, HTTP can be a poor choice in security terms for its complexity baggage (cookies, headers, methods, DNS baggage, SSL baggage, etc.). Alternatives such as MQs can be useful to consider, especially at later points in growth, since they can handle complex topologies with ease.

I believe the author could benefit from making a distinction between stateful and stateless in his description of dumb and smart API clients, since state is the main factor resulting from assumptions here.

ndcrandall · on March 20, 2014

So after going back and forth with our startup on SOA or not, I have felt like separating these services out of (in our case) a monolithic Rails application has made sense. I would like to use Sinatra for one service, rails for the web interface, and possibly Python for the last service. I understand this adds a lot of overhead by creating an interface and authentication for each service.

For me the logical division of services seems to make sense especially when using 'the right tool for the right job'

I may be over optimizing, but I think it will pay off at a later date with more developers and the need to scale each service separately. Maybe someone can point out issues with this thinking (besides those addressed in the blog post).

dasil003 · on March 20, 2014

To me the crux of the issue is the interface. If you can define a very clean interface without having to do a lot of contortions to get the data you want where you need it, then extracting a service can be relatively low overhead. But what often happens is at the 30,000-foot view it looks like a service makes sense, but then when you get into the details you realize the separation can not be as clean as you first envisioned.

jaegerpicker · on March 20, 2014

I'd disagree with this. In my experience everytime it has seemed like the interface was too complex too support services correctly, it's been because the break for the services was at the wrong level of abstraction. SOA tends to work best when you define discreet chunks of functionality and each service is only responsible for that chunk. Just like developing testable code, you want to make those chunks as small as possible, without losing your mind at the shear number of services. For example having an order service and shipping service as opposed to a just an order service that handles everything is more likely to make sense IMO.

dasil003 · on March 20, 2014

Your argument doesn't seem to address my point. You're saying if the interface isn't good you're at the wrong level of abstraction. Okay. How does that contradict the idea that the interface is everything for creating a successful SOA?

jaegerpicker · on March 20, 2014

I'm saying that it's rare that SOA isn't a proper fit for a web application and if it seems like a poor fit you likely haven't abstracted your interfaces to the right level. Your monolithic app is going to suffer from poor design just as much as an SOA based one. Your point seemed to be that certain applications could be well designed and still not a proper fit for a SOA. I think if an app is well designed it will by default be a proper fit for SOA. If you were not implying that I apologize. I guess my position is that nearly any complex web application would benefit from SOA.

dasil003 · on March 20, 2014

My point was really nothing to do with whether an application is a fit for SOA. It's more about how to design an SOA. Your point about a poorly designed SOA being an equally poorly designed app is well taken, but I think it's more work to design a good SOA than a good monolithic app. And this is where the effort of designing the interface comes in.

In a monolithic app interfaces can be more fluid because you can have automated tests and static analysis and compile time checks verifying that a given change works. That means you can prototype and iterate faster while the business requirements may still be churning considerably. To realize the benefits of an SOA you need a much more stable interface and some way to handle validation and correctness of the wire protocol. If you do it right and come up with a stable interface, you gain the benefits of decoupling system administration, scalability, and even development to a great extent. But if you do it wrong you end up a lot more work for an equivalent business logic architecture, and if you don't have any scaling issues than the cost-benefit is likely not to be there.

My rule of thumb about whether an SOA is a good idea to pursue at a given point in time is much more related to the team size than the nature of the app.

CarlHoerberg · on March 20, 2014

If you use a message queue system like RabbitMQ, and simply publish messages as JSON you have interfaces, authentication etc. done already. We have 17 different services internally, in four different languages, all communicating over AMQP with JSON messages.

(Disclosure: I'm owner of CloudAMQP - RabbitMQ as a Service, www.cloudamqp.com)

_3u10 · on March 20, 2014

It's almost as if authentication is a service that doesn't need to be replicated across services.

brown9-2 · on March 20, 2014

I understand this adds a lot of overhead by creating an interface and authentication for each service.

A new interface for each service, sure, but if your services only need to be deployed internally, some sort of authentication layer could be overkill.

al2o3cr · on March 20, 2014

"Maybe someone can point out issues with this thinking (besides those addressed in the blog post)."

Howsabout the one that was addressed in the blog post, namely that SOA is going to slow down your process?

Premature optimization indeed.

hartror · on March 20, 2014

#2 is exactly how I pitched a migration to a micro-service architecture to my CEO yesterday. Feeling rather pleased with myself right now.

zimpenfish · on March 20, 2014

Last job wanted SOA but wouldn't deploy Rabbit (or any other message bus) because "it's another thing to look after". Ended up as a simple REST-alike webservice. Which I guess includes "service" in the description...

mattmanser · on March 20, 2014

What's wrong with that? No need to overcomplicate stuff. That's a perfectly good solution depending on what reliability you want.

zimpenfish · on March 25, 2014

Oh, I have no objection to simple RESTian APIs. But they're very much not SOA.

nbevans · on March 20, 2014

What you've just described is a proper micro service architecture. Which is basically a more minimalist and opinionated form of SOA.

Nothing wrong with wanting to avoid dependencies. Good programmers try to avoid dependencies and coupling, especially to third party products, all the time.

Service buses like Rabbit are cool n' all but sometimes you don't need the complexity.

zimpenfish · on March 25, 2014

No, it wasn't SOA or micro-services at all. There was one RESTian API which handled everything in a monolithic Perl app running in one Apache. No services anywhere to be seen.

nl · on March 20, 2014

So.. SOA.

Is anyone doing SOA+ESB in a non-Enterprise environment?

Is an ESB actually useful beyond the idea that it is supposed to let "non skilled" people "develop services" (which I'm somewhat cynical about)?

daigoba66 · on March 20, 2014

We don't call it an ESB in our non-Enterprise environment. It's just called "service bus" or "message bus". Which is really just a fancy application-level API around services, queues, and messages (see http://particular.net/NServiceBus). It's incredibly useful, dare I say essential, for abstracting away those infrastructure level concerns when developing complex distributed applications.

We use it pretty extensively in our SaaS product to achieve loose coupling, yet high cohesion, between our various software components.

CarlHoerberg · on March 20, 2014

We use SOA a lot, we have a RabbitMQ server/cluster in the middle who all services communicate via. We have applications written in Ruby, Go, Clojure and Node.js, all publishing and subscribing JSON messages.

Benefits over monolithic: New languages/techniques can be introduced at any stage, and used for what's best for. Applications can be isolated, and only do one thing, making the system much more reliable. Your web front doesn't have to down because you have a bug in the mailer app.

Benefits over HTTP: Not everything is web facing, workers etc doesn't need a full web stack. Services can be offline, without interrupting other services. You get load balancing and work queuing for free. Performance, 5000msgs/s with disk persistence is no problem for a low end RabbitMQ server. Great inspection capabilities with RabbitMQ's management interface.

Tip: Get away from a "RPC" mindset, think "events" instead. Let other services subscribe to events they should take action to. Decide as much as possible up front, in the application that publishes events, so that information doesn't have to be returned.

Disclosure: I'm the owner of CloudAMQP (www.cloudamqp.com), RabbitMQ as a Service. But have used SOA and RabbitMQ a lot longer, both for small and large projects.

nahname · on March 20, 2014

I've used Kafka, ActiveMQ, MSMQ and Tibco on long running projects.

An ESB can be incredibly powerful when you give a small developer team access to it and wait to see what they dig out of the data. We pulled realtime data about sales that no one had access to and it was enlightening for everyone.

When non-programmers use something like tibco, the results are abysmal. Slow overly complicated highly coupled mess that only one person will understand, the creator. When most people say ESB, this is what they mean and it is a tax on the company.

noelwelsh · on March 20, 2014

How will you manage communication between services in a SOA? If you exclusively use point-to-point welcome to spaghetti code.

Our philosophy is to use point-to-point when you absolutely need another service's input to complete your task. Otherwise put a message into the bus. See the Jay Kreps blog post I linked in below.

lmm · on March 20, 2014

It's not spaghetti as long as your interfaces are well-defined, is it?

I can see the value in a service locator/registry, but that doesn't seem like a substantial piece of the ESB pitch.

When do you have a fire-and-forget case like you seem to be talking about? I can guess a few secondary concerns like logging, but most of the time if you're calling another service it's because you need a response from that service. At least for the apps I'm used to writing.

arethuza · on March 20, 2014

I would say "spaghettiness" is pretty much topology - not whether those interfaces are well defined or not, which is pretty much an orthogonal property.

i.e. You can have a poorly defined ESB and a big pile of spaghetti with well defined interfaces. Unfortunately I have seen both!

ewest · on March 20, 2014

..."in a non-Enterprise environment" - yes...well..expanding into an enterprise; using BizTalk server (Microsoft).

I design and create SOA solutions through services exposed by/through BizTalk and build an ESB solution to orchestrate the services which acts as a substrate for 'composable' applications (again, using BizTalk and apps that use the services are web-based, or whatever client apps the organization wants to use; many apps are integrations between systems like line of business using various approaches that include EDI, etc).