Since Martin Fowler described the concepts of microservices in 2014, the adoption of microservices in both large and small enterprises has been growing. Process automation platform Camunda estimated in 2018 that 63% of enterprises are using microservices and another 28% are considering migrating some applications to microservices.
Despite the enthusiasm around microservices, the O’Reilly radar revealed that only 10% of their readers report the complete success of the microservices adoption. In fact, 45% report failing to materialize the full benefits expected from microservices architectures, admitting only to “some success”, with 8% confessing outright failure! With the increasing number of tell-all blogs on companies doing the opposite voyage from microservices back to monoliths, no wonder that Gartner put microservices at the bottom of the trough of disillusionment this year.
In short, if companies are adopting microservices in mass, many are also failing at it. This is a reality check for companies seeking to further their digital transformation. Those companies are not incompetent — most successfully navigated major architecture transitions, from centralized mainframes to distributed client/servers to the Internet to the cloud. What happens is that the next transition, Cloud Native, is hard — intrinsically so.
The good news is that in the last decade, successful pioneers have identified clear cloud-native best practices that considerably decrease the risk of failure. Hint: digital leaders rely on a DevOps Value Stream Delivery Platform (DevOps VSDP) for the grunt work. Hey, it is still hard. But you can make it easier.
What makes Cloud Native especially difficult? How to get the benefits of Cloud Native while minimizing its tradeoffs? In what follows, we tell you about the complexity of cloud-native architectures, the anti-patterns responsible for failures, and the best practices implemented by digital leaders.
“It’s the economy, stupid!” The sentence is attributed to James Carville, a strategist in Bill Clinton’s 1992 presidential campaign. It was intended to be one of three messages that campaign workers would focus on in their outward communication. The increased focus would in turn raise the chances of success of the campaign. Implementing microservices architectures is no different. There are five messages that IT and business leaders should keep in mind when migrating to microservices:
Let’s expand on these messages one by one.
First message: It is all about the business! Many companies fail to understand the business needs that microservices architectures address. There are several. Let’s discuss the two that are core to the microservices approach.
The first business need is agile software delivery. On the one hand, software is increasingly becoming a key part of products and services. The Internet and the cloud enabled new digital services distributed through new digital channels. Nimbler competitors have exploited those channels to compete against established players and deliver innovative services faster. Customers’ expectations have changed as a result of these trends: they are used to customized products, services, features delivered where they want them, in the channel of their choice, at any time of the day. They may defect to competitors because of the absence or presence of a single feature.
More than ever, understanding customers, detecting customer needs, delivering new digital services and features in a complex, fast-changing business environment requires that companies be agile. They need to innovate faster. They need to react faster to changes. Because software is in every service, they must deliver software much more rapidly, frequently, and reliably.
The second business need is the ability to scale up and down with demand. First of all, this ensures that customers enjoy reliable quality of service at any level of demand. From a quality of service perspective, scaling up is the key capability. There is also a cost dimension to it though. Digital products and services run on infrastructure. That infrastructure has a capacity and a cost. Scaling down to return unused capacity back to infrastructure providers is key to keeping costs in check. Remember that you don’t just want to provide always-available stellar services. You want to do that in a cost-optimized way.
Scaling drives both quality of service and cost efficiency
Those two business needs directly map to the key architectural drivers for microservices adoption. Before microservices became popular, companies were stuck with applications made of one single deployable module, appropriately termed the monolith. Monolithic applications generally start as small applications maintained by a small team of developers. Those developers have a mental model that encompasses the whole software; communication and coordination costs are low; and development velocity (e.g., the time to add a new feature) is high.
Invariably though, that speed slows down. The team grows. The application grows and no longer fits into any single person’s head. Communication and coordination costs explode — together with software defects. Because monoliths are, well, monolithic, any new feature requires rebuilding, retesting, and redeploying the whole artifact — further slowdowns, increased downtime, higher risk of deployment failures.
Robert C. Martin, one signatory of the influential Agile Manifesto, recommended modular architectures to increase velocity:
That kind of flexibility and decoupling always speeds you up. If there’s one thing we’ve learned about coupling over the last fifty years it is that nothing is better at slowing you down.
Well, precisely. Microservices architectures are decoupled, modular architectures. A microservice is an independently deployable module. Applications are architectured as microservices that cooperate to implement business requirements. Those microservices are smaller than the functionally equivalent monolith; and should be designed to be maintained by a single, independent team. A team that is responsible for a microservice should have minimal coordination with other teams. Communication costs are back to being manageable.
So microservices architectures provide the agility required by businesses to compete in the digital age. What about the other business need: scalability? First, microservices make horizontal scaling easier. Because microservices are small(er) and independently deployable, instances can be added — or deprovisioned — quickly.
Second, you have fine-grained control over which parts you want to scale up or down. Conversely, with a monolith, you would need to scale the whole application even if only a few parts of it are heavily used. It is not just that it is not cost-efficient and wastes resources.
The first option, vertical scaling is simple enough but has upper limits dictated by the hardware (e.g., memory, CPU). Additionally, large monoliths that hold hundreds of functions can take 20–40 min to restart, increasing service downtime. The other option, horizontal scaling, is complex — involving load balancers, session replication, database replication, and more — and suffers from cost inefficiencies. Unless your monolith is already modular with well-architected boundaries between loosely coupled internal modules — that is unless you are architecturally already very close to microservices.
So microservices provide the agility and cost-efficient scalability that many businesses need, thanks to their independent deployability and fine-grained scalability. So why do so many companies fail to reap the benefits of microservices architectures? Some companies adopt microservices in spite of their architectural drivers being better addressed by other approaches. They discover that they did not need microservices in the first place.
Other companies pick microservices for the right business reasons. Some of those however either outright fall into common anti-patterns that impede them to enjoy the expected business outcomes. Others materialize in some measure the target business outcomes but fail to maximize those by not adopting the best practices that digital leaders have identified. This happens in part because they fail to properly assess the complexity of distributed systems — and address it.
This reality is encapsulated in our last four messages. Onto the next one!
Early last year, Craig Box, Kubernetes/Istio Advocacy Lead at Google, announced that the Istio control plane was migrating away from microservices back to a single, monolithic binary (emphasis is ours):
Microservices are a great pattern when they map services to disparate teams that deliver them, or when the value of independent rollout and the value of independent scale are greater than the cost of orchestration. We regularly talk to customers and teams running Istio in the real world, and they told us that none of these were the case for the Istio control plane. So, in Istio 1.5, we’ve changed how Istio is packaged, consolidating the control plane functionality into a single binary called istiod.
In his announcement post, Box goes on to explain that the fine-grained scalability and independent deployability of microservices did not correspond to Istio’s actual business needs:
Microservices empower you to scale components independently. In Istio 1.5, control plane costs are dominated by a single feature […] Every other feature has a marginal cost, which means there is very little value to having those features in separately scalable microservices.
Microservices empower you to decouple versions and release different components at different times. All the components of the control plane have always been released at the same version, at the same time. We have never tested or supported running different versions of (for example) Citadel and Pilot.
In fact, as a related IEEE publication explained, architecting the Istio control plane as five independently deployable microservices caused maintenance and performance overhead for typical Istio users.
Botify, an enterprise SEO platform, recounts a similar tale of switching back to a monolith. They did not need the scalability brought about by microservices. In fact, microservices caused a loss in performance:
The right solution for the right problems: it is important to remember our use cases and who Botify is serving […] Responding in milliseconds for metadata concerning some ten thousand customers is not a task requiring a highly scalable microservice architecture. Quite the opposite, our backend-to-backend communications were slowing down these light retrieval processes and making these requests take more time.
Lessons learned: think hard about whether your business needs the scalability and agility that comes with microservices.
Pick microservices for the right reasons / Implement best practices
Microservices form a distributed system and you have to understand and manage the extra complexity that comes with that. That’s our third message. Let’s elaborate on that now.
Turing Award winner Fred Brooks explained in a widely commented paper that there are two types of complexity in software engineering: essential complexity and accidental complexity. Essential complexity originates from the problem and is difficult to reduce without changing the characteristics of the problem. If a program must do X and then Y, there is no way to simplify that. Accidental complexity, on the other hand, originates from the solution and can be improved by better, more optimized solutions.
The essential complexity linked to microservices is that they form a dynamic distributed system. What was 1 cohesive large application (monolith) may now be 15 independently deployable, loosely coupled, smaller microservices. You now have to program against partial inter-process communication failures or exceedingly high latency between services.
With instances of services dynamically brought up and down, you need either location transparency or discovery services. You need to minimize calls between microservices to decrease latency or you end up with a less responsive application.
You still need to implement transactions and queries that span multiple services. But now your database is distributed across microservices. How much of ACID can you keep without sacrificing performance? You also must secure your distributed application against new classes of threats. The list goes on. And none of the items are going away.
The accidental complexity comes from the different solutions that we apply to palliate the previous issues. Should you have 15 microservices or 55 or 7? Which responsibilities do you assign to each microservice? How do you scale them? How do you test and debug a distributed system? How do you detect and recover from failure? How do you ensure zero downtime?
AWS today has over 200 products in its portfolio. Your 15 microservices may require that you learn, configure, and operate another 20 vendor tools to manage failure, scalability, deployment, and more. Recruiting, upskilling and reskilling product teams is a pain point constantly mentioned by companies large and small.
It is not just the technologies. It is the patterns too. In fact, it is the patterns first and foremost. AWS App Mesh is Amazon’s version of the service mesh pattern. But you need to understand service meshes first. Why do they exist in the first place? What problem do they solve? Do you need them for your specific use case? How to configure and operate them?
You need to do all that while keeping an eye on overall costs and quality. Vendor lock-in (another hard-hitting pain point expressed by companies) may affect your ability to optimize your cloud spend. While your independent teams can optimize locally for costs, you also have to optimize globally (e.g., avoid work duplication). We haven’t even mentioned yet how to orderly migrate from one architecture to the other in a way that is transparent to users.
In short, do not underestimate the challenges of adopting microservices. They are just as real as the benefits. When you take the easy road, you fall into anti-patterns that destroy value rather than deliver business outcomes.
Software architects Srini Penchikala and Marcio Esteves described four categories of anti-patterns that companies must avoid at all costs: Monolithic Hell, Death Star, Jenga Tower, and Square Wheel. There are others.
To pick a fairly occurring one, the Distributed Monolith anti-pattern (part of the Monolithic Hell family) occurs when a monolithic application is broken down into multiple single-instance services — but most services remain tightly coupled.
Distributed monolith anti-pattern
Source: Reactive microsystems – Evolution of Microservices at Scale, by Jonas Boner
The previous illustration showcases a monolith that has been broken down into three microliths. To quote Jonas Boner, CEO and founder of the serverless and cloud platform Lightbend:
A microlith is defined as a single-instance service in which synchronous method calls have been turned into synchronous REST calls and blocking database access remains blocking. This creates an architecture that is maintaining the strong coupling we wanted to move away from but with higher latency added by interprocess communication (IPC). The problem with a single instance is that by definition it cannot be scalable or available. A single monolithic thing, whatever it might be (a human, or a software process), can’t be scaled out and can’t stay available if it fails or dies.
In other words, a distributed monolith, as opposed to single-process monoliths, is an application distributed over several processes but built like a monolith. You get to have separate teams maintaining smaller pieces of the monolith but the motivating architectural drivers that we mentioned at the beginning of this article are absent. Microlithic systems are neither scalable nor resilient. The entire system often has to be deployed together. You get the complexity of distributed systems, the disadvantages of monoliths — and little to show in exchange.
So how do digital leaders do it? What are the best practices that correctly address the challenges created by microservices architectures?
First of all, rather than migrating your monolith in one go, start small. Extract one reasonably small microservice from your monolith. This is the advice that Sam Newman, the author of the book Monolith To Microservices, recommends:
The analogy I’ve always tried to use with adopting microservices is that it’s not like flicking a switch. It’s not off or an on state. It’s much more like a dial. If you’re interested in trying microservices out, then just create one service. Create one service. Integrate it with your existing monolithic system. That’s turning the dial a little bit. Just try one and see how one works. Because the vast amount of the problems associated with a microservice architecture are not really evident until you’re in production. It’s super important that even if you think microservices are right for you, that you don’t jump on it. You dip your toe in the water first and see how that goes before you turn that dial up.
Many companies have successfully and progressively completed their migration to microservices with the Strangler pattern. Keep your monolith, start small with a selected few microservices (as few as one), put a strangler facade between your monolith and your new services. The strangler facade redirects some of the requests to your new services while continuing otherwise to use your existing monolith. As you add more and more services, your legacy system is less and less solicited. At the end of your migration, all your requests go through your microservices, and you can get rid of your monolith for good (or keep it for legacy clients). Your customers will be none the wiser. In order to use the strangler pattern, requests to your back-end system will have to be intercepted by your strangler facade though.
Migrate progressively from a monolith to microservices
This is but one of the many patterns that ensure the successful implementation of microservices architectures. Event-driven architectures minimize the coupling between participants in a distributed system. Event producers do not know which event consumers are listening for an event, least where they are located. Event consumers are also unaware of event producers. Components of an architecture that communicate through events are thus loosely coupled. We mentioned before that this loose coupling is instrumental to fine-grained scalability, a key architectural driver for microservices.
This of course creates other issues that do not appear with synchronous request-driven models. Events may be sent and never received. Event consumers may be down or not available. Events can be duplicated. The good news is there are other patterns that have emerged that address these issues. To cut a long story short, these are the most important patterns covering the whole software delivery lifecycle that you must be aware of and that will drive the performance of your microservices implementation:
DevOps VSDPs implement cloud native’s best practices
This is what digital leaders do. This is how they deliver features several times a day to production, detect and recover from failure in a few minutes — with no downtime perceived by the end-user. They optimize cloud spend while maintaining quality of service by automatically scaling up and down without monopolizing site reliability or DevOps engineers. Every one of these patterns has a reason to be. This is for instance what Leif Barton, Senior Solutions Architect at NGINX, had to say about observability, a key monitoring pattern (emphasis is ours):
Monitoring, tracing, and stuff of that nature are of course critically important when you start moving into [microservices], because communication throughout the application is not a function call […] It’s a network call, which may or may not be locally on the system that you’re talking to at the moment. Being able to monitor, track, and trace communications throughout an application is absolutely critical. Without that, you’re essentially trying to find a black cat in a dark room while wearing a blindfold.
So yes, to get to a level of performance akin to that of digital leaders, you need to do all these things. Many companies are underestimating all that it takes to deliver strong business outcomes through stellar software delivery. How do companies master all this?
They use a DevOps Value Stream Delivery Platform (abbreviated as DevOps VSDP). CodeNOW, our DevOps VSDP, was born in one subsidiary of Société Générale to solve precisely the same problems that large companies face when scaling their digital transformation.
As one lead architect at Komercni Banka, one of the largest banks in the Czech Republic, put it:
I’ve seen too many devs crying when trying to deliver a HA/DR [High Availability/Disaster Recovery]-ready SAGA pattern using Java/Spring Boot with literally no framework/infra support — and being able to move from ACID to BASE is one of the cornerstones of Cloud Native Architecture.
Amazon, Google, IBM, Oracle have their own custom in-house DevOps VSDP. They honed their platform much like CodeNOW did, over years of experimentation, trial and errors, and gradual inclusion of the best practices. Like CodeNOW, those companies are members of the Cloud Native Computing Foundation, which experiments with and drives the adoption of Cloud Native best practices.
DevOps VSDPs unify and integrate multiple point solutions that most companies are already using to deliver software. DevOps VSDPs provide developers with a single self-service portal from which they can code, build, test, deploy, operate, and monitor their software. DevOps VSDPs provide an end-to-end view of the software delivery process with their integrated DevOps pipeline. With DevOps VSDPs, you can continuously monitor your delivery KPIs and ensure that you are — and remain — on track to deliver elite performance. All of that should matter to you. But what we want to emphasize is something else.
Beyond the world of software patterns, one key challenge mentioned by companies adopting microservices is the upskilling/reskilling of their staff. The vast majority of companies do not have the people and the capital to build their own DevOps VSDP. The people that they do have do not have the leisure to learn 20 vendor-specific tools to deliver software. The limited financial resources that they struggle to allocate should go primarily to analyzing customer needs, innovating, building competitive advantages, and addressing business-specific concerns rather than being gobbled up by menial, automatable software delivery tasks.
Microservices are harder than you know. We made our DevOps VSDP CodeNOW so that you can train the staff that you already have to the concepts, principles, and patterns that drive elite performance in software delivery. CodeNOW provides a sandbox environment in which Dev, Ops, DevOps, SRE teams can learn on a real cloud. We know from our experience while creating CodeNOW how difficult it is for developers to deliver high-availability-, disaster-recovery-friendly event-driven architectures on your notebook or with a single VM available.
We can’t make distributed systems easier. Nobody can. They have their essential complexity. So we worked very hard to reduce the accidental complexity that originates from the technologies that we use. CodeNOW’s DevOps VSDP is cloud-agnostic. CodeNOW may very well be the only DevOps platform out there that has no vendor lock-in. CodeNOW users only use fully open-source stacks that are already used by millions — many of which they already know.
As we said, CodeNOW users learn concepts, principles, and patterns. Concepts, principles, and patterns last, technologies come and go. We want you to focus on your business. Talk to your customers. Craft the digital services that address your customers’ needs, innovate, distribute your products to new channels. That is what your business is about. We handle the software delivery for you. We help you move from customer needs to delivered features.