too stupid to be having this conversation

Talking About Trade-offs

Jim Hughes's Picture Jim Hughes, - 7 min read
Talking About Trade-offs

You can't make it more than 1 paragraph/slide/minute into a piece of media discussing microservices without encountering admonitions to "carefully consider the trade-offs." This is promptly followed by not mentioning or explaining any trade-offs, and resuming to filibuster around whatever point they were trying to make, as if being direct and complete would make their position technically indefensible.

Software Architecture does not mean what you think it means

I don't think these authors are being disingenuous; the trade-offs involved are context-specific, so how could they ever offer something here that would be above reproach? Here's the problem: without some treatment of the trade-offs you navigated, from the perspective of your particular project, it's impossible to gather any useful takeaways.

The recommendation or pattern you are trying to share is not "architecture."  It is the output of architecture (which is more a process than a tangible artifact). How we collect, analyze, and reconcile competing requirements to craft a system that meets our needs within a set of constraints is architecture, and it's what elevates rote programming to software engineering.

By leaving out how you arrived at the result and all the surrounding context, you strip your message of the most instructive parts and stuff them into the hand-wave "consider the trade-offs." It's like reading a math textbook only to have a profound theorem's proof excluded as an "exercise to the reader." I won't go so far as to call this gate-keeping, but it suffices to say that those in-the-know (and likely do not require your guidance) will "get it" and those who could benefit from your experience will leave with the wrong takeaways.

It's the sort of education that breeds charlatanism; where real technical decision making and architecture get completely elided by resume-driven development, or stalled out by someone sagely uttering "consider the trade-offs" and thinking it profound. Of course we will consider the trade-offs...that's what we're doing, that's engineering! Anodyne opinions like this can only be refuted at great personal cost, and so entire organizations entertain these arguments in costly ways (analysis paralysis anyone?) even though the person who proffered this statement sometimes has no deeper meaning or insight.

If the intent is to spread good ideas and sound architecture, then we need to also accompany our ideas with examples of how real-world constraints and trade-offs manifested, and how our approach balanced those concerns to open a new way forward.

This does not mean belaboring every talk and blog post with a universal framework for decision-making. It would suffice to share the context and experience that led you to speak out, and allow the reader to judge how that does (or does not) apply to them. Being able to [describe CQRS] is trivia; knowing that [when faced with problem X, constraint Y, and requirement Z that CQRS led to the following virtues and vices] is wisdom.

You need non-functional requirements in your life

"Requirements gathering" smacks of archaic software engineering practices that predate the post-Agile era...but that's exactly what you're doing when you design systems. Requirements are more than just bewildering TODO lists of conflicting features, authored by despotic business leaders. Nor do you need to do requirements gathering all-at-once, waterfall-style. Once you break the association of "requirements" with this unhealthy modality you can recoup the most useful tool in our arsenal.

The first step in resuscitating "requirements" is to recognize they fall into three categories:

  • Functional Requirements: what you classically think of when you hear "requirements." Things the software must be able to do, and how it should behave and react to runtime stimuli.  For example, "users should receive an alert when their order ships."
  • Non-functional Requirements: also known as "software quality attributes," these are qualifications of the functional requirements and describe the operation of the system. Things like timing requirements ("performance") and other "ilities" like maintainability, availability, scalability, operability, etc. See a more complete list here.  For example, "users should receive password reset emails within 30 seconds of requesting a reset."
  • Constraints: a design decision with zero degrees of freedom. These can come in the form of implicit standards in your organization—e.g. "all services are written in Java"—in the form of edicts—e.g. "everyone must publish an OpenAPI spec"—or in the form of previous decisions that are being ruled as out-of-scope for the current project—e.g. "we are using flatbuffers instead of protobuffers."

Constraints and functional requirements are not very transformative, but non-functional requirements give you a way to discuss trade-offs...especially because some non-functional requirements appear to be foils of each other.  A classic trade-off is security vs. usability: requiring a user to 2Fa significant configuration changes is less usable, but more secure. Non-functional requirements offer firm definitions for these concepts, patterns for achieving them, and allows you to be complete in your accounting of what you're giving up for the sake of the other.

Security is the system's ability to protect data from unauthorized access.
Usability is a measure of how easy it is for a user to accomplish a desired task and learn system features. This includes learning system features, minimizing the impact of errors, adapting the system to user needs, and increasing confidence and satisfaction.

It is not necessary to detail all functional and non-functional requirements up front to approach architecture this way. You may only know a few requirements, or have a notion of what you want absent the challenges that may make it untenable ("I want consistency, availability, AND partition tolerance...oh wait...). It is natural to have requirements emerge during the implementation process, and to have their relative importance shift as the boundaries of what's possible (and what's necessary) become clear.

Non-functional requirements do not present a solution to changing requirements (and the need to be "agile"), they provide:

  1. A framework to conduct systematic searches through the space of possible requirements and complications.
  2. A language that codifies the essential abstractions and common ground needed to productively discuss software design, without getting waylaid by details of a particular instantiation, talking past one another, and excessive reliance on nonce terms.

...in other words, exactly what's needed to help your audience "consider the trade-offs."

Making sense of microservices

Returning to the example that motivated this article, let's talk about microservice trade-offs.

It's easy to accidentally attribute a lot of virtues to microservices related to modularity (single responsibility principle, interface segregation principle) but that does not capture the essence of microservices. It is perfectly reasonable to organize your code using strong namespacing and interfaces, granting you much the same modularity in a single process. Microservices take this a step further by achieving (enforcing?) through process isolation of subcomponents. With this in mind, we can now summarize essential strengths and weaknesses of microservices, as viewed through non-functional requirements.

Strengths

  • Scalability: each process can be scaled (horizontally or vertically) independent of the others. That is not to say a "monolithic" process cannot be scaled horizontally, but that typically requires sharding where microservices may be able to avoid that altogether.
  • Conceptual Integrity: the single responsibility principle ensures that the process has a small surface area, and theoretically can implement a single coherent concept better than multiple competing features. This eases maintainability, but also makes the system easier to reason about in production.
  • Reusability: processes that publish a well-defined API can be reused in almost any context: different teams, different companies, different programming languages, different use-cases.
  • Reliability: separate processes build in "bulk heads" into your application that isolate failures. With sufficiently defensive IPC (retries, back-offs, deadlines, etc), even faulting systems may still be able to deliver partial-to-full functionality.
  • Organizational Autonomy: each process can be built in some modicum of isolation from the others, coordinating via the shared API and contracts. This isn't exactly perfect (in my experience, a lot of growing pains experienced by MSFT/AMZN/GOOG are a direct result of this), but it does enable a product team to sustain productivity even as it scales into the hundreds of engineers.

Weaknesses

  • Operability: each process is another mouth to feed. It needs deployment orchestration, testing, monitoring, an oncall rotation, and more. As the number of processes grows, it becomes almost impossible to hold a complete mental model of the system in your head.
  • Observability: user-facing features and user journeys take place across a constellation of services, communicating using generic API's. It is more challenging to observe the flow of a single request, and it's more difficult for the system to make reasonable service-level decisions (request prioritization, load shedding).
  • Supportability: when something goes wrong, it will be more challenging to reconstruct the context and end-to-end flow that precipitated the failure. Troubleshooting and reproducing issues is made more complicated by the number of moving pieces that must be started and manipulated to construct a hermetic repro.

All of the above can be improved with additional architectural patterns...each with their own quality attributes. By using the language of non-functional requirements, you can methodically navigate these "trade-offs" and make better, contextual decisions...as opposed to always chasing the latest tech trend.

S.O.S. (save our systems)

I'm lampooning microservices quite a bit in this article, when in reality that was just the topic of the most recent tech talk that made me want to drive my car into the ocean. There are also plenty of authors who understand this problem and treat it appropriately. However, I have seen too much puffery and too many projects doomed by quixotic, resume-driven architectures to believe there isn't a gap in how we train and educate software engineers at large.

Using the language of requirements and constraints, you can ground your thinkpieces in the realities of bringing software into the world, you can hold more productive design discussions, and you can accelerate the training of the next generation of software engineer.

Architecture is the confluence of people, processes, and structures used to design systems; do not forget to talk about the people and processes as they determine more about the structures than you think.