How I became a software architect… (or not)

How did I get involved in software architecture & design? Do I think of myself as a software architect? Here are some insights about my career, which have helped me and ultimately led me to YouTube and produce videos about software architecture and design. Here’s a bit about my journey writing in software.

YouTube

Check out my YouTube channel, where I post all kinds of content accompanying my posts, including this video showing everything in this post.

Operations

I’ve always been involved in IT infrastructure and operations. Before working professionally as a software developer, I was interested in how software runs. Setting up my own servers at home and running Linux was a hobby. When I started working professionally as a software developer in the late ’90s, I was responsible for writing e-commerce software and the web servers they ran on, mail servers, DNS, etc.

That foundational knowledge of operations carried throughout my career, where I was more and more involved as the industry evolved from physical servers to on-perm VMs and then to the cloud.

You could say that before the “devops” movement, that was already what I was doing was heavily involved in both writing software systems and managing the infrastructure that ran them. DNS, database replication, web servers, load balancing, the list goes on.

This has been invaluable, especially when working in small teams and startups.

Business

I’ve always written software for line-of-business types of large systems: E-Commerce, Distribution, Manufacturing, Accounting, and Transportation.

While they are all very different, they’re all lines of business with some overlap. I’ve always said the best developers I know working in these systems have good developer skills but even better business knowledge.

Understanding how businesses work and how money flows. How do they make money? What are the core parts of each domain where they generate revenue? How do they incur costs? There are business processes that are the main workflows of any business.

It sounds simple, but it’s understanding the business.

If you’re working in these types of systems, understanding the fundamentals of your domain will serve you well.

Ah-ha!

That segways well into a big ah-ha! in my career: finding Domain Driven Design. Because I lived in various domains, this book was a game-changer in multiple ways. The first was it was a gateway. DDD led me to CQRS. CQRS led time to Event Sourcing. Event Sourcing segwayed me into Event-Driven Architecture.

People like Greg Young and Udi Dahan, the blogs they wrote, and the conference talks they presented were eye-opening and shaped many of the ways I think today.

DDD also validated some thoughts I had and forced me to think deeply about design. A lot of people, still today, are hung up on the “tactical patterns” of Domain-driven Design. Aggregates, Entities, Value Objects, Repositories, etc. While they have value, they are a means to an end. The real value in DDD for me was thinking about boundaries on many different levels.

A lot of how I think now is rooted in coupling and cohesion. If you boil things down, they generally return to those two and the trade-offs you make for them.

Working with large systems means thinking about cohesion and defining boundaries. Coupling is inevitable, but how you manage it is vital.

Software Architecture and Design

Being involved in greenfield projects working in small companies and startups has forced me to think about software architecture and design. How to decompose large systems. However, I’ve also been thrown into the fire several times to work on a large existing (brownfield) project without any support. Zero. A large system that is running a company, and you’re the one that’s got to deal with it. Fix bugs, add features, and deal with the infrastructure of how it’s running. There is no developer documentation. Nobody to ask. Nothing. It sounds stressful, and it was. But coming out of those situations gave me experience in navigating large codebases and seeing a lot of issues with these types of systems.

When you’re working on a greenfield project, especially for a startup, you don’t have time for bikeshedding. Make the best decision you can, given the information you have, and move forward. Too often, developers can get caught up in over-analyzing and bikeshedding over irrelevant things.

Software architecture, to me, is about making decisions that give you options in the future but with little cost. I talk about this in my post: What is Software Architecture?

Systems need to evolve. Technology changes, businesses change, industry changes, your system needs to change.

Do I think of myself a software architect? No, not really. It’s always been apart of my role. Thinking about decomposing systems, how a system should function, how the business works, and how to model that is always been apart of my role. I understand there are folks with the title of “Software Architect,” and I think that’s a role that can exist in certain situations, but I do think everyone can have a role in architecture and design on a team. Even if that’s understanding the reasons why certain decisions were made.

YouTube

So, how did I end up creating videos on YouTube about software architecture and design? The primary reason, and why I still do it today, is that I think there is a lack of good content on the topic.

There’s too much emphasis on “how” and not enough on “why”.

I create videos that I would want to watch. They aren’t for everyone, and that’s fine. They aren’t for people that want the “show me the codez!” videos.

I absolutely show code in some videos, and it’s for illustration purposes to explain a concept. Not surprisingly, although I show code in C#, there’s a large number of viewers who do not use C# at all.

I said there’s a lack of good because there’s also a lot of content that is, to me, misleading, bad, clickbait, and incorrect. Does that mean that everything I post is correct? No. Because of that I only post videos/blogs that I’m confident in based on my experience. My lane is software architecture and design. I post content around it based on all my 20+ years of experience.

Hopefully, what I post will resonate, and like the YouTube comment above, hopefully, you’ll get an “aha” moment, just like I have from other folks I mentioned above.

Good luck on your journey!

Join!

Developer-level members of my Patreon or YouTube channel get access to a private Discord server to chat with other developers about Software Architecture and Design and access to source code for any working demo application I post on my blog or YouTube. Check out my Patreon or YouTube Membership for more info.

Follow @CodeOpinion on Twitter

Software Architecture & Design

Get all my latest YouTube Vidoes and Blog Posts on Software Architecture & Design

Distributed isn’t Microservices, In-Process isn’t a Monolith

Amazon Prime Video moved one of its monitoring services from “microservices” to a “monolith”. I’m using quotes because that’s how they termed it in the post, which did themselves a disservice by making this statement. Almost every blog post or video covering this has missed the mark. All they did was refactor. This has nothing to do with microservices or a monolith.

YouTube

Check out my YouTube channel, where I post all kinds of content accompanying my posts, including this video showing everything in this post.

Original Architecture

You can check out the original blog post from Amazon Prime Video, but here’s a quick summary that sets the stage for what they originally had as an architecture and what they moved to. There are a lot of hot takes about what they did, but most are way off base. So first, what was their original architecture?

Amazon Prime Video Distributed

There’s an audio/video stream that goes to a Media Conversion service, which extracts frames from the video and puts those to an S3 bucket. Then, there’s a workflow with step functions that pull those frames (data) from S3 to analyze. They call these detectors that do the analyzing.

They stated that this was fine initially, but as the workload increased, this architecture was no longer viable. They stated that they never intended nor designed it to run at a high scale.

A major issue wasn’t anything to do with execution but cost. Because everything is distributed, each step function has to pull the data from S3 to analyze it. This data transfer does add latency to their processing times, but more so it’s monetarily costly that was an issue.

New Architecture

The architecture they moved to seems logical to me. Instead of using lambdas for analyzing the frames, they instead moved them to be within the same process within an ECS task. The audio/video stream data is going directly to this ECS task. This means S3 is no longer used at all. Since everything is in memory, there’s no pulling data from S3, it’s already in memory.

Amazon Prime Video Monolith

Nothing is distributed anymore. The media converter and the detectors (to analyze the frames) are all within the same ECS task (container).

Hot Takes

Unfortunately, this blog post stated that they moved from microservices to a monolith. There were all kinds of hot takes that seem to of missed the point or just didn’t even read their original post.

Clickbait

I’m not sure if people purely do this for clicks/views or if they just don’t really understand what the architecture change was. Or possibly they just didn’t even read the original post. I’m not sure.

What Amazon Prime Video did was change the physical aspect of their deployment. That’s it. It’s not that serverless sucks or they don’t understand what it is. It’s that given their specific use case, at a higher volume/scale, it wasn’t cost-effective.

They moved their code to be composed into a single process, which elevated the cost of distributing the workload and data. That’s it. As stated in the post, a lot of the code was reused, and it allowed them to quickly do this refactor. It wasn’t a rewrite.

Microservices to Monolith?

Amazon Prime Video said they moved from a microservices to a monolith. But that’s not what they did. They moved from a service… to a service.

What’s a service? It’s the authority of a set of business capabilities. That didn’t change. What changed was the physical boundaries.

Physical boundaries aren’t logical boundaries.

A logical boundary defines what the capabilities are. Their logical boundary is still the same, but their physical boundaries changed.

I think the confusion lies in thinking that the media conversion is its own “service” and the step functions detectors that do the analyzing are their own “service” but that’s not the case. The service is everything.

Using lambdas or having different components distributed doesn’t suddenly make it microservices. And just because you have all components executing in the same process/container doesn’t make it a monolith.

If you have a logical boundary and you have some source repo, that could turn into one container/process that is an HTTP API. You could also have another container/process that executes as a background worker service. You can also be scaling out and deploy multiple instances of both of these.

One more time for those in the back. Physical boundaries aren’t logical boundaries. Check out my post The Pendulum swings! Microservices to Monoliths for more on this.

Refactored

So what did they really do? Did Amazon Prime Video move from a microservices to a monolith? No.

They refactored. That’s it.

They evolved their architecture based on changing load and cost. They realized at the scale they needed that it wasn’t cost-effective to distribute the workload with step functions which then required distributing the data via S3.

They evolved and refactored to move it all within an ECS task so that everything was in memory.

Makes sense.

Join!

Developer-level members of my Patreon or YouTube channel get access to a private Discord server to chat with other developers about Software Architecture and Design and access to source code for any working demo application I post on my blog or YouTube. Check out my Patreon or YouTube Membership for more info.

Follow @CodeOpinion on Twitter

Software Architecture & Design

Get all my latest YouTube Vidoes and Blog Posts on Software Architecture & Design

The pendulum swings! Microservices to Monoliths

We moved from monoliths to microservices a decade ago, and there has been a swing back to either consolidating microservices or moving to a modular monolith. This isn’t surprising. Why is this happening? I will explain why logical boundaries don’t have to be physical. Once we finally make this distinction, it opens up many possibilities.

YouTube

Check out my YouTube channel, where I post all kinds of content accompanying my posts, including this video showing everything in this post.

Physical Boundaries

One of the good things that came from microservices is defining boundaries. Good fences make good neighbors, as the saying goes. Defining boundaries is a good thing, but how exactly are we defining them? Microservices, as defined by Adrian Cockcroft

Loosely Coupled service oriented architecture with bounded contexts

This means that we have services defined by a bounded context, which comes from Domain Driven Design, and these services are loosely coupled between them. Typically, this would imply an event-driven or message-driven architecture.

I typically define services (not the micro part) as the authority of a set of business capabilities.

What does a service do? What are the capabilities the service provides? At least with all the baggage of microservices came the idea of defining boundaries.

Unfortunately, microservices, as developers have implemented them, have forced boundaries to be physical boundaries. But when talking about what capabilities a service provides, I’m talking about logical boundaries. They aren’t the same thing. Physical boundaries are not logical boundaries.

Forcing this idea can cause a lot of issues and is not flexible. Typically you’d have a logical boundary (what a service provides) also be in its own source repository (git repo), which is built and turned into a deployment artifact run in some environment as process, container, etc.

Or you might have a mono repo where all the source is under a single repository, but the end result is still the same where a logical boundary is built and turned in its own deployment artifact.

If you’re logical boundaries are also physical boundaries, you likely end up having service-to-service communication that happens over the network.

This could be HTTP, gRPC, or any other type of synchronous request/response via network calls. What’s the problem with that? A lot. Check out my post REST APIs for Microservices? Beware! as I explain some of the pitfalls including, latency, failures, and more. This ultimately is a distributed monolith. check out the fallacies of distributed computing which is still very relevant. If you were to go back to Adrian’s definition, he mentioned loosely coupled, which this is not. Making RPC/network calls does not make anything loosely coupled.

Logical Boundaries

So what’s the difference between a logical boundary and a physical boundary? The best way to describe this is probably thinking about the full scope of a system. You likely have many different aspects to it. There’s probably some Frontend/Client/UI, a backend, and a database and other infrastructure like maybe a cache.

A logical boundary is the vertical slice across all these layers. A logical boundary owns everything related to the capabilities it provides. That includes the UI, the backend API, the database it is persisting to, and any other infrastructure it owns.

There are different ways of thinking about boundaries. A great illustration of this is the 4+1 Architectural View Model by Philippe Kruchten.

There are different ways you can look at a system. As I’ve been mentioning, there is a logical view, a development view (source code, repo), and a physical view (deployment).

A logical view can be the same as a physical view, but the point is they don’t have to be. Meaning, that a logical boundary doesn’t have to be deployed independently.

Composition

Once you realize this, you can see that you can compose things differently depending on your needs.

You could have multiple logical boundaries, within a single mono repo, which is built as a single unit of deployment. That single unit of deployment could be single process.

That means that if we were communicating synchronously between logical boundaries, we could be doing so in-process and not be making network calls. It’s just functions calling functions in-process.

I’m not suggesting to compose all your logical boundaries into a single deployable unit. I’m illustrating that logical boundaries aren’t physical boundaries. They don’t have to be one-to-one. You have options.

One logical boundary could have a single source repo that when built creates two different deployment units, let’s say a container.

One logical boundary could be across multiple source repos that each get built into their own separate containers.

You could have a single logical boundary that is spread across multiple source repos that gets built into a single container.

A better illustration of this would be if you have two different logical boundaries for a mobile app with a backend. Service A could provide a backend and a mobile front end. That same friend might be composed of another logical boundary to build the APK for Android.

This applies to infrastructure. Each service should own its own data, but that does not mean it needs to own the infrastructure. You could share infrastructure without sharing data. A single database instance could host different schemas owned by each logical boundary.

Could you have “noisy neighbors” where one service is consuming too many resources on your single database instance? Yes. At which point you could separate that out and use a different physical instance. The point is you don’t have to right from the get-go.

Jumping back to Adrian’s definition, he mentioned loosely coupled. If we leverage an event-driven architecture or messaging, we can remove any rpc or in-process calls. Even within a single process, we can use messaging to loosely couple between logical boundaries.

We have the option of changing our physical aspect by deploying logical boundaries independently if we need to for various reasons (deployment cadence, etc).

Microservices to Monoliths

Logical boundaries aren’t physical boundaries. They don’t have to be one-to-one. You can choose to compose logical boundaries together into a variety of different physical boundaries. Don’t limit yourself by forcing this restriction. Could you need a logical boundary to be a physical boundary? Sure. Then do it when it’s required. You have a lot of options when you don’t force this one-to-one constraint.

Join!

Developer-level members of my Patreon or YouTube channel get access to a private Discord server to chat with other developers about Software Architecture and Design and access to source code for any working demo application I post on my blog or YouTube. Check out my Patreon or YouTube Membership for more info.

Follow @CodeOpinion on Twitter

Software Architecture & Design

Get all my latest YouTube Vidoes and Blog Posts on Software Architecture & Design