Lessons learned from a year of running microfrontends

Around March 2019, we came together in a small group to draft an idea of making product teams independent in their frontend development. At that point, we had a monolithic app that took about an hour to walk through the whole CI pipeline, and frontend sources were part of it, which meant any tiny change on the frontend had to be scheduled, which meant a regular loss of momentum in every team.

Solving the issue of slow CI directly made things better: through a fair amount of effort, we trimmed the time it took to deploy down by a bunch of percentage points. However, adding one extra team to the organization would make things worse than before. So, just like how microservice architecture would make teams independent, at the cost of simultaneous CI runners, we thought microfrontends would give similar benefit.

Not until recently a reasonably detailed guide to microfrontend architecture was published. We did some research, but most of the solutions we found were either too opinionated or didn’t resonate with what we wanted to create: a piece of software infrastructure and tooling that would help us scale the number of teams without having to rearchitect frontend in the next couple years. Fast forward one year, seems like we got lucky.

This is a collection of observations made along the way.

Code splitting and lazy loading vs. microfrontends

A wrong way to approach microfrontends is to see them as a collection of very small JS and CSS assets. This line of thinking ends at webpack: version 4 was already amazing at code splitting and enabling lazy-loading of JS and CSS chunks. However, lazy loading is about the TTI, that is, the end user is primary in this case.

While end user is probably the most important part of the whole product, microfrontends is an architectural choice. Users don’t see the architecture; people are unlikely to say, “Wow, Amazon product search microservice is so fast, I really enjoy using it!” Architectural choices are mostly driven by a desire to ensure that the system is robust and that it can be extended easily — notice how focus is much more on the developer experience and software engineering.

We do, however, use code splitting and lazy loading in almost every microfrontend. It improves the TTI a bit: some areas of the UI are accessed way more often than the others, so why make the user load the whole thing and hang the UI longer than what we can confidently say is valuable?

This distinction is, I believe, by far the most important to consider for a team first before going the microfrontend architecture route.

One microfrontend = one git repo

We don’t do monorepo. One microfrontend is strictly equal to one git repository in our case. There’s a bunch of reasons to that:

To be frank, it’s partially like that because we haven’t figured out to marry monorepo with many parallel deployments, and this question was never on the agenda. Maybe at some point we’ll go towards monorepo, but there’s some historical evidence that this endeavor has a low chance of success.

Some advantages of many repos:

Of course there are some tradeoffs:

One distinct context = one microfrontend, approximately

At some point we converged on creating a microfrontend per every large context the user can be in. Here, context may mean “persona”, or it may mean “user journey”: if you are familiar with Atlassian products, then Jira and Confluence might be two separate contexts, but also within Jira issue navigator and backlog view could be two separate contexts. The rule of thumb here is, if one team handles one context, it should be one microfrontend; if one team handles two separate contexts, it’s a team-level decision to either have to microfrontends or put everything into one. Whatever is asking to be split, will be split eventually anyway.

The famous website about microfrontends describes a case of having three distinct parts on the same pages as three microfrontends. While it looks good as an illustration of purpose, I noticed that many people take it at face value and believe microfrontends are about carousels and carts written in different browser frameworks. We didn’t take that direction and instead opted for larger pieces of UI. A single button or a single search bar would probably bee too tiny to bear the overhead of requesting it in the runtime, with possibly duplicate dependencies.

All that said, we run a couple microfrontends that are really large because the contexts they represent are large, and that’s because the domain and the corresponding set of product features are big as well. The teams behind these microfrontends are, not surprisingly, larger than average as well. Overall, my observation is that Conway’s law is fully at work here, and we consciously approach this deviation at the organizational level first and let the software architecture level naturally follow.

Share dependencies across microfrontends (or not)

If we have ten microfrontends each using React runtime, we potentially deliver ten times the same vendor code to the end user. In case of React DOM, it’s a lot of code. A natural response to this would be to deliver React separately once and let microfrontends rely on that, expecting React DOM to be available at any point in time of code execution.

There’s a bunch of good in this approach:

And a few risks:

We ended up sharing some dependencies by picking the libs that either have slow version release cycle or are really used in most or all microfrontends.

Performance budgeting is tricky but not useless

If we have a “frontend container” that takes care of mounting microfrontends and routing traffic to them appropriately, and then microfrontends that can be mounted or unmounted as the user traverses the product UI, performance budgeting gets tricky. In our case, the container sources are quite significant in size for historical reasons, so whatever budget a team behind a microfrontend settles on, it will be inevitably handicapped.

I do believe it’s like that in all flavors of microfrontend architecture. Similar to load balancer and service discovery adding latency to the network request roundtrip in microservice architecture, microfrontends start loading after some prep work is done. Which, by the way, is solvable in a way similar to microservices: if we ensure, say, 90KB initial load for a microfrontend, we can abstract out the frontend container and expect another (platform-layer) team to take care of that.

The upside of this approach is that each team has clear performance budget and the corresponding metrics that are in their full control. The downside is that we lose the holistic view of the network and runtime performance metrics. If there’s no single team that takes care of the user experience end-to-end, and each team has their own SLO for their own microfrontend that doesn’t reflect overall UI performance, long-term we’re in trouble. Which, by the way, is an issue we haven’t figured out how to solve yet.

Having said all that, microfrontend architecture is absolutely not a silver bullet and it would not cure issues with usability or user experience. There might be some correlation, but I doubt moving towards microfrontends could directly impact the quality and usability of your product. It might, however, unlock velocity in teams that is otherwise lost in unproductive activities like manual testing or spending time waiting for their turn to deploy. As before committing to any other architectural change, do your analysis, consider the scale of the org, and prepare for a short-term loss in developer velocity in return to a long-term accumulating benefit.