By R2 in Design systems — Oct 3, 2021

How (not) to grow a UI library while building a product

If we’re building a plane while we’re flying it, building a UI library is like building wings and fuselage before putting them together, but they must also fit all other planes.

When we started working on a UI library (we called it “design system” at first lol), we thought we’re going to fix the nastiest problems of design and frontend collaboration of all: building things fast.

Back in January 2020, as we officially started the team, we came together and drafted a mission statement, which was about enabling teams to build fast and end up with more cohesive product UI virtually for free. We already had a UI lib, supported by an internal community of frontend engineers, with a few components in it, and some of those were already being used in the product. There were some basic ones, like button and link, and some more complex, such as an overlay sheet container that slides in from outside of the screen. And we had a backlog of feature requests and bug reports waiting to be resolved and fixed.

Refactoring and building, all at the same time

The backlog made a (predictable) effect on the team’s agenda: we started working on the things that were on the list. Some were hard, some easy, some were urgent and others were not really, and — mind you! — we assigned priority to every one of them. Smart.

In hindsight, I understand now that majority of these requests, including the ones I put on the list myself prior to that, were about “faster horse”. If there was a date picker already, people would say they need a date picker that would also pick date ranges in a way that would be more user-friendly and convenient than picking two dates in two separate date pickers. We’d take on this kind of work, and there were two reasons to that.

First of all, being a young team is a special case. It’s similar to be a young company, and it’s very different from being an established one that’s managed to exist for a few years and a couple highs and lows. Early days of a company — and of a team — are all about earning that next one customer, all for that bump to revenue. Survival. For a design system team, the early days are all about adoption of the design system: by the teams, and directly in the product.

So, as a team, we felt the urge to deliver. It didn’t have to be some crazy awesome change that would accelerate the business, any small net positive change would suffice. We planned in weeks. This affected our choices, effectively encouraging us to search for local optimum. So, “faster horse”.

For example, for a good part of 2020, we’d go to the product and actually make changes in a way that more frontend code would use our UI library and not some one-time custom implementation of components. This would cost us weeks, sometimes months, of time, because the product was complex, and we occasionally messed with other teams’ areas of the product. Looking back, I wish I avoided into that trap, but nope, I didn’t, and neither did the team.

And the second reason, we felt too attached to the existing UI lib. You could say, “escalation of commitment”, and be right. Remember, it was gradually built and maintained by a handful of volunteers even before we started the team. This also made it hard to make fundamental changes: the architecture was already solidified in the lib, and any change that would walk through all levels would require a lot of effort, bumping up the chance of breaking changes. And because the library was already actively used, we intuitively chose to avoid breaking changes as much as possible.

Which had then led to a radical and very important step: refactoring the UI lib in a way that making breaking changes would not be as scary.

Side-quest: rearchitecture of the UI lib

The original UI library had one big flaw: it was distributed as a single package. Literally, one npm package that was called “design system”. Closer to the end of our first year, we took on a quest to rearchitect it in a way that each UI component (and a few other things) would be distributed separately from others. They would still be codependent sometimes, but each would be versioned independently from others.

With the old distribution approach, the problem was: if I have a microfrontend that uses UI components abc and xyz, and then I file bug for the component xyz and get it fixed, but right before that abc was updated with some breaking changes, I can’t just upgrade xyz without upgrading abc. Say, I’m on design-system@1.0.0, and then the sequence of versions would be: design-system@2.0.0 with a major change in abc and then design-system@2.1.0 with a minor change in xyz. The only way for me to get an updated xyz is to upgrade to design-system@2.1.0. If, by any chance, I also used the component abc (which I was encouraged to!), I would have to deal with breaking changes in my code. Uncool.

At some point, Ivan came along and noticed what was going on. He pushed for reachitecture, got it through, and so we did it. No changes in how each individual UI component is implemented, no new features, just made the UI lib easier to ship by making each individual UI component be published as a separate npm package, with its own version history and dependencies.

After this change, the teams stopped suffering from breaking changes. Folks finally started to focus on using the UI library and getting value out of it, not spend time trying to justify to the team why upgrading dependencies costs so much.

Although, even after that, there was some work to do.

Fundamental particles of a design system, aka design tokens

The way out UI library has been architected for very long time was as a collection of SCSS variables and then a collection of UI components that are mostly independent from each other and that make use of those variables. This is already good, but it could be better. So we zoomed in real hard, up to the level of design tokens.

Design tokens are hard! Imagine that it’s Tuesday and you’re planning to go to work. You’re thinking whether you put on a blue shirt or a red one. What if I told you that, based on this choice, you might or might not end up earning a big cash bonus in the end of that same year? Sure, the system is fundamentally flawed, but the key word here is “system”: one choice can cause lots of forthcoming changes, sometimes in unpredictable ways. Butterfly effect.

I personally like the underlying idea of design tokens: define a handful of abstract base values, then stem more and more complex values from them, so that you can turn one of them up a bit and get a completely different set of parameters at the high level of complexity. Like, add a bit less oxygen and a bit more nitrogen to the atmosphere of the planet and get a completely different fauna. (there was a computer game about that back in the 90s, can’t recall the name)

We started refactoring the UI library and design system from the ground up, by first defining a set of design tokens, but this story is yet to be told.

Still, the main idea is simple: instead of going for “faster horse”, we opted for remaking the way the horse is defined in the first place.

Took on a wrong problem? Change the plan!

Initially, I had an idea of giving product teams a tool that would enable them to ship faster, at least when it came to the UI development. Turned out, I didn’t think of a much more important problem.

While the “technical problem, technical solution” approach is great, there might be a fundamentally different topic that needs work. It certainly was true for the team I was part of, where we dutifully worked on the design system and UI library, only to figure out eventually that designers and engineers didn’t understand each other. They used different vocabulary (is it link or anchor?) and thought about different things when used the same words. I still believe there’s a solution that is repeatable and that is not gonna cost infinity, but before someone discovers that, we gotta work together somehow.

From the design system point of view, the lesson here is that how well different teams and different functions in the organization (in this case, designers and engineers) can collaborate is going to define how successful the design system is going to be. In our case, we lived in a dual reality where designers would say the design system is nonexistent, while engineers would say they’re actively using the design system, and they were using the same expression to refer to different things, pattern library and UI component library, respectively.

So while we, the design system team, were busy fixing the issues in the UI library, our product wasn’t fully successful because it wasn’t fully adopted. We really were fixing a wrong problem, and it took us just a bit longer than a year to figure this out and refocus.

The solution is messy and fuzzy, but it all starts from speaking the same language, as in, using the same words to refer to the same concepts. The less ambiguous the communication between the design system team and the rest of the organization is, the easier it is to make the right changes and to convince people that, hey, these changes are the right ones.

That doesn’t come easy, especially when the existing tools aren’t adequate to that level of conversation. And this is where, I believe, focusing on a few lowest levels works well. A few abstract parameters that define everything go into design tokens. And a few atomic building blocks as simple as a rectangle and a one-column layout container make it easier to add on top of, — and implicitly set expectations low enough that people feel free to experiment with more complex concepts.

What I would do differently next time

I know that the second system effect is cruel and that there’s never next time (as in, each situation is unique), but I would probably do a few things differently.

Full rewrite?

One, I would deliberately spend time — talking about weeks-months here — evaluating the option of going full rewrite. I know, I know, no manager ever is going to approve that, but it’s a matter of packaging. Sunk costs are hard to recognize and hard to let go, and that’s where a lot of managers (especially seasoned, especially at late stages) actually don’t take risks, because the likelihood of losing lots of reputation points is too high.

A lot has been said about rewrites, but hey, software deteriorates over time, and there can be a situation where it simply doesn’t make sense to keep maintaining it. I’m not saying rewrite is the only option, but when the risk is low and the effort is small, and people support the promise of a better version, why not?

The implementation details of that, I’m not sure. Usually, for a rewrite, the specs are easy to come up with: just reverse-engineer existing solution and its behavior, serialize into doc, done. So should an in-house team do that, or is it okay to outsource? I found that outsourcing projects with crisp specs isn’t super risky from the point of view of one-time delivery (getting the project to “done”), but the future maintenance is rather hard to predict. Would the people who are supposed to maintain and develop future versions be capable of maintaining it? Will they be willing to? Ensuring that their expectations are fulfilled is almost equivalent to getting them do it directly in the first place. I’ve come to realize that making this kind of choice requires managerial knowledge that is very specific to this situation, so perhaps if you are in a similar situation, you might want to look for a person who has been through similar experience once or twice.

Explicitly distinguish value creation and refactoring work

If a full rewrite is not possible, or even if it is actually possible and got funded, dropping the maintenance of any tool that is being actively used is not cool, especially if there’s an incoming stream of bug reports and feature requests. There always is one, by the way.

Here, I’m thinking of splitting the effort into two buckets: maintaining existing solution and refactoring it.

Previously, we mixed the two on the team. Each person would get involved in either of the two at any point in time, and on average they’d spend about 25% time on paper and way of half of that in reality dealing with maintenance work, which is mostly encouraging teams to come fix ‘em bugs together with us and, when this didn’t work out, actually fixing those directly. This helped keep the existing UI lib growing, this also helped get more feedback on what’s working and what isn’t, really in the system sense, which is still really valuable as we talk about rethinking the architecture of the UI lib and the corresponding part of the design system (including design tools) from the ground up.

However, going more radical and putting a cap on how much time and effort we spend maintaining the legacy seems to me like the most reasonable approach. Makes me wonder how teams behind hyper-popular open-source libs (for example, React) deal with the maintenance vs. major refactor dilemma.

But what about UI lib tech stack? (I don’t know)

Here for some tech stuff, huh? I think that there’s plenty of information on the internets about crafting your next UI lib. My experience is running a team that’s building one in a rapidly growing digital product organization. My experience is very hard to replicate in another situation, even in another company that follows the same growth pattern.

So, just like the title of this section says, I don’t really know what stack is best for a UI library. Definitely the one that the team already knows and can put to good use, but beyond that? This is the kind of path you gotta walk yourself.

Eventually we got it clear that the UI library is a reflection of the design system into code. It can exist as a separate thing, but it so happens that we have one team that is working on both design system and component library, and so far it makes sense. There might be a situation where our design system becomes so big and abstract, there’s no direct connection between it and the UI library, and if that happens, I’ll probably be writing a separate post about that.