Mono-repo or multi-repo? A straw-man question

24 Nov 2021

Dependency Hell
Isolation VS. Coherence
Accidental Dependencies
A Possible Solution
Criticisms
Afterword

There has been a lot of discussion about mono-repo and multi-repo in source code management. In my opinion, it is a straw-man question. The real question is:

How to isolate the impact of changes?
How easy it is to make synchronized changes to multiple projects?

Multi-repo is a nightmare for making synchronized cross-project changes. For programmers, it’s annoying to make several PRs (pull request) for the same set of changes. For continuous integration, the infrastructure has to implement cross-repo transaction support for PRs.

If multiple projects are located in one repo, only one PR is required for the same set of changes. No complex cross-repo transaction support is needed. Google mentioned the benefits of global refactoring when rolling out new compiler features.

Without impact boundaries, a mono-repo will be a disaster. Suppose that a system is composed of a backend, a web app and a mobile app. The mobile app and web app are independent, while both depend on the backend. The code for all the sub-projects are in the same repo. Now, without isolation mechanism in mono-repos, each PR to the web app will also trigger the mobile app CI, which does not make any sense. It’s a productivity killer and a waste of computing resources!

A PR is basically a transaction to the source code management system. Different levels of isolation enables different levels of parallelism in mono-repos when merging PRs. No isolation means all merges will be serial, which can easily become a bottle neck in huge mono-repos.

Isolation also enables the source code management system to run a subset of the CI corresponding to the changes of the source code. No isolation means each change will trigger the whole CI. For a big company, it means long waiting time and a huge waste of computing resources.

Therefore, we conclude that a way to define and manage isolation is necessary for mono-repos.

Dependency Hell

But how to define and manage isolation in mono-repos? The question essentially boils down to how projects can depend on each other. Different ways of dependence determines different granularity of isolation. For example, there can be:

Synchronized source code dependency

e.g., webapp -> backend, mobileapp -> backend
branch-based source code dependency

e.g., webapp -> backend @ v1.2.x , mobileapp -> backend @ v1.3.x
Artifact-based dependency

e.g., webapp -> backend.jar @ v1.2 , mobileapp -> backend.jar @ v1.3

The first one provides the most coarse-grained granularity, while the last one provides the most fine-grained granularity. As a result, the first will be the slowest and most resource-consuming, while the last is the fastest and most resource-friendly.

However, the last two lead to the notorious problem: dependency hell! In fact, mono-repo is favored in companies for two reasons:

simplicity of making synchronized changes to any subset of files
avoiding the dependency hell caused by versioning

To boost performance of PR transaction processing and reduce resource consumption in CI, the only solution is to make granularity of isolation more fine-grained. It means that the real blocker to boost performance of mono-repos is the dependency hell.

Theoretically, dependency hell is only a problem if there are diamond dependencies like the following:

       ┌─────┐
   ┌───┤  A  ├───┐
   │   └─────┘   │
   │             │
   │             │
   ▼             ▼
 ┌─────┐     ┌─────┐
 │  B  │     │  C  │
 └─┬───┘     └───┬─┘
   │             │
   │             │
   │    ┌─────┐  │
   └───►│  D  │◄─┘
        └─────┘

Unfortunately, such dependencies are common in the real-world. Unlike cyclic dependencies, which should be made an exception, such acyclic dependencies are justified use cases.

Now, if B and C depend on different versions of D, two things can happen – both are bad:

If the major versions do not agree, the dependency management would think that there is major incompability and stop building the system from the components.
If only minor versions disagree, usually the dependeny management will use the latest minor version. But that make the final system prone to accidental or unknown incompatibilities between minor versions — a big risk!

Isolation VS. Coherence

Mono-repos differ in how the CIs are ran for PRs. If all PRs run the same CI, the coherence of the dependent parts of the repo is high, but at the price of isolation. In contrast, if the isolation level is high, a change in one project might break another project, which reduces coherence.

However, while higher coherence is better, it is far from being the same as correctness. The system can still fail at runtime. This is because coherence in the current practice of software engineering is ensured by testing, which, as put by Edsger W. Dijkstra, can be used to show the presence of bugs, but never their absence!

Meanwhile, from the architectural principle of high cohesion, low coupling, it makes sense to run the same CI for closely coupled projects, and increase isolation level for loosely coupled projects.

But how we can specify and manage dependencies of loosely coupled projects, without running into the dependency hell?

Accidental Dependencies

One of the pain points in managing the dependency of loosely coupled projects is accidental dependencies: a project may accidentally depend on the implementation details of a library.

Unlike the carefully crafted APIs intended for usage whose contract is supposed to be stable, the internal implementation is supposed to change from time to time: the name of a field, method or class may change, visibility of definitions may change, method signatures may change, virtual methods may become final, etc.

The unexpected leaking of internal implementation details makes the dependency brittle. It also hinders improvement on the implementation of a library.

A Possible Solution

The problem of dependency hell is hopeless if it is based on versioning, semantic or not, due to the existence of diamond dependencies. It means that for dependencies that are not strictly a tree, we need to find solutions other than versioning.

An interesting idea I’m still pursuing is service-based semantic dependency. The rough idea is that we separate a service from its implementation(s). Each service definition is associated with an acceptance test. A library only specifies the service it provides and the services it depends on instead of the concrete service implementation. An application is free to choose any validated conformant service providers for a dependent service from a service registry. As there is no versioning, it is impossible to have conflicts.

I can see the following benefits of the service-based approach to dependencies:

It encourages standardization of common services. The same service can have multiple competitive implementations. It will be much easier to discover providers of the same service.
The acceptance test set of a service will be enriched over time and can be reused to validate new implementations. For companies and the open source community, the service definitions will be a valuable asset.
It protects the service provider from accidental dependencies on internals of the implementation by consumers. The separation of a service from its implementation also forces library authors to craft better APIs.
It avoids name space pollution. No global names exist except that of service names. No names in the concrete implementation should matter neither at compile time nor runtime. This requires language design support to better handle names and modules. The popular programming languages, such as C++, Java/JVM, C#/.NET, fail to satisfy this requirement.
The design will lead to true secure programming languages. Existing programming languages usually provide a standard library which includes APIs to perform side effects, such as read/write files, network, etc. The service-based API mandates that such APIs must be abstracted as interfaces, and can easily be replaced with a different implementation in an application.

The service-based approach enables a new vision in source code management: service definitions will be the key for managing isolation/dependency. The usability and stability of services will be a major quality metric of system architecture in a huge code base.

Criticisms

But the idea also faces several questions:

Who owns the service definitions?
Is it possible to define a service independent of its implementations?
Can service definitions be changed?

It is difficult to define who owns the service definition: the service consumer or the service provider(s). Joint ownership only complicates its creation and maintenance. If no one owns the service definitions, then they are not going to exist.

For this question, I think either the service consumer or the provider can be the owner. The most important thing is a mechanism to ensure that the interface is separate from the implementation, and enforce the conformance of the implementation. Eventually, a better-specified service may become more popular, a library without an independent service specification will not be well received or even impossible to be published.

Meanwhile, it is difficult to define a service API for a framework that is independent of its implementation. It also holds true for libraries whose APIs are based on domain-specific languages. The reason is that for frameworks and DSLs the interface matters more than the functionality. For these libraries, I think it is fine that the service only has one implementation. It will be an overhead to separate the interface from the implementation. But it helps in the long run by forcing library authors to craft better APIs and avoid unexpected dependencies from end users.

For frameworks, there is a better solution. Library authors can separate the library into two libraries: (1) a stable core, and (2) a changable and user-friendly API. The first should contain the main functionality and aims for binary-compatibility. The second should be a thin wrapper layer intended to make the API friendly to programmers. This way, different versions of the API can be released as different services. The API layer is supposed to be small, so it is harmless to have multiple variants reachable from a single application.

If we ignore the overhead of defining the service APIs, there is also the criticism that we may not be able to define the service APIs before we create the actual implementation. I do not object to the fact. However, it is fine that a service definition is created after the implementation.

Another criticism is: can service definitions be changed? If so, will they fall into the same trap of versioning? For this question, I think a proper design would require that service definitions may only be augmented. If some APIs indeed have to be removed, it may create a new service, such as JUnit4, JUnit5. As they are different services, it is fine to include both of them in the same application. Meanwhile, it’s also possible for the same implementation to support multiple services, e.g., the same implementation may support both JUnit4 and JUnit5. The application can choose that particular implementation to satisfy the requirement for both JUnit4 and JUnit5. At runtime, only one implementation will be used.

Afterword

Speaking from real-world experience, a pratical solution is to split the test set into categories:

Gate tests
Daily tests

The gate tests run on each PR. If they pass, a PR can be merged.

The daily tests run on a daily basis. It is more extensive and can detect incompatibilites relatively early. Make sure that developers can trigger a selected subset of daily tests on a PR.

If the daily test becomes too big, we further split into weekly, monthly tests, and release tests.

This way, we can achieve both productivity and eventual reliability.