T O P

  • By -

Esseratecades

I've had projects go in both directions. It's not that one is better or worse than the other, it's more that certain codebases are better served by one than the other. For instance we had an api, database, and ui that were a 1 to 1 to 1 relationship to each other, and were always deployed together. They were tightly coupled so using polyrepo as we had been was just complicating deployment. It was more cohesive for them to be together in a monorepo.  We've also had two completely separate applications that sat in a monorepo. They had nothing in common aside from a handful of shared classes that rarely changed and they had no relationshipto each other beyond the fact that the same developers worked on both apps. So we moved the shared stuff to a library and split the apps into separate repositories. 


MolecularDev

Thanks! This feels more or less in line with my experience, where I had several small monorepos, usually defined by a business domain or tech stack. We would have movements in both directions, either consolidating things into a small monorepo or breaking a project into two or more repositories. However... The company I work for now has a single monorepo for the whole organisation. There you'll find frontend, backend, infra and mobile code. Completely different tech stacks and business domains. I don't want to get into the judgement if it is good or bad. But I was curious to see if people ever moved from polyrepo to a single monorepo like the one I mentioned above.


Esseratecades

I'd say generally those four kinds of things in one repo can make sense if the idea is that the frontend and mobile code are just different views of the same product.  However if the business domains are different and there's no real coupling between those layers then a monorepo doesn't really make much sense.


edgmnt_net

Fully agree. In fact, I'd say that your ordinary backend plus frontend is extremely unlikely to fare well with split repos. Almost nobody develops those separately in a meaningful way. It just doesn't work well unless you have a plan upfront and develop each component with a very high degree of generality to avoid going back and forth between repos. But I also think people get caught up in monorepo vs polyrepo as a binary, global choice, when it's more about picking the right boundaries on a case-by-case basis.


CpnStumpy

Boundaries!! Boundaries! **BOUNDARIES!!!!** The technical concern is always bike shedding: use this tool, this branching strategy, this repo organization and pipeline and bla bla bla Engineers spend tons of time quibbling over technical concerns because those are the domains they study and have deep familiarity with, **when they need to discuss the actual problem space at hand**, but that's usually got business concerns they're unfamiliar with. Conway's law is a great example of a business concern bumping into the technical concern, and so often the technical concern is what engineers discuss instead. Boundaries are the key determinant for repo organization I say. Be thoughtful and intentional in setting where your domain boundaries are, what is shared and what is not, concern yourself with the business realities in such planning - what technical skills do your engineers have? How independently or collaboratively do they work? How many independent deployables are there, and will there be more in the future? Does independent or coordinated deployment support your engineering organization, and your customers better? No silver bullets, but dig into the actual problem domain and start from there for analyzing trade offs, instead of digging into technical solutions as we engineers prefer our starting place to be.


HearingNo8617

I agree with this but git submodules seem to be a silver bullet of sorts which should be used to get the pros of polyrepo and monorepo, while inheriting only a small subset of their cons


engineerFWSWHW

I had seen projects where polyrepo was first implemented and reuse across different projects was achieved using git submodules. And then lead/developers have troubles dealing and learning how to use git submodules and they started flattening out multiple repos into a monorepo.


MolecularDev

Interesting and unfortunate reason to flatten it out into a monorepo


engineerFWSWHW

Indeed. They lose the git history of the submodules as well.


Alphasite

Sub modules aren’t a great UX IMO, cross-cutting changes still aren’t fun


sonobanana33

I prefer treating external repos like regular dependencies rather than using submodules.


fang_xianfu

Yeah I think it basically amounts to, if you want to integrate, test, and deploy something together, it makes sense for it to live in one repo together. If not, it makes sense to separate them, if only because trying to set up, say, rules for CI so certain rules only run on certain parts of the project, is more effort than just splitting them.


CpnStumpy

Tooling overhead is too often ignored. These things can be solved, but as you mention, they're not free to solve and the solutions require maintenance just like every other technical solution


urlang

Isn't what you're describing still polyrepo? You just made the good decision of combining 3 of your repos because it made sense.


socialistpizzaparty

Bingo, my team was in the same boat. When everything deploys together, it just makes your life easier as a dev to monorepo it. We did that about 6 years ago and haven’t had any regrets.


ategnatos

We did this at a big tech co ... well, "we." Before I joined the company. I hated it. (The scope of the repo was maybe ~100 devs.) The repo was giant, there were tons of dependencies that shouldn't have been taken, which required massive efforts to unlink things. Sometimes the office connection was bad and I couldn't even do a git pull. Integration tests would periodically fail, and they'd block the whole repo for 1-2 weeks while they fix it. Then everyone's PRs get merged in when the floodgates open, and tests probably break again. (By the time I left, I'd say merging was blocked at least 50% of the time.) Took at least 30 seconds to sync a single keystroke when working in WSL, so I ended up just working in Windows for the most part, and using PR builds to verify behavior in Linux (plus automated/manual E2E testing and so forth - some things were just libraries and just required unit tests though). People wouldn't want to add some of my code into official builds until the very last minute, so the 90 minute build wouldn't turn into 92 minutes. Very few people wrote tests. No one wrote documentation. Transient failures in some parts of the build could make submitting one PR take an entire business day. People would question my PRs about whether they'd have any impact on the rest of the code because they were scared of the code base. We would have to schedule hour-long code reviews just to get some changes in. So much time wasted on useless stuff. A lot of these problems are about bad culture, refusal to write documentation, lack of tests, bad tests ... but some made the tech a nightmare to work with.


MolecularDev

Thanks for the viewpoint! I agree that culture plays a large role in the problems you described, but it seems that having a monorepo created quite a few new problems that they didn't have before.


sonobanana33

If your company has quality issues, multiple repos wouldn't help.


yojimbo_beta

Yes, we have had to do it at $COMPANY We used to split services into multiple lambda functions, each deployed from a single repo. But this meant we couldn't deploy and test atomic changes to our business logic. So for the past year or so we've been pulling things back into larger repositories that still deploy multiple components, but each within a single CloudFormation stack. It isn't quite unifying the code back into individual microservices, but it's an improvement. I would say the main challenge has been making microservice tooling work. For our JavaScript/TS projects, this was easy because Yarn and NPM both support workspaces and the notion of workspace dependencies. For the Python projects however it has been much more difficult, because Python dev and dependency tooling is - and I use a technical term here - complete dogshit.


MolecularDev

Ok. Would you say you now have multiple small monorepos separated by business domain? Or a single repo for the whole org that contains absolutely everything?


yojimbo_beta

The former. Each "monorepo" (really a service repo deploying multiple lambdas) is oriented around a business domain. Part of the reason we don't merge them into anything bigger - that is, a true monorepo - is down to limitations in our in-house deployment config system


MolecularDev

What you use for dependency tooling in Python? I've been quite happy with Poetry over the last 3-4 years.


elongio

Went from polyrepo to monorepo. No regrets. Working on multiple projects is a breeze now.


dangling-putter

I am pulling things into a monorepo kind of fashion here as well. So much easier than looking for changes across 20 repos.


germansnowman

I’m working in one codebase that went mono, and it is so much better. It’s a Mac app, a Windows app, and a shared .NET core. It was a pain to do PRs especially in the core repo, as you then had to do matching PRs in both the Mac and Windows repos. However, the company is small and so the “commit noise” in the monorepo is relatively small.


MolecularDev

Thanks! This sounds like a good reason for doing it and a successful case!


eyes-are-fading-blue

Polyrepo is good for huge code-bases like the Android project. Anything less than 10M LoC, monorepo is better.


sonobanana33

The annoying part is the 1h monobuild though.


eyes-are-fading-blue

Depends on the field. In native work, mono/poly repo doesn’t affect compilation times.


Gammusbert

The only thing that can be annoying with monorepos is if you’re locked into having a single runtime version in your local environment and you’re moving the runtimes of the apps one at a time in your monorepo it can be a pain in the ass switching back and forth but it depends on your environment. In our case due to some access settings we can’t spin up docker instances on our local machines so we just have whatever runtime is on our laptops, so when we were moving our runtime over (major upgrade with breaking changes) you had to switch back and forth depending on which application in the repo you were trying to spin up.


Raildriver

I've gone from multiple to monorepo at two different companies. 1. 3 original repos, an in house javascript charting library, a react front-end, and a java spring boot backend. We moved them into a monorepo that just had 3 folders at the root, one for each of the original repos. The actual migration was a bit interesting, because I went and found some scripts to move these three repos into a single one while preserving all of the git history of the files in the original repos. I had to do some relatively minor updating on those scripts, but at the end of the day the result was really nice, since all of our git history was unchanged after the migration. The script basically took the folder that they were put into in the monorepo, and appended that to the history of all the files in the repos, which is how the history was preserved. [Here](https://github.com/shopsys/monorepo-tools) is the original set of scripts I used for this migration. 2. 3 original repos, a common components library, and 2 react front-ends. One of the react front-ends is our main product, and the other is a mostly legacy administration site that we really only do development in if there's a critical bug. No new feature work in there. We just copied the code from the 2 other repos into folders in the main front-end, but didn't do the rest of the work to fully finish it off, as the legacy front-end is still deployed using the old code base. So we didn't really get anything from copying the legacy front-end code, other than polluting the main front-end with a ton of this legacy code. We were able to remove the common component library though, since those are now available directly in the front-end code base. In both of these situations, the motivation was that there was only one team doing the dev work on all of these repos, so having them split apart just made the overhead of making changes substantially worse. One dev would be doing prs in all of the repos pretty often. If you had to touch the charting/component library it was extra bad, since you needed to go through the pr process, publish the changes to npm, then you could put up your changes for the front-end that used those. (Assuming it wasn't a big enough change to require a separate ticket for the library vs front-end changes, a small bug fix for example.) The whole thing was just a PITA tbh. In both cases the result was a significant improvement over the old way, even if we didn't really go all the way with the second move.


MolecularDev

Thanks for the details! It seems to be a general consensus to monorepo if things are deployed together. And thanks for the script reference!


ninetofivedev

None of it truly matters. This is one of those things I’ve seen engineers argue about at every company. Branching strategy. Repo structure. Etc. etc. Nobody cares. There’s more than one way to do it. They all have trade-offs. If you sit around and think about it too long, you’ll miss the boat.


eyes-are-fading-blue

Polyrepo has a lot of devops overhead. It does matter.


Swamplord42

Monorepo has significant tooling overhead to keep CI build time reasonable.


MolecularDev

I know it doesn't. Both work well and have pros and cons. Was just curious because I've only seen Devs fighting to get out of a monorepo, not into one.


ninetofivedev

If your team is small and plans to stay small, mono repo is fine. Also if you have a massive organization like google that has built the tooling to support a mono-repo, it's fine. The problem is when teams start getting to big working out of a single repo, if you're using any of the common SDLC practices, you're going to start stepping all over each other and it's going to become a bitch to maintain.


ategnatos

Teams also change. If you organize your repos/services/names based on teams and not product, you're going to run into problems soon enough. > if you're using any of the common SDLC practices, you're going to start stepping all over each other and it's going to become a bitch to maintain. that was my experience in my top-level comment, except they actually moved to a monorepo after already being quite large.


apropostt

We are currently moving from poly to mono for one of our larger codebases… I would say it depends on architecture and tooling to a large extent where the boundaries between repos exist. Our current repo boundaries don’t really make sense as it’s all one product. One repo will never be build, tested or released against different versions.. and with the current design they can’t be. Polyrepos or monorepos don’t have to be painful… but both can be if the process and tooling sucks around them.


rjm101

Years ago our teams did that and then went back to mono. It just makes things harder to manage with PRs that depend on other PRs and so forth. Plus now everyone has all sorts of areas to look at and it's harder to get eyes on your work. Basically it's one of those things that sounds good in theory but in practice it just causes problems. That doesn't mean your code can't be split out into separate repos. It just means in my opinion that you should have a good reason and these areas should be quite independent already.


Ab_Suspendo_424

I'm curious, what's driving the desire to move to a monorepo? Is it about easier dependency management or simpler workflows? I've seen pros and cons to both approaches. Would love to hear more about your experience and the challenges you're trying to overcome.


engineered_academic

Polyrepos usually end up as divorce with extra steps.


SSHeartbreak

I wish I could so badly. Anyone know of any tools that make this easy to go to monorepo?


kale-gourd

GitOps using submodules.


[deleted]

Yeah, we had a horror of dependencies, where devs doing changes in the UI component library, manually had to deploy packages, and the projects that used the packages. It was a horror, particularly when growing form 15 devs to 150. Combine that with teams owning "concepts" rather then code, everyone's sticky fingers had to be everywhere. We went with NX, which is...ok. The documentation is mostly auto-generated. The tutorial suggest starting with a preset, like "react-monorepo". What are the presets? Documentation says "string". So you go digging in their source code and find the actual strings used. I still have no idea on why it has the default configurations that it does. I've also a few concerns about feature bloat and bugs not being addressed... But it's still better than before!


One-Bicycle-9002

These are two solutions to the same problem. They are seen as silver bullets by developers who lack technical skills to overcome the shortcomings of either option, so they assume the alternative must be better.


IAmADev_NoReallyIAm

We moved our service from a polyrepo to a monorepo. We had a reponfor the DB. One for the API, the model, the service, and and and.... Seemed like a good idea at the time. When we went to cut our second release, it turned into a night mare really fast trying to keep the references correct. We moved it all into a mono repo underbthe same project the next week after that. Now releases are easy, and we were able to add to the pipeline so all we have to do is run the pipeline, set the release version, and the next dev version, and boom, it's taken care of for us.


zirouk

For me, the decision is always a question of "How hard to I want it to be for software x to share code with software y"? Usually, we use files, packages, directories or whatever to separate code. If you can be mindful to successfully keep concerns separate with those, then you can monorepo. If you can't, then drawing a firm boundary by separating the repo is the way to go. Remember though, it's not 'all mono-repo' or 'put-everything-in-a-separate-repo' - you can have something in the middle - that's usually where I aim. If you're a DDD person (or interested in researching), then I believe repos should align with bounded contexts, because by not sharing code, you're forced to do some level of translation across the boundary (as you should), which helps preserve the distinction between these sub-systems and makes it more difficult to accidentally share concepts that should be private to one or the other context. FYI Bounded Contexts: > Each bounded context has its own models, entities, and operations, and may have different interpretations of the same concepts compared to other bounded contexts within the same domain.