Feb 18, 2026 by José Carrasquel Vera | 55 views
https://cylab.be/blog/484/git-subtrees-subrepos-the-simple-way
This post explains how to leverage git-subtree to have existing repositories as a subdirectory (subtree) of your repository.
Suppose we want to test or develop a tool that is meant to be deployed in a Kubernetes environment. Suppose furthermore that our organisation cannot provide more than 3 VMs for our environment.
Kubespray leverages Ansible to deploy production ready Kubernetes clusters. Since we hope our tool will end up in production, we go all the way and use Kubespray for the deployment of Kubernetes.
We follow this guide where one of the first steps is to clone their repo. This is a convenient way of moving forward, but we want to keep our work in a single repository.
So the first idea tha comes to mind is to simply copy the content of the Kubespray repo in v2.30.0 and continue the guide. But what if we want to update to a newer version of Kubespray in the future? We could repeat what was mentioned before for the new tag, but, as we will see, we will need add an inventory for our infrastructure and play around with different configs… obviously, this way of working is not optimal.
While thinking about a better solution, I recalled I read something related to “subrepos” in the git book. In fact, what I read was the section about Submodules.
It often happens that while working on one project, you need to use another project from within it. Perhaps it’s a library that a third party developed or that you’re developing separately and using in multiple parent projects.
…
Git addresses this issue using submodules. Submodules allow you to keep a Git repository as a subdirectory of another Git repository. This lets you clone another repository into your project and keep your commits separate.
This looks promising, but after playing a bit with submodules I understood better what the last parragraph meant. Indeed, submodules are a way tomanage another repository as a subdirectory of your repository. This means, in particular, that
In practice this means that I would have to fork the Kubespray repo into my original remote. This is a bit better but it’s not ideal.
At this point, I decided to do a duckduckgo search. The Search Assistant ( :/ ) replied something about a git subtree command.
$ git subtree
git: 'subtree' is not a git command. See 'git --help'.
Digging into the Git Reference, I found nothing related to subtrees. Great, another LLM hallucination… After further investigation, I realized that perhaps I’m the one hallucinating, or even my git command. The truth is that I ignored that Git has tools that are not necessarily distributed with Git. Moreover, the tool is not documented in Git’s website but it turns out to appear in manuals.
$ sudo dnf install git-subtree
...
Complete!
$ git subtree --help
Subtrees allow subprojects to be included within a subdirectory of the main project, optionally including the subproject’s entire history.
For example, you could include the source code for a library as a subdirectory of your application.
Subtrees are not to be confused with submodules, which are meant for the same task. Unlike submodules, subtrees do not need any special constructions (like .gitmodules files or gitlinks) be present in your repository, and do not force end-users of your repository to do anything special or to understand how subtrees work. A subtree is just a subdirectory that can be committed to, branched, and merged along with your project in any way you want.
This looks much better and it turns out to be exactly what we need.
We’ve developed our app, so our history looks like this:
$ git log --name-only --graph
* commit 5cd8d092d9457df89eba38b769b8c7ea07252eeb (HEAD -> main)
| Author: José Carrasquel Vera
| Date: Wed Feb 18 16:03:11 2026 +0100
|
| Squashed commits of our-tool development
|
| our-tool/Dockerfile
| our-tool/app.py
| our-tool/requirements.txt
|
* commit 95709cee07c590d104d4b11730d59d44e27c5160
Author: José Carrasquel Vera
Date: Wed Feb 18 16:00:26 2026 +0100
Added initial readme
README.md
Now we bring over Kubespray v2.29.0 into our repo (yes, I’m a super fast developer):
$ git subtree add --prefix=kubespray --squash https://github.com/kubernetes-sigs/kubespray.git v2.29.0
git fetch https://github.com/kubernetes-sigs/kubespray.git v2.29.0
remote: Enumerating objects: 80360, done.
remote: Total 80360 (delta 0), reused 0 (delta 0), pack-reused 80360 (from 1)
Receiving objects: 100% (80360/80360), 25.63 MiB | 33.70 MiB/s, done.
Resolving deltas: 100% (44905/44905), done.
From https://github.com/kubernetes-sigs/kubespray
* tag v2.29.0 -> FETCH_HEAD
Added dir 'kubespray'
Let’s try to understand what was done:
$ git log --graph
* commit 7883f802414c205c469c1a64cb12ce846e8e2c1b (HEAD -> main)
|\ Merge: 5cd8d092d 10a336cad
| | Author: José Carrasquel Vera
| | Date: Wed Feb 18 16:19:55 2026 +0100
| |
| | Merge commit '10a336cad8d9ba91f9b4a374dced4b6deba7eb24' as 'kubespray'
| |
| * commit 10a336cad8d9ba91f9b4a374dced4b6deba7eb24
| Author: José Carrasquel Vera
| Date: Wed Feb 18 16:19:55 2026 +0100
|
| Squashed 'kubespray/' content from commit 9991412b4
|
| git-subtree-dir: kubespray
| git-subtree-split: 9991412b4597d6eaf37f86e5f20f9f903a731c08
|
* commit 5cd8d092d9457df89eba38b769b8c7ea07252eeb
| Author: José Carrasquel Vera
| Date: Wed Feb 18 16:03:11 2026 +0100
|
| Squashed commits of our-tool development
|
* commit 95709cee07c590d104d4b11730d59d44e27c5160
Author: José Carrasquel Vera
Date: Wed Feb 18 16:00:26 2026 +0100
Added initial readme
Commit 10a336c was created by squasing the commit pointed at by the tag v2.29.0 (9991412) and all its parents commits in the Kubespray repo, but in the kubespray subdirectory. Then HEAD is merged with 10a336c to form 7883f80. Without the squash option, we would see the whole commit structure above v2.29.0 in Kubespray’s repo instead of the squash commit.
Now we adapt the inventory to our infrastucture.
$ git show --name-only
commit 7c63ea59ad96c89a36a02a7820e3a4f0baee1485 (HEAD -> main)
Author: José Carrasquel Vera
Date: Wed Feb 18 16:40:18 2026 +0100
Adapted our inventory
kubespray/inventory/sample/inventory.ini
And life continues… if you push to remote, everything will be there. Now lets upgrade Kubespray to v2.30.0.
$ git subtree merge --prefix=kubespray --squash https://github.com/kubernetes-sigs/kubespray.git v2.30.0
$ git log --graph --oneline
* 09c02a623 (HEAD -> main) Merge commit 'd7727a259d30dffdd4acabbba17380c0f1b51a68'
|\
| * d7727a259 Squashed 'kubespray/' changes from 9991412b4..f4ccdb5e7
* | 7c63ea59a Adapted our inventory
* | 7883f8024 Merge commit '10a336cad8d9ba91f9b4a374dced4b6deba7eb24' as 'kubespray'
|\|
| * 10a336cad Squashed 'kubespray/' content from commit 9991412b4
* 5cd8d092d Squashed commits of our-tool development
* 95709cee0 Added initial readme
What happened is obvious from the git log command: d7727a259 was created by squashing changes from v.2.29.0 to v.2.30.0 and then merged with HEAD to form 09c02a623. Observe that we could have had merge conflicts depending on what exactly was done in 7c63ea59a.
Even with basic Git knowledge, working like this is quite simple and intuitive. The only new concept used is bringing commits from elsewhere and seeing them as a sub tree. So what we have is something like two almost parallell commmit trees, the regular one of your project, and a second one containing commits coming from the external repo. The parallelism is broken by several merges putting the trees together.
if you would like to gain further intuition on what is happening I recommend reading the Git Internals chapter of the Git Book.
This blog post is licensed under
CC BY-SA 4.0