Why Git?
For many people in KDE, svn no longer meets their needs for teamwork-based development. The ability to easily merge branches, to work radically off-line and a fairly impressive list covering various benefits of git are often cited as reasons to make the move. Personally I don't think it's a panacea as git comes with its own set of issues, though it is getting better and better every time I upgrade to a new version of it. I think it is quite a bit better than svn is for how we tend to work, though. That hardly matters, though, because our developers are voting with their feet.
Not only are some projects already hosted on gitorious.org, including some big ones like Amarok, a growing number of KDE developers are using git-svn so they can use git for their work and simply sync svn up with their changes periodically. The result is that our world is getting more and more fractured, with some work happening in git repositories and some on svn.kde.org. This is not optimal for the health of our community as there is a real benefit from having each other's code within "close reach" of everyone else's and only having to learn a limited number of tools to work with. When it was just svn being used (and before that cvs), we benefited from a kind of network effect and a lower barrier to entry. We're in danger of eroding both of those things.
So unless we say "no, you can't use git for anything KDE", which seems a bit draconian and probably not only unfair but risky (in terms of losing contributors), we need to get this house in order again.
The bonuses like having a better revision control system and being closer to Qt's mainline development are just bonuses next to that.
What Should Git Be For KDE?
When it comes to the migration itself, I think it is pretty important that we avoid some evident pitfalls.
First, we need good documentation on how to use git so that when the time comes we can all move over and waste as little time retooling as possible. It will help alleviate the apprehension carried by some KDE developers as they being moved to something new, and we all know how much people dislike change. :) This has been started on Techbase and seems to be making really good progress. So, with some more effort (and possibly helpers), that looks like one issue is (being) taken care of.
Next, git must not be allowed to cause "developing in secret bunkers" to start to occur in KDE. Due to its highly decentralized and off-line features, git (and other systems like it) can lead to people working on their own, self-isolated from the group for extended periods of time. This really isn't very healthy, but since it's a social problem it needs a social solution. Just being aware of it and reminding people who seem to be dropping out of the picture for extended periods of time should hopefully be enough.
Git must also not be allowed to make developing for or following development of KDE software harder. A single, canonical mainline must be kept to keep that bar low. We really don't have the need for long lived personal branches that deviate significantly from mainline and which people then have to pick between when getting KDE source code. This seems very common (and for good reasons, in my opinion) in the Linux kernel project, but it isn't a necessity when using git. Git enables a lot of workflows, and I think we should be careful about straying too far afield from the workflows which work for us today. I can't wait for feature branches to be easy to do in Plasma, but I really don't want to see the notmart-plasma and the aseigo-plasma. ;)
Between some good decision making on the part of the people doing the git migration (so we don't, for example, end up with a mess of modules with no clear purpose) and some social engineering we should stand to profit handsomely from the git migration. Seeing how we already have the kde-developers group on Gitorious (so we can all commit to the whole code base once it is in git, just as we can now in svn) and the plans the migration team have put together, I think we'll do fine.
But first we need to do the actual migration...
What Are We Waiting For?
The Move To Git page has a nice list of things that are yet to be done. Some things left are business topics such as KDE e.V. and Gitorious entering into an SLA. Some are technical (such as getting rid of svn externals, of which there are two blockers still there, one of which I'll be getting rid of in the next week or two), some are social. The team meets every week on irc and has been making fine progress.
So we aren't quite there yet, but we're a lot closer than we were a month or two ago. I'm really hoping we can complete the migration before 4.5 is out and do a great job of helping our community move over to the new system. That will allow us to do some git hand holding at Akademy and develop the 4.6 release with some shiny new tools at our disposal. As such, this made my "Key Quests for 2010" list.
(This article is part of the "Key Quests for KDE in 2010" series)

29 comments:
i dont like the move to git. if development is done outside of trunk, we lose very much of the monumentum of working together. If i look in the future i see everyone working alone in their own branches, merging only when stuff which is somewhat finished :( if its really finished, then there is no probability for others to comment on the work or influence the direction.. this has other implications too, like taking maintership of code from others and so on... lets see...
my2ct
First, from the blog entry itself:
"Git must also not be allowed to make developing for or following development of KDE software harder. A single, canonical mainline must be kept to keep that bar low."
Then the next question is "what about feature branches?"
Well, we already do a lot of that in KDE, and it's painful and even completely impractical to follow them. With git, it's actually plausible to follow multiple branches by easily switching between them, even being able to stash your own local changes before doing so.
I don't think feature branches are anything to be feared, and as long as everything folds down into a single mainline we should be good. I expect only larger features to end up being developed in their own feature branches, and the life span of a feature branch should be relatively low. This really isn't all that different than how we work already, at least in projects like Plasma and from what I understand Amarok as well.
Coming from an svn-centric viewpoint, branches are scary. From a git perspective, branches keep things sane. Right now it is very hard for me to keep track of everything that is being developed and we're doing far too much hand-managing of patches. This is almost completely down to svn being horrible with merging and managing multiple branches.
Now, it is completely possible to use git in a way that everyone does work alone in their own branch on their own system and doesn't collaborate with anyone. That isn't hardwired into git, however. It's a social problem, and it has social solutions.
Which is why I wrote what I did about us ensuring that git doesn't make it harder to follow development.
As for taking maintainership and what not, that sounds like fear talking because I really don't see how that would be possible at all with git in ways that it isn't already possible right now with svn.
Keep in mind the large number of people in KDE who already develop using git-svn and it becomes pretty apparent that we don't tend to develop these social problems with git. As long as we keep our eyes open about it, we won't.
I swiched to git svn in order to get familiar with git which is a bit harder to approach than svn was from cvs. Indeed it's not 100% perfect but it's pretty interresting !
Thanks for the articles!
One thing I am really, but really worried about is the size of KDE. Since KDE is *so* big, making a small change in kdeedu/marble will mean I have to pull the whole kdedu project. With all it's history.
I personally cannot even clone the Qt repository, since it's seems my ISP is blocking it. I am really worried about things likde kdelibs+kdebase, with ttheir 10 years of history.
I am still under the assumption that each module (kdelibs, kdebase, kdpim, etc) will get it's own git repository. Now the question is: how can I move a project between modules? Remeber how kmail moved from kdenetwork to kdepim? How can I move my phonon plugin to Qt?
Don't get me wrong, git kicks ass, I really really like this tool. But we need to understand those issue and have real solutions to them before we commit to this big change.
You say that many people are using git-svn and not pushing their work up to svn for a while, but then you put moving from svn to git as a solution to this. How does dropping svn solve that problem, when people could still wait a while to push from their git to the project's git?
(BTW, (a) I'm a KDE user, not KDE developer, and (b) I love git, especially its branch/merge abilities.)
Good article, Aaron. I think this summed up our current situation (and future plans) very nicely.
I would like to add, especially @Chris, that there really is no need to worry so much about it. When we (Amarok) did the transition, we were afraid that it could become quite complicated and chaotic.
As it turned out, it was much easier than we had thought. And what has impressed us especially: We have much more contributors today than we had ever before with SVN. Even for newbies it's easier to learn Git than we had expected.
If you are interested, I have summed up some of our experiences with the migration here:
How We Made Amarok 2.2.1
Regards, Mark.
I'm not convinced that it's good to move to Git. While it surely has nice features it's rises the entry barrier significantly for occasional contributors. At least this is my personal experience from doing a few patches for Amarok. And yes, I'm aware that Git works differently than SVN. I read lots of docs. Way more than I ever did for SVN. Still more than once I came to the point where I thought I should throw it all away and not care about patches anymore.
SVN is dead simple compared to Git. With Git you have to be extremely careful what you do. Otherwise you end up in situations where things go neither forward nor back. Then you either have to start all over again or find the secret command which makes things work again. With SVN you don't really need to care when you want to add some debug output to track down a bug or do a local change which is not intended for general use. You update, things get merged and that's it. With Git you need branching, stashing and what not.
@Mark: I somehow can't believe that people came to Amarok _because of_ Git, I'd say they came to Amarok _despite_ of Git, because Amarok is a cool project. (Of course I won't deny, that there are weired people out there who choose a project because of the version control system it uses ;) And SVN is surely easier to learn than Git.
I am not familiar with git, so forgive me if this is a stupid question. Will moving to git make it possible to have, say, a trunk branch that remains open for development while a second branch is made for a SC release. Currently, as I am sure most people reading this are aware, during a, say, hard feature freeze trunk is closed to new features. Would it be possible, or even desirable, to avoid this with git, have one branch intended for release which is frozen and a second that remains open? If it is possible it may make things too complicated, so I don't know if it is even a good idea. I am just curious how git will affect the release process, I guess.
redm: I came to konversation "because of" git and gitorious. I am using KDE for years, always wanted to contribute but asking for an svn account with no clear purpose had always been a blocker for me.
Gitorious with merge requests and git with local branches etc. were the solution for me.
@redm: Start to actually use git instead of reading about it. Otherwise you won't understand or appreciate it. Let me predict that once git solves a problem for you which is a nightmare with svn, I'm sure you can't live without it any more! ;-)
@aavci: Well, speaking for me, all projects where I contributed patches, except KDE, I had no account on their version control system. Still this didn't keep me from contributing. Why should it? And even with KDE, where I have an SVN account, I can't just commit where I like. Maybe that's why I can't really follow this argument. As for local branches, they are partly not necessary with SVN. Sure, they are nice to have. But for me as an occasional contributer it doesn't outweigh the Git overhead.
@shentey: I did use Git! I do see the benefits. But I also see that it's way more complicated than SVN. This is exactly the point I'm trying to make. When you use it on a daily basis you will eventually get a hang of it and the overhead will be worth it. However if you only use it every now and then it's not really. A VCS is a necessary tool for me and as such should support me and not get in the way. Git simply gets too much in the way for me. I want to invest my time into making patches and not into mastering some tool. A solution might be a dead simple frontend to the myriad of Git commands, which make the common things easy. But currently I don't know of any such frontend.
@redm:
We should be honored that you called Amarok a "cool project", and that's very nice of you.
However, it may be cool or not, but even if it was that cool, that still doesn't the explain the extreme growths rate of 3rd party contributions we have seen after the migration to Git.
Please consider that having Git clones actually makes it much easier to contribute things, as you can happily run "your own version of Amarok", and when you feel like it, you just request a merge.
The article I linked to also shows some stats of the sheer number of "Merge Requests" we received. I think that this number speaks for itself.
Regards, Mark.
@elcuco: "Since KDE is *so* big, making a small change in kdeedu/marble will mean I have to pull the whole kdedu project. With all it's history."
it sounds scary, doesn't it? here's some reading that shows just how efficient a git repo can be, history and all:
http://www.contextualdevelopment.com/logbook/git/large-projects
http://blogs.gnome.org/simos/2009/04/18/git-clones-vs-shallow-git-clones/
Qt 4.6.0 from the release tarball takes 472MB on disk here; so an svn checkout would take twice that, or 944MB. the git checkout of Qt i have takes 942MB on disk, and that's with the full history.
"how can I move a project between modules?"
good question; git can pull directories out of a repository to create a new repository (complete with history), and it supports commit playback as well. i don't have the answer to this one off the top of my head, but i'll bet one of the local git gurus can come up with an elegant recipe for this workflow, and this should join the documentation on techbase.
@rfunk: "You say that many people are using git-svn and not pushing their work up to svn for a while, but then you put moving from svn to git as a solution to this."
what i said (in a comment here) was that people are already using git-svn and that hasn't significantly changed the interval people push nor has it made it harder to follow what people are doing.
i also said (in the blog entry) that people are using git-svn already, and besides fracturing our community process somewhat, it hasn't caused any of the possible social ills one can develop with git if done poorly.
@redm: "SVN is dead simple compared to Git. With Git you have to be extremely careful what you do."
yes, svn is very limited compared to git in terms of complexity and possibilities. the trick is to keep yourself to what you understand with git. if you go trying all kinds of complex stuff with git, you can, indeed, get yourself into trouble. it's usually fixable, but yes, you have to not do silly things.
my advice there is to just stick to what you really understand and know with git.
just because you -can- do something with git doesn't mean you have to, or even should.
"With SVN you don't really need to care when you want to add some debug output to track down a bug or do a local change which is not intended for general use. You update, things get merged and that's it. With Git you need branching, stashing and what not."
that has not been my experience in the least. i have worked on projects with git using the exact same workflows i use with svn, which is to say one central repository with branching only for releases and no stashing or anything.
even if we ignore the ability to do nice things like "stash my current changes, and then update" or "put this in a branch and let's go back to mainline for a bit here" with git, git still has benefits over svn here because you can do things like commit just pieces of changes, allowing one to put in some debug, then commit the other changes that were made while keeping the debug local. but even that isn't necessary.
it really sounds like you chose a very difficult way of doing something very simple. this is why we need good documentation, so that people using git don't feel that they must do contortions when a simple straightforward path is right there.
@Todd: "Would it be possible, or even desirable, to avoid this with git, have one branch intended for release which is frozen and a second that remains open?"
this is something we've discussed, and git does make this easier (though it's also possible in svn, just that the merging is more painful). right now with svn, we already do open up trunk for features as soon as we hit the RC stage. we don't do it sooner (e.g. immediately after feature freeze) for purely social reasons (to encourage people to remain focused on quality improvements instead of just jumping to the next feature set). even that can be worked around with branches (even in svn).
so this isn't a git-vs-svn thing as much as it is "how do we want our workflow to go". i think there is real benefit to a "always open trunk", but it needs to be done with care to ensure that quality doesn't suffer as a result.
@redm: "As for local branches, they are partly not necessary with SVN."
they aren't necessary with git, either.
@markey: "The article I linked to also shows some stats of the sheer number of "Merge Requests" we received."
review board has given us some of the same sort of bump; i wonder how much of this, therefore, is due to git itself (and it's decentralized-makes-the-bar-lower nature) and how much is due to simply having a well documented (because it's part of the tools themselves!) way to contribute.
before review board, it was pretty murky to outsiders how to post patches for comment and what not. yes, the obvious common sense answer is "send an email to the mailing list", but that was apparently too much on the side of "arcane knowledge".
I think bzr is much easier to use than git and much more powerful than svn. You can have a very simple work flow with it.
But I think there is no way around git because of the existing projects in git and bzrgit is too slow.
I'm certainly not a git expert, but in kde-svn I'm used to organize my checkouts like this:
svn co -N svn+ssh://svn.kde.org/home/kde/trunk KDESVN
cd KDESVN
svn up -N KDE
svn up -N playground
svn up -N extragear
svn up -N kdereview
cd KDE
svn up kdelibs
...
Would anything like this be possible with git? I always see some .git-files on gitorious but never a way to download stuff in such a hierarchical way.
The User
About this it's one thing I am curious about, why gitorious.org? Why moving the repository to a 3rd party, would not something like git.kde.org be better?
Seem to me KDE could keep better and tighter control itself, rather then using a 3rd party.
Also since gitorious host lots of other projects, how will this impact the performance available to a huge project like KDE? How will load issues be handled, to not get problems like the massive issues SourceForge suffers?
@morty: the current plan is to have git.kde.org point to a kde-specific page on gitorious, a bit like qt.gitorious.org. we're currently negotiating with nokia to ensure that when we move gitorious will have enough bandwidth and everything else we need from them. the gitorious guys seem to spend all their time in the trolltech office so they're a very close "3rd party" ;)
if somehow that doesn't work out, we can always set up our own gitorious server and take our code over there, at any time; the distributed nature of git makes that especially easy. :)
I also think that having kde hosted on the same site as other projects is a good thing; maybe some of the people in those other projects will wander over and start hacking on kde too :) we'll still have our own kde portal but we'll be more visible to the outside world too.
@jonathan: that's one of the issues we need to solve before moving to git. the chances of getting a solution within git itself are pretty slim; most likely we'll end up with an xml description of the hierarchy that scripts can use. people who aren't using kdesvnbuild can either write their own little scripts to use that or.. well hopefully we'll get a volunteer to write a basic example that can do the equivalent of "svn co" and "svn up" as needed.
thoughts?
A distributed VCS can be useful but it is not for every kind of (software) project which needs revision control.
A centralized VCS is one central place. That's _very_ useful if you have low entrance barriers for (new) committers, cause it guarantees for very low hierarchies among people and a low learning curve (there is just one place to look up).
A distributed VCS has n! points (where n is the number of peoples repositories) I have to care about if I really want to get into the thing and collaborate with all others. As n! is rather unpleasant a hierarchy of peoples repositories will quickly evolve as you can see on the Linux kernel developement model.
So a distributed VCS replaces the centralized hierarchy of computers (client - server) and if done right a flat hierarchy among people by a nonexistant hierarchy among computers but a very strong hierarchy among people. And this is very odd.
You don't necessarily need to use git that way but every software feature of any software has implications on its user's behaviour. So git and any other distributed VCS inherits strong forces towards hierarchy among people just because it is a distributed VCS.
So you actively have to work permament against this inherent force inside git in order to avoid this. And especially in some years when git is not that new sexy toy anymore people will acutually notice it and feel uncomfortable.
So I think it is better to look for a VCS that actually minimizes the burden on its users and just doesn't replace one burden (nasty workflows on merging in svn) by another burden.
Why not looking for a centralized VCS with better merging for large scale projects such as KDE?
Sothere aren't real tree-structures in git? I hate those (especially decentralized) unstructured .git-files.
Look at Gluon (a small KDE-project), there are already some personal clones trying to implement extra-stuff. You've duplicate files, duplicate builds, merging-problems... Distribution can be useful, but it should be possible to use it in different ways, it could allow some approvements or it could be useful if somebody works outside of trunk on a new plugin, as long as he still shares his unfinished versions (today we've plaground, I guess git could do something like this in a more flexible way). The really-long-term-clones with duplicate stuff are simply ugly.
For me git seems to be as complicated as cvs... But unfortunately we will have to follow Amarok and KOffice.
OpenID was me...
Jonathan
I support the move to Git.
Keep up the good work.
I quite understand the move away from SVN and it sucks as merging!
I see someone thinks about a 'centralized' VCS with better merge, and I will recommend Clearcase. Yes, I am kidding :-) But as DVCS can do almost everything centralized one can, why not DVCS?
My question is why git? I have to admit that I didn't use git or other DVCS a lot, but git is too complicated and only advantage I can see is the speed, however it doesn't matter in most cases. I like Bazzar's concept (esp. directory is treated same as file), and I like Mecurial too. (However, I haven't see one can give me version tree in Clearcase.) I have to say that open source community is somewhat biased, and git is an example as more and more projects are moving towards it but I only see what they need is a DVCS but not necessarily git. I am not saying git is bad, but it does have its advantage to be accepted as used in kernel, doesn't it?
So how large would a git clone of the KDE trunk be?
Currently a svn checkout of the modules I'm interested in (most of them) is around 600MB. That consumes a noticable amount of my monthly bandwidth allocation (7GB), and it takes hours.
I'm guess a git clone with all the history will be considerably larger.
@arnomane
There will only be one central place that you have to care about after a git migration, the mainline repo. That is what people will base their changes off, and what they will want to get it merged into. I expect this to be somethink like git.kde.org/kdelibs/kdelibs for instance. Quite obvious.
But yes, one should be conscious about how one is working. And this aspect of git should also be well documented.
i am a total noob when it comes to command line vcs usage. I wanted to try out git and used the cheat sheet that zrusin posted. Gotta love that one!
http://zrusin.blogspot.com/2007/09/git-cheat-sheet.html
It felt so easy and natural! I guess it's like aaron says: you don't need to want to use every single feature of git. If you stick to the basics, it's very easy to learn, and still instant speed benefits :)
Post a Comment