Practical knowledge of git
Some time ago, for about a year I’ve been teaching myself and newcomer company programmers about proper and efficient usage of git. Since git is a distribued VCS, which is well done for time pressure work, the “distributed” part isn’t really the easiest one in such heavy context. Inbetween those teaching times I have noticed that there is much more to teach rather than only how to use some new source code management tool.
I’m not gonna explain the tool itself and tricks around it. I’d like to focus on problems around it. The following may be your checklist for practical knowledge of git. Ready for exam? Then read on.
Distribued VCS?
Being alone, you would write new code or make changes to the existing code and save it as so-called “commit” with attached comment to it. And that’s all… until you collaborate. “OK, let’s collaborate then”, you say.
Here goes well known VCS called Subversion which teaches some bad habits. It’s easy to learn and master. But it often seems that not really everybody can see any problems within it. I admit, that software was cool for some long time but those times are improved by git.
The power of branching
“SVN is innocent!”, you say. SVN is good until you work in a team. Your team may consist of one person (only you!) but let’s say it happens that there are still couple of features to be made that can’t be done one after another because there comes Client with his Wishes… one after another.
If there are many features to do and a couple of people (more than “you”!) in a team then you’re screwed when not using branches. Branches in SVN? Good luck with this.
The power of branches comes in with git just because it’s simple and slick architecture. It’s all about commits. Branch is just a named list of commits IDs. And to just remember - one commit is just a changeset.
My team uses git! But not efficiently…
I’ve met a few issues while working in a team which didn’t know git from the start. The most annoying situations are those happening every day which shouldn’t ever happen with correct understanding and proper use of the tool. But when there is no realunderstanding then various questions are appearing and that’s not weird.
Working with code repository is not just alone-coding. It’s about coding with other people, not making their work harder. Organization of work and workflow is the thing here to reduce often occuring problems like these:
- I’m not sure who wrote that code… I have to ask everyone who made that feature. OMG STOP! “git blame”
- “When did you make a change in file {filename}?” STOP! “git whatchanged -p {filename}”
- Whew, lots of changes. I’ll add all files by “git add *” STOP! “git add -u” (and then “git status” and manual add of untracked files)
- I have conflicts in code. Someone else have made a change to the same fragment of code. Looking at changes, diffing around aand… the conflict is only about that someone has formatted the code not making any value of the change. Actually, there’s a minus value of wasted time for me dealing with conflict.
- someone else made a push just a minute ago and I can’t do that. I can’t rebase so I’ll just make a simple “git pull” and immediately push the code. Nooo, STTOOOPP! In the future it will make hard to “git revert” or “git bissect” properly. Please, learn how to rebase. (hint: there’s a “–rebase” option when pulling)
- I don’t know how to revert pushed commits so I’ll make a manual commit which changes those things. Well, no. Test things before pushing. And if that bad thing still happened, then just “git revert” and push it. (no, don’t offer me “push –force”, you’ll break work of someone else in a team)
So those were organizational problems.
I know this stuff… well, mostly
There are areas of git which you think you know but you do not entirely understand. Want to make a test of your git wisdom?** **I’ve gathered some doubts awhile interacting with colleagues who were learning git and I have been putting them into the following questions:
- how “master” is different from other branches? Why is it even called “master”?
- when “origin master”, when “origin/master”? What’s the difference?
- why, what for and when that… rebase thing/operation? “git rebase” or “git pull –rebase”? What are both?
- why fetching before rebase?
- “git reset” - when –soft, when –hard? When neither?
- why “git stash apply” and not “git stash pop”?
- why some branches (using “git branch -a”) have “remote/” in their names?
- how do I delete branch? (difference between local and remote deletion)
- how to change name of the branch? (it’s a trap!)
- why “git submodule update” doesn’t work the way I expect?
- how do I remove file from “staged for commit” list? (“git rm –cached …”)
Most of this stuff is about understanding git and believe me or not - there are answers out there in the internets. But it’s better to understand overall architecture to just know how git works, then all other things come naturally. A place to start is to see git’s creator, Linus Torvalds, introduction about git:
But there are also doubts about feeling the project and it’s workflow rather than understanding git:
- why fetching before rebase?
- so, when should I use “git merge” without rebase?
- when should I push my (personal) branches?
I won’t answer these things here, they’re expected to be project-specific.
In the end there exist tools you didn’t expect:
- statistics about your (or everyone else, or everybody) commits (yeah, you can see numbers instead of just “git log”)
- “git stash” - when boss/manager comes in and wants to see what’s currently working
- making branch out of a branch. Really, can you do such thing? Yeah. Branch is just a name of commit (ups, I told something important about git’s architecture).
- “git reflog” - when you messed up with your commits.
- more origins? “upstream”? WTF?
Good habits
- often use of “git status” and “git log –graph”. Especially before pushing commits to central repo.
- “git pull”? Do it with “–rebase”!
- commenting commits accurately. It’s a little bit frustrating to read something like “fix bug #723”. It’s ok when your project is connected to some software on web (then #723 is clickable), otherwise oh dear. Make comments valuable to show difference between commits!
- beginning a new feature? Create a branch! “git checkout -b {branch_name}”. You can dirty-commit in there as much as you want.
- leave your branches for a little while after it’s accepted and merged into main branch. It may be useful to have that dirty history alongside to take a look into that later (especially looking after bugs).
- if possible, no more than one bugfix in one commit. Features - same story, one commit per feature. It’s easier to “git revert {commit_id}”
- make the least changes as possible. Don’t format code alongside making a bugfix or a new feature. When someone does a rebase, he don’t want to waste time on analyzing differences between two conflicting codes where one is a major change and second one is just a code format which is in commit described with something else.
- distinguish your pushed branches by giving them a name having your “name.surname/” _prefix _or anything else that’s distinguishable from branches of other people.
- **no one ever should just **“git push” without specifying “remote” (like “origin”) name and branch name (like “master” or whatever). Never ever do that.
Summary
I think that’s all there is to it to just work efficiently with git. If you have some doubts about anything here - you should take some breaks sometimes and read about those things you don’t really understand. AND try to explain it to others - it will help both you and your work colleagues. Or will it make a mess?
Resources
- Tech Talk: Linus Torvalds on git
- GitHub Training
- git ready - receipts!
- Please. Stop Using Git. - Matthew McCullough - sarcastic talk about git strong features
- Git for Grown-ups - an interesting take on custom workflow based on git