1 2
Git: Managing local commits and branches

At Encodo, we've got a relatively long history with Git. We've been using it exclusively for our internal source control since 2010.1

Git Workflows

GitWhen we started with Git at Encodo, we were quite cautious. We didn't change what had already worked for us with Perforce.2 That is: all developers checked in to a central repository on a mainline or release branch. We usually worked with the mainline and never used personal or feature branches.

Realizing the limitation of this system, we next adopted an early incarnation GitFlow, complete with command-line support for it. A little while later, we switched to our own streamlined version of GitFlow without a dev branch, which we published in an earlier version of the Encodo Git Handbook.3

We're just now testing the waters of Pull Requests instead of direct commits to master and feature branches. Before we can make this move, though, we need to raise the comfort level that all of our developers have toward creating branches and manipulating commits. We need to take the magic and fear out of Git -- but that's a pushed commit4 -- and learn how to view Git more as a toolbox that we can make for us rather than a mysterious process to whose whims we must comply.5

General Rules

Before we get started, let's lay down some ground rules for working with Git and source control, in general.

  • Use branches
  • Don't use too many branches at once
  • Make small pull requests
  • Use no more than a few unpushed commits
  • Get regular reviews

As you can see, the rules describe a process of incremental changes. If you stick to them, you'll have much less need for the techniques described below. In case of emergency, though, let's demystify some of what Git does.

If you haven't done so already, you should really take a look at some documentation of how Git actually works. There are two sources I can recommend:

  • The all-around excellent and extremely detailed Official Git Documentation. It's well-written and well-supplied with diagrams, but quite detailed.
  • The Encodo Git Handbook summarizes the details of Git we think are important, as well as setting forth best practices and a development process.

Examples

All examples and screenshots are illustrated with the SmartGit log UI.

Before you do any of the manipulation shown below, **always make sure your working tree has been cleared**. That means there are no pending changes in it. Use the `stash` command to put pending changes to the side.

Moving branches

In SmartGit, you can grab any local branch marker and drag it to a new location. SmartGit will ask what you want to do with the dropped branch marker, but you'll almost always just want to set it to the commit on which you dropped it.

This is a good way of easily fixing the following situation:

  1. You make a bunch of commits on the master branch
  2. You get someone to review these local commits
  3. They approve the commits, but suggest that you make a pull request instead of pushing to master. A good reason for this might be that both the developer and the face-to-face reviewer think another reviewer should provide a final stamp of approval (i.e. the other reviewer is the expert in an affected area)

In this case, the developer has already moved their local master branch to a newer commit. What to do?

Create a pull-request branch

Create and check out a pull-request branch (e.g. mvb/serviceImprovements).

image

image

Set master to the origin/master

Move the local master branch back to origin/master. You can do this in two ways:

  • Check out the master branch and then reset to the origin/master branch or...
  • Just drag the local master branch to the origin/master commit.

image

Final: branches are where they belong

In the end, you've got a local repository that looks as if you'd made the commits on the pull-request branch in the first place. The master branch no longer has any commits to push.

image

Moving & joining commits

SmartGit supports drag&drop move for local commits. Just grab a commit and drop it to where you'd like to have it in the list. This will often work without error. In some cases, like when you have a lot of commits addressing the same areas in the same files, SmartGit will detect a merge conflict and will be unable to move the commit automatically. In these cases, I recommend that you either:

  • Give up. It's probably not that important that the commits are perfect.
  • Use the techniques outlined in the long example below instead.

You can also "join" -- also called "squash" in Git parlance -- any adjoining commits into a single commit. A common pattern you'll see is for a developer to make changes in response to a reviewer's comments and save them in a new commit. The developer can then move that commit down next to the original commit from which the changes stemmed and join the commits to "repair" the original commit after review. You can at the same time edit the commit message to include the reviewer's name. Nice, right?

Here's a quick example:

Initial: three commits

We have three commits, but the most recent one should be squashed with the first one.

image

Move a commit

Select the most recent commit and drag it to just above the commit with which you want to join it. This operation might fail.6

image

Squash selected commits

Select the two commits (it can be more) and squash/join them. This operation will not fail.

image

Final: two commits

When you're done, you should see two commits: the original one has now been "repaired" with the additional changes you made during the review. The second one is untouched and remains the top commit.

image

Diffing commits

You can squash/join commits when you merge or you can squash/join commits when you cherry-pick. If you've got a bunch of commits that you want to combine, cherry-pick those commits but don't commit them.

You can also this technique to see what has changed between two branches. There are a lot of ways to do this, and a lot of guides will show you how to execute commands on the command line to do this.

In particular, Git allows you to easily display the list of commits between two other commits as well as showing the combined differences in all of those commits in a patch format. The patch format isn't very easy to use for diffing from a GUI client, though. Most of our users know how to use the command line, but use SmartGit almost exclusively nonetheless -- because it's faster and more intuitive.

So, imagine you've made several commits to a feature or release branch and want to see what would be merged to the master branch. It would be nice to see the changes in the workspace as a potential commit on master so you can visually compare the changes as you would a new commit.

Here's a short, visual guide on how to do that.

Select commits to cherry-pick

Check out the target branch (master in this example) and then select the commits you want to diff against it.

image

Do not commit

When you cherry-pick, leave the changes to accumulate in the working tree. If you commit them, you won't be able to diff en bloc as you'd like.

image

Final: working tree

The working tree now contains the differences in the cherry-picked commits.

image

Now you can diff files to your heart's content to verify the changes.

Working-tree files

Once you have changes in the working tree that are already a part of other commits, you might be tempted to think you have to revert the changes because they're already committed, right?

You of course don't have to do that. You can let the original commits die on the vine and make new ones, as you see fit.

Suppose after looking at the differences between our working branch and the master branch, you decide you want to integrate them. You can do this in several ways.

  1. You could clear the working tree7, then merge the other branch to master to integrate those changes in the original commits.
  2. Or you could create one or more new commits out of the files in the workspace and commit those to master. You would do this if the original commits had errors or incomplete comments or had the wrong files in them.
  3. Or you could clear the working tree and re-apply the original commits by cherry-picking and committing them. Now you have copies of those commits and you can edit the messages to your heart's content.

Even if you don't merge the original commits as in option (1) above, and you create new commits with options (2) and (3), you can still merge the branch so that Git is aware that all work from that branch has been included in master. You don't have to worry about applying the same work twice. Git will normally detect that the changes to be applied are exactly the same and will merge automatically. If not, you can safely just resolve any merge conflicts by selecting the master side.8

An example of reorganizing commits

Abandon hope, all ye who enter here. If you follow the rules outlined above, you will never get into the situation described in this section. That said...when you do screw something up locally, this section might give you some idea of how to get out of it. Before you do anything else, though, you should consider how you will avoid repeating the mistake that got you here. You can only do things like this with local commits or commits on private branches.

The situation in this example is as follows:

  • The user has made some local commits and reviewed them, but did not push them.
  • Other commits were made, including several merge commits from other pull requests.
  • The new commits still have to be reviewed, but the reviewer can no longer sign the commits because they are rendered immutable by the merge commits that were applied afterward.
  • It's difficult to review these commits face-to-face and absolutely unconscionable to create a pull request out of the current local state of the master branch.
  • The local commits are too confusing for a reviewer to follow.

The original mess

So, let's get started. The situation to clean up is shown in the log-view below.

image

Pin the local commits

Branches in Git are cheap. Local ones even more so. Create a local branch to pin the local commits you're interested in into the view. The log view will automatically hide commits that aren't referenced by either a branch or a tag.9

image

Choose your commits

Step one: find the commits that you want to save/re-order/merge.

image

The diagram below shows the situation without arrows. There are 17 commits we want, interspersed with 3 merge commits that we don't want.10

image

Reset local master

Check out the master branch and reset it back to the origin.

image

Cherry-pick commits

Cherry-pick and commit the local commits that you want to apply to master. This will make copies of the commits on pin.

image

Master branch with 17 commits

When you're done, everything should look nice and neat, with 17 local commits on the master branch. You're now ready to get a review for the handful of commits that haven't had them yet.11

image

Delete the temporary branch

You now have copies of the commits on your master branch, so you no longer care about the pin branch or any of the commits it was holding in the view. Delete it.

image

That pesky merge

Without the pin, the old mess is no longer displayed in the log view. Now I'm just missing the merge from the pull request/release branch. I just realized, though: if I merge on top of the other commits, I can no longer edit those commits in any way. When I review those commits and the reviewer wants me to fix something, my hands will be just as tied as they were in the original sitution.

image

Inserting a commit

If the tools above worked once, they'll work again. You do not have to go back to the beginning, you do not have to dig unreferenced commits out of the Git reflog.

Instead, you can create the pin branch again, this time to pin your lovely, clean commits in place while you reset the master branch (as before) and apply the merge as the first commit.

image

Rebase pin onto master

Now we have a local master branch with a single merge commit that is not on the origin. We also have a pin branch with 17 commits that are not on the origin.

Though we could use cherry-pick to copy the individual commits from pin to master, we'll instead rebase the commits. The rebase operation is more robust and was made for these situations.12

image

pin is ready

We're almost done. The pin branch starts with the origin/master, includes a merge commit from the pull request and then includes 17 commits on top of that. These 17 commits can be edited, squashed and changed as required by the review.

image

Fast-forward master

Now you can switch to the master branch, merge the pin branch (you can fast-forward merge) and then delete the pin branch. You're done!

image

Conclusion

I hope that helps take some of the magic out of Git and helps you learn to make it work for you rather than vice versa. With just a few simple tools -- along with some confidence that you're not going to lose any work -- you can do pretty much anything with local commits.13

h/t to Dani and Fabi for providing helpful feedback.


If you look closely, you can even see two immediately subsequent merges where I merged the branch and committed it. I realized there was a compile error and undid the commit, added the fixes and re-committed. However, the re-commit was no longer a merge commit so Git "forgot" that the pull-request branch had been merged. So I had to merge it again in order to recapture that information.

This is going to happen to everyone who works more than casually with Git, so isn't it nice to know that you can fix it? No-one has to know.


  1. Over five years counts as a long time in this business.

  2. I haven't looked at their product palette in a while. They look to have gotten considerably more enterprise-oriented. The product palette is now split up between the Helix platform, Helix versioning services, Helix Gitswarm and more.

  3. But which we've removed from the most recent version, 3.0.

  4. This is often delivered in a hushed tone with a note of fervent belief that having pushed a commit to the central repository makes it holy. Having pushed a commit to the central repository on master or a release branch is immutable, but everything else can be changed. This is the reason we're considering a move to pull requests: it would make sure that commits become immutable only when they are ready rather than as a side-effect of wanting to share code with another developer.

  5. In all cases, when you manipulate commits -- especially merge commits -- you should minimally verify that everything still builds and optimally make sure that tests run green.

  6. If the commits over which you're moving contain changes that conflict with the ones in the commit to be moved, Git will not be able to move that commit without help. In that case, you'll either have to (A) give up or (B) use the more advanced techniques shown in the final example in this blog.

  7. That is, in fact, what I did when preparing this article. Since I'm not afraid of Git, I manipulated my local workspace, safe in the knowledge that I could just revert any changes I made without losing work.

  8. How do we know this? Because we just elected to create our own commits for those changes. Any merge conflicts that arise are due to the commits you expressly didn't want conflicting with the ones that you do, which you've already committed to master.

  9. You can elect to show all commits, but that would then show a few too many unwanted commits lying around as you cherry-pick, merge and rebase to massage the commits to the way you'd like them. Using a temporary branch tells SmartGit which commits you're interested in showing in the view.

  10. Actually, we do want to merge all changes from the pull-request branch but we don't want to do it in the three awkward commits that we used as we were working. While it was important at the time that the pull-request be merged in order to test, we want to do it in one smooth merge-commit in the final version.

  11. You may be thinking: what if I want to push the commits that have been reviewed to master and create a pull request for the remaining commits? Then you should take a look in the section above, called Moving branches, where we do exactly that.

  12. Why? As you saw above, when you cherry-pick, you have to be careful to get the right commits and apply them in the right order. The situation we currently have is exactly what rebase was made for. The rebase command will get the correct commits and apply them in the correct order to the master branch. If there are merge conflicts, you can resolve them with the client and the rebase automatically picks up where you left off. If you elect to cherry-pick the commits instead and the 8th out of 17 commits fails to merge properly, it's up to you to pick up where you left off after solving the merge conflict. The rebase is the better choice in this instance.

  13. Here comes the caveat: within reason. If you're got merge commits that you have to keep because they cost a lot of blood, sweat and tears to create and validate, then don't cavalierly throw them away. Be practical about the "prettiness" of your commits. If you really would like commit #9 to be between commits #4 and #5, but SmartGit keeps telling you that there is a conflict when trying to move that commit, then reconsider how important that move is. Generally, you should just forget about it because there's only so much time you should spend massaging commits. This article is about making Git work for you, but don't get obsessive about it.

Limited drive-space chronicles #2: Why is Visual Studio installed on my machine?

If you're like us at Encodo, you moved to SSDs years ago...and never looked back. However, SSDs are generally smaller because the price (still) ramps up quickly as you increase size. We've almost standardized on 512GB, but some of us still have 256GB drives.

Unfortunately, knowing that we all have giant hard drives started a trend among manufacturers to just install everything, just in case you might need it. This practice didn't really cause problems when we were still using by-then terabyte-sized HDs. But now, we are, once again, more sensitive to unnecessary installations.

If you're a Windows .NET developer, you'll feel the pinch more quickly as you've got a relatively heavyweight Visual Studio installation (or three...) as well as Windows 8.1 itself, which weighs in at about 60GB after all service packs have been installed.

Once you throw some customer data and projects and test databases on your drive, you might find that you need, once again, to free up some space on your drive.

I wrote a similar post last year and those tips & tricks still apply as well.

System Cleanup is back

One additional tip I have is to use Win + S to search for "Free up disk space by deleting unnecessary files"1 and run that application in "clean up system files" mode: the latest version will throw out as much Windows Update detritus as it can, which can clean up gigabytes of space.

imageimageimage

Remove Old Visual Studios

The other measure you can take is to remove programs that you don't use anymore: for .NET developers that means you should finally toss out Visual Studio 2010 -- and possibly even 2013, if you've made the move to the new and improved 2015 already.2 Removing these versions also has the added benefit that extensions and add-ons will no longer try to install themselves into these older Visual Studios anymore.

However, even if you do remove VS2010, for example, you might find that it just magically reappears again. Now, I'm not surprised when I see older runtimes and redistributables in my list of installed programs -- it makes sense to keep these for applications that rely on them -- but when I see the entire VS2010 SP1 has magically reappeared, I'm confused.

image

Imagine my surprise when I installed SQL Server Management Studio 2016 -- the November 2015 Preview -- and saw the following installation item:

image

However, if you do remove this item again, then SQL Server Management Studio will no longer run (no surprise there, now that we know that it installed it). However, if you're just doing cleanup and don't know about this dependency3, you might accidentally break tools. So be careful; if you're too aggressive, you'll end up having to re-install some stuff.4



  1. The reason I write that "it's back" is that for a couple of versions of Windows, Microsoft made it an optional download/feature instead of installing it by default.

  2. Be careful about removing Visual Studio 2013 if you have web projects that still rely on targets installed with VS2013 but not included in VS2015. I uninstalled 2013 on my laptop and noticed a warning about an MS target that the compiler could no longer find.

  3. The fact that Windows still can't tell you about dependencies is a story for another day. We should have had a package manager on Windows years ago. And, no, while Choco is a lovely addition, it's not quite the full-fledged package manager that aptitude is on Ubuntu.

  4. Speaking from experience. Could you tell?

Improving NUnit integration with testing harnesses

imageThese days nobody who's anybody in the software-development world is writing software without tests. Just writing them doesn't help make the software better, though. You also need to be able to execute tests -- reliably and quickly and repeatably.

That said, you'll have to get yourself a test runner, which is a different tool from the compiler or the runtime. That is, just because your tests compile (satisfy all of the language rules) and could be executed doesn't mean that you're done writing them yet.

Testing framework requirements

Every testing framework has its own rules for how the test runner selects methods for execution as tests. The standard configuration options are:

  • Which classes should be considered as test fixtures?
  • Which methods are considered tests?
  • Where do parameters for these methods come from?
  • Is there startup/teardown code to execute for the test or fixture?

Each testing framework will offer different ways of configuring your code so that the test runner can find and execute setup/test/teardown code. To write NUnit tests, you decorate classes, methods and parameters with C# attributes.

The standard scenario is relatively easy to execute -- run all methods with a Test attribute in a class with a TestFixture attribute on it.

Test-runner Requirements

There are legitimate questions for which even the best specification does not provide answers.

When you consider multiple base classes and generic type arguments, each of which may also have NUnit attributes, things get a bit less clear. In that case, not only do you have to know what NUnit offers as possibilities but also whether the test runner that you're using also understands and implements the NUnit specification in the same way. Not only that, but there are legitimate questions for which even the best specification does not provide answers.

At Encodo, we use Visual Studio 2015 with ReSharper 9.2 and we use the ReSharper test runner. We're still looking into using the built-in VS test runner -- the continuous-testing integration in the editor is intriguing1 -- but it's quite weak when compared to the ReSharper one.

So, not only do we have to consider what the NUnit documentation says is possible, but we must also know what how the R# test runner interprets the NUnit attributes and what is supported.

Getting More Complicated

Where is there room for misunderstanding? A few examples,

  • What if there's a TestFixture attribute on an abstract class?
  • How about a TestFixture attribute on a class with generic parameters?
  • Ok, how about a non-abstract class with Tests but no TestFixture attribute?
  • And, finally, a non-abstract class with Tests but no TestFixture attribute, but there are non-abstract descendants that do have a TestFixture attribute?

In our case, the answer to these questions depends on which version of R# you're using. Even though it feels like you configured everything correctly and it logically should work, the test runner sometimes disagrees.

  • Sometimes it shows your tests as expected, but refuses to run them (Inconclusive FTW!)
  • Or other times, it obstinately includes generic base classes that cannot be instantiated into the session, then complains that you didn't execute them. When you try to delete them, it brings them right back on the next build. When you try to run them -- perhaps not noticing that it's those damned base classes -- then it complains that it can't instantiate them. Look of disapproval.

Throw the TeamCity test runner into the mix -- which is ostensibly the same as that from R# but still subtly different -- and you'll have even more fun.

Improving Integration with the R# Test Runner

At any rate, now that you know the general issue, I'd like to share how the ground rules we've come up with that avoid all of the issues described above. The text below comes from the issue I created for the impending release of Quino 2.

Environment

  • Windows 8.1 Enterprise
  • Visual Studio 2015
  • ReSharper 9.2

Expected behavior

Non-leaf-node base classes should never appear as nodes in test runners. A user should be able to run tests in descendants directly from a fixture or test in the base class.

Observed behavior

Non-leaf-node base classes are shown in the R# test runner in both versions 9 and 10. A user must navigate to the descendant to run a test. The user can no longer run all descendants or a single descendant directly from the test.

Analysis

Relatively recently, in order to better test a misbehaving test runner and accurately report issues to JetBrains, I standardized all tests to the same pattern:

  • Do not use abstract anywhere (the base classes don't technically need it)
  • Use the TestFixture attribute only on leaf nodes

This worked just fine with ReSharper 8.x but causes strange behavior in both R# 9.x and 10.x. We discovered recently that not only did the test runner act strangely (something that they might fix), but also that the unit-testing integration in the files themselves behaved differently when the base class is abstract (something JetBrains is unlikely to fix).

You can see that R# treats a non-abstract class with tests as a testable entity, even when it doesn't actually have a TestFixture attribute and even expects a generic type parameter in order to instantiate.

Here it's not working well in either the source file or the test runner. In the source file, you can see that it offers to run tests in a category, but not the tests from actual descendants. If you try to run or debug anything from this menu, it shows the fixture with a question-mark icon and marks any tests it manages to display as inconclusive. This is not surprising, since the test fixture may not be abstract, but does require a type parameter in order to be instantiated.

image

Here it looks and acts correctly:

image

I've reported this issue to JetBrains, but our testing structure either isn't very common or it hasn't made it to their core test cases, because neither 9 nor 10 handles them as well as the 8.x runner did.

Now that we're also using TeamCity a lot more to not only execute tests but also to collect coverage results, we'll capitulate and just change our patterns to whatever makes R#/TeamCity the happiest.

Solution

  • Make all testing base classes that include at least one {{Test}} or {{Category}} attribute {{abstract}}. Base classes that do not have any testing attributes do not need to be made abstract.

Once more to recap our ground rules for making tests:

  • Include TestFixture only on leafs (classes with no descendants)
  • You can put Category or Test attributes anywhere in the hierarchy, but need to declare the class as abstract.
  • Base classes that have no testing attributes do not need to be abstract
  • If you feel you need to execute tests in both a base class and one of its descendants, then you're probably doing something wrong. Make two descendants of the base class instead.

When you make the change, you can see the improvement immediately.

image


  1. ReSharper 10.0 also offers continuous integration, but our experiments with the EAP builds and the first RTM build left us underwhelmed and we downgraded to 9.2 until JetBrains manages to release a stable 10.x.

Encodo Git Handbook 3.0

Encodo first published a Git Handbook for employees in September 2011 and last updated it in July of 2012. Since then, we've continued to use Git, refining our practices and tools. Although a lot of the content is still relevant, some parts are quite outdated and the overall organization suffered through several subsequent, unpublished updates.

What did we change from the version 2.0?

  • We removed all references to the Encodo Git Shell. This shell was a custom environment based on Cygwin. It configured the SSH agent, set up environment variables and so on. Since tools for Windows have improved considerably, we no longer need this custom tool. Instead, we've moved to PowerShell and PoshGit to handle all of our Git command-line needs.
  • We removed all references to Enigma. This was a Windows desktop application developed by Encodo to provide an overview, eager-fetching and batch tasks for multiple Git repositories. We stopped development on this when SmartGit included all of the same functionality in versions 5 and 6.
  • We removed all detailed documentation for Git submodules. Encodo stopped using submodules (except for one legacy project) several years ago. We used to use submodules to manage external binary dependencies but have long since moved to NuGet instead.
  • We reorganized the chapters to lead off with a quick overview of Basic Concepts followed by a focus on Best Practices and our recommended Development Process. We also reorganized the Git-command documentation to use a more logical order.

You can download version 3 of the Git Handbook or get the latest copy from here.

Chapter 3, Basic Concepts and chapter 4, Best Practices have been included in their entirety below.

3 Best Practices

3.1 Focused Commits

Focused commits are required; small commits are highly recommended. Keeping the number of changes per commit tightly focused on a single task helps in many cases.

  • They are easier to resolve when merge conflicts occur
  • They can be more easily merged/rebased by Git
  • If a commit addresses only one issue, it is easier for a reviewer or reader to decide whether it should be examined.

For example, if you are working on a bug fix and discover that you need to refactor a file as well, or clean up the documentation or formatting, you should finish the bug fix first, commit it and then reformat, document or refactor in a separate commit.

Even if you have made a lot of changes all at once, you can still separate changes into multiple commits to keep those commits focused. Git even allows you to split changes from a single file over multiple commits (the Git Gui provides this functionality as does the index editor in SmartGit).

3.2 Snapshots

Use the staging area to make quick snapshots without committing changes but still being able to compare them against more recent changes.

For example, suppose you want to refactor the implementation of a class.

  • Make some changes and run the tests; if everythings ok, stage those changes
  • Make more changes; now you can diff these new changes not only against the version in the repository but also against the version in the index (that you staged).
  • If the new version is broken, you can revert to the staged version or at least more easily figure out where you went wrong (because there are fewer changes to examine than if you had to diff against the original)
  • If the new version is ok, you can stage it and continue working

3.3 Developing New Code

Where you develop new code depends entirely on the project release plan.

  • Code for releases should be committed to the release branch (if there is one) or to the develop branch if there is no release branch for that release
  • If the new code is a larger feature, then use a feature branch. If you are developing a feature in a hotfix or release branch, you can use the optional base parameter to base the feature on that branch instead of the develop branch, which is the default.

3.4 Merging vs. Rebasing

Follow these rules for which command to use to combine two branches:

  • If both branches have already been pushed, then merge. There is no way around this, as you wont be able to push a non-merged result back to the origin.
  • If you work with branches that are part of the standard branching model (e.g. release, feature, etc.), then merge.
  • If both you and someone else made changes to the same branch (e.g. develop), then rebase. This will be the default behavior during development

4 Development Process

A branching model is required in order to successfully manage a non-trivial project.

Whereas a trivial project generally has a single branch and few or no tags, a non-trivial project has a stable releasewith tags and possible hotfix branchesas well as a development branchwith possible feature branches.

A common branching model in the Git world is called Git Flow. Previous versions of this manual included more specific instructions for using the Git Flow-plugin for Git but experience has shown that a less complex branching model is sufficient and that using standard Git commands is more transparent.

However, since Git Flow is a very widely used branching model, retaining the naming conventions helps new developers more easily understand how a repository is organized.

4.1 Branch Types

The following list shows the branch types as well as the naming convention for each type:

  • master is the main development branch. All other branches should be merged back to this branch (unless the work is to be discarded). Developers may apply commits and create tags directly on this branch.
  • feature/name is a feature branch. Feature branches are for changes that require multiple commits or coordination between multiple developers. When the feature is completed and stable, it is merged to the master branch after which it should be removed. Multiple simultaneous feature branches are allowed.
  • release/vX.X.X is a release branch. Although a project can be released (and tagged) directly from the master branch, some projects require a longer stabilization and testing phase before a release is ready. Using a release branch allows development on the develop branch to continue normally without affecting the release candidate. Multiple simultaneous release branches are strongly discouraged.
  • hotfix/vX.X.X is a hotfix branch. Hotfix branches are always created from the release tag for the version in which the hotfix is required. These branches are generally very short-lived. If a hotfix is needed in a feature or release branch, it can be merged there as well (see the optional arrow in the following diagram).

The main difference from the Git Flow branching model is that there is no explicit stable branch. Instead, the last version tag serves the purpose just as well and is less work to maintain. For more information on where to develop code, see 3.3 Developing New Code.

4.2 Example

To get a better picture of how these branches are created and merged, the following diagram depicts many of the situations outlined above.

The diagram tells the following story:

  • Development began on the master branch
  • v1.0 was released directly from the master branch
  • Development on feature B began
  • A bug was discovered in v1.0 and the v1.0.1 hotfix branch was created to address it
  • Development on feature A began
  • The bug was fixed, v1.0.1 was released and the fix was merged back to the master branch
  • Development continued on master as well as features A and B
  • Changes from master were merged to feature A (optional merge)
  • Release branch v1.1 was created
  • Development on feature A completed and was merged to the master branch
  • v1.1 was released (without feature A), tagged and merged back to the master branch
  • Changes from master were merged to feature B (optional merge)
  • Development continued on both the master branch and feature B
  • v1.2 was released (with feature A) directly from the master branch

image

Legend:

  • Circles depict commits
  • Blue balloons are the first commit in a branch
  • Grey balloons are a tag
  • Solid arrows are a required merge
  • Dashed arrows are an optional merge

Question to consider when designing APIs: Part II

In the previous article, we listed a lot of questions that you should continuously ask yourself when you're writing code. Even when you think you're not designing anything, you're actually making decisions that will affect either other team members or future versions of you.

In particular, we'd like to think about how we can reconcile a development process that involves asking so many questions and taking so many facets into consideration with YAGNI.

Designing != Implementing

The implication of this principle is, that if you aren't going to need something, then there's no point in even thinking about it. While it's absolutely commendable to adopt a YAGNI attitude, not building something doesn't mean not thinking about it and identifying potential pitfalls.

A feature or design concept can be discussed within a time-box. Allocate a fixed, limited amount of time to determine whether the feature or design concept needs to be incorporated, whether it would be nice to incorporate it or possibly to jettison it if it's too much work and isn't really necessary.

The overwhelming majority of time wasted on a feature is in the implementation, debugging, testing, documentation and maintenance of it, not in the design. Granted, a long design phase can be a time-sink -- especially a "perfect is the enemy of the good" style of design where you're completely blocked from even starting work. With practice, however, you'll learn how to think about a feature or design concept (e.g. extensibility) without letting it ruin your schedule.

If you don't try to anticipate future needs at all while designing your API, you may end up preventing that API from being extended in directions that are both logical and could easily have been anticipated. If the API is not extensible, then it will not be used and may have to be rewritten in the future, losing more time at that point rather than up front. This is, however, only a consideration you must make. It's perfectly acceptable to decide that you currently don't care at all and that a feature will have to be rewritten at some point in the future.

You can't do this kind of cost-benefit analysis and risk-management if you haven't taken time to identify the costs, benefits or risks.

Document your process

At Encodo, we encourage the person who's already spent time thinking about this problem to simply document the drawbacks and concessions and possible ideas in an issue-tracker entry that is linked to the current implementation. This allows future users, maintainers or extenders of the API to be aware of the thought process that underlies a feature. It can also help to avoid misunderstandings about what the intended audience and coverage of an API are.

The idea is to eliminate assumptions. A lot of time can be wasted when maintenance developers make incorrect assumptions about the intent of code.

If you don't have time to do any of this, then you can write a quick note in a task list that you need to more fully document your thoughts on the code you're writing. And you should try to do that soon, while the ideas are still relatively fresh in your mind. If you don't have time to think about what you're doing even to that degree, then you're doing something wrong and need to get organized better.

That is, you if you can't think about the code you're writing and don't have time to document your process, even minimally, then you shouldn't be writing that code. Either that, or you implicitly accept that others will have to clean up your mess. And "others" includes future versions of you. (E.g. the you who, six months from now, is muttering, "who wrote this crap?!?")

Be Honest about Hacking

As an example, we can consider how we go from a specific feature in the context of a project to thinking about where the functionality could fit in to a suite of products -- that may or may not yet exist. And remember, we're only thinking about these things. And we're thinking about them for a limited time -- a time-box. You don't want to prevent your project from moving forward, but you also don't want to advance at all costs.

Advancing in an unstructured way is called hacking and, while it can lead to a short-term win, it almost always leads to short-to-medium term deficits. You can still write code that is hacked and looks hacked, if that is the highest current priority, but you're not allowed to forget that you did so. You must officially designate what you're doing as a hot-zone of hacking so that the Hazmat team can clean it up later, if needed.

A working prototype that is hacked together just so it works for the next demonstration is great as long as you don't think that you can take it into production without doing the design and documentation work that you initially skipped.

If you fail to document the deficits that prevent you from taking a prototype to production, then how will you address those deficits? It will cost you much more time and pain to determine the deficits after the fact. Not only that, but unless you do a very good job, it is your users that will most likely be finding deficits -- in the form of bugs.

If your product is just a hacked mess of spaghetti code with no rhyme or reason, another developer will be faster and produce more reliable code by just starting over. Trying to determine the flaws, drawbacks and hacks through intuition and reverse-engineering is slower and more error-prone than just starting with a clean slate. Developers on such a project will not be able to save time -- and money -- by building on what you've already made.

A note on error-handling

Not to be forgotten is a structured approach to error-handling. The more "hacked" the code, the more stringent the error-checking should be. If you haven't had time yet to write or test code sufficiently, then that code shouldn't be making broad decisions about what it thinks are acceptable errors.

Fail early, fail often. Don't try to make a hacked mess of code bullet-proof by catching all errors in an undocumented manner. Doing so is deceptive to testers of the product as well as other developers.

If you're building a demo, make sure the happy path works and stick to it during the demo. If you do have to break this rule, add the hacks to a demo-specific branch of the code that will be discarded later.

Working with a documented project

If, however, the developer can look at your code and sees accompanying notes (either in an issue tracker, as TODOs in the code or some other form of documentation), that developer knows where to start fixing the code to bring it to production quality.

For example, it's acceptable to configure an application in code as long as you do it in a central place and you document that the intent is to move the configuration to an external source when there's time. If a future developer finds code for support for multiple database connections and tests that are set to ignore with a note/issue that says "extend to support multiple databases", that future developer can decide whether to actually implement the feature or whether to just discard it because it has been deprecated as a requirement.

Without documentation or structure or an indication which parts of the code were thought-through and which are considered to be hacked, subsequent developers are forced to make assumptions that may not be accurate. They will either assume that hacked code is OK or that battle-tested code is garbage. If you don't inform other developers of your intent when your're writing the code -- best done with documentation, tests and/or a cleanly designed API -- then it might be discarded or ignored, wasting even more time and money.

If you're on a really tight time-budget and don't have time to document your process correctly, then write a quick note that you think the design is OK or the code is OK, but tell your future self or other developers what they're looking at. It will only take you a few minutes and you'll be glad you did -- and so will they.

Questions to consider when designing APIs: Part I

A big part of an agile programmer's job is API design. In an agile project, the architecture is defined from on high only in broad strokes, leaving the fine details of component design up to the implementer. Even in projects that are specified in much more detail, implementers will still find themselves in situations where they have to design something.

This means that programmers in an agile team have to be capable of weighing the pros and cons of various approaches in order to avoid causing performance, scalability, maintenance or other problems as the API is used and evolves.

When designing an API, we consider some of the following aspects. This is not meant to be a comprehensive list, but should get you thinking about how to think about the code you're about to write.

Reusing Code

  • Will this code be re-used inside the project?
  • How about outside of the project?
  • If the code might be used elsewhere, where does that need lie on the time axis?
  • Do other projects already exist that could use this code?
  • Are there already other implementations that could be used?
  • If there are implementations, then are they insufficient?
  • Or perhaps not sufficiently encapsulated for reuse as written?
  • How likely is it that there will be other projects that need to do the same thing?
  • If another use is likely, when would the other project or projects need your API?

Organizing Code

  • Where should the API live in the code?
  • Is your API local to this class?
  • Is it private?
  • Protected?
  • Are you making it public in an extension method?
  • Or internal?
  • Which namespace should it belong to?
  • Which assembly?

Testing Code

  • What about testability?
  • How can the functionality be tested?

Even if you don't have time to write tests right now, you should still build your code so that it can be tested. It's possible that you won't be writing the tests. Instead, you should prepare the code so that others can use it.

It's also possible that a future you will be writing the tests and will hate you for having made it so hard to automate testing.

Managing Dependencies

  • Is multi-threading a consideration?
  • Does the API manage state?
  • What kind of dependencies does the API have?
  • Which dependencies does it really need?
  • Is the API perhaps composed of several aspects?
  • With a core aspect that is extended by others?
  • Can core functionality be extracted to avoid making an API that is too specific?

Documenting Code

  • How do callers use the API?
  • What are the expected values?
  • Are these expectations enforced?
  • What is the error mechanism?
  • What guarantees does the API make?
  • Is the behavior of the API enforced?
  • Is it at least documented?
  • Are known drawbacks documented?

Error-handling

This is a very important one and involves how your application handles situations outside of the design.

  • If you handle externally provided data, then you have to handle extant cases
  • Are you going to log errors?
  • In which format?
  • Is there a standard logging mechanism?
  • How are you going to handle and fix persistent errors?
  • Are you even going to handle weird cases?
  • Or are you going to fail early and fail often?
  • For which errors should your code even responsible?
  • How does your chosen philosophy (and you should be enforcing contracts) fit with the other code in the project?

Fail fast; enforce contracts

While we're on the subject of error-handling, I want to emphasize that this is one of the most important parts of API design, regardless of which language or environment you use.1

Add preconditions for all method parameters; verify them as non-null and verify ranges. Do not catch all exceptions and log them or -- even worse -- ignore them. This is even more important in environments -- I'm looking at you client-side web code in general and JavaScript in particular -- where the established philosophy is to run anything and to never rap a programmer on the knuckles for having written really knuckle-headed code.

You haven't tested the code, so you don't know what kind of errors you're going to get. If you ignore everything, then you'll also ignore assertions, contract violations, null-reference exceptions and so on. The code will never be improved if it never makes a noise. It will just stay silently crappy until someone notices a subtle logical error somewhere and must painstakingly track it down to your untested code.

You might say that production code shouldn't throw exceptions. This is true, but we're explicitly not talking about production code here. We're talking about code that has few to no tests and is acknowledged to be incomplete. If you move code like this into production, then it's better to crash than to silently corrupt data or impinge the user experience.

A crash will get attention and the code may even be fixed or improved. If you write code that will crash on all but the "happy path" and it never crashes? That's great. Do not program preemptively defensively in fresh code. If you have established code that interfaces with other (possibly external) components and you sometimes get errors that you can't work around in any other way, then it's OK to catch and log those exceptions rather than propagating them. At least you tried.

In the next article, we'll take a look at how all of these questions and considerations can at all be reconciled with YAGNI. Spoiler alert: we think that they can.



  1. I recently read Erlang and code style by Jesper L. Andersen, which seems to have less to do with programming Erlang and much more to do with programming properly. The advice contained in it seems to be only for Erlang programmers, but the idea of strictly enforcing APIs between software components is neither new nor language-specific.