SVN to GitHub Migration on Windows

SVN and GitHub are both valid source code repositories and depending on your project context you should decide for one or the other.
However, as SVN is out there for some time, and many projects starts to move on to GitHub because of it’s additional infrastructure, such as forks and pull requests, more and more people needs to perform a migration.

On linux systems, it’s quite easy to use your package installer for the svn2git tool recommended by GitHub for the migration.
However, on a windows machine, this is possible as well and you just need a few extra steps to get this done.

1 Migration in a Nutshell

There is a good article by omranic on how to get a git repository from your svn repository: http://omranic.com/svn2git-subgit-win/.
In short:

  1. Install Git: http://git-scm.com/download/win
  2. Install Ruby: http://rubyinstaller.org/downloads/
  3. Update the Ruby Package Manager (gem): Execute “gem update –system” on the command line
  4. Create a text file that maps your svn commiters to the according github users (see notes below)
  5. Run svn2git to create a local git repository
  6. Link your local repository with GitHub and send it over (push)

2 Author Mapping

You need a file mapping your svn users to according git/GitHub users to correctly assign the commits. The file looks like

svnuser1=gituser1 <gituser1@mail.com>
svnuser2=gituser2 <gituser2@mail.com>
...

To find out who has contributed to your svn repository you can use the command:

svn log --quiet http://path/to/root/of/project | grep -E "r[0-9]+ \| .+ \|" | cut -d'|' -f2 | sed 's/^ //' | sort | uniq

3 Run svn2git

Open a command line, switch to the directory you want to create your local repository at and run svn2git.
There are a couple of different parameters, depending on wether you want to take over your branches, tags, or what else.
If you are intersted in the trunk only, simply use

svn2git <your/repository/url/trunk/> --authors <path/to/authormappings.txt> -v --rootistrunk

No, grap a coffee, lean back and wait for the tool to finish…

4 Send Content to GitHub

Once your local repository is fine, you can link it with a remote GitHub Repository and push it to the remote host using the following comments (do not forger to adapt GITHUB_USERNAME and REPO_NAME to whatever you settings are).

git remote add origin git@github.com:GITHUB_USERNAME/REPO_NAME.git
git push origin master

5 Trouble Shooting

5.1 GitHub Connection

If you are not able to push your content to GitHub yet, this might because of a not yet configured connection.

Open a command line and run
ssh -v git@github.com
If you see a permission denied at the end somewhere above issues like

debug1: Authentications that can continue: publickey
debug1: Next authentication method: publickey
debug1: Trying private key: /.ssh/identity
debug1: Trying private key: /.ssh/id_rsa
debug1: Trying private key: /.ssh/id_dsa
debug1: No more authentication methods to try.
Permission denied (publickey).

You might need to generate a new ssh key and make it available for your ssh connection.

How to generate the key is well described by Daniel Hüsken at http://danielhuesken.de/git-fur-windows-installieren-und-ssh-keys-nutzen/
Your ssh-keygen might tell you about “//.ssh/id_rsa” as default directory. Note the two //. You should change this to a single slash: “/.ssh/id_rsa” to match git’s local ssh directory.

If you still not able to connect, but your users .ssh sub directory contains your key file, you should take a look at your Git installation directory. It might have happend that the git installation overloaded your ssh installation with it’s own implementation. So you can take a look if your git installation directory has a subdirectory named .ssh. If so, copy your id_rsa and id_rsa.pub files into that directory and try again.

5.2 No tracking information for current branch

If you try to push your content to the remote repository, you might need to specify the branch to commit to.

If you get error messages such as the following ones:

warning: no common commits
remote: Counting objects: 4, done.
remote: Compressing objects: 100% (4/4), done.
remote: Total 4 (delta 0), reused 0 (delta 0)
Unpacking objects: 100% (4/4), done.
From github.com:kopl/SPLevo
 * [new branch]      master     -> origin/master
There is no tracking information for the current branch.
Please specify which branch you want to merge with.
See git-pull(1) for details

    git pull <remote> <branch>

If you wish to set tracking information for this branch you can do so with:

    git branch --set-upstream-to=origin/<branch> master

choose the second offered command to link your local master with the remote master:

git branch --set-upstream-to=origin/master master

5.3 Empty Directories

Git does not support empty directories. This might become an issue depending on how your project / code is set up.
Sometimes you have empty directories for temp directories, convinience, due to default project creations or what ever. In Subversion no issue, in git not possible (except of dirty and error prune hacks).

So far, by the migration itself, this might be not an issue except of minor warnings easy to ignore 😉
However, when it comes to checking your code out again (by human or build server), the downstream build process might fail because of missing class path settings or others.
So be aware of that and keep it in mind during error research.

5.4 Changed Files right after Eclipse Import

When you are working with Eclipse and see many changed files right after the import, this might because of one of the following reasons:

Line Ends
Your system might automatically change the line ends in your text files to adapt it to your local file system. To prevent this behaviour you can discable the auto carriage return and line feed setting of git by running the following command on a shell or command line:

git config --system core.autocrlf false

bin Directories
When Eclipse compiles your projects, it might create bin directories with content. To not get this recognized as changes to commit, you can add a .gitignore file to your project containing a single line of “/bin”

If you operate several projects, you might want to place the .gitignore file in the parent folder. Note that you need to specify the excluder rule as “*/bin” to get it applied to the sub directories.

6 Further Topics

6.1 Jira-GitHub Integration

Jira is a very powerfull issue management system and if you used it before, you might want to keep it. However, integrating it with your code repository is an essential feature you do not wanna miss. Setting this up wiht GitHub is quite easy.

In your Jira System, install the Jira DVCS Plugin.
Then log in to your GitHub account and create credentials for a remote application. Open your profile, select “Applications” and register a new application. This will provide you a client key and secret you will need to specify a GitHub account in the DVCS Plugin.
Next, go to your Jira Administration and select Plugins -> DVCS Accounts and create a new GitHub account. There you will need the key and secret provided by GitHub. And there you go.
Now, the commit messages used in GitHub must contain the key of a Jira issue and the commit will be linked to the according issue: https://confluence.atlassian.com/display/AOD/Processing+JIRA+issues+with+commit+messages

6.2 Jenkins-GitHub Integration

For the Jenkins continous integration server a GitHub specific git plugin is provided. This plugin comes with a remote API. GitHub allows for configure a web service connection specific to the Jenkins GitHub plugin. The settings of your GitHub repository contain an option named “Webhooks & Services”. Here you can add a new service and select from a list of available ones. Choose the “Jenkins GitHub plugin” service and provide it with the url to your jenkins server. Further details about the GitHub Plugin configuration can be found on the plugin’s website.

6.3 Line Endings

As line endings are treated different between Windos (CRLF) and Unix (LF) operating systems, they are always an error-prone and annoying topic.
Git provides capabilities to normalize line endings when code is commited and/or pushed into the repository. This behaviour can be configured in many different ways.
However, working in teams with people using different operating systems, it is always a good choice to work with Unix style line endings (LF).

The following set of practices is recommendable to ensure everything runs smooth in your daily work:

1. Explicit Repository Settings
Git repositories can be configured to use an explict setting independent from what a user has configured for his overall system.
Create a .gitattributes file in the root of your repository. Below, you can see an example for treating all files as text files with LF file endings by default. Furthermore, for a set of specific file endings, files will be treaten as binary.

# Set the default behavior, in case people don't have core.autocrlf set.
* text eol=lf

# Denote all files that are truly binary and should not be modified.
*.png binary
*.jpg binary
*.gif binary
*.class binary
*.jar binary

Checkout the GitHub recommendations for how to update your repository if you need to normalize an existing repository: https://help.github.com/articles/dealing-with-line-endings

2. Configure Eclipse
Make sure to configure your Eclipse workspace for UTF-8 file encoding and UNIX style line ends:
eclipse-workspace-settings

7 Noteworthy Features from Working with GitHub

From migrating to GitHub, we identified several features really helpful in daily work.

Working Offline
Great while working a lot on the road, airplane or train

Line Level Code Links
Great to lightweight pointing people to specific code snippets such as “look here!

Line End Management
Working in teams with people using different operating systems it’s good to enforce a specific line ending. See the GitHub recommendations how this can be enforced on a per repository base:
https://help.github.com/articles/dealing-with-line-endings