I was scouting around github today and noticed that many projects seem to have a huge number of forks… Way more forks than are actually being used. At first this seemed strange, but I quickly realized that this large number of forks is due to people not knowing what forks are actually for.
Forks are for making your own snapshot of a codebase so that you can make a new version of it with your own special sauce, or so that you can contribute a change in the form of a pull request. Simply, you must make a fork whenever you need to modify the codebase, but do not have direct access to do so. New users don’t understand this and end up equating the ‘fork’ button with ‘download’ or ‘bookmark’. Little do they know, you can download code directly from the original repository and you can bookmark things using Github’s stars.
Forks are seldom what people actually should be doing. Actually, it looks like the VAST majority of times a user clicks the fork button, they actually shouldn’t have. Most of GitHub’s forks are useless and are user error!
Stupid factor calculation
The interesting part comes from when we compare stars to forks. First, we weight the forks count by removing the number of contributors to the project. This removes the majority of correctly used forks for a project. Even if one of those contributors made multiple pull requests, that still only ever counts as one fork - it has the same URL.
Next, we simply divide the number of forks by the number of stars and multiply by 100. This gives us a ‘stupid factor’ for any project. The formula is this:
((Forks - Contributors) / Stars) * 100
I will display the results in several categories, then follow up with a grab bag list of interesting items.
Programming Language Extensions or Packages
|Spring Framework||% 81.92|
|Ruby on Rails||% 30.49|
|Jekyll Now||% 344.20|
What is ‘Imagical Mine’ you say? It’s a pocket Minecraft server! Also, Jekyll now advertises that you can setup a blog with no programming.
I hope you had as much fun studying these results as I had finding them. I think this metric is much more accurate than you might first assume and should proably be considered by the community when making any software choice.