If you develop software and haven't just returned from the moon, you've
undoubtedly heard that GitHub is
being acquired by Microsoft. Depending on your affiliations you might be spelling
"being acquired by" as "selling out to". The rest of you are probably
wondering what on Earth a GitHub is, and why Microsoft would want one.
Let me explain.
Please note: this post isn't about my opinion of today's news. It's
really too early to tell, though I may get into that a little toward the
end. Instead, I'm going to explain what GitHub is, and why it matters.
But first I have to explain Git.
Git is a version-control system. (Version-control systems are sometimes
called "source code management" (SCM) systems. If you look closely you
might even have spotted "scm" in git's URL up there at the end of the last
paragraph.) Basically, a version-control system lets you record the
complete history of a project, with what changes were made, who made the
each change, when they changed it, and their notes about what they did and
why. It doesn't have to be a software project, either. It can be
recipes, photographs, books, the papers you're writing for school, or even
blog entries. (Yes, I do.)
Before git, most version-control systems kept track of changes in text
files (which of course is what all source code is) by recording which
lines are different from the previous version. (It's usually done by a
program called diff
.) This was very compact, but it could
also be very slow if you had to undo all the changes between two versions
in order to see what the older one looked like.
Git, on the other hand, is blindingly fast in part because it works in the
stupidest way possible (which is why it's called "git"). It
simply takes the new version of each file that changed since the last
version, zips it up, and stuffs it whole into its repository. So it takes
git about the same amount of time to roll a file back two versions or two
hundred.
The other thing that makes git fast is where it keeps all of its
version information. Before git, most version-control systems used a
centralized repository on a server somewhere. (Subversion, one of the
best of these, even lets you browse the repository with a web browser.)
That means that all the change information is going over a network. Git
keeps its repository (these days everyone shortens that to "repo") on your
local disk, right next to your working copy, in a hidden subdirectory
called ".git
".
Because its repo is local, and contains the entire history of your
project, you don't need a network connection to use git. On the beach, in
an airplane, on a boat, with a goat, it doesn't matter to git.
It's de-centralized. It gets a little more complicated when more
than one developer is working on a project.
Bob's been in the office all week working on a project. When his boss,
Alice, comes back from the open source conference she's been at all week,
all she has to do is tell git to fetch all the changes that Bob made while
she was away. Git gets them directly from Bob's repo. If Alice didn't
make any changes, that's called a "fast-forward" merge -- git just takes
the changes that Bob made, copies those files into Alice's repo, updates
her working tree, and it's done.
It's a little trickier if Alice had time to make some changes, too. Now
Alice has to merge the two sets of changes, and then let Bob pull the
merged files onto his computer. By the way, a "pull" is just a
fetch followed by a merge, but it's so common that git has a shorthand way
of doing it. (I'm oversimplifying here, but this isn't the time to go into
the difference between merge and rebase. It's also not a good time to
talk about branches -- maybe some other week.) As you can imagine, this
gets out of hand pretty quickly, and it's even worse if there's a whole
team working on the project.
The obvious thing to do is for the group to have one repo on a server
somewhere that has what everyone agrees is the definitive set of files on
it. Bob pushes his changes to the server, and when Alice tries to push
her changes, git balks and gives her an error message. Now it's
Alice's responsibility to make any necessary fixes and push them to the
server. Actually, in a real team, Alice would send her proposed changes
around by making a diff and sending email to the other team members to
review, and not actually push her changes until someone approves them.
In a large team, this is kind of a hub-and-spokes arrangement. You can
see where this is going, right?
GitHub is a company that provides a place
for people and projects to put shared git repositories where other people
can see them, clone them, and contribute to them. GitHub has become
wildly popular, because it's a great place to share software. If you have
an open-source software project, putting a public repo on GitHub is the
most effective way to reach developers. It's so popular that Google and
Microsoft shut down their own code-hosting sites (Google Code and CodePlex
respectively) and moved to GitHub. Microsoft, it turns out, is GitHub's
biggest contributor.
Putting a public repository on GitHub is free. If you want to set up
private repositories, GitHub will charge you for it, and if your
company wants to put a clone of GitHub on its own private servers they can
buy GitHub Enterprise, but if your software is free, so's your space on
GitHub.
That's a bit of a problem, because the software that runs GitHub is
not free. That means that they need a steady stream of income to
pay their in-house developers, because they're not going to get any help
from the open-source developer community. GitHub lost $66 million in
2016, and doesn't really have a sustainable business model that would make
them attractive to investors. They needed to get acquired, or they had a
real risk of going under. And when a service based on proprietary
software goes under, all of their customers have a big problem. But their
users? Heh.
Everybody knows the old adage, "if you're getting a service for free
you're not the customer, you're the product." That's especially true for
companies like Google and Facebook, which sell their users' eyeballs to
advertisers. It's a lot less true for a company whose users can leave any
time they want, painlessly, taking all their data and their
readers with them. I'm sure most of my readers here on Dreamwidth
remember what happened to Livejournal when they got bought by the
Russians. Well, GitHub is being bought by Microsoft. It's not entirely
clear which is worse.
GitHub has an even worse problem than Livejournal did, because
"cross-posting" is basically the way git works. There's a company
called GitLab that looks a lot like
GitHub, except that their core software -- the stuff that wraps a slick
web interface around a git repository -- is open source. (They do sell
extensions, but most projects aren't going to need them.) If you want to
set up your own private GitLab site, it's free, and you can do it in ten
minutes with a one-line command. If you find bugs, you can fix them
yourself. You'll find a couple of great quotes from their blog at the end
of the notes, but the bottom line is that 100,000 repositories have moved
from GitHub to GitLab in the last 24 hours.
And once you've moved a project to GitLab, you don't have to worry about
what happens to it, because the open-source core of it will continue to be
maintained by its community. That's what happened when a company called
Netscape went belly-up: Mozilla Firefox is still around and doing fine.
And if the fact that GitLab is for profit is a problem for you, there's
Apache Allura, gitolite3, gitbucket, and gitweb (to name a few). Go for
it!
This so wasn't what I was planning to write today.
Notes:
@ Microsoft Reportedly Acquires GitHub | Linux Journal
The article ends with a list of alternatives:
Gitea
Apache Allura
GitBucket: A Git platform
GitLab
@ Microsoft acquires GitHub for $7.5 billion - TFiR
" According to reports, GitHub lost over $66 millions in 2016. At the same time
GitLab, a fully open source and decentralized service is gaining momentum, giving
users a fully open source alternative. "
@ Microsoft to acquire GitHub for $7.5 billion | Stories official press release
@ Microsoft + GitHub = Empowering Developers - The Official Microsoft Blog
@ A bright future for GitHub | The GitHub Blog
@ Congratulations GitHub on the acquisition by Microsoft | GitLab
" While we admire what's been done, our strategy differs in two key areas. First,
instead of integrating multiple tools together, we believe a single application,
built from the ground up to support the entire DevOps lifecycle is a better
experience leading to a faster cycle time. Second, it’s important to us that the
core of our product always remain open source itself as well. "
@ GitLab Ultimate and Gold now free for education and open source | GitLab
" It has been a crazy 24 hours for GitLab. More than 2,000 people tweeted about
#movingtogitlab. We imported over 100,000 repositories, and we've seen a 7x increase
in orders. We went live on Bloomberg TV. And on top of that, Apple announced an
Xcode integration with GitLab. "
Another fine post from The Computer Curmudgeon.