With the recent demise of Gitorious and Google Code, here are some thoughts I posted (in slightly modified form) a while ago to gnu-prog-discuss, on how to avoid dependencies on centralized version control systems. Federation isn't enough for this; we need replication, and I think the way to do it is with P2P protocols, along the lines of Twister, the P2P Bitcoin&Bittorrent-based microblogging platform. http://twister.net.co for more info.

Twister uses a bitcoin block chain to handle distributed name and public encryption key registration, a bittorrent DHT for user profiles, and (separate, growing) bit torrents for posts and (encrypted) direct messages.

Now, what if we used similar infrastructure to build P2P-distributed version control repositories? Say, you want to create/clone a repository, you register it just like you'd register a Twister account and maybe have your client post an empty changeset (as follows).

You want to push a changeset to the P2P network, have your client post a message to your public message torrent, containing: an update to the refs list of your repository, (optionally) references to others' posts that contain objects your changeset requires, and (optionally) (a pack of) objects that your changeset adds to the previously-distributed set.

You want to fetch/update someone else's repository, tell your client to "follow" that repository, so that it starts keeping a local replica of that repository's torrent. Some smarts are needed, comparing with twister, to further follow, or download on demand, other dependencies referenced by the repos you follow, and the transitive closure thereof.

I guess it might not be too hard to implement on top of the Twister code base. The greatest challenge might be to integrate it with git proper, though if the client just maintained the git pack files and packed-refs files in various dirs, as multiple git repos would (as opposed to in a leveldb, as it does for posts ATM), you could just tell git to look for packs in them and be done with it.

The advantage is then that nobody will rely exclusively on a third party to hold their repos: they will be replicated on the network, by those interested in the repositories to begin with (and then some). The more popular a repository is, the more replicas it will have.

Plus, nobody would have to be worried about pointing people at undesirable (e.g., blob-ridden) repositories (just be careful, when you push, to push a pack without referencing packs in the undesirable third parties' repo; maybe have separate follow sets for what repos to fetch to merges, and what repos to use as baselines for published changes), or where to host their forks or branches: the P2P network would do that in a P2P fashion, without central control or authority, and one would only have to worry about what one actually chooses to publish (though participating in the P2P network might optionally involve storing and forwarding replicas of 3rd parties' torrents).

Discussion

So blong...