Monday, January 18, 2010

Xus file sharing

It's been a long time since my last post; various RL intrusions encroached on Xus. Now I'm starting the file sharing piece. I'm basing it on what I did in Plexus, but starting right away with Git integration instead of putting it off, since that seems to be a better way to actually have Git integration :).

Xus has kind of a weird profile for a p2p system. It's made to handle a mass of small clusters, rather than a giant, monolithic cloud. It's really based on the needs of Plexus, but I suspect other systems out there can benefit from it. In a monolithic p2p storage system like bittorrent or FreePastry's PAST, the cloud can (theoretically) be so large that the data can "live in the cloud". In a game like Plexus (or Sauerbraten), however, the clusters are small and may often drop to 0 users (like when the players for a particular world are asleep), so there needs to be a way to restore cluster state (so a world with no players is still up-to-date when a player joins, even if they weren't around recently).

The most straightforward way I could think of to accomplish this is just to use a Git repository for the Xus file cache and mirror it to a remote Git repository for backup/restore. When a topic space master boots a topic space, it can pull from a remote git repository to restore the state. To avoid a single point of failure, the owners can push to more than one designated repository. The DHT portion of the repository is only useful for peers that are connected, since the DHT expands and contracts, causing chunks to be copied around over time.

Storage:
  1. each resource is tracked with a tag: an outer directory containing chunk-list file and a subdirectory with standard names
  2. creator sends packed delta to topic owner
  3. owner pushes delta to remote Git repository
  4. owner returns signatures for the chunks, etc. to creator
  5. creator stores chunks, etc. in the DHT.
  6. creator gets an event when storage process is complete
Retrieval:
  1. request tag from owner, which responds with ID of chunk-list file
  2. retrieve chunk-list from DHT (which delegates to owner if chunk not in cache)
  3. retrieve chunks from DHT (possibly delegating to owner to fill cache)
  4. requester gets an event when the retrieval process is complete
Several owners is a good thing in this model, since they back the DHT cache.


So, that's the idea. I decided I'm going to include the source to jgit and its dependencies (jsch and jzlib) with this. Requiring on a properly installed command-line Git is just too much work to make developers do (or force their users to do). Requiring developers to download, test, and package a compatible version of JGit wouldn't be nice either. So I have some bloat now. I really don't see the file sharing part as being quite so optional, anyway. From my experience, nontrivial p2p apps tend toward bit torrent the same way most apps tend toward email clients (that's someone's law; maybe I'll get a comment with a citation for it).

No comments:

Post a Comment