At this point, Plexus breaks files into chunks that are 50K (if I remember correctly) and keys them by SHA-1 hashes. If we include rsync's rolling CRC with the chunk ID, that would be enough (I think) for a peer to determine whether it could reuse a chunk from its current file and avoid downloading a chunk. I'll have to check on rsync to see its default chunk size. We chose 50K based on tuning with Pastry and Xus' behavior will probably be very different.
Thursday, January 21, 2010
Use rsync with Xus file sharing?
I'm wondering if it would be a good thing to use a variant of the rsync algorithm with Xus file transfer. This would help save a lot of bandwidth for large files that only have small changes. One example is Sauerbraten maps, which can be 2M or more (after compression). When you move a tree, only a small part of the map file is changed, but without rsync behavior, pushing an update requires transferring the whole file.