Monday, September 21, 2009

Xus: A Simple Peer-to-Peer Platform

A friend of mine wanted to implement peer-to-peer source code repositories with social networking features, like popularity ratings and such and Plexus was already on the way to being able to support something like that, so I decided to break out the back end of Plexus into a separate project, called Xus. We started talking about what such a thing could do.

Then the bottom fell out of Plexus.

Apparently, the FreePastry project is winding down. It was, in my opinion, the most promising of the major peer-to-peer platforms. It was straightforward and relatively small. We chose it over JXTA for our communication in Plexus. We actually started with JXTA, back in 2002, but JXTA was just too over the top with the excessive design patterns, the large surface area, the lack of a usability layer, and things of this nature. Why does Sun seem to delight in requiring so much ritual from developers to do the simplest things with their libraries? So Pastry is way cleaner than JXTA. Like JXTA, Pastry was made to scale to 1000s of nodes in a cloud and had interesting routing schemes and other mechanisms to promote scalability and reliability. We liked it a lot. We did experience a few problems:
  1. An improperly configured NATted peer could royally screw up the cloud, because all of the peers take part in routing and connect directly to each other
  2. We had to roll our own presence using our own heartbeats, because Pastry doesn't inform you when a peer disconnects and you couldn't easily enumerate the peers on a broadcast channel (publishing topic)
This is all moot now, because FreePastry is not going to be actively developed.


So I'm announcing the Xus project. Here's the web page and here's the source, so far. There's a VERY simple, console-based chat to test it. I don't have unit tests yet. I've only been working on this for about a week and I'm not in the TDD habit. Maybe I should be. Anyway, because of the FreePastry news, Xus is a peer-to-peer platform and the other stuff will be services on top of it. Here's a snippet from the doc:
Xus is a peer-to-peer protocol with a reference implementation in Scala. The point of Xus is to provide a peer-to-peer platform that is simple, powerful, practical, and easy to use. People can use it for games, social networks, source code repositories, or whatever.

  • Simple: able to implemented with a relatively small amount of code so that it's easy to maintain and fix
    • Layered: there are 3 layers. Layering separates the concerns of the protocol and makes it easier to understand
    • Self contained: the reference implementation does not rely on third party libraries, reducing cognitive overhead for future developers
    • Scala: reference implementation is is Scala, which allows for smaller code that is "less noisy" than Java
    • Straightforward implementation: simple, direct routing
  • Powerful: strong enough to support common p2p tropes and some new ones
    • Messaging: multicast, unicast, dht, and delegated and direct p2p messaging
    • Delegated messaging allows peers behind NATs to receive messages from other peers
    • File storage, backed with Git
    • Replicated properties for simple shared data
  • Practical
    • Medium scale allows you to know when peers disconnect
    • Works through NATs
  • Easy to use
    • Built-in port testing peers can request other peers to test their port configurations
    • API is small with a minimum of required setup
    • Reference implementation will include a UPnP port configuration tool (ported from the Plexus code base)
Xus is centered around "topic spaces," which are groups of topics. A topic space is a slice of the total peer-to-peer network, consisting of a medium amount of peers and each peer can connect to many topic spaces. Probably 100 or fewer peers in a topic space -- depends on the app. All of the peers in a topic space connect to the active owner, which does all of the message routing for the space. So, if a peer wants to broadcast a message to a topic, it sends a message to the active owner of the topic's space and the owner then sends a message to each peer in the space. Any peer can potentially be an active topic space owner if it can accept connections and it is, in fact, an owner of that topic space (Xus uses strong authentication for this kind of thing) . Peers which can't accept connections can still participate; they just can't be active topic space owners.

I use NIO in the reference implementation, so this routing isn't computationally intensive, but astute readers will notice that this broadcasting arrangement takes up a relatively lot of bandwidth for the active topic space owner, compared to everyone else. Hence the "medium amount of peers". By the way, Xus accomplishes broadcasting in a way that mirrors the most common architecture for first person shooter servers, so it CAN work extremely well (see Sauerbraten). There are issues and constraints, of course.

See, this is just the sort of thing that Pastry was trying to avoid by having all of the peers do routing, but it was also the thing that accounted for some of the major complexity in Pastry and all of the problems we had with it. Thus in Xus, broadcasting is limited to medium-sized groups of peers but this allows Xus to be more reliable and faster -- latency is usually much lower with this architecture. Xus is for when you want to build a peer-to-peer app quickly and defer scalability issues to when you realize that your project has, in fact, become wildly popular. At that point, there is nothing preventing you from making levels 1 and 2 of the Xus stack more scalable. You'll even be able to reuse the Xus protocol, probably adding messages and attributes that are required to support routing.

The protocol is XML-based and I'm using fastinfoset (which is built-in to Java) to take the fat off the wire. This allows you to use your favorite XML tools without worrying about how bloated the communication is. I'm using TCP for connections to simplify things. Maybe I'll look into SCTP, but for now, it's just TCP. The protocol is full duplex but does support request-response in cases where messages could fail, like direct p2p messages, for instance.

I'm not entirely sure whether to have broadcasts return a response with a list of the nodes that failed to accept the broadcast for whatever reason (security or lack of resources, perhaps). This sounds like it would be useful for some applications. Right now, Plexus is what's driving the requirements. If other developers see needs for things like this, they are welcome to jump on board the project and help out.

No comments:

Post a Comment