This Statement is False: 2009

Sunday, October 11, 2009

Xus Milestone 1 complete

According to my only somewhat extensive automated tests, Milestone 1 (Basic Xus) is now complete. That includes layers 1 and 2 and the property service (transient and persistent properties). Plexus relies heavily on its property service (currently backed by FreePastry) and I think it will be useful to other programs as well. It's kind of a persistent broadcast; it's really just for relatively small databases, like if you have to track a few hundred things for the topic, like player locations in a game or account credit, for instance. Plexus uses properties for player presence (nickname, which world they're connected to, etc.) and object information (location, orientation, costume, etc.) If you need to store large amounts of data, it might be appropriate to use the file storage service (coming soon and featuring optional Git integration) or maybe to add a database service (and contributing it back to the project would be nifty, too). DHT message routing is useful for this sort of thing, so you don't necessarily have to replicate all the data to all of the peers if you are storing a large amount.

When a peer successfully joins a topic, the topic owner sends it the current property values. Peers broadcast property settings or deletions to the topic and the topic owner keeps track of the properties for when new members join. This creates a very simple (some might even say simplistic) database for each topic. It's just key-value pairs of strings (like in Java's Properties object).

When you set a property, you specify whether the setting should be persistent. If a peer has storage turned on, persistent properties are stored in an XML file so that they are there next time the peer starts up. It usually only important for topic owners to store persistent properties, because other peers always get the latest properties for a topic when they connect to it (and they may have changed since the peer was last connected anyway).

Wednesday, October 7, 2009

Xus' file storage service and Git

Xus has a 2-pronged approach for integrating its file storage with Git:

the basic file storage will just use ordinary file storage, with an approach similar to Git's, ensuring that Xus' model is "compatable" with Git's
provide a plugin that uses JGit (http://www.eclipse.org/egit/) for people who want to maintain a local Git repository

Each peer can be optionally Git-enabled, since the peers will only interact with a local Git repository. It doesn't use Git's protocol to exchange data, because we want Xus to be able to swarm downloads, but a Git-enabled peer could use Git-clone to seed its cache. Since the file storage protocol doesn't have to interact with Git, Xus just needs #1 to make sure that a peer can fetch everything it needs over the net. Since Git uses SHA-1 hashes as keys for almost every type of object, this shouldn't bloat the protocol too much. Note that when I say "protocol" here, I'm talking about the file storage service protocol, which rides inside of layer 2 messages, not Xus' layer 1 or layer 2 protocols.

The file storage service will break files into chunks and use the DHT to spread the chunks around. This is how we do it right now in Plexus; the directory and chunk models are ours and Plexus uses PAST (Pastry's DHT file storage service) to store and retrieve the objects (chunks, file chunk lists, and directories). Xus DHT file service will be similar to PAST's but not entirely the same. The chunk/chunk list/directory model will be similar to Plexus' but it also needs branches, commits, and tags, in order to use Git.

The Git model has been a milestone on Plexus since last year because I think it's crucial to power an open, collaborative gaming environment. In a completely open environment where anyone can change anything, version history ensures that people can't destroy data; they can only create new versions without some of the data, and signed commits make spoofing very difficult, allow people to rate authors based on their work, etc.

This also has the side effect of making a p2p version of Git :).

I don't think totally reimplementing PAST is a good idea for small clouds, because I believe data copying would kill the network. If the cloud is small, with peers joining and leaving slowly over time but never getting larger than 5, each time a new peer joins, it gets a copy of all of the data in storage. PAST does have caching and it works very well for large clouds where storage is built up slowly over time, but I don't think it works well for small rings. For this reason, I'm planning on only having the topic space masters replicate automatically (hopefully starting with a Git-cloned cache seed), but having "ordinary" peers "fill in the holes" from the topic space master peers on-demand. This will slow down initial requests, but amortize the impact on the network.

In practice, I think JGit integration will probably be the default, but I still want to provide it as a plugin as a courtesy to projects so they don't have to link in yet another third party library if they don't require it. Also, I want non-Git-enabled peers to be able to work with Git-enabled ones seemlessly. At least this is what seems like a good idea at this time :).

This implements a form of swarming download, similar in spirit to bittorrent, but not similar in the way it actually functions; you can simultaneously download chunks of files from different peers and several peers will have the chunks, but it's organized quite differently.

Also :) I'm thinking that it would be a good idea to give each directory (with all of its versions) a separate "view" in Git; i.e. new directories are branched from HEAD in Git (like when you make static web pages for Github: http://pages.github.com/). This will give each directory an independent set of branches, allowing people to store many different directories in a single Git repository without having to check out a ton of stuff just to get the directory you want, AND it also allows you to reuse the files from other directories because they're all in one Git repository.

Sunday, October 4, 2009

Xus: Layers 1 and 2 work now

I have layers 1 and 2 working now, with automated tests (it's on github). Layer 1 is simple packets and layer 2 is p2p messaging. That covers:

strong authentication (challenge/response on connect with RSA keys & signatures)
direct peer-to-peer messages
broadcast messages
unicast messages
dht messages (delegated to the peer with the peer id closest to the dht key)
delegated peer-to-peer messages (one peer sending to another peer through a topic)

Right now, as long as your peer doesn't spoof its public key, it can connect to any topic (authorization is part of layer 3). I think one other thing I should add to layer 1 is a heart beat, so you can quickly detect dropped connections (TCP timeouts are notoriously unreliable).

Layer 3 is the service layer and has these components:

Properties
Topic authorization and management (including peer "accounts")
File sharing
Port configuration testing
Connect-back requests (for when only one peer can accept connections)

A note about topic management: the idea is to use topic space 0 as a "root" topic space for a cluster of spaces, with topic 1 in space 0 handling administrative concerns for the cluster, like id vending for topic spaces and small id vending for peers (so messages don't have to contain a giant peer ids). Topic 0,1 would contain a set of properties about accounts and topics which would also allow banning users from clusters and members-only clusters (allowing private and pay services).

I was thinking back to the very start of my peer-to-peer experiments, so I searched my email archives to see when I started with this. It was April, 2001 when I first started talking p2p architectures over with my friend Fritz. A little while later, I made the p2pmud project on source forge; for project paleontologists, here's a link to the old p2pmud forums.

The project transitioned through several languages and architectures, eventually adopting FreePastry in Plexus so that we didn't have to write and maintain our own peer-to-peer networking layer. So now we have to do that and, ironically, the architecture's pretty close to the one I came up with in 2001, except that it's simpler. In 2001, I had envisioned a set of peers that routed requests to other, natted peers, but now we just have one peer doing the routing for a "topic space" with a bunch of topic spaces clustering to form the entire peer-to-peer grid.

Thursday, September 24, 2009

andThen andThen andThen... Scala, where's my car?

When I saw Dude Where's My Car?, several years ago with my wife (who is Chinese), she started laughing right as they pulled into the parking lot of a Chinese restaurant. I had no idea why, but she said to wait until the end of the scene. In the scene, after they order each item, a voice from the speaker says, "and then?" and continues after they run out of things to order until Ashton Kutcher eventually destroys the speaker in frustration. When the scene was over, I asked her why she started to laugh at the beginning of the scene. She said she had heard people quoting that at work, but she never realized that the Chinese name of the restaurant, which she saw at the beginning of the scene, was "And Then."

Appropriately, Scala defines infinite loops in actors using andThen. Here's how...

In Scala, one way to compose functions is "f andThen g", which is the same as

def newFunc[T](x: T) = g(f(x))

You can think of it as "evaluate f(x) and then evaluate g with that result". It's an easy way to chain functions and it's good for noncommutative operations, where "order matters," like matrix operations (the same type of idea you would use in implementing a safe navigation operator {quickly ducking for cover}). For instance, if you want make a function that takes a matrix, transposes it and inverts it and then returns the determinant, you could write that like this:

def transInv = transpose andThen invert andThen determinant

Those not familiar with functional programming will note the lack of arguments and return type in the definition of transInv. TransInv gets its arg types from transpose and its return type from determinant, so you can still use transInv(myMatrix). The definition of Function1[A, R].andThen is (here, Function1 takes an argument of type A and returns a result of type R):

def andThen[A](g: R => A): T1 => A = { x => g(apply(x)) }

Scala defines infinite loops in actors using andThen:

def loop(body: => Unit): Unit = body andThen loop(body)

So, if want to simulate the scene in Dude, Where's My Car?, using an actor, you can say:

loop {choose}

which translates to:

((_: Unit) => choose) andThen (_ => choose) andThen (_ => choose) andThen...

Monday, September 21, 2009

Xus: A Simple Peer-to-Peer Platform

A friend of mine wanted to implement peer-to-peer source code repositories with social networking features, like popularity ratings and such and Plexus was already on the way to being able to support something like that, so I decided to break out the back end of Plexus into a separate project, called Xus. We started talking about what such a thing could do.

Then the bottom fell out of Plexus.

Apparently, the FreePastry project is winding down. It was, in my opinion, the most promising of the major peer-to-peer platforms. It was straightforward and relatively small. We chose it over JXTA for our communication in Plexus. We actually started with JXTA, back in 2002, but JXTA was just too over the top with the excessive design patterns, the large surface area, the lack of a usability layer, and things of this nature. Why does Sun seem to delight in requiring so much ritual from developers to do the simplest things with their libraries? So Pastry is way cleaner than JXTA. Like JXTA, Pastry was made to scale to 1000s of nodes in a cloud and had interesting routing schemes and other mechanisms to promote scalability and reliability. We liked it a lot. We did experience a few problems:

An improperly configured NATted peer could royally screw up the cloud, because all of the peers take part in routing and connect directly to each other
We had to roll our own presence using our own heartbeats, because Pastry doesn't inform you when a peer disconnects and you couldn't easily enumerate the peers on a broadcast channel (publishing topic)

This is all moot now, because FreePastry is not going to be actively developed.

So I'm announcing the Xus project. Here's the web page and here's the source, so far. There's a VERY simple, console-based chat to test it. I don't have unit tests yet. I've only been working on this for about a week and I'm not in the TDD habit. Maybe I should be. Anyway, because of the FreePastry news, Xus is a peer-to-peer platform and the other stuff will be services on top of it. Here's a snippet from the doc:

Xus is a peer-to-peer protocol with a reference implementation in Scala. The point of Xus is to provide a peer-to-peer platform that is simple, powerful, practical, and easy to use. People can use it for games, social networks, source code repositories, or whatever.

Simple: able to implemented with a relatively small amount of code so that it's easy to maintain and fix
Layered: there are 3 layers. Layering separates the concerns of the protocol and makes it easier to understand
Self contained: the reference implementation does not rely on third party libraries, reducing cognitive overhead for future developers
Scala: reference implementation is is Scala, which allows for smaller code that is "less noisy" than Java
Straightforward implementation: simple, direct routing
Powerful: strong enough to support common p2p tropes and some new ones
Messaging: multicast, unicast, dht, and delegated and direct p2p messaging
Delegated messaging allows peers behind NATs to receive messages from other peers
File storage, backed with Git
Replicated properties for simple shared data
Practical
Medium scale allows you to know when peers disconnect
Works through NATs
Easy to use
Built-in port testing — peers can request other peers to test their port configurations
API is small with a minimum of required setup
Reference implementation will include a UPnP port configuration tool (ported from the Plexus code base)

Xus is centered around "topic spaces," which are groups of topics. A topic space is a slice of the total peer-to-peer network, consisting of a medium amount of peers and each peer can connect to many topic spaces. Probably 100 or fewer peers in a topic space -- depends on the app. All of the peers in a topic space connect to the active owner, which does all of the message routing for the space. So, if a peer wants to broadcast a message to a topic, it sends a message to the active owner of the topic's space and the owner then sends a message to each peer in the space. Any peer can potentially be an active topic space owner if it can accept connections and it is, in fact, an owner of that topic space (Xus uses strong authentication for this kind of thing) . Peers which can't accept connections can still participate; they just can't be active topic space owners.

I use NIO in the reference implementation, so this routing isn't computationally intensive, but astute readers will notice that this broadcasting arrangement takes up a relatively lot of bandwidth for the active topic space owner, compared to everyone else. Hence the "medium amount of peers". By the way, Xus accomplishes broadcasting in a way that mirrors the most common architecture for first person shooter servers, so it CAN work extremely well (see Sauerbraten). There are issues and constraints, of course.

See, this is just the sort of thing that Pastry was trying to avoid by having all of the peers do routing, but it was also the thing that accounted for some of the major complexity in Pastry and all of the problems we had with it. Thus in Xus, broadcasting is limited to medium-sized groups of peers but this allows Xus to be more reliable and faster -- latency is usually much lower with this architecture. Xus is for when you want to build a peer-to-peer app quickly and defer scalability issues to when you realize that your project has, in fact, become wildly popular. At that point, there is nothing preventing you from making levels 1 and 2 of the Xus stack more scalable. You'll even be able to reuse the Xus protocol, probably adding messages and attributes that are required to support routing.

The protocol is XML-based and I'm using fastinfoset (which is built-in to Java) to take the fat off the wire. This allows you to use your favorite XML tools without worrying about how bloated the communication is. I'm using TCP for connections to simplify things. Maybe I'll look into SCTP, but for now, it's just TCP. The protocol is full duplex but does support request-response in cases where messages could fail, like direct p2p messages, for instance.

I'm not entirely sure whether to have broadcasts return a response with a list of the nodes that failed to accept the broadcast for whatever reason (security or lack of resources, perhaps). This sounds like it would be useful for some applications. Right now, Plexus is what's driving the requirements. If other developers see needs for things like this, they are welcome to jump on board the project and help out.

Sunday, September 6, 2009

Can functional programming really offer something substantial to Java?

This post on Scala-lang talks about research done on how the style of language affects productivity (the time it takes to understand other peoples' code). It is specifically comparing Java with Scala. Here is the paper that the post refers to.

Scala allows functional programming, as in Scheme, as opposed to procedural or object-oriented programming, as in Java, but Scala allows a nice blend of the two, which is a unique approach among programming languages. Yes, there was LOOPS and CLOS, but Scala's approach is quite different and I think it's much a more integrated one. There seem to be more and more posts out there about the importance of functional languages. Personally, I agree with the premise: that functional techniques lead to denser code and that can be (if used properly) easier to understand and maintain. Actually, this is related to one thing that I emphatically agree with the eXtreme Programming camp on: don't use accessors unless absolutely necessary (i.e. use refactoring to insert them when you have need of them).

There is a controversy here, though. Every now and then, over the years, I get extreme push-back on my coding style, because some programmers REALLY hate functional style. They act as if it's offensive -- as if people who use functional techniques are snobs and feel they're too good to use the straightforward, meat-and-potatoes approaches that the gumshoe programmer in the street uses. That's not it at all for me -- I use procedural techniques too. Just not all the time; not as a golden hammer.

It IS possible to use functional techniques in Java (using anonymous inner classes), but they are ugly and verbose. Nevertheless, I use them in places where I feel they're appropriate. Some people have complained bitterly about this in the past. That hasn't had an impact on my code, because of my position and the unique constraints and properties of it but in other circumstances I would be forced to write in what I consider a less effective way.

Maybe this attitude will change with the attention that functional languages are getting. I was a Smalltalk programmer in the 80s and 90s, and I remember the Java hype. A lot of Smalltalkers were saying that it would pass because it was just hype. After all, they said, look how inferior Java is with its primitive types, substandard collection library, and skeletal GUI support, compared to Smalltalk's! And anyway, it was just the new new thing. I figured Java was going to eat Smalltalk's lunch. I made noises so that people at our company got trained in Java and I made a technology that connected Smalltalk to Java so that Smalltalk could essentially use Java as a display medium, which I thought might slow Java's impact on Java (as of last year, there was still at least one Smalltalk project using that technology).

Are functional languages just being hyped now as the latest attempt at a silver bullet or do they REALLY have something to offer? I've always thought they were interesting and powerful and fun to program with. As I mentioned before, the functional paradigm has been around a long, LONG time: Church introduced Lambda Calculus in 1932 and Turing introduced his machine in 1936. They were both VERY different approaches to computing and the Turing Machine approach is currently "in vogue."

That may change. What if functional languages are shown to scale better in several dimensions, such as code comprehension and multithreaded programming? I guess we'll see, but it's starting to smell that way to me. I did a lot of programming in both LISP and Smalltalk, back in the day. When I switched to Java, the lack of lambda in JDK 1.0.X was a real pain. When they introduced inner classes in 1.1, things got better, but so far they really don't have the brevity of Smalltalk's blocks.

Groovy and Scala have nice lambdas and Scala really focuses on functional tools. My money's on a change that integrates functional tools into the JVM. Maybe that'll come quickly as more and more deep pocketed groups adopt Scala. That's how Smalltalk got mainstreamed in the 90s. Even if Scala doesn't get super broad adoption, if high tech, high finance, and infrastructure orgs continue to adopt Scala, there will be a nice little Scala niche for a lot of people I know.

Tuesday, September 1, 2009

Livecoding: connecting the dots from Lambda Calculus to Gibson's cowboys

Dot #1: Lambda Calculus...

My wife and I took a vacation recently and I had a chance to learn some more Lambda Calculus; how the Y-combinator works, in particular. At Purdue, we focused on Turing Machines and either we didn't really cover LC or I slept during that part. If you haven't, I HIGHLY recommend learning LC. Lambda Calculus is the most elegant system I, personally, know of (maybe that's not saying much). You functional-types out there will already know this next part, but here's some history and stuff.

It was invented by Alonzo Church, who was Turing's Ph. D. adviser (later they proved that the Lambda Calculus and Universal Turning Machines are equivalent, which was one really impressive feat). The Y-combinator is a way to define recursive functions using only lambda binding. Lambda binding is very simple and it's pretty much what Lambda Calculus is. It more or less just lets you assign values to variables and then use them in an expression, like this (in LISP):

( (lambda (x) (print x)) 50)

(lambda (x) (print x)) is a function definition that says to print whatever x is bound to. Using it like in the above "applies" the function to the argument by first "binding" x to the value 50 and then "evaluating" the expression (print x). Pretty simple concept. The trick is how to define recursive functions using this mechanism, since both the set of variables to be bound and the expression to evaluate are separate. In order to be able to call a function, you either need to pass it in as an argument or have it defined by an "outer" function.

A recursive function can't take itself as its own argument and it's also not defined in an outer scope, so it essentially has to be given a copy of itself as an argument. The Y-combinator is a relatively clean way to do this. I wasn't able to understand it while we were out on our vacation, because I (deliberately) left my computer at home. Once home though, I was able to type things into LISP and play with them to finally be able to grok the combinator (hmm, "grok the combinator" sounds like some sort of euphemism).

Dot #2: Scala...

So, that launched me on a journey to rediscover functional programming, since I hadn't messed with it really since the late 80s, when Roy Riggs and I (TEAM CTHULHU) were working on our MUD. My friend, Serj, reminded me of this functional language for the JVM called Scala and told me that James Strachan wrote in a blog post that if someone had shown him the new Scala book back in 2003, he never would have written Groovy. Groovy has been my favorite scripting language for quite a few years, because it integrates with Java so well (low barrier to adoption for Java coworkers). So, I started messing with Scala. I even made this blog so I could post some musings and info about my Ober project, which I updated from JavaScript to Scala. I recoded a bunch of Groovy stuff for Dataswarm in Scala and that's worked out really well.

Dot #3: Scheme...

I met this guy on a plane to Israel a few months ago. He's in his 50s and he got his masters in education, applying virtual worlds to education. he's not a programmer (yet), but he's been wanting to essentially "do cyberspace" for education since 1996. He got a 17 year old Israeli programmer to work with him and make a bunch of interesting mockups. They tried to commercialize it and whatnot, but never really got anywhere. When I told him about Plexus (which is currently stalled, by the way), he was sort of shocked into action and really got inspired to work on his project again. I asked him that since Plexus was open source anyway, why not build on top of it instead of from scratch?

Eventually, I realized that he really needed a better way to get a feel of what's currently possible, etc. and that would require him to know something of how to program. I looked for a good intro to pure programming, with little between him and the algorithm and that led me to PLT Scheme (http://plt-scheme.org/). There's a GREAT book on programming that basically plows through a few semesters of CS very quickly (How to Design Programs here: http://docs.plt-scheme.org/). The quick intro also does graphics on the command-line, which is pretty slick, letting you define shapes that print out like smileys. This reminded me of "the good old days" in our LISP MUD when we could type in code and change the game as it was running.

Dot #4: testing and log analysis...

I've been doing a lot of log analysis with our technology at work. We're working on making it "bullet proof" now, with the goal of being able to remove the battery from a laptop in a demo, sticking it back in, and starting back up without data loss (after the file system recovers). This requires some heavy journaling and online backup. Test runs can be long and produce giant diagnostic logs (0.5 - 2G). I have to comb through these logs and find errors. Then I have to trim them and transfer them from Kansas to my machine in Israel so I can inspect them more closely. This takes a lot of ad-hoc scripting and the needs change after I fix each bug. This reminds me of what we did with our MUD, because I'm "writing code as I go". One of my hopes with the Ober project is that I can have a nice tool to do this with.

Dot #5: Gibson's cowboys...

I started rereading Neuromancer and it's even MORE impressive than it was the first time I read it. Early in the book, Armitage tells case that he was there when they "invented" Case -- i.e. when they created the software cowboys use for net running. This reminded me of typing code "in-game" with our MUD and I remembered how cool it was to adapt the game as people were playing it. So I was thinking that there ought to be some powerful stuff programmers could do that we might not really be doing yet.

I had told Serj about PLT Scheme. He's early in college and I'm trying to mentor him to be a guru -- I think he has the potential. He started messing with Scheme and reading the book and sent me a link to "fluxus" http://www.pawfal.org/fluxus/ This is a "livecoding" system. People are using this to actually code music and graphics in front of an audience: http://toplap.org/uk/about/ The first vid is fairly poor and what you might fear such a performance would be like (sorry guys who made the vid), but the rest are very interesting, I think. Again, kind of the type of thing that we used to do with our MUD.

Dot #6: livecoding in Scala...

So people are working on creating systems specifically to accomodate changing programs as they run. You could say "that's what a shell is" and you'd be right, to an extent. Shell tools are powerful ways to slice and dice text, but for things like the log analysis I'm doing, I think I need something more like an IDE. Something that gives programmers an environment they can use to modify code and test it in a tight loop, with access to the code, the data, and intermediate results as the program is running, like in the Haskell Hackery performance here http://toplap.org/uk/about/ it's well worth watching. That's more like EMACS or WILY than a shell, I think.

I'm hoping Ober can do something like this for Scala, backed with something like Eclipse, to get syntax highlighting, and whatnot. One idea might be to make Ober an Eclipse plugin instead (or in addition to) a stand-alone environment. It might be interesting to "skin" Eclipse and remove the right-button menus, replacing the behavior with Oberon menus (or 'tags' in Wily -- a single Oberon menu bar above all of the editors might be sufficient, with some commands to open subpanels or something). Perhaps running in the debugger will help or ClassGhost (http://classghost.sourceforge.net/).

One sticky issue is that Java can't redefine arbitrary methods in a running VM -- you can't redefine a method using a different signature as you could in Smalltalk and you can't add or remove fields. A quick check of the new features of JDK7 doesn't show any evidence of change to these limitations. Coding conventions would help with this (kind of like writing "structured COBOL"). As opposed to hotswapping methods, you can just "reload" a class, which defines a NEW class with the SAME name. Production servlet hotswappers are familiar with the perils of this.

Ober's command/namespace registry makes it a system like a shell or Oberon. Indirecting through the registry instead of using hard pointers allows you to reload a class file without worrying about signatures, provided the fundamental command interface didn't change (i.e. the command itself should be the only thing keeping references to its objects). A central storage area where commands could place simple data would allow commands to keep state that was protected from redefinition and they could be initialized with this state.

So here's a simple way to get what I think would be decent support for livecoding with Scala:

run Ober out of the debugger so you can use hotswapping when possible, using Ober to access intermediate results and as a scratchpad for data munging and Eclipse to change code
make the class loading namespace check for hints in the class, like annotations for namespaces and such
provide per-command data storage from the Ober object (a trait can make this even easier to access)
make a Reload command as a fallback for when hotswapping doesn't work
make a log that appends useful Ober commands, such as prefilled Reload commands and access to previous command results and provide easy access to this from the Ober object and a command to show the log
keep commands clean -- don't hold references to other commands' objects; go through the command registery to get them
simplify access to the command registry to make it simple for commands to use it to access each other

Friday, August 28, 2009

Whimsical Jerusalem Pedestrian Crossing Signs

I noticed these in walking around in Jerusalem and I haven't seen a lot about them on the web yet, so I thought I'd post them. The last one's a fake, obviously put up by a concerned citizen who thought that werewolves weren't getting the representation they deserved. There are several more that I haven't photographed, including at least 2 more fakes: a man dragging a baby, and an angel as well as at least two authentic ones, a man with one hand in a fist and the other with fingers spread and a fat man. I'll post those eventually.

Thursday, August 20, 2009

Streamlining Ober

I made some changes to Ober that make it quicker to use. Now, when you right-click a word, Ober does everything in path-like fashion:

if the word is an Ober built-in, execute it
if the word is a class name, run the main with the args to end of line
if the word is a shell command, run it
if the word is a file, directory, or URL, open it

If you middle click, that sets a hint to tell the command that if it is appropriate, open a new viewer instead of replacing the current one.

Also, I made a runnable jar for it all, so you don't have to build it.

Here's the new doc page

Monday, July 27, 2009

The power of text

After spending 4 or 5 years working on servers with decent amounts of data going through them, I've had to do a fair bit of forensic work with saved databases, giant log files, etc. AWK, sed, Groovy, EMACS, and their kin have been my friends. You can use them to distill unstructured text into a structured form for summaries, trending, etc.

I am starting to turn Ober into a console for this type of forensic work. I can see how farsighted Niklaus Wirth was when he made Oberon. This doesn't look to me like a temporary phase in computing. If anything, unstructured text is even more prevalent today than it has been in the past. Google has shown people what you can do with it. We'll see where Ober goes from here.

For the future, I'm planning on adding some more Wily-like features to Ober, like preceding commands with |, <, and >. Here's an update on Ober...

Ober is becoming usable now. Here are the items I checked off of the list from the last post:

execute HOME/.oberrc on startup (if it exists)
Scala expressions embedded in {...}
ctrl-click opens a file. Actually, it tells the namespaces to "surf", so surfing depends on the namespaces your viewer is using
fixed redraw bug

I also added some cosmetic commands, like TrackPos, Width, ViewerPos, and Height.

Using the context-sensitive surfing, I made a "data browser" to examine some object databases I use. It was pretty simple, since I could already dump objects and display the guids of the objects they reference, so when you ctrl-click on a guid, it goes to the viewer for that object (or creates it) where it displays the dump of that object. If you type things or mess up the text and you want to redisplay, you just ctrl-click on the viewer name and it refreshes.

Here is the Ober's current source code and here is Ober's current help page.

Wednesday, July 22, 2009

A tribute to Niklaus Wirth

If you haven't heard of Niklaus Wirth's Oberon or you haven't seen it, you should check it out. Messing with it reminded me of when I first learned Smalltalk in 1988. There's an air of playfulness and potential to it, just like in Smalltalk-80. The current version of Smalltalk is called Squeak and it's well worth checking out as well.

Oberon is sort of a Pascal take on Smalltalk. Like Smalltalk, it's both a language and an environment. Like Smalltalk, it has built-in programming tools (compiler, editor, etc), and it's interactive; you're programming "in" the language, so the edit-compile-run cycle is VERY tight (like when you're using the scala command and you can evaluate expressions). Oberon's environment was similar to Smalltalk's, but had some major differences and innovations, the most valuable one being (in my opinion) that you can click on a word anywhere in the environment and it will attempt to execute it as a command. There are other innovations too, but that's my favorite.

Harnessing this capability, Rob Pike at AT&T based a text editor/shell on Oberon as part of Plan 9, called Acme. Then, Gary Capell made Wily for Unix. Then, in 2002, I made one based on Wily for Ruby, called Meep. And THEN, I rewrote it in Java and renamed it "Ober," to point back to Oberon (and because ober means "waiter" in German). Somewhere along the line, I don't remember whether it was in Meep or just Ober, I added a "namespace" twist, to allow scoped commands, like Oberon has, but identified in the "tag" of the viewer (the text field at the top of the viewer which contains handy commands).

So. Time for another rewrite. Here's Ober-scala now.

It's two eclipse projects, one Java, and one Scala 2.8 (I'm having trouble with Java and Scala in the same project). To run it, just use the "Scala Ober" launcher that's in the scala project. There's a LOT of room for improvement, of course, but it functions, at least. On Linux. I haven't tested it on Windows. Sorry Windows :(. It's not that I think Windows sucks, it's just that I don't use it and it's pretty inconvenient for me to mess with it on my current setup.

Why did I do this? I wanted a "programmer's workbench" that was a combination program runner, simple editor, and shell. You know, like EMACS, but by Scala and for Scala. And the Oberon environment is very cool. When I read that Martin Odersky studied under Niklaus Wirth, I figured, hey, maybe some Scala people would be interested in this thing too. We'll see.

What can it do? If you righth-click a word, it will try to execute it as a command. At this point, there are 3 types of namespaces:

closure namespaces, which look up named closures
class namespaces, which look up classes
system namespaces, which look up shell commands

So, you can test it by right-clicking on some of the words that are already there when you start it (like New, Newcol, Del, Quit, ...). You can type in the name of a class in your class path and right-click on that (like tc.ober.Test, for instance); it will split the rest of the line by spaces and send the strings to the main method of the class. You can type in a shell command and right-click that; it will execute the word plus the rest of the line as a system command (using Runtime's exec(String) method).

What's left?

execute .oberrc on startup
Scala commands embedded in {...}
file handling (crtl-click on a word and it opens as a file). See Wily for this behavior.
meta commands for your .oberrc (commands to define commands, etc.)
classpath manipulation
fix bugs (like when you delete the last viewer in a track, it doesn't redraw)

Happy hacking!

Friday, July 17, 2009

Safe navigaion in Scala, the aftermath

Wow, I sure stirred up a hornet's nest with my safe navigation experiment. Some people posted some very useful information and good alternative approaches!

As for comments about "ugliness", being able to read this in 6 months, etc.; I'm not making a policy decision here, just experimenting. The fact that people are so put off by experimentation makes me wonder... Anyway, we'll see if it comes in handy or not. Things like this can easily be "compiled" by the mind as "programming idioms".

But as for unreadable, it's only a cascade of lambdas, after all. Personally, I've been very comfortable with functional style, since I first learned LISP, freshman year in college. It's amazing to me how emotional some programmers get when they see code using functional style. Does this mean "imperative -> good, functional -> bad?"

I guess there's always been a divide between functional and imperative approaches. I'm glad Church and Turing could see that their different approaches were accomplishing the same thing. I really think more universities should teach functional languages in their freshman level courses, before people have a chance to ingrain imperative as the one true way. My university (Purdue) did not; I just happened to learn it from a friend, because I was interested in what LISP was and how to use it. At the time, I was amazed (and a bit envious) to find out that some universities taught LISP or Scheme to freshmen.

I think LISP and Scheme are excellent first languages, because they abstract computation away from the machine; they don't encourage you to simulate a computer in your head when you write code. Simulating a stack machine can be a great barrier to actually understanding recursion and higher order functions. Freshman and sophomore year, I helped quite a few students learn recursion, but I had to "unteach" the technique of "stack simulation" first, so they actually had a chance of understanding something like A-B recursion or composite recursion.

It's interesting, because I just started teaching a guy to program last week, and I chose to start with Scheme (although eventually he probably needs to learn Java). I found good site with some basic tutorials and a free book on programming that actually teaches recursion the way I taught it freshman year, way back in 1984. It looks like I'm not the only one with those ideas about how to teach recursion.

For those interested, here's the PLaneT Scheme Site and here's the book (linked from their site as well).

Thursday, July 16, 2009

Safe navigaion in Scala, take 2

OK, revisiting this topic after a very short time, I have a new implementation, again using that interesting anonymous class technique from James Iry's comment:

implicit def richOption[T](x : T) = new {
 def ??[B](f : T => B):B = if (x != null) f(x) else null.asInstanceOf[B]
}

I changed the operator from "?!" to "??" because that's a lot easier to type and I chose "??" instead of "?" so people wouldn't confuse it with the questi-colon operator from Java and C. It's stand-alone (doesn't depend on Option or anything) and I had to use a cast to get the types to work out properly, but the test case from my last post does work

println(p3??(_.bud)??(_.bud)??(_.bud)??(_.name.drop(1)))

prints null and

println(p3??(_.bud)??(_.bud)??(_.name.drop(1)))

prints ubba.

For those new to Scala, the expressions in parentheses are anonymous functions because they use "_" as identifiers. (_.bud) expands to (x => x.bud), where Scala chooses an x that is unique in the scope of the expression. Of course, you can put whatever you want in the final block there, or the other blocks. You could say this instead

p3??(_.bud)??(_.bud)??(x => println(x.name.drop(1)))

You might want to do this if you only want to print provided that the full path exists for instance, which demonstrates that this construct is more powerful than Groovy's, because each step can be more than just another increment along the path.

This has an advantage over the previous solutions of not requiring the ! operator at the end to extract the value out of the option because the ?? operator returns a raw value instead of an Option. Not having used Scala for more than a week (I haven't even used LISP since the late 80s), I can't speak as an authority on when it's appropriate to use Options, but this seems to me like an alternative to Option for at least some cases.

Safe navigation operators for Scala

Welcome to my first blog post.

I searched the net for a Scala analog to Groovy's safe navigation, but I didn't find anything, so I made my own. It's like a monad but I haven't thought much about the implications of really making it into a real monad so I haven't monadized it at this point.

Here is some example data to demonstrate how you use these safe navigation operators in Scala:

case class Person(var name: String, var bud: Person = null)

val p1 = new Person("bubba")
val p2 = new Person("fred", p1)
val p3 = new Person("mary", p2)

println(opt(p3)?!(_.bud)?!(_.bud)!(_.name.drop(1)))

This code prints "ubba". It uses ?! like Groovy's ?. operator (?. isn't a legal operator in Scala because of the .), but you need the extra parentheses or it doesn't parse like you would want (I think Scala needs the parentheses to disambiguate the scope for the _). The final ! operator returns the closure value instead of wrapping it with an OptionalRef.

If you go one more step, you'll get () as the return value:

println(opt(p3)?!(_.bud)?!(_.bud)?!(_.bud)!(_.name.drop(1)))

That code prints "()". If you don't like the first opt(Any) and you want to do implicit conversions, you can import OptionalRef.Implicit._

So, it's noisier than Groovy, but a lot less noisy than nested "if" expressions.

Here's the code, modified from James Iry's comment:

def notNull[T](x : T) = if (x == null) None else Some(x)
implicit def richOption[T](x : Option[T]) = new {
  def ?![B](f : T => B) = notNull(f(x get))
  def ![B](f : T => B) = if (x isDefined) f(x get)
}

I like that much better than my old, slightly longer, code :)

object OptionalRef {
 object Implicit {
  implicit def obj2OptionalRef[T <: Object](input: T): OptionalRef[T] = {
    if (input == null) NoRef else Ref(input)
  }
 }
 def opt[T](obj: T) = Ref[T](obj)
 abstract class OptionalRef[+T] {
  def isEmpty: Boolean
  def ?![U](st: (T) => U): OptionalRef[U] = {
   (if (isEmpty) NoRef else st(get)) match {
    case r: OptionalRef[U] => r
    case null => NoRef
    case o: U => Ref(o)
   }
  }
  def !(block: (T) => Any): Any = if (!isEmpty) block(get)
  def get: T
 }
 class Ref[+T](x: T) extends OptionalRef[T] {
  def isEmpty = false
  def get = x
  override def toString = "Ref(" + get + ")"
 }
 object Ref {
  def apply[T](obj: T) = new Ref(obj)
 }
 object NoRef extends OptionalRef[Nothing] {
  def isEmpty = true
  def get = throw new NoSuchElementException("NoRef.value")
  override def toString = "NoRef"
 }
}

This Statement is False