I just dropped a new repository on GitHub about this -- just wanted to put a pointer here.
Friday, October 1, 2021
Thursday, December 24, 2015
Transactional memory for operational transformation in Leisure
It's been a very long time since my last post but we haven't been idle. We've been working on Leisure, which grew out of Lazp. Leisure has become a collaborative, dynamic, polyglot, computing environment, but more about that in future posts. If you're curious about Leisure, you can find the repository here, although we haven't formally announced it, yet. It's functional but buggy.
For now, here's a short note about how Leisure manages collaborative data manipulation. Leisure's collaborative documents contain a combination of content, code, and data, all represented by org-format markup in plain text. Data in Leisure documents resides in markup-delimited regions of text, such as org-mode source blocks containing yaml data and code in a Leisure document can change that data. This means that, beyond just normal collaboration, Leisure requires an exclusion mechanism so that data can stay consistent when several users access and change it.
Currently available OT (operational transformation) libraries and services work fine for eventually consistent collaborative document editing but these systems (Google Realtime API, TogetherJS, operational-transform.js, sharejs) don't work so well for collaborative applications that use the document for their state because they don't support any kind of locking (pessimistic or optimistic).
Leisure adds "guarded operations" to OT for collaborative data management. Guarded operations add a "compare and set" operation to OT which Leisure uses to implement a sort of STM (software transactional memory) where the memory arena is the entire text document. A guarded operation pairs an OT operation with a collection of text ranges. A guarded operation will return failure to the sender if any of the ranges were altered before the operation could be applied, otherwise it returns success.
Leisure's STM transaction facility takes a transaction function which returns a set of guarded replacements and returns a promise. To attempt transaction, Leisure evaluates the function, combines he returned replacements into a guarded operation and sends it to the OT server. Leisure will reattempt the transaction until it succeeds, after which it will resolve the promise. This is analogous to the way STM works (see this Wikipedia article).
Using this facility in Leisure, developers can safely and easily write multi-user programs that collaborate on document data. This is the standard way to access and alter data in Leisure so the use of STM becomes transparent to the developer. As with STM, developers don't have to worry about data concurrency so much as just which code to group into which transactions.
For now, here's a short note about how Leisure manages collaborative data manipulation. Leisure's collaborative documents contain a combination of content, code, and data, all represented by org-format markup in plain text. Data in Leisure documents resides in markup-delimited regions of text, such as org-mode source blocks containing yaml data and code in a Leisure document can change that data. This means that, beyond just normal collaboration, Leisure requires an exclusion mechanism so that data can stay consistent when several users access and change it.
Currently available OT (operational transformation) libraries and services work fine for eventually consistent collaborative document editing but these systems (Google Realtime API, TogetherJS, operational-transform.js, sharejs) don't work so well for collaborative applications that use the document for their state because they don't support any kind of locking (pessimistic or optimistic).
Leisure adds "guarded operations" to OT for collaborative data management. Guarded operations add a "compare and set" operation to OT which Leisure uses to implement a sort of STM (software transactional memory) where the memory arena is the entire text document. A guarded operation pairs an OT operation with a collection of text ranges. A guarded operation will return failure to the sender if any of the ranges were altered before the operation could be applied, otherwise it returns success.
Leisure's STM transaction facility takes a transaction function which returns a set of guarded replacements and returns a promise. To attempt transaction, Leisure evaluates the function, combines he returned replacements into a guarded operation and sends it to the OT server. Leisure will reattempt the transaction until it succeeds, after which it will resolve the promise. This is analogous to the way STM works (see this Wikipedia article).
Using this facility in Leisure, developers can safely and easily write multi-user programs that collaborate on document data. This is the standard way to access and alter data in Leisure so the use of STM becomes transparent to the developer. As with STM, developers don't have to worry about data concurrency so much as just which code to group into which transactions.
Monday, August 25, 2014
DOMCursor, a tool for filtered DOM tree cursoring
While working on Leisure, I searched for tools to traverse text in DOM trees, but most of the ones I found were fairly limited. One nice one is Rangy (https://github.com/timdown/rangy), which has a TextRange that can traverse by characters and words with rules for skipping over invisible text, collapsing contiguous whitespace, etc. Leisure, however is what you might call an “ultra-rich text” environment and needs more power. This isn't my announcement of Leisure, by the way -- it's got a bit farther to go for that -- so consider this a teaser :).
Leisure documents are orgmode files and the environment has a couple ways to present them, some of which sprinkle controls and views in among the editable text. During the design, I decided to use the contenteditable attribute and use the text in among the sprinkled controls and views as the actual document text, as opposed to emulating document editing, like code mirror does. Over the years, I experimented with different approaches, such as using shadow DOM, which I eventually abandoned, because of complications and performance issues with importing large CSS files into many shadow trees.
After dumping the shadow DOM approach, I moved those controls and views from the shadow back into the light, marking them with a data attribute (data-nonorg), so I could skip over them. I had a bunch of functions to help with this, so I collected them into a class called DOMCursor, to make it easier to debug maintain.
There are several web-based editors out there, but not much in the way of tools to help people who are bulding their own and I’ve seen posts asking for tools like this. DOMCursor is already open source and part of the Leisure project, but I decided to release it as a separate project, without any Leisure dependencies. And I even went to the trouble to document it!
Here’s the project and here are some details from the source file (which doubles as the readme):
This readme file is also the code.
DOMCursors are immutable – operations on them return new DOMCursers.
There are two ways to get mutabile cursors, sending @mutable() or
sending @withMutations (m)-> …
A DOMCursor has a node, a position, a filter, and a type.
Leisure documents are orgmode files and the environment has a couple ways to present them, some of which sprinkle controls and views in among the editable text. During the design, I decided to use the contenteditable attribute and use the text in among the sprinkled controls and views as the actual document text, as opposed to emulating document editing, like code mirror does. Over the years, I experimented with different approaches, such as using shadow DOM, which I eventually abandoned, because of complications and performance issues with importing large CSS files into many shadow trees.
After dumping the shadow DOM approach, I moved those controls and views from the shadow back into the light, marking them with a data attribute (data-nonorg), so I could skip over them. I had a bunch of functions to help with this, so I collected them into a class called DOMCursor, to make it easier to debug maintain.
There are several web-based editors out there, but not much in the way of tools to help people who are bulding their own and I’ve seen posts asking for tools like this. DOMCursor is already open source and part of the Leisure project, but I decided to release it as a separate project, without any Leisure dependencies. And I even went to the trouble to document it!
Here’s the project and here are some details from the source file (which doubles as the readme):
DOMCursor
Filtered cursoring on DOM trees. DOMCursors can move forwards or backwards, by node or by character, with settable filters that can seamlessly skip over parts of the DOM.This readme file is also the code.
DOMCursors are immutable – operations on them return new DOMCursers.
There are two ways to get mutabile cursors, sending @mutable() or
sending @withMutations (m)-> …
A DOMCursor has a node, a position, a filter, and a type.
- node: like with ranges, a DOM node
- position: like with ranges, either the index of a child, for elements, or the index of a character, for text nodes.
- filter: a function used by @next() and @prev() to skip over portions of DOM. It returns
- truthy: to accept a node but its children are still filtered
- falsey: to reject a node but its children are still filtered
- ‘skip’: to skip a node and its children
- ‘quit’: to end to make @next() or @prev() return an empty DOMCursor
- type: ‘empty’, ‘text’, or ‘element’
Labels:
dom,
javascript,
leisure,
programming
Wednesday, May 15, 2013
Trying out a Coding Dojo
We had our first Coding Dojo session, today. I wanted to start a "Programming Dojo" and I found that some people had already been doing that for years, so I decided to use their wisdom :). Here's a great video, linked from the same site. The guy making this gestures with the camera a lot. If you get easily motion-sick, maybe you should just listen to it.
The idea is to have 2 people pair programming in front of a group, with the group kibitzing. Techniques are discussed only in the context of code and programming is the path to learning in the coding dojo. I'm introducing people, here, to functional programming concepts. Here is the "kata" we used, today:
Kata: Person
Shortly, we'll be moving on to lists and this kata will form a foundation for that. This "person" isn't too much different from a cons-cell.
We're doing this at lunch, right now. If it works out, I'll move it to evenings and invite people.
The idea is to have 2 people pair programming in front of a group, with the group kibitzing. Techniques are discussed only in the context of code and programming is the path to learning in the coding dojo. I'm introducing people, here, to functional programming concepts. Here is the "kata" we used, today:
Kata: Person
Learning about "selector functions"
1. Write a "person" function, person(name, address, gender):
2. Write person() without using comparisons or control structures
- it returns a "person"; something which can be used to retrieve the values
- you man not create a new array, object, or string
- you may use comparisons and control structures (if, for, while, switch, etc.)
Shortly, we'll be moving on to lists and this kata will form a foundation for that. This "person" isn't too much different from a cons-cell.
We're doing this at lunch, right now. If it works out, I'll move it to evenings and invite people.
Saturday, March 24, 2012
Lazp: an untyped, lazy, functional language with support for runtime metaprogramming
So far, the only untyped, lazy language I've found that's being maintained is Lazy Racket. I've been interested in Lambda Calculus for a while, though, and LISP/Scheme aren't quite what I want (and they're quite a bit larger, too), so I decided to extend the Lambda Calculator into a standalone language. Here's the repository. The goal is to make it native, but for now, it's still in JavaScript, although it will run "standalone" in node.js (and also in browsers). Here are the language features I'm interested in:
The code's in Coffeescript and the parser and code generator is all in one file with no dependencies, called Lazp.cs (about 375 lines of code).
I'm looking for help on this, so feel free to leave a comment if you're interested and feel free to fork it in Github; it's licensed with Zlib.
- lazy, so no side effects, because they can really mess up a lazy language
- untyped; Haskell is already there for typed, lazy languages
- metaprogramming; the eval function uses ASTs that are LC functions
- We'll rewrite Lazp's parser and code generators in Lazp, so it's self-hosting
- I "hand assembled" the AST functions, so they're in Lazp, already
- This means metaprogramming in Lazp can happen at parse time and also at runtime (hey, just like in FORTH!)
- _lit v -- literal value
- _ref v -- variable reference: v should be a string
- _lambda var body -- lambda binding: var should be a string or number, body is an AST
- _apply func arg -- function application: func can be any AST function except a _prim
- _prim arg rest -- primitive call: rest is either a _lit, a _ref, or another _prim (which allows more args)
- _name name ast -- name a type
- _lit x = λf . f x
- _ref x = λf . f x
- _lambda v f = λg . g v f
- _apply func arg = λf . f func arg
- _prim arg rest = λf . f arg rest
- _name nm ast = λf . f nm ast
So, what's already working? I had a big advantage, here, because I could use the Lambda Calculator as a prototype. So, right now, this returns "hello":
eval (_apply (_lambda x (_ref x)) (_lit hello))This calls eval with an AST that applies the identity function to "hello".
The code's in Coffeescript and the parser and code generator is all in one file with no dependencies, called Lazp.cs (about 375 lines of code).
I'm looking for help on this, so feel free to leave a comment if you're interested and feel free to fork it in Github; it's licensed with Zlib.
Wednesday, January 4, 2012
I added an intro, a summary, and a few more ideas to the Code Google post. It needs a better name that doesn't just mean "search" to people and also doesn't include trademarked words :).
Tuesday, December 27, 2011
Code Google
UPDATE: Added structure and a few more ideas...
Intro
Have you ever seen someone include a giant open source library in a project, just to access some small part of its functionality? It's very important to have someone else write, debug, and maintain a large part of your code, but when you do this over and over, it can add quite a lot of bloat to an app for comparatively little gain in functionality.
This is a huge problem, as I see it, but maybe there is a solution?
What about allowing people to integrate only small subsets of modules into their code. I'm calling this idea "code google," for lack of a better term, although it's a LOT more than just search. It involves search, analysis, IDE integration, and social networking, so that library developers can track what parts of their code are actually being used. It's far from trivial to implement, but I think it could be very useful
Before our company started, a friend and I wrote a fair-sized game in LISP (a couple tens of thousands of lines) and later versions used OO languages (both class-based and prototype-based). My friend and I seem both to be coming to the conclusion that what is happening in the Java world is not a good thing. A lot of modern programming seems actually to be just gluing together libraries and frameworks. Many times, frameworks are chosen because of a few of their features and glued onto other frameworks similarly chosen.
Module systems concentrate on versioning in the large -- each framework is versioned as a whole, despite the fact that so many people (apparently) pick a small set of features and use that. Programmers achieve modularity by marking off a segment of code and declaring it to be a module. APIs are defined explicitly by developers, but I, on the other hand, often need to use implicit APIs that the developers didn't consider to be "first class," when they wrote their code.
Code Search
What I think is needed, in addition to explicit module declaration, is a "code Google" that searches the vast sea of code and finds a small subset that does what I need at the time. Curiously enough, this already exists, in part, for Haskell programmers, with Hoogle, that lets you search for functions based on their type signatures. It might be hard to imagine how this could possibly work in Java, because you won't necessarily know the name of the actual type you are looking for, but it becomes more practical as Java gets closer to LISP (or Haskell), because it becomes easier to express generic behavior without knowing as many explicit names. A "code Google" for Java could allow you to inline a method signature without knowing the name of the SAM type that the code needs. This could be done before Java actually has a lambda syntax.
Closures will go a long way toward making Java simpler to use, since so many design patterns are trivial with them -- I made a command-driven framework in Java 1.0.4, before there were even anonymous inner classes and had to make a separate class file for each of several hundred commands. You can look at how much much easier AWT became after the introduction of anonymous inner classes for an example of this detangling; you don't have to subclass Button, anymore, when you make a GUI, you just make an anonymous event handler inner class. In Java 8 (or later?), each of these 1.0.4 files would just be a lambda expression -- same anonymous inner class mechanism as in Java 6, but it's a LOT easier on the eyes.
Code Snippet Retrieval
In addition to search, Code Google should be able to retrieve the dependencies of the code snippet you need and pack it all up for you in a library jar with source included. Hoogle does not do anything like this -- it just gets you to the module version. Again, as Java gets closer to LISP, dependencies won't need to be as tangled up because as closures become easier to use and more prevalent in code, it becomes easier to write code that stands on its own (see the AWT evolution for examples). It's already possible to do this in Java, by using anonymous inner classes, but these are verbose and a lot of people don't like to use them (except when they write event handlers :) ).
Social Networking and IDE integration
Once you have the snippet, all jarred up, there should be a way to track the search query and the version of the module it came from, so you can automatically update your code. When someone downloads a snippet, there should be a social networking tie-in with tools that the developers use so that they can see an annotation in their code to let them know that people are using that particular API, so that the the developer doesn't unintentionally change or remove that implicit API. If the search does fail because the code is no longer in a module, someone can still spawn another open source module to support it. The social networking support should also indicate how many people are using a piece of functionality and allow people to attach comments to the code so the developers and other users can see them.
As an alternative to IDE integration, there should also be command line and web-based tools for your project. Mods to Gitweb and Fossil, maybe?
Summary
The Eclipse Code Recommenders plugin looks like it's trying to do the search portion, at least.
Intro
Have you ever seen someone include a giant open source library in a project, just to access some small part of its functionality? It's very important to have someone else write, debug, and maintain a large part of your code, but when you do this over and over, it can add quite a lot of bloat to an app for comparatively little gain in functionality.
This is a huge problem, as I see it, but maybe there is a solution?
What about allowing people to integrate only small subsets of modules into their code. I'm calling this idea "code google," for lack of a better term, although it's a LOT more than just search. It involves search, analysis, IDE integration, and social networking, so that library developers can track what parts of their code are actually being used. It's far from trivial to implement, but I think it could be very useful
Before our company started, a friend and I wrote a fair-sized game in LISP (a couple tens of thousands of lines) and later versions used OO languages (both class-based and prototype-based). My friend and I seem both to be coming to the conclusion that what is happening in the Java world is not a good thing. A lot of modern programming seems actually to be just gluing together libraries and frameworks. Many times, frameworks are chosen because of a few of their features and glued onto other frameworks similarly chosen.
Module systems concentrate on versioning in the large -- each framework is versioned as a whole, despite the fact that so many people (apparently) pick a small set of features and use that. Programmers achieve modularity by marking off a segment of code and declaring it to be a module. APIs are defined explicitly by developers, but I, on the other hand, often need to use implicit APIs that the developers didn't consider to be "first class," when they wrote their code.
Code Search
What I think is needed, in addition to explicit module declaration, is a "code Google" that searches the vast sea of code and finds a small subset that does what I need at the time. Curiously enough, this already exists, in part, for Haskell programmers, with Hoogle, that lets you search for functions based on their type signatures. It might be hard to imagine how this could possibly work in Java, because you won't necessarily know the name of the actual type you are looking for, but it becomes more practical as Java gets closer to LISP (or Haskell), because it becomes easier to express generic behavior without knowing as many explicit names. A "code Google" for Java could allow you to inline a method signature without knowing the name of the SAM type that the code needs. This could be done before Java actually has a lambda syntax.
Closures will go a long way toward making Java simpler to use, since so many design patterns are trivial with them -- I made a command-driven framework in Java 1.0.4, before there were even anonymous inner classes and had to make a separate class file for each of several hundred commands. You can look at how much much easier AWT became after the introduction of anonymous inner classes for an example of this detangling; you don't have to subclass Button, anymore, when you make a GUI, you just make an anonymous event handler inner class. In Java 8 (or later?), each of these 1.0.4 files would just be a lambda expression -- same anonymous inner class mechanism as in Java 6, but it's a LOT easier on the eyes.
Code Snippet Retrieval
In addition to search, Code Google should be able to retrieve the dependencies of the code snippet you need and pack it all up for you in a library jar with source included. Hoogle does not do anything like this -- it just gets you to the module version. Again, as Java gets closer to LISP, dependencies won't need to be as tangled up because as closures become easier to use and more prevalent in code, it becomes easier to write code that stands on its own (see the AWT evolution for examples). It's already possible to do this in Java, by using anonymous inner classes, but these are verbose and a lot of people don't like to use them (except when they write event handlers :) ).
Social Networking and IDE integration
Once you have the snippet, all jarred up, there should be a way to track the search query and the version of the module it came from, so you can automatically update your code. When someone downloads a snippet, there should be a social networking tie-in with tools that the developers use so that they can see an annotation in their code to let them know that people are using that particular API, so that the the developer doesn't unintentionally change or remove that implicit API. If the search does fail because the code is no longer in a module, someone can still spawn another open source module to support it. The social networking support should also indicate how many people are using a piece of functionality and allow people to attach comments to the code so the developers and other users can see them.
As an alternative to IDE integration, there should also be command line and web-based tools for your project. Mods to Gitweb and Fossil, maybe?
Summary
- Code Search indexes vast amounts of code and lets people search for what they need by specifying type signatures in addition to just unstructured text
- Code Snippet Retrieval incorporates a snippet and its dependencies into your project's codebase and includes metadata that describes where the snippet came from (the module, version number, site, etc.)
- IDE Integration puts this information at your fingertips so users can be informed when new versions are available and developers can be informed which pieces of their code that people are finding important and how many people are using the code
- Social Networking lets people comment and collaborate with the developers to keep them connected to the users of their code
The Eclipse Code Recommenders plugin looks like it's trying to do the search portion, at least.
Subscribe to:
Posts (Atom)