Thursday, January 19, 2017

channels in Common Lisp

preramble


One of the things that Go (and Erlang, among others) got right was using channels as the main mechanism for synchronizing and communicating between threads. This post describes a minimal, portable (as in based on bordeaux-threads) implementation in roughly 100 lines of Common Lisp. I suspect there are plenty of more elaborate implementations out there. The point of this exercise is cutting the idea down to it's core; for educational reasons, and because I prefer more focused code than what's generally offered.

fifo queues


A channel needs a fifo queue to buffer items, the most obvious way to get there in Common Lisp is a list that keeps track of it's tail.


semaphores


The easiest way to keep track of buffer length and block threads on access when needed is to use semaphores. The most popular thread implementation in Common Lisp, bordeaux-threads, unfortunately lacks semaphores. Luckily, it provides locks and condition variables; which is all we really need.


channels


A channel is defined as a locked fifo queue that uses two separate semaphores for controlling access. Both semaphores share the same lock as the queue.


example



You may find a full implementation of these ideas and more here, multi-threaded generators is the best current example of channel use. And feel free to join the ongoing discussion on Hacker News.

peace, out

Sunday, January 15, 2017

coroutine iters in Common Lisp

preramble


This post describes an implementation of iterators in Common Lisp that uses coroutines based on conditions and restarts. Going through the motions one more time brought up memories and pain from past iterative experiences in various languages, it's clearly something we are still in the process of figuring out. The worst anti pattern I've been forced to deal with so far is Java's Iterables/Iterators/Streams mess; where each API got some parts right and completely forgot about the others. At least having no extensible protocol for iteration is better than having three, we're off to a good start.

simple


First of all, extension and use of iterators has to be trivial; otherwise we'll find more convenient ways around. This is the reason you'll find most Common Lisp libraries consing lists and/or writing custom DO-macros, even though it's clearly the opposite of bottom up. It's with iterating as with testing; they are stepping stones to the point where real problem solving begins; and they need to be trivial or they will get jumped over.

flexible


Ideally, extending the iterator protocol shouldn't take more than calling into the library on iteration; the less ceremony, coupling and restrictions, the better. And iterators are not allowed to randomly fail above a certain number of items, like consing lists inevitably does.

encapsulated


Besides support for, wait for it, iterating; I prefer as few mentions of iterators as possible in user code. It's low level plumbing; a platform that allows building more interesting algorithms on top, and I want user code to be about those algorithms.

the holy grail


The holy grail is an extensible iterator with above properties that doesn't require cutting algorithms into small pieces and spoon feeding it. Unfortunately; implementing simple, fast and portable coroutines is often tricky business; and Common Lisp is no exception. I briefly entertained the idea of using cl-cont as a foundation, but it lacks the performance and simplicity I'm looking for. Which is when I remembered that Common Lisp supports conditions and restarts; a few macros on top to create the illusion of iterating and that might actually work. And it does. It's still not quite as fast as I would like, around 10x slower than consing lists; but since conditions are first class citizens in Common Lisp, their performance is on enough tables already. (Update 21/1 - I managed to triple performance by adding user specified batch size to reduce the number of call stack traversals, the details are in the repo). Above all, the implementation is both trivial and portable compared to available options.

use



extension


The following examples are from cl4l's table and index packages, both are included to better illustrate the amount of ceremony involved in hooking into the iterator protocol. Notice how iterators are completely independent from library code except for calls to '(iter-yield ...)'. The main reason DO-macros are provided is to help with unpacking items returned from the iterator, the rest is delegated to the extension helper macro '(do-iter ...)'. Either macro could deal with unpacking any iterator with the same item encoding.


implementation


Really, that's it; once you're able to cut to the core of any problem, that's about as complex as it will look. It's all equations in the end.

A full implementation of these ideas and more may be found here.
peace, out

Monday, January 9, 2017

defer in Lisp

The Go designers had good reasons to choose 'defer' as the one true mechanism for executing code on scope exit. The usual Lisp strategy of writing custom WITH-macros breaks down as soon as allocation and/or clean up depend on decisions made in the macro body. Implementing a comparable general purpose mechanism in Lisp is trivial, and well worth the effort. The implementation below adds the capability to defer to named scopes.

usage



implementation



peace, out

Lispy indexing

I would assume that most coders out there know at least a couple of ways to get around the problem of quickly and flexibly indexing data on composite keys. I know I'm responsible for my share of the maps of maps and fixed format string key karma. It's Greenspun's Tenth Rule for indexing; any project that doesn't have access to general purpose indexes will eventually solve the same problems in other, less rational ways.

general purpose indexes


I've come to define a general purpose index as an ordered set of records with optionally unique composite keys. The Common Lisp implementation described here uses functions to specify keys. Since indexes delegate actual indexing to ordered sets, they support the same features out of the box.


transactions


The other reason many projects develop an SQL dependency is transaction support. As data and access patterns become more complex, manual coordination quickly looses it's appeal. Transaction support may sound daunting, but we're not aiming for the clouds. Assuming single threaded use; automatically tracking added and removed items, and performing opposite operations on rollback; is trivial to implement and frees mental resources for more interesting tasks.


A final example before we wrap up, this one shows how the API supports non unique use:


A full implementation of these ideas and more may be found here.

peace, out

Thursday, January 5, 2017

Lisp, L stands for Leverage

There's seems to be a lot of confusion among non Lispers about the reasons why so many otherwise seemingly intelligent coders swear by this funny looking curiosity from the 70's. And the name keeps popping up; from Emacs, that's primarily coded in it's own flavor of Lisp; to JavaScript that borrows most of it's power from Lisp.

leverage


It's about Leverage, period. Lisp was created to make it possible to solve very complex problems, problems so complex that holding back any power from users would have rendered the result useless. The designers of Lisp went all in to provide the best tool set they could imagine for solving the most complex problems they could imagine, and mostly managed to leave egos on the shelf while doing so. The reason it isn't changing much is because it's a masters tool set, forged from an ocean of experience; the types of software people build in Lisp change at least as fast as the rest of the world.

memoization


Memoization is the idea that it might be worthwile to cache the results of an expensive, potentially parameterized computation. This rest of this post describes an implementation of general purpose memoization in Common Lisp.

context


To begin with, we need somewhere to index memoized results on computation and parameters. A closure would do for simple cases, but in more complex scenarios user code commonly needs more control over the cache. We could give up and pass contexts around explicitly; but in Lisp circles this kind of manual, repetitive work is generally frowned upon. Instead, we will store the context in a special variable. Special variables are global variables done right. Binding one affects the entire call stack originating from the binding, regardless of depth; and nothing else. We provide a default binding to make simple trivial.


computation


Next question is how to model parameterized computations. The obvious answer is to use functions, and we will provide a functional interface either way; but we can generalize that further with a macro. This allows specifying a condition and handing back flow control to user code.

Given the macro, implementing a functional interface on top is effortless:


usage



A full implementation of these ideas and more may be found here.

peace, out

Sunday, January 1, 2017

on tests

I prefer my tests simple and flexible, designed around the application rather than imposed on it. These days, I usually start with basic functions and assertions; and add what I need when I need it. The holy grail for me would be a reusable library that allows the same progressive approach without getting in the way.

The flexibility I'm looking for is the freedom to structure my tests any way I feel like, to change my mind as many times as I have to without rewriting everything; and the capability to include/exclude tests in a run without touching the code.

On the subject of simplicity, besides being a requirement for flexibility; benchmarking is worth mentioning. I find that most test frameworks either completely skip the subject, or make to big a deal out of it. I want the option to warm up caches, repeat tests and print a table of run times without bending over backwards.

I've found that implementing tests as regular functions helps fulfill above stated goals. And I've grown fond of the idea of reusing regular assertions for signaling errors. In the end, all assertions are tests; which means that you want to deal with them the same way in any given scenario, embedded or external makes no difference. Common sense recognizes the fact that organizing something as volatile as tests into rigid hierarchies is futile; tagging tests with keywords allows a more dynamic, gradual approach.

Each test in libc4life is implemented as a regular function that contains a set of tags in it's definition. Tests can be grouped into suites, further modularized using sub tests to specify additional tags, and called directly from within other tests. When running a suite of tests, two sets of tags may be specified; one set of tests to include, and one containing tests to skip. Tests matching tags in the first set are included, unless they match tags in the second set.

Each test run includes a specified number of warmups and repetitions, and prints a formatted table of test names and run times to stdout. Assertion errors are first class and can be dealt with like any other error, and defining NDEBUG allows test code to control if failures should terminate or be reported at the end of the run.


peace, out

Saturday, December 31, 2016

set, match, meta operator

The common set operations; difference, join/intersect and union; may all be efficiently implemented on top of a meta operator that I've come to call 'match'. It does what it says: finds the next matching element, given two sets. The following examples are from libc4life's binary sets, but the idea is universally applicable.

the match operator



set operations


Once match is given, the actual operations read like poetry. For brevity, only join/intersect is included here; remaining operations vary only in the way they deal with matches/gaps.


Besides simplifying the implementation, exposing the underlying mechanism like this also gives user code a second chance for optimized access when provided operations don't fit the problem.

peace, out