Karl Hans Janke Kollaborativ
"GERMANIA 'Fabrikat'"

Closed algebraic data types for Ruby

Today, a little detour off the path to world domination. Or maybe not; who knows.

I use Ruby at work and I guess it's pointless to deny that I regularly cringe at it for its rude failure at being just like Haskell. Even though, I have to admit that it is rather flexible:

Algebraic data types in Haskell look like this:

data Expr = Lam String Expr
          | App Expr Expr
          | Var String

It says that there are three kinds of expressions, called lambdas, applications and variables, consisting of the given data fields (e.g. a variable name and a body for lambdas, and so on). Operations on such a type are defined by giving cases for each kind of argument.

Off the shelf, there are no algebraic data types in Ruby. Google didn't find anything, so I thought about it. With the right plumbing, it's possible to do the following:

class Expr
    ctor :lam, :var => String, :body => Expr
    ctor :app, :op => Expr, :arg => Expr
    ctor :var, :name => String
end

Not much longer and only slightly uglier! \o/ And with type checks, too.

The result is a class Expr with three singleton constructor methods, similar to the usual new. Each takes a single hash as its argument that assigns values to the relevant fields. All fields must be provided and their types must match the definition. For example:

id = Expr.lam(:var=>"x", :body=>Expr.var(:name=>"x"))   # (\x.x)

The ctor method defines accessors to the constructor tag and each field. A typical method definition on Exprs would look like this:

class Expr
    def eval(env={})
        if is_lam? then
            self
        elsif is_app? then
            f = op.eval(env)
            if f.is_lam?
                x = arg.eval(env)
                f.body.subst(f.var,x).eval
            else
                raise "operator is not a function"
            end
        elsif is_var? then
            env[name] || self
        end
    end
end

Obviously, this is in contrast to the object-oriented way of distributing operations on different kinds of things over respective (sub-) class definitions. Aside from being a matter of taste, I prefer the above in this case, because the definitions of evaluation for different kinds of expressions have to fit together. Their interaction is easier to view when the definition is in one place instead of scattered across multiple classes.

To the plumbing: Everything happens in ctor, which is singleton method of Class, so it can be used in a class definition (similar to convenience methods like attr_reader and such). The routine looks like this:

class Class
    def ctor(ctorname, argtypes)
        # add singleton constructor method to self
        self_ = (class << self; self; end)
        self_.module_eval do
            define_method(ctorname) do |argvalues|
                x = allocate

                # check argument types
                # [...]

                # fill instance variables of x
                x.instance_eval do
                    @ctor = ctorname
                    @args = argvalues
                end

                x
            end # constructor
        end # singleton methods of our ADT

        # inspection methods
        attr_reader :ctor
        attr_reader :args

        # define accessors and convenience methods on self
        # [...]
    end
end

The full source code along with a complete lambda evaluator is in the file attached below. The evaluator also supports destructive in-place updates, effectively yielding graph reduction.

Note: The copyright notice states isc license, meaning do whatever you want as long as you leave the notice intact.

Attachment: adt.rb

PS. Happy Birthday, you know who you are.

Do you have a MUA ready to follow mailto: links?

stift-uhr.klein.jpg

Just wondering, because if you do, you can now leave comments on posts on this site. Of course, my solution is unusual and specifically excludes HTML forms, CGI scripts or any other webcrap. Honestly though, I think it's kind of elegant like this:

Anyway, with this yak now out of the way, back to important work. I've got a few crazy ideas on which I'd very much appreciate feedback. Stay tuned.

telnetd an

kabelstraenge.klein.jpg
random foto: elaborately wired contraption (Technikmuseum Berlin)

I bought a Fritz!Box 7050 for cheap on eBay the other day. It's a home router with built-in DSL modem, WLAN, and a POTS interface. The reason I got it was that I'd like to ditch my aging, bulky, and increasingly flaky ISDN phone in favour of using my mobile as a VOIP handset for the Fritz!Box.

The device itself works very well, including my old phone, WLAN, and ADSL2+. Unfortunately, as it turns out, the 7050 doesn't support VOIP handsets in the stock firmware. Google says, however, that you can actually run full-blown Asterisk on it. Now, the reason for this post is my amusement over the method to get a shell on the thing. Connect a phone, dial #96*7*, et voila:

telnetd.klein.jpg
Well, thanks! :)

A lot of frustrating frickel later, however, I've decided to give up for today. I got Asterisk to run, dialing out from the PC via X-Lite actually worked, but SIP registration fails on my Nokia E51, for unknown reasons.

For later reference, here's the list of the relevant links I found:

Semiautomating Accountancy for Fun and Profit

z23.klein.jpg
Zuse Z23 @ Technikmuseum Berlin — definately a semiautomatic accountant!

Is anybody out there using ledger or one of its siblings for their personal accounting? If not, take this as a recommendation. It's a command-line tool to generate various financial reports from a plain text listing of account transactions. If you happen to have access to your bank transactions in csv format, the script I wrote yesterday may be useful to you. It reads comma-separated values from stdin and writes ledger entries to stdout.

If you're German, your likely way to get csv files from your bank is via HBCI. The right tool for the job appears to be aqbanking. Hooking it up to the bank is a bit of fiddling, so I'll reproduce the quick how-to here. This is assuming authentication via PIN/TAN:

$ aqhbci-tool4 adduser -t pintan --context=1 --hbciversion=300 \
        -b BLZ -u NUTZERKENNUNG -c KUNDENKENNUNG \
        -s SERVERURL \
        -N "Real Name"
$ aqhbci-tool4 getsysid -c KUNDENKENNUNG
$ aqhbci-tool4 getaccounts -c KUNDENKENNUNG

To fetch transactions from all accounts and print them in csv format:

$ aqbanking-cli request -c /tmp/foo.ctx --transactions
$ aqbanking-cli listtrans -c /tmp/foo.ctx

The csv2ledger script is tailored to the default output format of the above. I also have made a small shell script to drive these two commands and pipe the result through the converter. It accepts an optional date range to which to restrict the output.

Appendix:

The Skein hash function in 256 lines of C

Note: I'm going to switch this thing over to English now, because I expect to ask some non-Germans for feedback in the future. I might also translate some old posts.

teichufer.klein.jpg
A view across the lake at HAR 2009 towards the CCC's geodesic party tent. Me and friends camped just about outside the right edge of the picture.

So here's the latest installment of my exploits into the forbidden realm of implementing cryptographic primitives.

After building my little crypto chat experiment last month, one thing sorely missing was message authentication (from p2p.c):

printf("receiving packets on port %d\n", LOCALPORT);
printf("CAUTION: Message senders can be spoofed.\n");

The obvious solution to this problem are message authentication codes, particularly because the diffie-hellman setup already yields shared secrets between any two parties. A typical way to construct MACs is to take a cryptographic hash function and compute its value over a combination of the message and the secret (the standard construction of this kind is called HMAC). So I set out to find a nice little hash function which could be easily and elegantly implemented. Unfortunately, the obvious candidates didn't quite satisfy me. I kept looking and eventually ended up with the promising description of Skein:

Skein is a new family of cryptographic hash functions. Its design combines speed, security, simplicity, and a great deal of flexibility in a modular package that is easy to analyze.

Without much regret, I commited the next step up on the ladder of serious crimes in the construction of crypto systems: I set out to use an unproven algorithm. Hooray! >:)

Incidentally, one very nice feature of Skein is that it already offers a mechanism to turn it into a keyed hash function. If my understanding is correct, this is essentially due to the fact that Skein is actually derived from a block cipher (actually called Threefish ;)). I have yet to implement this MAC mode, but it's basically a detail once the rest is set up.

As of yesterday, the code finally produces the correct output on the official one-byte test vector. Feel free to try it on the longer ones. What took me so long? First, there was HAR. I had it pretty much complete at that point, except for one of those nasty segfault bugs. When I took a good hard look at things again this week, it turned out to be an overlong memset() corrupting my stack. God, I love those! ;)

There are some limitations to the code at this point:

Appendix: skein.c