Thursday, April 26, 2007

Deus Ex Automatis, Part VII

Up to this point, all the code presented here is running in sequence. This is not, however, by design, so much it is by accident. Theoretically, every cell of a cellular automaton has access to its own tiny little computer (e.g. XOR gate), and it's own tiny little memory (e.g. 2 bits). And when we do that, we can start running them in parallel.

A somewhat more realistic approximation of this theoretical model would be to run enough cells on any given processor to fill up its memory, and to have it communicate without any friction with the adjacent processors.

There is no reason that a few of these babies couldn't run a cellular automaton with a trillion or so cells.

The hardware limitation to what we can effectively learn is rapidly dwindling. The only limitation to how well a curve can be fit is how well the underlying slice correlates to the curve. This is a matter of achieving sufficient parallelism, tweaking our models, and brute-forcing our way through enough cell-space to figure out what classes of data are good fits:
One of the beauties of this field is that it's not really theoretical- at least not in the sense that math or physics is theoretical. Well, at least theoretical physics. It's practical: either our models will match experimental data, or they won't.

There are open questions worth additional scrutiny.

Should we model using infinite, or finite grids?

What is the optimal structure of grids for various problem sets?

The only way we can find answers to these questions is to dive in and start hacking. If you're interested, email me.

I've now started the questions- if any of you here know the answers, feel free to pop me an email. Otherwise, I've hopefully stimulated a few questions.

Labels: , ,


Tuesday, April 24, 2007

Deus Ex Automatis, Part VI

Having arrived at Part Six of our series, I am now going to make a rash hypothesis:
The rule we've just described is computationally universal.
Just a hunch, though. Not getting into a proof yet.

Because if it is universal (as some cellular automata have proven to be), who cares? It's probably not a very efficient computer you want to do something like add 2 and 2 together and print the result on a screen. Besides, it doesn't need to be computationally universal to be useful.

So what might we do with such a model?

The short answer to that question is the entire point of my presentation at the emerging technology conference and this series: Fit Curves Better. Particularly curves that equations are bad at.

As a final point of clarification before diving into a bit more code, by "curve" I mean "series of 1's and 0's." As I assume any right-thinking modern programmer would mean by the word. Now on to our curve fitting:
class Fit {
has Bool $.match is rw = Bool::False;
has Int $.magnitude is rw = 0;

method fit (:$model, :$sample) {
if ($model == $sample) {
$.match = Bool::True;
$.magnitude++;
}
else {
$.match = Bool::False;
}
}
}
This module is dead simple. It simply compares our model (a cellular automaton) to our sample (some so-called "random" data), and increments the variable magnitude if they are the same, and sets $fit.match() to true.

In real life, I don't use a module to do this, but it's easier to explain the principle of it before diving into any more detailed examples. At this point in the talk at ETech, I proceeded to run a script that matched an arbitrary string of 1's and 0's against my cellular automaton. Here's a version of that script:

my Bool @data = <001001000011111101101010100010001000010110100011000>.split("");

use Automaton;
use Fit;
my $diameter = 19;
for (0..$diameter-1) -> $slice {
my Automaton $ca .= new(:diameter($diameter), :rule(6)); # initialize the CA
$ca.next() for 0..30; # "seed" the CA
my Fit $compare .= new();
for (0..@data.elems-1) -> $stage {
$compare.fit(:model($ca.grid.state()[$slice]), :sample(@data[$stage]));
$ca.next;
}
my $fit = $compare.magnitude()/@data.elems() * 100;
say " - Slice $slice fits: " ~ floor($fit) ~ "%";
}
And here's the output of the script (finally!)
- Slice 0 fits: 50%
- Slice 1 fits: 52%
- Slice 2 fits: 47%
- Slice 3 fits: 52%
- Slice 4 fits: 58%
- Slice 5 fits: 52%
- Slice 6 fits: 60%
- Slice 7 fits: 47%
- Slice 8 fits: 49%
- Slice 9 fits: 66%
- Slice 10 fits: 62%
- Slice 11 fits: 66%
- Slice 12 fits: 60%
- Slice 13 fits: 56%
- Slice 14 fits: 64%
- Slice 15 fits: 50%
- Slice 16 fits: 45%
- Slice 17 fits: 50%
- Slice 18 fits: 45%
As you can see, the best fit is about 66%, which is only .16 above what we would expect from matching our arbitrary data to a totally random data set.

So where's the catch?

Tune in for the final segment soon!

Labels: , ,


Friday, April 20, 2007

Deus Ex Automatis, Part V(c)

In our final segment of Part V, we tie together all the code into a single, cohesive whole, and run it. In the interest of getting it done more quickly, and in a literate programming style, I've noted some of the nice features of Perl 6 in the comments.

A simple Copy/Paste should get this code going for you in Pugs:
use v6-alpha;  # By Christmas, we should be able to say "use v6;"
class Rule { # See Deus Ex Automatis Part V(a)
has Int $.number is ro;
has Int $.neighbors is ro;
has Bool %.lookup is rw;
submethod BUILD (:$.number, :$.neighbors) {
for ( 0 .. (2 ** $.neighbors -1) ) -> $key {
my $format = "\%0" ~ $.neighbors ~ "b";
%.lookup{sprintf($format,$key)} = ?($.number +& (1 ~ 0 x $key) );
}
}
}

class Grid {
has Int $.diameter is rw;
has Bool @.state is rw;

submethod BUILD (:$.diameter) {
@.state = 1,;
@.state.push( 0 xx ($.diameter-1) );
}
}

class Automaton does Rule does Grid { # "does" gives us a mix-in pattern
has Rule $.rule is rw; # the "Rule" is a reference to our own class
has Grid $.grid is rw; # as is Grid

submethod BUILD (:$diameter, :$rule) { # new() takes two arguments
$.grid = Grid.new(:$diameter);
$.rule = Rule.new(:number($rule), :neighbors(2));
}

method next { # The "guts" where we apply the rule to the grid
# Also, incidentally, our first method !!
my @old = $.grid.state();
for ( 0 .. $.grid.diameter-1 ) -> $i {
$.grid.state[$i] = ~+$.rule.lookup{ @old[$i] ~ @old[$i-1] };
}
return Bool::True; # Lets us easily use this in a loop
}
}

##
# The script! This will print beautiful cellular automata on your screen.
my Automaton $ca .= new(:diameter(10), :rule(6));
say $ca.grid.state while $ca.next();
Whew!

I may come back later and edit this post to add a bit more commentary. In the meantime, note that this code (in a somewhat complicated, and somewhat slow manner) implements the class of cellular automata we've been discussing through this entire series.

I'm going to move on in the next segment to talking about why it's important.

Labels: , ,


Deus Ex Automatis, Part V(b)

We've already got code for rules from last post. Now we need a grid.
class Grid {
has Int $.diameter is rw;
has Bool @.state is rw;

submethod BUILD (:$.diameter) {
@.state = 1,;
@.state.push( 0 xx ($.diameter-1) );
}
}

Fairly simple, takes one argument: diameter. Building a new grid is done like so:
pugs> my Grid $grid .= new( :diameter(10));
pugs> say $grid.state;
1000000000
Once again, we've taken some shortcuts. The grid code assumes a single dimension, with a second, emergent dimension (often called "time") that happens when you apply the rule.

Now that we've got a rule, and got a grid, we can put them together into a cellular automaton, and discover some more syntactic (and semantic) sugar from Perl 6.

Labels: , ,


Monday, April 16, 2007

Deus Ex Automatis, Part V(a)

Now (finally!) for some code. I've had a bit of a hard time getting this to format correctly, so if there are problems, pop me an email, and I'll try again.

I have two reasons for presenting examples in Perl 6. Reason number one: Perl 6 is reasonably easy for native English speakers to read, even if you're not a hacker. Reason number two: it's fun.

Hopefully it also does something useful, such as compiling without errors.
class Rule {
has Int $.neighbors = 2;
has %.lookup =
( '11'=>'0', '10'=>'1', '01'=>'1', '00'=>'0' );
}
This code isn't particularly useful by itself, but if you wanted to try it, just copy and past that code into Pugs, a Perl 6 implementation, and call it like so:
pugs> my Rule $rule .= new();
pugs> say "Rule has $rule.neighbors() neighbors."
Rule has 2 neighbors.
Note that both $.neighbors and %.lookup have not one, but two punctuation marks at the beginning of the word. The first is a sigil. The sigil designates $.neighbor to be a scalar, and %.lookup to be a hash (e.g. lookup table). The second character (both have a period) tells the class to create accessors for each. So we can do things like "$rule.neighbors()"

One of the most satisfying parts of writing code in Perl 6, is that my spell checker doesn't raise a red flag (or underline) for most the code I write.

The code above is totally static, and could really have used just a built-in data structure. Perhaps a more useful rule could accept any number of neighbors (as an argument) and a rule number as are often used in constructing simpler cellular automata. That could be done like so:

class Rule {
has Int $.number is ro;
has Int $.neighbors is ro;
has Bool %.lookup is rw;

submethod BUILD (:$.number, :$.neighbors) {
for ( 0 .. (2 ** $.neighbors -1) ) -> $key {
my $format = "\%0" ~ $.neighbors ~ "b";
%.lookup{sprintf($format,$key)} =
?($.number +& (1 ~ 0 x $key) );
}
}
}

Perl 6 gives us a built-in pretty way to overload the "new" method in a class with "submethod BUILD". What we've done here is add a method that unpacks the Wolfram-style rule into a lookup table when you create a rule.

As an exercise for the reader, here's how to make the same rule we've been talking about all along:
pugs> my Rule $rule .= new(:neighbors(2), :number(6));
Why?

Labels: , ,


Thursday, April 12, 2007

Pike, Stallman, Patents, and Civility

A couple weeks back, Rob Pike revived this weblog, which in turn triggered this discussion (among others, I'm sure). The only post on the site at the time of this writing (other than the short "revival" post) is a long rant about Richard Stallman that Pike wrote in 1991, and re-posted in 2004.

Rob Pike is a towering genius of computing.

And he's dead wrong.

To begin with, Pike's 1991 rant is showing its age. Free Software is about as mainstream and commercial as you can get these days. Hell, Sun Microsystems is releasing about everything they do under the GPL, and that's just for starters. IBM, RedHat, Oracle... all big players in Free Software.

But that aside, Pike intentionally mis-represents Stallman's viewpoints, ostensibly for the sake of making a point:
"Free Software is like Free Love, a hippie pipe dream in which computing is free from venality, commercial interests, even capitalism."
Now, Stallman himself may be a bit of a hippie pipe dream. He is, as far as I can tell, free from venality, commercial interests, and, yes, capitalism. He is the President of a prestigious foundation, yet he makes no salary.

But Free Software?

Come on, Free Software is more like Free Markets than it is like a hippie pipe dream. Pike almost surely knew this when he wrote the rant. And if he didn't know it then, he certainly knew it when he re-posted it in 2004.

Pike's rant goes on to talk about how civilized everyone at the "protest" was. He even cites this as evidence of him being right, and Stallman being wrong. He could just as easily have taken it to show how civil the Free Software movement was, even in 1991. But he didn't.

Software patents do hurt programmers.

I am a programmer.

So they hurt me.

I don't believe they're good for AT&T, but they're definitely not good for programmers. If you don't understand why, read Donald Knuth's explanation. But that hasn't kept me (or Stallman, or the rest of us) from being civil.

I started out by calling Rob Pike a towering genius.

He is. But he's dead wrong on the issue of software patents.

Labels: ,


Monday, April 9, 2007

Oh. My. Goodness.

This may be the best thing in the world.

Ever.

Labels:


Friday, April 6, 2007

Deus Ex Automatis, Part IV

When I got to this stage in the talk at the emerging technology conference, I glossed over a few of the finer points of mechanics, in hopes that diving right into code would be a better way for most the audience to understand what I was talking about. In retrospect, I think I would have been better served to dig a bit deeper into the mechanics, which is what this article will do.

The grid of cells we looked at in yesterday's segment is incomplete. There are cells of the grid that only have a single neighbor- namely those on the right and left. In fact, the top cell has no neighbors at all. Whereas the rule we outlined needs to look at two neighbors.

There are at least a half-dozen ways around this problem.
  1. Make the grid of cells big enough that on need never reach the edge.
  2. Simulate an infinite grid by doing lazy evaluation at the edges (more on this later).
  3. "Wrap" each dimension of the grid, making an end neighbors with its opposite.
  4. Use a lower-dimensional cellular automaton to specify values on the edges.
  5. Assign a static value to every cell on the edge of the grid.
  6. Combine one or more of these approaches.
It's beyond the scope of this segment to describe the benefits and drawbacks of each method. But method three, which we will use below, does merit a closer look.

Note that I have started with the top cell labeled "A0," I have taken each right-hand neighbor, and when I reached the end, I made the last cell neighborly with the first cell. Think of this as wrapping the grid around on itself.

The cellular automaton now has a new property: diameter.

Which brings us to the all-important question: which kind of grid is simpler- infinite or finite?

Well, who knows? I think the finite grid is simpler, because even though it has a new property (namely, diameter), it's smaller and it's calculable. In addition to that, using this kind of a finite grid causes the cellular automaton we've been discussing here to exhibit interesting properties. If you look again at the previous section, note that even with a diameter of 25, there is a great deal of complexity to the structure.

It may even be universal, in the same sense that Wolfram's Rule 30 and Conway's Game of Life have already been proven to be.

With that last addition to the mechanical overview, we're now going to commit our lengthy explanation into some extraordinarily concise code.

Labels: , ,


Wednesday, April 4, 2007

Deus Ex Automatis: Part III

Now for some mechanics.

A cellular automaton is a model. Untangled from all the biological, physical, and mathematical metaphor, cellular automata are simple.

They exist on a grid of cells:


As pictured here, cells are represented by the egg shapes with letters in them. As you may have guessed, the cells give cellular automata the first part of their name. Between the cells are arrows, which declare the "neighborhood" of a cell. A cell's neighbors have arrows pointing into that cell.

Some cellular automata differentiate between one neighbor and another. In this series we'll look at some of each. Note, in passing (don't get hung up on it) that any cellular automaton that does not distinguish between neighbors can be easily represented by one that does. So the former should just be considered a kind of short-cut for the latter.

Inside each cell is exactly one member of a finite set.

{1,0}

Different models use different symbols in the set. 1 and 0 is very common. As is black and white. As long as the set is finite, it can also include more than two symbols.

If you're with me so far, this next piece is the last before we put it all together.

In order to pick which symbol from the finite set a cell should display, the cell examines its neighbors, and only its neighbors, and applies a rule. Programmers think of this rule as code in some programming language, but it can be even simpler than that. Here is an example of a rule:

left=1, right=1 then cell=0
left=1, right=0 then cell=1
left=0, right=1 then cell=1
left=0, right=0 then cell=0

This rule could be stated in the following casual shorthand: if exactly one of a cell's neighbors is a 1, that cell should be a 1. Otherwise, it should be a 0.

It's important to note that at least one 1200 page book has sold a fair number of copies by pointing out that a very simple rule can generate complexity out of simplicity. Put another way, complexity is simply a feature of some systems, even without entropy or chance.

The rule I just described would yield us output that looks something like this:

1101111011000011101110000
1011000110100010011001000
1110100101110011010101100
1001110111001010111111010
1101001100101111100000111
0011101010111000010000100
0010011111100100011000110
0011010000010110010100101
1010111000011101011110111
0111100100010011110001100
0100010110011010001001010
0110011101010111001101111
1101010011111100101011000
1011111010000010111110100
1110000111000011100001110
1001000100100010010001001
0101100110110011011001101
1111010101101010110101011
0000111111011111101111110
0000100000110000011000001
1000110000101000010100001
0100101000111100011110001
1110111100100010010001001
0001100010110011011001101
1001010011101010110101011
0101111010011111101111110
0111000111010000011000001
1100100100111000010100001
0010110110100100011110001
1011101101110110010001001

And with output like that, who needs a million monkeys typing?

Labels: , ,


Tuesday, April 3, 2007

Deus Ex Automatis: Part II

The history of cellular automata is intertwined with the history of computers itself.

It was easy to imagine cool things to do with computers. The emerging technology conference- a talk by Microsoft Labs next-door, another talk by Amazon Web Services out in the courtyard, and only the cool kids here. That's you all- the cook kids.

With a few notable exceptions, everyone decided to ignore cellular automata. Which is why he's often pictured with a computer (notably, the EDVAC), instead of a self-reproducing automaton.

Very few.

Ten or so years after von Neumann's untimely death, his friend Arthur W. Burks published his papers on automata theory. I read the originals. Genius, but clearly unfinished. He was a busy guy. You know, inventing computers and the atomic bomb and game theory.

E.F. Codd- yes, the same E.F. Codd who gave us relational databases- had a brief romance with cellular automata with a simplification of JVN's self-reproducing machine.

And from there, the field exploded, experiencing slow linear growth for another forty years.

John Conway's Game of Life is a cellular automaton that mimics a biological ecology.

Rudy Rucker uses a particularly gnarly class of automata to describe Jellyfish.

Chris Langton made a famous one to describe the way an ant walks, but seems to focus on swarms now.

Thomas Schelling's segregation model was a cellular automaton in all but name.

As was most of Donald Greenspan's 1973 book Discrete Models, featuring a rant about how infinity is un-mathematical.

And we can't forget the preeeeeeetty seeeashells courtesy of Stephen Wolfram.

And all that only took another forty years.

Now that I've given you the history lesson, you all know why I became a computer programmer instead of a historian.

Before closing I should note, in passing, that there have been a steady stream of philosopher-hackers over those forty years, starting with Konrad Zuse's Rechender Raum (translated into English as Calculating Space), who insist that the universe itself is a calculation.

There will be time enough for that later, though.

Labels: , ,


Deus Ex Automatis, Part I

A cellular automaton, of course, is a kind of discrete model, which as you will see shortly can be a useful thing for hackers. Today's post is the beginning of my O'Reilly Emerging Technology presentation, in a heavily edited version.

I will present the talk in a series of posts, outlining the discovery and history of this kind of model. I'll follow with a mechanical overview of cellular automata, for hackers. Next, a few examples in the fabulous Perl 6 programming language (using Pugs, of course). Last, but not least, I will present the long list of open questions I presented at the end of my talk. These were the start of a Q/A session at the conference, and I hope they will be here as well.

When I gave the talk in San Diego, I started with some witty remarks, facilitated by the poorly functioning A/V system. On the original text of the speech, which I will post with the original slides at the end of these journal entries, I had the following note after my intro:
DRINK, PAUSE, LOOK THOUGHTFUL

So if you can manage that before moving on, take a drink, pause, and look thoughtful.

Cellular Automata.

Really rolls off of the tongue.

John von Neumann (a name you'll hear a lot) established the study of these models as a field. He was, unfortunately for all of us, a native speaker of physics. Worse yet, his second language was Hungarian. He's the one who saddled us with the name.

Cellular Automata.

I first encountered cellular automata about seven years ago, shortly after moving to Washington, DC. Living in Washington affords some advantages for doing research. For instance, our neighborhood library turns out to have every book ever published, and then some.

Because of my proximity to the Library of Congress, when Stephen Wolfram came out of hiding with A New Kind of Science, I wasn't one of the people who had to camp out in front of the bookstore for three days to get one.

I just went down to the library.

But that was just the beginning. If you're planning a visit to Washington, make sure to send me an email. I'll take you on the cellular automatour. The real treasure trove is in the Library of Congress manuscript reading room. John von Neumann left his papers- published and unpublished- to the library when he died.

Which you can just ask for.

Wow.

The letters between Stanislaw Ulam and John von Neumann are a great starting point. The two probably should be credited as co-founders of the field. Sanislaw Ulam had a great deal of affection for von Neumann, always calling him "Johnny". His letters were full of practical advice, grand plans, and occasional favors he would ask of von Neumann.

John von Neumann inevitably wrote back something utterly banal.

"Don't you owe me fifty cents?" Of course that was 1947. Fifty cents was a lot of money then.

Which brings me to my next point. If this is a sixty-year-old field, why was I giving a presentation at an Emerging Technology conference one week ago? Tune in tomorrow to find out!

Labels: , , ,


Deus Ex Automatis

The talk in San Diego is come and gone, and it was incredible fun. Thanks to all of you who posted materials, emailed me, posted on your blogs, and came to the talk.

I had a great time.

I learned things.

For instance, that the title for my talk, had I used the ablative correctly, would have been Deus Ex Automatis. File that under things I never would have known.

The slides are linked above. The text of the talk, I'll be trickling out in segments here on my online journal. The first segment, this afternoon, will be my "historical" introductions. Lucky for you all, I'm no more historian than I am Latin scholar.

Labels: , , ,


This page is powered by Blogger. Isn't yours?

Subscribe to Posts [Atom]