Thursday, December 24, 2009
Promises of Parallel in Perl6
Been reading through the Perl6 specs again. I like the way the synopses make "promises of parallelizability" on three families: junctions, hyperops, and feeds. I also like the way these three promises differ from one another. This post is a little thought experiment, in which I will act as the interpreter for all three.
Junctions
Of the three, junctions are the most difficult for me to grok. Junctions are a family of four: all(), any(), none(), and one(). One obvious place where this family of four comes in handy:
say "foo" if all(1,2,3) # says fooThe semantics, in this case, are obvious. In Boolean context, all evaluates to True if all of the arguments are true, any if any are true, none if (and only if) none are true, and one if exactly one is true. In this case, it's meant as a nice shorthand for:
say "foo" if (1 && 2 && 3) # says fooExcept under the covers, it's much different (or at least I suspect it is, inside the brain of Larry Wall). Since junctions carry a promise of autothreading, the former is a way of handing off to three threads in any order, and the latter (probably) executes in a promised order, (probably) with short-circuiting. Concentrating on the former, it's not that interesting the way it's written, so let's imagine it a bit differently:
say "foo" if any(bar(1), bar(2), bar(3));Basically, what I imagine this to be saying is, "go execute bar() in wherever it is convenient, in boolean context, and return the result here. I will continue whenever either: a) one of you comes back true, or b) all of you come back false." In fact, you can imagine doing something like this:
do_stuff() if any(breadthfirst($tree), depthfirst($tree));This could be shorthand for: "do a breadth first and a depth first at the same time, and whenever one of you returns True, the other one can just stop." It's a very nice kind of syntax in a world where we have a few processors to play around with.
Hyperops
Hyperoperators differ slightly from junctions- they still promise autothreading, but they cannot short-circuit in the same way a junction can. Hyperops, instead, execute a block-alternatively-statement (this, in Perl6 is called a "blast" I think) on every single member of a list. It looks like this:
[1,2,3]>>.say # prints 1\n2\n3\nOnce again, it's a pretty simple example that executes .say on every member of the list. This, under the covers, might be done in any order, even somewhere else. As such, any side effects might also occur out-of-order. I've heard that the order of results is generally preserved, but I'm not at all clear how much that matters in most cases.
Anyhow, back to the previous example, if I did something like [1,2,3]>>.bar(), it would promise to execute bar() three times with three topics (1,2, and 3), side effects would be preserved with all three. One can imagine how this would be useful for building cellular automata, where one has a grid to keep in synch, and one might do something like @grid>>.next() to yield a subsequent state.
Feeds
Feeds are autothreading just like the previous two, but instead of promising that all will be executed, or short-circuiting, the control of the timing of the autothreading is left to the context (caller?). A feed looks something like this:
my $a,$b,$c <== 0..InfThat <== thing is the feed. Basically, what this does is says "ask my pointy side how many it wants, grab that many from my blunt end when my pointy end needs them, execute them in any order, anywhere, but give them back to the pointy end in the order they were asked for." It's not short-circuiting like a junction, it's doesn't guarantee execution on the whole list like a hyperop, but it does provide some pretty nuanced control when combined with either itself or its friends.
I'm still thinking about it, but this increasingly seems to be a nice approach to parallelization, which has good dwimmery (e.g. if you're doing hypers and junctions, even if you weren't trying to autothread, it probably does what you meant), nice fine-grained control (execute all, execute some number, execute with short-circuiting), and still mixes nicely with other parts of the language.
I'm not sure if there are other parallel constructions either already in the spec or on their way in, but I don't find any immediate holes in the semantics here. Some additional control over how and where the processes are spread around might be achieved with some introspection on the current context, which may require some additional syntax, but that may also be left as an exercise for the reader.
Monday, December 21, 2009
Merry Christmas
In celebration of the Federal Government having a snow day today, I tweeted this:
.........X.........
........XXX........
.......XX..X.......
......XX.XXXX......
.....XX..X...X.....
....XX.XXXX.XXX....
...XX..X....X..X...
..XX.XXXX..XXXXXX..
.XX..X...XXX.....X.
XX.XXXX.XX..X...XXX
say <. X>[my@i=0 xx 9,1,0 xx 9];map {say <. X>[@i=map {%([^8]>>.fmt("%03b") Z 0,1,1,1,1,0,0,0){@i[($_-1)%19,$_,($_+1)%19].join}},^19]},^9;It's my geekery for the day. It is written in Perl 6, and it prints out a Christmas tree, which is also a cellular automaton (Wolfram rule #30). Thanks to the phenomenal people at Parrot and Rakudo, it looks like this when it is run:
.........X.........
........XXX........
.......XX..X.......
......XX.XXXX......
.....XX..X...X.....
....XX.XXXX.XXX....
...XX..X....X..X...
..XX.XXXX..XXXXXX..
.XX..X...XXX.....X.
XX.XXXX.XX..X...XXX
Merry Christmas, Perl6!
Labels: christmas, perl, perl6, software
Tuesday, December 15, 2009
Short Description of Gödel's Proof
Inspired by Mark Jason Dominus, my short description of Gödel's first incompleteness theorem:
No set of rules is both consistent and complete.For favorite long(er) explanations, see the appendix to Rudy Rucker's Infinity and the Mind or Nagel and Newman's Godel's Proof. If you read the latter, read the one edited by Douglas Hofstadter, who is awesome.
Saturday, December 5, 2009
Perl6 Advent
Thanks in part to the Perl 6 Advent Calendar, in part to Ed showing me how easy it was to pop a subversion repo into github, and in part to whim, I dusted off some old Perl6 and got it running under Rakudo. The gang on #perl6 was smart and fun as always, and to top it all off, there's a butterfly now. Anyone who wants to see the code, it's on Github.
Labels: cellular automata, perl, perl6, programming
Thursday, December 3, 2009
On Old Being New
Continuing on the riff Brian started, I was reflecting on some stuff I built a while back. In 2001 and 2002, I was one of the people who worked on Democrats.org. There were a whole mess of people involved in that project, from a whole mess of different organizations, and it was fun. I even married one of the people on the project.
One of the parts I was responsible for was the contribution processing code. I kept working on this through the end of 2004, and during the last part, it was pretty high volume. The system was written in Perl (HTML::Mason and XML::Comma), ran under mod_perl, and it processed credit cards in real-time.
We used a cool trick to keep browsers from hanging up on long-running processes (sometimes the bank would take up to a minute to get back to us about the credit card). We immediately checked for easily caught errors (for instance, credit cards have a check digit that can help catch mistyping), then gave the browser an immediate response response. That response was an HTML page written into a file on the server, which basically told the browser, "Yeah, still working on it, check back here five seconds." In the background, we ran a forked process to talk to the bank that didn't talk to the browser directly. When the bank got back to us, the forked process would write our response into that same file, and the next time the browser checked back, they would see a response: "Thanks for Contributing," or whatever.
We spent a lot of time refining the system over the next few years, and by the time the general election came around, we had processed some absurd nine figure sum of money with those couple hundred lines of Perl.
That basic strategy- do something easy, respond immediately, run long processes asynchronously, seems to be generating quite a lot of interest these days. Node.js does it for I/O, Google Go has it built in to the language, NGINX does it for the Web, AnyEvent does it for Perl, Twisted in Python, EventMachine in Ruby, and the list goes on. There are differences in the way each runs this process, but that core description seems to hold, and it shares a lot in common with the way we did it nearly a decade ago. There's even a spec for this kind of thing in the works.
Now, as the inimitable Ryan Dahl points out, there are plenty of reasons not to do development this way. Cultural reasons and technical reasons. He rejects the "it's too hard" argument, and gives some pretty convincing demos in counterpoint, and maybe he's right. Or maybe "cultural reasons" and "it's too hard" are the same thing, and all we really need is someone to come along and hand it to us on a silver platter. Probably an opportunity there.
One of the parts I was responsible for was the contribution processing code. I kept working on this through the end of 2004, and during the last part, it was pretty high volume. The system was written in Perl (HTML::Mason and XML::Comma), ran under mod_perl, and it processed credit cards in real-time.
We used a cool trick to keep browsers from hanging up on long-running processes (sometimes the bank would take up to a minute to get back to us about the credit card). We immediately checked for easily caught errors (for instance, credit cards have a check digit that can help catch mistyping), then gave the browser an immediate response response. That response was an HTML page written into a file on the server, which basically told the browser, "Yeah, still working on it, check back here five seconds." In the background, we ran a forked process to talk to the bank that didn't talk to the browser directly. When the bank got back to us, the forked process would write our response into that same file, and the next time the browser checked back, they would see a response: "Thanks for Contributing," or whatever.
We spent a lot of time refining the system over the next few years, and by the time the general election came around, we had processed some absurd nine figure sum of money with those couple hundred lines of Perl.
That basic strategy- do something easy, respond immediately, run long processes asynchronously, seems to be generating quite a lot of interest these days. Node.js does it for I/O, Google Go has it built in to the language, NGINX does it for the Web, AnyEvent does it for Perl, Twisted in Python, EventMachine in Ruby, and the list goes on. There are differences in the way each runs this process, but that core description seems to hold, and it shares a lot in common with the way we did it nearly a decade ago. There's even a spec for this kind of thing in the works.
Now, as the inimitable Ryan Dahl points out, there are plenty of reasons not to do development this way. Cultural reasons and technical reasons. He rejects the "it's too hard" argument, and gives some pretty convincing demos in counterpoint, and maybe he's right. Or maybe "cultural reasons" and "it's too hard" are the same thing, and all we really need is someone to come along and hand it to us on a silver platter. Probably an opportunity there.
Wednesday, December 2, 2009
Ongoing Conversation
Brian mentions a conversation we've been having on his blague. In conversations, Brian and I generally take opposite positions, then swap halfway through, then swap back at the end, with some modifications. In our last conversation, I made a lot more modifications than he.
My previous post, written from Chrome OS, made the assertion that Google didn't rethink the operating system, they just threw it out entirely. In the conversation with Brian, I mentioned it appeared they were trying to do the same with web servers. In a couple dozen lines of go (the new Google programming language), there's a reasonably nice web server.
Brian wasn't impressed. Not with any of it.
Which appears to be Brian's superpower. He's a hard guy to impress. Me, I'm easy. But if Brian is impressed, then it's probably really impressive. With me, it's just a flavor of the week. His impressively dispassionate point was they weren't throwing anything out, they were just divvying it up differently. They haven't thrown away the Operating System, they've just put it on a different computer. And while they're doing it, they've stuck us with a sub-standard UI toolkit. They haven't thrown away the web server, they've just put it in a library, where it's hard to see.
Good points, Brian. Good points. I think you've got me. So let's explore your idea a little more.
I was mentioning, in the same conversation (or maybe a different one) that I've been doing a little bit of programming on the Arduino recently. People ask me for help sometimes, because I've posted some code here in the past, and I try to help. The Arduino is a little computer, costs about thirty bucks, and comes with a nifty little editor that helps with compiling and loading code onto the thing. Really cool. It's all open source- hardware and software, and it's pretty easy to use.
It's the only computer I use that doesn't have an operating system, but as I got to thinking about that, I realized it wasn't really true. It does have an operating system. That nifty little text editor runs on Linux (in my case, probably Mac and Windows for other folks), and it has a bunch of convenient stuff that you can include in your programs without rewriting.
Which is really all an operating system is, I guess. It's just running on a different computer.
So, let me try again, Brian. What do you think about this as a paradigm shift: what if the future of computing consists of a bunch of appliances (virtual and physical) where the only "operating system" is a bunch of libraries that may or may not be used and live on a different computer? When you run a database server, instead of being something that lives in your operating system, it's something that lives in your virtualization environment, until you put it on a real piece of hardware. When you put it on the real piece of hardware, it's the only thing running there. No OS, no libraries, no nothing, just the bare database.
Of course, most the time, you're using it on your desktop, with your OS-of-choice (have you switched to Windows 7 yet?), where it's running under VirtualBox or KVM or VMWare, or whatever you use. And that's your OS.
It's kind of the opposite of what Java promised us: software designed to run in only one particular place, optimized for that place. The wacky thing is, it will probably deliver on Java's promise of write once, run anywhere, since virtual machines are coming along so quickly.
Which reminds me, how do I distinguish from a virtual-machine-like-java that runs bytecode for a particular programming language family, and a virtual-machine-like-vmware that runs a whole operating system?
My previous post, written from Chrome OS, made the assertion that Google didn't rethink the operating system, they just threw it out entirely. In the conversation with Brian, I mentioned it appeared they were trying to do the same with web servers. In a couple dozen lines of go (the new Google programming language), there's a reasonably nice web server.
Brian wasn't impressed. Not with any of it.
Which appears to be Brian's superpower. He's a hard guy to impress. Me, I'm easy. But if Brian is impressed, then it's probably really impressive. With me, it's just a flavor of the week. His impressively dispassionate point was they weren't throwing anything out, they were just divvying it up differently. They haven't thrown away the Operating System, they've just put it on a different computer. And while they're doing it, they've stuck us with a sub-standard UI toolkit. They haven't thrown away the web server, they've just put it in a library, where it's hard to see.
Good points, Brian. Good points. I think you've got me. So let's explore your idea a little more.
I was mentioning, in the same conversation (or maybe a different one) that I've been doing a little bit of programming on the Arduino recently. People ask me for help sometimes, because I've posted some code here in the past, and I try to help. The Arduino is a little computer, costs about thirty bucks, and comes with a nifty little editor that helps with compiling and loading code onto the thing. Really cool. It's all open source- hardware and software, and it's pretty easy to use.
It's the only computer I use that doesn't have an operating system, but as I got to thinking about that, I realized it wasn't really true. It does have an operating system. That nifty little text editor runs on Linux (in my case, probably Mac and Windows for other folks), and it has a bunch of convenient stuff that you can include in your programs without rewriting.
Which is really all an operating system is, I guess. It's just running on a different computer.
So, let me try again, Brian. What do you think about this as a paradigm shift: what if the future of computing consists of a bunch of appliances (virtual and physical) where the only "operating system" is a bunch of libraries that may or may not be used and live on a different computer? When you run a database server, instead of being something that lives in your operating system, it's something that lives in your virtualization environment, until you put it on a real piece of hardware. When you put it on the real piece of hardware, it's the only thing running there. No OS, no libraries, no nothing, just the bare database.
Of course, most the time, you're using it on your desktop, with your OS-of-choice (have you switched to Windows 7 yet?), where it's running under VirtualBox or KVM or VMWare, or whatever you use. And that's your OS.
It's kind of the opposite of what Java promised us: software designed to run in only one particular place, optimized for that place. The wacky thing is, it will probably deliver on Java's promise of write once, run anywhere, since virtual machines are coming along so quickly.
Which reminds me, how do I distinguish from a virtual-machine-like-java that runs bytecode for a particular programming language family, and a virtual-machine-like-vmware that runs a whole operating system?
Subscribe to Posts [Atom]