Thursday, December 3, 2009
On Old Being New
Continuing on the riff Brian started, I was reflecting on some stuff I built a while back. In 2001 and 2002, I was one of the people who worked on Democrats.org. There were a whole mess of people involved in that project, from a whole mess of different organizations, and it was fun. I even married one of the people on the project.
One of the parts I was responsible for was the contribution processing code. I kept working on this through the end of 2004, and during the last part, it was pretty high volume. The system was written in Perl (HTML::Mason and XML::Comma), ran under mod_perl, and it processed credit cards in real-time.
We used a cool trick to keep browsers from hanging up on long-running processes (sometimes the bank would take up to a minute to get back to us about the credit card). We immediately checked for easily caught errors (for instance, credit cards have a check digit that can help catch mistyping), then gave the browser an immediate response response. That response was an HTML page written into a file on the server, which basically told the browser, "Yeah, still working on it, check back here five seconds." In the background, we ran a forked process to talk to the bank that didn't talk to the browser directly. When the bank got back to us, the forked process would write our response into that same file, and the next time the browser checked back, they would see a response: "Thanks for Contributing," or whatever.
We spent a lot of time refining the system over the next few years, and by the time the general election came around, we had processed some absurd nine figure sum of money with those couple hundred lines of Perl.
That basic strategy- do something easy, respond immediately, run long processes asynchronously, seems to be generating quite a lot of interest these days. Node.js does it for I/O, Google Go has it built in to the language, NGINX does it for the Web, AnyEvent does it for Perl, Twisted in Python, EventMachine in Ruby, and the list goes on. There are differences in the way each runs this process, but that core description seems to hold, and it shares a lot in common with the way we did it nearly a decade ago. There's even a spec for this kind of thing in the works.
Now, as the inimitable Ryan Dahl points out, there are plenty of reasons not to do development this way. Cultural reasons and technical reasons. He rejects the "it's too hard" argument, and gives some pretty convincing demos in counterpoint, and maybe he's right. Or maybe "cultural reasons" and "it's too hard" are the same thing, and all we really need is someone to come along and hand it to us on a silver platter. Probably an opportunity there.
One of the parts I was responsible for was the contribution processing code. I kept working on this through the end of 2004, and during the last part, it was pretty high volume. The system was written in Perl (HTML::Mason and XML::Comma), ran under mod_perl, and it processed credit cards in real-time.
We used a cool trick to keep browsers from hanging up on long-running processes (sometimes the bank would take up to a minute to get back to us about the credit card). We immediately checked for easily caught errors (for instance, credit cards have a check digit that can help catch mistyping), then gave the browser an immediate response response. That response was an HTML page written into a file on the server, which basically told the browser, "Yeah, still working on it, check back here five seconds." In the background, we ran a forked process to talk to the bank that didn't talk to the browser directly. When the bank got back to us, the forked process would write our response into that same file, and the next time the browser checked back, they would see a response: "Thanks for Contributing," or whatever.
We spent a lot of time refining the system over the next few years, and by the time the general election came around, we had processed some absurd nine figure sum of money with those couple hundred lines of Perl.
That basic strategy- do something easy, respond immediately, run long processes asynchronously, seems to be generating quite a lot of interest these days. Node.js does it for I/O, Google Go has it built in to the language, NGINX does it for the Web, AnyEvent does it for Perl, Twisted in Python, EventMachine in Ruby, and the list goes on. There are differences in the way each runs this process, but that core description seems to hold, and it shares a lot in common with the way we did it nearly a decade ago. There's even a spec for this kind of thing in the works.
Now, as the inimitable Ryan Dahl points out, there are plenty of reasons not to do development this way. Cultural reasons and technical reasons. He rejects the "it's too hard" argument, and gives some pretty convincing demos in counterpoint, and maybe he's right. Or maybe "cultural reasons" and "it's too hard" are the same thing, and all we really need is someone to come along and hand it to us on a silver platter. Probably an opportunity there.
Comments:
<< Home
You left out POE, which has been doing this kind of thing for years (since 1998) :)
A couple other frameworks take it a step further and become "job queues" where you can have many workers (threads or processes) which take predefined jobs off a queue, do the work and then update the jobs. Things like Gearman, TheSchwartz (and I'm sure there's more).
We now have a custom in-house job queue that we use at PlusThree which takes that same original idea of yours and expands on it to do other things besides just contributions (sending emails, geocoding data, large list uploads, etc).
A couple other frameworks take it a step further and become "job queues" where you can have many workers (threads or processes) which take predefined jobs off a queue, do the work and then update the jobs. Things like Gearman, TheSchwartz (and I'm sure there's more).
We now have a custom in-house job queue that we use at PlusThree which takes that same original idea of yours and expands on it to do other things besides just contributions (sending emails, geocoding data, large list uploads, etc).
Heh, the original idea was hardly mine- I think Kwindla Kramer taught me that trick, but it was in Perl since v3 IIRC (fork(), that is). It wasn't until Perrin came over from Ticketmaster (I think) that PlusThree started making serious progress on queuing. Nice to hear that it's still progressing.
It's interesting to think about how far the "hey, I'm working on this" promise has been pushed. As the internet becomes our time-sharing system (yay! I don't have to hand this paper-tape off to an operator to get my promise!) we need to push that message further, quicker and to more people.
It amazes me how much the UNIX philosophy got right. All of these wonderful evolutions in web tech (AJAX, fixing the C10K problem, extending HTTP to be asynchronous and bi-directional) seem to me to be an effort to take back-groundable processes, time-sharing and pipes (with feedback) to the people.
And they don't even know they have been indoctrinated.
It amazes me how much the UNIX philosophy got right. All of these wonderful evolutions in web tech (AJAX, fixing the C10K problem, extending HTTP to be asynchronous and bi-directional) seem to me to be an effort to take back-groundable processes, time-sharing and pipes (with feedback) to the people.
And they don't even know they have been indoctrinated.
Heh, I was thinking about how much we used to use mkfifo when we were parsing those same large data sets Peters was talking about up above. It wasn't a queuing system, as such, but it sure acted like one :-P
Post a Comment
Subscribe to Post Comments [Atom]
<< Home
Subscribe to Posts [Atom]