Wednesday, October 3, 2007

Internet Is The New COBOL

I've been trying to quantify exactly what it is about software as a service that bugs me so much. Yeah, my experiences haven't been all that winning with it so far, but I shouldn't let my anecdotes taint the promise of the paradigm, right?

Like Joel pointed out, computers will continue to roll out faster and with more memory, so don't sweat the micro-optimizations and build more features. No way I disagree with that; Moore's Law keeps on chugging away and today's monster rig is 18 months from now's Dell.

But as computers inexorably get faster, more memory, bigger hard drives, our line to the outside world remains fairly constant. Putting aside the comparatively minor problems of pretending that software standards ever worked for anything beyond the trivial (can you name one standard that gave you anything approaching wood?), SaaS seems like a plug-ugly mistake for that reason.

  • The success of SaaS is predicated on the use of a scarce resource, the network.

This problem was driven home in performance testing our services. Internally, on a modest (that's a nice way of saying "hand-me-down") application server and database, we were able to push pretty good numbers through. We're feeling good about ourselves and how it's performing, so why not point the load servers at our staging environment up on production? It's a managed site, the hardware isn't vintage, so we're expecting to see some solid throughput.

We start the ramp-up, get about halfway to where we peaked in our internal testing when things start shitting the bed. The load servers start timing out and they drop out of the test.

We saturated our T1.

Oh. Network pipes don't double in size every 18 months? So we set off scrambling again. Do we order another T1 or two? They'll take 30-45 days to be installed and then we'll be paying to have them lay fallow after we run our few days' worth of tests. Maybe more importantly, we're not sure that our clients have any fatter pipes than we do.

Do we find someone with bigger pipes than we've got and tote our load machines over there for a few days? They'll gladly let us set up shop there for a perfectly unreasonable price. Oh. But our connection to our back-end won't work from there so we'll need to be teleconferencing with someone on-site monitoring the servers. That complicates things and we still don't know that performance is a problem.

What we do know is that the network is causing a lot of problems that we can't easily throw more hardware at. When it comes to what a computer can do, the graph trends up and to the right.

When it comes to the stuff backing your service calls, how much shit can you stuff in that five pound sack?

XML is bloated. Really, really bloated. It was designed as a human-readable markup language (it's what puts the ML in XML) but basing communications protocols on it was a dubious decision, hindsight or otherwise. Five pounds.

JSON is less bloated, but JSON parsers aren't as endemic as XML and business people will object because they can't juggle two acronyms at the same time and their tech guys don't know JSON but have a sneaking suspicion that it means way more work for them so they're getting doubly steered back to XML. Six pounds.

You can compress the HTTP that either of them is getting shooted out over but, like JSON, not all clients are going to be able to deal with compression. Six and a half pounds.

In college, professors told me that in the Bad Old Days of computing, you didn't own the computers you worked on. You paid oodles of money to lease an IBM rig and keep it running and even then, it shipped with more hard drive space and more CPUs than were turned on at any point (over the phone line that you pay for).

"But professor, that's awful! You pay all that money and you don't even get all the computer that you could be using? And you have to pay for a phone line so their techs can dial in and turn the magic on?"

"No, that's a good thing. You built your applications under constraints and when you ran into a wall because your app was running too slowly or you were running out of disk space, a call and a few hours later, magic happens and your app's running fine and disk space is no longer an issue."

Curiously, Amazon's following IBM's lead with their S3 and EC2 offerings. Need more space? Got it. More computational power? Bingo bango.

God help you if you need more bandwidth to make those calls to S3 or EC2. Not even god can help you if your clients are running into a brick wall because they saturated their pipes calling your services.

Like buying IBM, basing your architecture around a decentralized network server with flexibly vast resources won't get you cockpunched for making an impossibly wrong decision by most people, but I'll still hate on you because that's how I do.

  1. We already knew that storage space and computational power were cheap and vast. Amazon's maybe made it moreso, but that's nothing new.
  2. For what it is, the pay-as-you-go model isn't awful. You wouldn't consider it if you could build your own disparately-hosted server farm, but you don't got the bankroll to roll like that which is why you've gone this route.
  3. Wait a fucking second. You knew that the network wasn't going to get any faster and you designed your application around using it extensively anyway?

Congratulations. You've discovered the brave new frontier of decentralized internets architecture and it looks a whole lot like a fucking mainframe.

Web 2.0, meet Web 0.7. Web 0.7, meet UNIVAC.

No comments: