Python Notes

Wednesday, September 15, 2004

Do we really need SQL?

It's been a few busy weeks since I returned to the world of business application development. This time, however, I'm having a chance to take a fresh look at it. Although I'm still too used to some tools to change my mind very quickly (IDEs, for example), there are a few things that I'm really keen to try out. One of them sounds like anathema -- do I really need a true RDBMS for my apps? (note: I'm using the terms SQL and RDMS rather interchangeably for all purposes of this discussion. I hope my readers will not find if offending.).

A long time ago, I've read an interesting paper criticizing the over use of RDBMSs (I tried to Google it but I can't find it now). The author -- an old time IBM researcher -- argues that the use of RDBMSs is not needed after all, and that (almost) everything that is done with SQL can be done more efficiently with old-style flat files and batch processing programming. Well, it's hard to agree with him on the conclusion, but he pointed out several issues with relational databases and SQL that are worth thinking about. The main problem is that the relational theory is purely a mathemathical construction, that over generalizes the problem at hand, and in the process imposes several layers of abstractions for things that can be done dirty cheap if you just do it the "ugly way". If you think from this perspective, you can surely see that for lots of applications, SQL is plain overkill, and one can surely live happily with a much simpler persistence model.

Then today, I've read another article, called object prevalence. The argument is sound, and made me think (again) on the actual need for RDBMSs. It's now clear to me that there are few good reasons to deploy a full fledged RDBMS for small apps, and even for relatively big apps. Storing data in memory sounds better than most people realize. Many apps never have more than a few thousand records, and storing everything in memory is a clear winner in terms of performance. Even big tables - those with a few hundred thoused records- can actually be stored in memory if well structured.

So why do we need to deploy RDBMSs today? There are still a few things that a RDBMS/SQL combo does well, and that are good reasons to keep the RDBMS backend.

  • Customers are used to it. Tell your customer that you're using a SQL database, and he'll feel fine about it. Tell him that you're using a proprietary high-performance scheme at absolutely zero extra cost and watch his reaction.
  • SQL makes report generation easier. Well, while not actually absolutely true, it's a fact that there are countless report generators available around, and that some of them will probably be able to connect to your database and retrieve data using SQL. This puts more power in the hands of the customers, and buys them peace of mind.
  • SQL is network enabled. It's easy to separate the SQL server from the application code. On the other hand, a custom memory-based persistence setup may depend on special design to take into account networking. On the other hand, if your app is not going to need to exchange raw data with anyone else, then your app server can just do it fine by serving custom XML feeds to all its clients.



For now, I'm just wondering if the entire SQL stuff is really worth the price for some of my current assignments. I think that it's better to be conservative in this regard, so I'll probably keep the RDBMS backend -- for now. Future projects may be different...

15 Comments:

Post a Comment

<< Home