Several reasons, actually, but this post, I, Cringely . The Pulpit . Data Debasement | PBS, gets to one of them.
Hash tables, on the other hand...
Thanks in part to Larry Ellison's hard work and rapacious libido, databases are to be found everywhere. They lie at the bottom of most web applications and in nearly every bit of business software. If your web site uses dynamic content, you need a database. If you run SAP or any ERP or CRM application, you need a database. We're all using databases all the time, whether we actually have one installed on our personal computers or not.Not only is a (relational) database (server) hard to back up reliably, a processing bottleneck, and a single point of failure, it also doesn't distribute worth a damn, so it doesn't scale.
But that's about to change.
We're entering the age of cloud computing, remember? And clouds, it turns out, don't like databases, at least not as they have traditionally been used.
This fact came out in my EmTech panel and all the experts onstage with me nodded sagely as my mind reeled. No database?
No database.
Hash tables, on the other hand...
no subject
Date: 2008-10-04 06:09 pm (UTC)Not only is a (relational) database (server) hard to back up reliably, a processing bottleneck, and a single point of failure, it also doesn't distribute worth a damn, so it doesn't scale.
Yes, many of the fundamental properties that a database provides, such as the atomicity, consistency, and isolation of transactions, force the database to become a bottleneck. Mere persistence of data is no problem; the reason Google can keep the whole internet in memory is that when you google something, you aren't writing data. The internet doesn't change very fast, and if you google a web page that doesn't really exist any more, that isn't catastrophic.
So you would expect database companies to put as little functionality in the data server as possible, to minimize the degree to which the data server is a bottleneck. In reality, the opposite happens. Database providers put more and more bells and whistles into their data servers, including the ability to run programs there. Why?
For the same reason that PepsiCo and the Coca-Cola Company started to sell bottled water. (Aquafina and Dasani, respectively. I had to look that up, but I was confident that they would sell bottled water just as I am confident that the sun will rise tomorrow.) It's hard for cola beverage companies to increase their cola beverage market share, so they have branched out into other kinds of soft drinks and bottled water. And solid food, too; PepsiCo owns Quaker Oats and Frito-Lay.
Similarly, it's hard for database companies to increase their database market share. But other opportunities exist, because most persistent data in the world is not stored in relational databases, and because computers don't just store data, they also compute. So now data servers can compute. When I was working for a big database company, my work involved putting a Java engine in the data server. This makes some sense, because computing something locally is faster than sending data halfway around the world, but it makes the data server more of a bottleneck.
I don't expect the market for big centralized data servers to disappear overnight, but there are applications for which other solutions are preferable.
no subject
Date: 2008-10-04 07:25 pm (UTC)But there's also the "if all you have is a hammer..." effect; it's not so much that database companies are trying to add computation so that they can increase their market share, it's that they need computation for some purposes anyway, and want to broaden their product category as much as possible because it's what they know how to sell and what their existing customers want.
(Merely branching out into adjacent but different technologies is another matter; some companies are able to pull this off, and some fail miserably at it. I've worked for both kinds.)
There are, as you point out, ways of breaking an application up so as to minimize the amount of time spent going through the bottleneck, but people (and companies) who are most familiar with databases are less likely to see them than those who approach it from a different direction.