Tag Archives: database

NuoDB at the BBBT: Another One Bringing SQL to the Cloud

Today’s presentation in front of the BBBT was by NuoDB’s CTO, Seth Proctor. NuoDB is a small company with big investments. What makes them so interesting? It’s the same thing as in many of the other platform presenters at the BBBT. How do we get real databases in the Cloud?

Hadoop is an interesting experiment and has clearly brought value to the understanding of massive amounts of unstructured data. The main value, though, remains that it’s cheap. The lack of SQL means it’s ok for point solutions that don’t stress its performance limitations. Bringing enterprise database support to the cloud is something else.

The main limitation is that Hadoop and other unstructured databases aren’t able to handle transactional systems while those still remain the major driver in operating businesses.

NuoDB has redesigned the database from the ground up to be able to run distributed across the internet. They’ve created a peer-to-peer structure of processes, with separate processes to manage the database and SQL front end transaction issues.

Seth pointed out that they ““Have done nothing new, just things we know put together in a new way.” He also pointed out they have patents. My gripe about patents for software is an issue for another day, but that dichotomous pairing points to one reason (Apple’s patent on a rounded rectangle is another example of the broken patent system, but off the soap box and onwards…).

It’s clear that old line RDMS systems were designed on major, on-premise servers. The need for a distributed system is clear and NuoDB is on the forefront of creating that. One intriguing potential strength, one about which there wasn’t time to discuss in the presentation, is a statement about the object-oriented structure needed for truly distributed applications.

Mr. Proctor stated that the database schema is in object definitions, not hard coded into the database. He added that provides more flexibility on the fly. What it also could mean is that the schema isn’t restricted to purely RDBMS schemas and that future versions of their database could support columnar and even unstructured database support. For now, however, the basic ability to change even a standard row-based relational database on the fly without major impacts on performance or existing applications is a strong benefit.

As the company is young and focused on the distributed aspects of performance, it was also admitted that their system isn’t one for big data, even structures. They’re not ready for terabytes, not to mention petabytes of data.

The Business

That’s the techie side, but what about business?

The company is focused on providing support for distributed operational systems. As such, Seth made clear they haven’t looked at implementations supporting both operational and analytical systems. That means BI is not a focus and so the product might not be the right system for providing high level business insight.

In addition, while I asked about markets I mainly got an answer about Web sites. They seem to think the major market isn’t Global 1000 businesses looking for link distributed operational systems but that Web commerce sites are their sweet spot. One example referred to a few times was in transactional systems for businesses selling across a country or around the world. If that’s the focus, it’s one that needs to be made more explicit on their web site, which really doesn’t discuss markets in the least.

It’s also an entry into the larger financial markets space. It and medical have always been two key verticals for new database technologies due to the volumes of information. That also means they need to prioritize the admitted lack of large database support or they’ll hit walls above the SMB market.

The one business thing the bothers me is their pricing model. It’s based on the number of hosts. As the product is based on processes, there’s no set number of processes per host. In addition, they mentioned shared hosting, places such as AWS, where hosts may be shared by multiple of NuoDB’s customers or where load balancing might take your processes and have them on one host one day and multiple hosts the next.

Host base pricing seems to be a remnant of on-premises database systems that Cloud vendors claim to be leaving. In a distributed, internet based setup, who cares how big the host is, where the host is, or anything else about the host? The work the customer cares about is done by the processes, the objects containing the knowledge and expertise of NuoDB, not the servers owned by the hosting firm. I would expect that Cloud companies would move from processors to process.

Summary

NuoDB is a company focused on reinventing the SQL database for the Cloud. They have significant investment from the VC and business markets. However, it would be foolish to think that Oracle, IBM and other existing mainstream RDBMS vendors aren’t working on the same thing. What NuoDB described to the BBBT used most of the right words from the technology front and they’re ramping up their development based on the investments, but it’s too early to say if they understand their own products and markets enough to build a presence for the long term.

They have what looks like very interesting technology but, as I keep repeating in review after review, we know that’s not enough.

Cloudera at the BBBT: The limits of Open Source as a business model

Way back, in the dawn of time, there were ATT and BSD, with multiple flavors of each base type of Unix. A few years later, there were only Sun, IBM and HP. In a later era, there was this thing called Linux. Lots of folks took the core version, but then there were only Redhat and a few others.

What lessons can the Hadoop market learn from that? Mission critical software does not run on freeware. While open source lowers infrastructure costs and can, in some ways, speed feature enhancements, companies are willing to pay for knowledge, stability and support. Vendors able to wrap the core of open source up in services to provide the rest make money and speed the adoption of open-source based solutions. Mission critical applications run on services agreements.

It’s important to understand that distinction when discussing such interesting companies as Cloudera, whose team presented at last Friday’s BBBT session. The company recently received a well-publicized, enormous investment based on the promise that it can create a revenue stream for a database service based on Hadoop.

The team had a good presentation, with Alan Saldich, VP Marketing, pointing out that large, distributed processing databases are providing a change from “bringing data to compute” to “bringing compute to data.” He further defined the Enterprise Data Hub (EDH) as the data repository that is created in such an environment.

Plenty of others can blog in detail about what we heard about the technology, but I’ll give it only a high level glance. The Cloudera presenters were very open about their product being an early generation and they laid out a vision that seemed to be good. They understand their advantages are the benefits of Cloud and Hadoop (discussed a little more below) but that the Open Source community is lagging in areas such as access and control to data. It’s providing such key needs to IT that will help their adoption and provide a revenue stream, and their knowing that is a good sign.

I want to spend more time addressing the business and marketing models. Cloudera does seem to be struggling to figure out how to make money, hence the need more such a large investment from Intel. Additional proof is the internal confusion of Alan saying they don’t report revenues and then showing us only bookings, while Charles Zedlewski, VP Products, had a slide claiming they’re leading their market in revenue. Really? Then show us.

They do have one advantage, the Cloud model lends itself to a pricing model based on nodes and, as Charles pointed out, that’s a ““business model that’s inherently deflationary” for the customer.  Nodes get more powerful so the customers regularly get more bang for the buck.

On the other side, I don’t know that management understands that they’re just providing a new technology, not a new data philosophy. While some parts of the presentation made clear that Cloudera doesn’t replace other data repositories except for the operational data store, different parts implied it would subsume others without giving a clear picture of how.

A very good point was the partnerships they’re making with BI vendors to help speed integration and access of their solution into the BI ecosystem.

One other confusion that Cloudera, and the market as a whole, seems to be clearly differentiating that the benefits of Hadoop come from multiple technologies: Both the software that helps better manage unstructured data and simple hardware/OS combination that comes from massively parallel processing, whether the servers are in the Cloud or inside a corporate firewall. Much as what was said about Hadoop had to do with the second issue, and so the presenters rightfully got pushback from analysts who saw that RDBMS technologies can benefit from those same things and therefore minimizing that as a differentiator.

Charles did cover an important area of both market need and Cloudera vision: Operational analytics. The ability to quickly massage and understand massive amounts of operational information to better understand processes is something that will be enhanced by the vendor’s ability to manage large datasets. The fact that they understand the importance of those analytics is a good sign for corporate vision and planning.

Open source is important, but it’s often overblown by those new to the industry or within the Open Source community. Enterprise IT knows better, as it has proved in the past. Cloudera is a the right place at the right time, with a great early product, the understanding as to many of the issues that are needed in the short term. The questions are only about the ability to execute both on the messaging and programming sides. Will their products meet the long term needs of business critical applications and will they be able to explain clearly how they can do so? If they can answer correctly, the company will join the names mentioned at the start.