Category Archives: Enterprise Software

DataHero at the BBBT: A Startup Getting It Right

First, on a tangent not directly focused on the product: Thank you Chris Neumann, CEO or DataHero. After hearing presenters from multiple companies consistently use the wrong words over the last few months, you used both premise and premises in the appropriate places. Thanks!

As you might gather, Wednesday’s presentation at the BBBT was by DataHero. A fairly young company, less than three years old, DataHero is focused on “Delivering a self-service Cloud BI solution that enables enterprise and SMB users to analyze and visualize their SAAS-based data without IT.”

Self-service BI is what almost all the players, both new and mature companies, are trying to provide these days. This just means they’re another player in attempting to help business knowledge workers to connect to data, analyze it and gather useful and actionable information without heavy intervention by business analysts and IT.

Cloud is also where everyone’s moving since it has so many advantages to all areas of software. DataHero, as a small company, isn’t just in the Cloud. They’ve smartly decided to begin by focusing on public Cloud applications with accessible API’s.

While that initially simplifies things, the necessity to handle complexity still exists in that world. Mike Ferguson, another BBT member analyst, pointed out that many of his clients have multiple, customized Salesforce.com instances and that’s bringing the upgrade issues seen in on-premises systems into the Cloud world. Chris acknowledges that and understands the need to grow to handle the issue, but knows that at the current size of DataHero there’s enough of a market for an initially more focused solution.

A strategic issue comes up with the basic nature of the Cloud. Mr. Neumann mentioned Cloud being opposed to centralized data, but that’s not quite so. Depending on how Cloud systems are set up, they can help or hinder centralization of data. However, right now he is accurate in that most of the growth of Cloud is departmental in nature. It’s also further blurring the always fuzzy line between enterprise and SMB markets by providing applications that both groups can leverage.

Another area that shows thought in their growth strategy is entry into new market. Chris is clear that they dip their toes into an arena, check reactions, and if positive then try to partner with as many companies in the space as possible to maintain neutrality. That means they don’t get locked into the first vendor the first client wants to work with, regardless of market control, leaving flexibility for customers. Their partner page, though young, clearly shows that strategy in effect. That’s a good move and I wish more vendors would think that way.

Another key growth issue is data cleansing. Right now, DataHero does none, expecting that the source system provides that capability. However, as clients use more and more source systems, there’s a cleansing need to clarify data clashes from different systems. That’s something the team at DataHero says they’re aware of while, again, that’s future growth (no time frames, as per legal sanity…).

The demo was very interesting. The other founder, Jeff Zabel, has a strong history in designing interfaces for software in vehicles, meaning usability really matters. That can be seen with a very clear and simple interface. It is easy to use. However, as pointed out by many other companies, 80% of business data has a location component and many DataHero vendors are far ahead of them in the area of geospatial information. That’s a key area they’ll have to improve.

Summary

DataHero is a young company with a young product. The key is that they aren’t just looking at their cool product and customizing solely on first sales. They have thought through a clear growth strategy. The BI tool is clearly fully fledged for the market segment they’ve chosen for initial release and they have thought through their growth strategy in far more detail than I’ve seen in other vendors who have presented at the BBBT.

If they execute their vision, and I see no reason why they wouldn’t, the folks at DataHero have a bright future.

Splunk at BBBT: Messages Need to Evolve Too

Our presenters last Friday at the BBBT were Brett Sheppard and Manish Jiandani from Splunk. The company was founded on understanding machine data and the presentation was full of that phrase and focus. However, machine data has a specific meaning and that’s not what Splunk does today. They speak about operational intelligence but the message needs to bubble up and take over.

Splunk has been public since 2012 and has over 1200 employees, something not many people realize. They were founded in 2004 to address the growing amount of machine data and the main goal the presenters showed is to “Make machine data accessible, usable and valuable to everyone.”

However, their presentation focused on Splunk’s ability to access IVR (Interactive Voice Recorder) and twitter transcripts and that’s not machine data. When questioned, they pointed out that they don’t do semantic analysis but focus on the timestamp and other machine generated data to understand operational flow. Still, while you might stretch and call that machine data, they did display doing some very simple analytics on the occurrence of keywords in text and that’s not it.

It’s clear that Splunk has successfully moved past pure machine data into a more robust operational intelligence solution. However, being techies from the Bay Area, it seems they still have their focus on the technology and its origins. They’re now pulling information from sources other than just machines, but are primarily analyzing the context of that information. As Suzanne Hoffman (@revenuemaven), another BBBT member analyst, pointed out during the presentation, they’re focused on the metadata associated with operational data and how to use that metadata to better understand operational processes.

Their demo was typical, nothing great but all the pieces there. The visualizations are simple and clear while they claim to be accessible to BI vendors for better analytics. However, note that they have a proprietary database and provide access through ODBC and an API. Mileage may vary.

There was also a confusing message in the claim that they’re not optimized for structured data. Machine data is structured. While it often doesn’t have clear field boundaries, there’s a clear structure and simple parsing lets you know what the fields and data are in the stream. What they really mean is it’s not optimal for RDBMS data. They suggest that you integrate Splunk and relational data downstream via a BI tool. That makes sense, but again they need to clarify and expose that information in a better way.

And then there’s the messaging nit. While talking about business as my main focus, technology presented with the incorrect words jars the educated audience. Splunk is not the first company nor will it, sadly, be the last, to have people who are confused about the difference between “premise” and “premises.” However, usually it’s only one person in a presentation. The slides and both presenters showed a corporate confusion that leads me to the premise that they’re not aware of how to properly present the difference between Cloud and on-premises solutions.

Hunk: On the Hadoop Bandwagon

Another messaging issue was the repeated mention of Hunk without an explanation. Only later in the presentation, they focused on it. Hunk’s their product to put the Splunk Enterprise technology on a Hadoop database. Let me be clear, it’s not just accessing Hadoop information for analysis but moving the storage from their proprietary system to Hadoop.

This is a smart move and helps address those customers who are heavily invested in Hadoop and, at least at the presentation level, they have a strong message about having the same functionality as in their core product, just residing on a different technology.

Note that this is not just helping the customer, it helps Splunk scale their own database in order to reach a wider range of customers. It’s a smart business move.

Security, Call Centers and Changing the Focus

The focus of their business message and a large group of customer slides is, no surprise, on network security and call center performance. The ability to look at the large amount of data and provide analysis of security anomalies means that Splunk is in the Gartner Magic Quadrant for SIEM (Security Information and Event Management).

In addition, IVR was mentioned earlier. That combined with other call center data allows Splunk to provide information that helps companies better understand and improve call center effectiveness. It’s a nice bridge from pure machine data to a more full featured data analysis.

That difference was shown by what I thought was the most enlightening customer slide, one about Tesco. For my primarily US readers, Tesco is a major grocery chain, with divisions focused on everything from the corner market to supermarkets. They are headquartered in England, are the major player in Europe and the second largest retailer by profit after Walmart.

As described, Tesco began using Splunk to analyze network and website performance, focused on the purely machine data concerns for performance. As they saw the benefit of the product to more areas, they expanded to customer revenue, online shopping cart data and other higher level business functions for analysis and improvement.

Summary

Splunk is a robust and growing company focused on providing operational intelligence. Unfortunately, their messaging is lagging their business. They still focus on machine data as the core message because that was their technical and business focus in the last decade. I have no doubts that they’ll keep growing, but a better clarification of their strategy, priorities and messages will help a wider market more quickly understand their benefits.

Datawatch at BBBT: Another contender and another question of message

Yesterday’s presentation to the BBBT was by Datawatch personnel Ben Plummer, CMO, and Jon Pilkington, VP Products. As they readily admit, they’re a company with a long history about which most people in the industry have never heard. They were founded in the 1980s and went public in the 1990s. Their focus is data visualization, but much of their business has been reseller and OEM agreements with companies including SAP, IBM and Tibco.

The core of their past success was with basic presentation of flat file information through their Monarch product. It was only with the acquisition of and initial integration with Panopticon in 2013, providing access to far more unstructured data that they rebranded as data visualization and began to push strongly into the BI space.

The demo was very standard. Everyone wants to show their design interface and how easy it is to build dashboards. Their demonstration was in the middle of the pack. The issue I had was the messaging. It’s no surprise that everyone claiming to be a visualization company needs to show visualization, but if you’re not one of the very flashy companies, your message about building your visualization should be different.

Datawatch’s strengths seem to be two-fold:

  • Access a very wide variety of data sources.
  • Access in motion data.
  • Full service from data access to presentation.

While Ben’s presentation talked about the importance of the Internet of Things and that real-time data is transactional, Jon’s presentation didn’t support those points. Datawatch is another company working to integrate structured and non-structured data and they seem to have a good focus on real-time, those need to be messages throughout their marketing, and that means in the demo.

Back from that tangent to the mainline. The third point is a major key. Major ETL and data warehouse vendors aren’t going away, but for basic BI, it adds costs and time to have to look at both and ETL and a data visualization tool which may not work together as the demoware indicates (A surprise, I know…). The companies who can get the full stream data supply chain from source to visualization can much more quickly and affordably add value for the business managers wanted better BI. I know it’s a fine line in messaging that and still working with vendors who overlap somewhat, but that’s why Coopetition was coined.

They seem to have a good vision but they haven’t worked to create a consistent and differentiated message. That could be because of resources and hopefully that will change. In February of this year Datawatch issued a common stock offering that netted them more cash. Hopefully some of that will be spent to focus on created strong and consistent marketing. That also includes such simple things as changing press releases to be visible from the PR link as html, not just pdfs.

Summary

I know you’re getting tired of hearing the following refrain, but here it is again. The issue is that I’ve heard this message before. The market is getting crowded with companies trying to support modern BI that’s a blend of structured and unstructured data. Technologists love to tweak products and think that minor, or even major technical issues that aren’t visibly relevant to the market should sell the product all by themselves. Just throw some key market points on top of them and claim you have no competitors because your technology is so cool.

BI and big data are cool right now and there are a large number of firms attempting to fill a need. Datawatch seems to have the foundations for a good, integrated platform from heterogeneous data access to visual presentation of actionable information. That message needs to quickly become stronger and clearer. This is a race. Being in shape isn’t enough, you have to have the right strategy and tactics to win the race. Datawatch has a chance, will they stumble or end up on the podium?

NuoDB at the BBBT: Another One Bringing SQL to the Cloud

Today’s presentation in front of the BBBT was by NuoDB’s CTO, Seth Proctor. NuoDB is a small company with big investments. What makes them so interesting? It’s the same thing as in many of the other platform presenters at the BBBT. How do we get real databases in the Cloud?

Hadoop is an interesting experiment and has clearly brought value to the understanding of massive amounts of unstructured data. The main value, though, remains that it’s cheap. The lack of SQL means it’s ok for point solutions that don’t stress its performance limitations. Bringing enterprise database support to the cloud is something else.

The main limitation is that Hadoop and other unstructured databases aren’t able to handle transactional systems while those still remain the major driver in operating businesses.

NuoDB has redesigned the database from the ground up to be able to run distributed across the internet. They’ve created a peer-to-peer structure of processes, with separate processes to manage the database and SQL front end transaction issues.

Seth pointed out that they ““Have done nothing new, just things we know put together in a new way.” He also pointed out they have patents. My gripe about patents for software is an issue for another day, but that dichotomous pairing points to one reason (Apple’s patent on a rounded rectangle is another example of the broken patent system, but off the soap box and onwards…).

It’s clear that old line RDMS systems were designed on major, on-premise servers. The need for a distributed system is clear and NuoDB is on the forefront of creating that. One intriguing potential strength, one about which there wasn’t time to discuss in the presentation, is a statement about the object-oriented structure needed for truly distributed applications.

Mr. Proctor stated that the database schema is in object definitions, not hard coded into the database. He added that provides more flexibility on the fly. What it also could mean is that the schema isn’t restricted to purely RDBMS schemas and that future versions of their database could support columnar and even unstructured database support. For now, however, the basic ability to change even a standard row-based relational database on the fly without major impacts on performance or existing applications is a strong benefit.

As the company is young and focused on the distributed aspects of performance, it was also admitted that their system isn’t one for big data, even structures. They’re not ready for terabytes, not to mention petabytes of data.

The Business

That’s the techie side, but what about business?

The company is focused on providing support for distributed operational systems. As such, Seth made clear they haven’t looked at implementations supporting both operational and analytical systems. That means BI is not a focus and so the product might not be the right system for providing high level business insight.

In addition, while I asked about markets I mainly got an answer about Web sites. They seem to think the major market isn’t Global 1000 businesses looking for link distributed operational systems but that Web commerce sites are their sweet spot. One example referred to a few times was in transactional systems for businesses selling across a country or around the world. If that’s the focus, it’s one that needs to be made more explicit on their web site, which really doesn’t discuss markets in the least.

It’s also an entry into the larger financial markets space. It and medical have always been two key verticals for new database technologies due to the volumes of information. That also means they need to prioritize the admitted lack of large database support or they’ll hit walls above the SMB market.

The one business thing the bothers me is their pricing model. It’s based on the number of hosts. As the product is based on processes, there’s no set number of processes per host. In addition, they mentioned shared hosting, places such as AWS, where hosts may be shared by multiple of NuoDB’s customers or where load balancing might take your processes and have them on one host one day and multiple hosts the next.

Host base pricing seems to be a remnant of on-premises database systems that Cloud vendors claim to be leaving. In a distributed, internet based setup, who cares how big the host is, where the host is, or anything else about the host? The work the customer cares about is done by the processes, the objects containing the knowledge and expertise of NuoDB, not the servers owned by the hosting firm. I would expect that Cloud companies would move from processors to process.

Summary

NuoDB is a company focused on reinventing the SQL database for the Cloud. They have significant investment from the VC and business markets. However, it would be foolish to think that Oracle, IBM and other existing mainstream RDBMS vendors aren’t working on the same thing. What NuoDB described to the BBBT used most of the right words from the technology front and they’re ramping up their development based on the investments, but it’s too early to say if they understand their own products and markets enough to build a presence for the long term.

They have what looks like very interesting technology but, as I keep repeating in review after review, we know that’s not enough.

Teradata Aster at the BBBT. Is a technology message sufficient?

Last Friday’s visitors to the BBBT were from Teradata Aster. As you’ve noticed, I tend to focus on the business aspects of BI. Because of that, this blog entry will be a bit shorter than usual.

That’s because the Teradata Aster folks reminded me strongly of my old days before I moved to the dark side: They were very technical. The presenters were Chris Twogood, VP, Product and Services Marketing, and Dan Graham, Technical Marketing.

Chris began with a short presentation about Aster. As far as it got into marketing was pointing to the real problem concerning the proliferation of analytic tools and that, as with all platform products, Aster is an attempt to find a way to address a way to better integrate a heterogeneous marketplace.

As with others who have presented to the BBBT, Chris Twogood also pointed out the R and other open source solutions aren’t any more sufficient for a full BI solution managing big data and analytics that are pure RDBMS solutions, so that a platform has to work with the old and the new.

The presentation was then handed over to Dan Graham, that rare combination of a very technical person who can speak clearly to a mixed level audience. His first point was a continuation of Chris’, speaking to the need integrate SQL and Map Reduce technologies. In support of that, he showed a SQL statement he said could be managed by business analysts, not the magical data scientist. There will have to be some training for business analysts, but that’s always the case in a fast moving industry such as ours.

Most of the rest of the presentation was about his love of graphing. BI is focused on providing more visual reporting of highly complex information, so it wasn’t anything new. Still, what he showed Teradata focusing upon is good and his enthusiasm made it an enjoyable presentation even if it was more technical than I prefer. It also didn’t hurt that the examples were primarily focused on marketing issues.

The one about which I will take issue is the wall he tried to set between graph databases and the graph routines Aster is leveraging. He claimed they’re not really competing with graph databases which was, Dan posited, because they are somehow different.

I pointed out that whether graphs are created in a database, in routines layered on top of SQL or in Java, or were part of a BI vendor’s client tools only mattered in a performance standpoint, that they were all providing graphical representations to the business customer. That means they all compete in the same market. Technical distinctions do not make for business market distinction other than as technical components of cost and performance that impact the organization. There wasn’t a clear response that showed they were thinking at a higher level than technological differences.

Summary

Teradata has a long and storied history with large data. They are a respected company. The question is whether or not they’re going to adapt to the new environments facing companies with the explosion of data that’s primarily non-structured and having a marketing focus. Will they be able to either compete or partner with newer companies in the space.

Teradata is a company who has long focused on large data, high performance database solutions. They seem to clearly be on the right path with their technology and the implications are that they are in their strategic and marketing focus. They built their name focused on large databases for the few companies that really needed their solutions. Technology came first and marketing was almost totally technically focused on the people who understood the issue.

The proliferation of customer service and Web data mean that the BI market is addressing a much wider audience for solutions managing large amounts of data. I trust that Teradata will build good technology, but will they realize that marketing has to become more prominent to address a much larger and less technical audience? Only time will tell.

TDWI Webinar Review: Business- Driven Analytics. Where’s business?

Today’s TDWI webinar was an overview of their latest best practices report. The intriguing thing was the numbers show that BI & Analytics still aren’t business driven. As Dave Stodder, Director of Research for Business Intelligence, pointed out, there are two key items contradicting that. First, more than half of companies have BI in less than 30% of the organization, pointing out that a large number of businesses aren’t prioritizing BI. Second, most of the responses to questions about BI show that it’s still something controlled and pushed by IT.

One point Dave mentioned was still the overwhelming presence of spreadsheets. They aren’t going away soon. A few vendors who have presented at the BBBT have also pointed out their focus at integrating spreadsheets rather than ignoring all the data that resides in them or demanding everything be collected in a data repository. The sooner more vendors realize they need to work with the existing business infrastructure rather than fight against it, the better off the industry will be.

Another interesting point was the influence of the CMO. I regularly read about analysts and others talking about how the “CMO has a bigger IT budget than the CIO!” The numbers from the TDWI survey don’t bear that out. One slide, a set of tables representing different CxO level positions’ involvement in different areas of the IT buying process show the CMO up near the CIO for identifying the need, but far behind in every other category – categories that include “allocate budget” and “approve budget.” In tech firms, and especially in Silicon Valley, people look around at other firms involved in the internet and forget they’re a small subset of the overall market.

Another intriguing point was brought out in the survey. Of companies with Centers of Excellence or similar groups to expand business intelligence, the list of titles involved in those groups shows an almost complete dearth of business users. It seems that IT still thinks of BI as a cool toy they can provide to users, not something that business users need to be involved in to ensure the right things are being offered. Only 15% show line of business management involved while a pathetic 4% show marketing’s involvement.

The last major point I’ll discuss is an interesting but flawed question/answer table. The question was on how the business-side leadership is doing during different aspects of a BI project. The numbers aren’t good. However, as we’ve just discussed, business isn’t included as much as they should be. There are two things that make me consider:

  • What would the pair of charts look like if the chart was split to look at how IT and business respondents each look at the question?
  • Is it an issue of IT not involving business or business not getting involved when opportunities are presented?

Summary

TDWI’s overview of the current state of business-driven BI & analytics seems to show that there’s a clear demand from the business community but there doesn’t seem to be the business involvement need to finish the widespread expansion of BI into most enterprises.

What I’d like to see TDWI focus on next is the barriers to that spread, the things that both IT and business see as inhibitors to expanding the role of modern BI tools in the business manager’s and CxO suite’s daily decision making.

It’s a good report, but only as a descriptive analysis of current state. It doesn’t provide enough information to help with prescriptive action.

EXASOL at the BBBT: Big Data, fast database. Didn’t I just hear this?

Friday’s EXASOL presentation to the BBBT brought a strong feeling of déjà vu. I’ve already blogged about the Tuesday Actian presentation and, to be honest, there were technical differences but I came to the same conclusion about the business model. But first, a thanks to Microsoft for the autocorrect feature. Otherwise typing EXASOL in all caps each time would have been bothersome.

The EXASOL presenters were Aaron Auld (@AaronAuldDE), CEO, and Kevin Cox (@KJCox), Director Sales and Marketing.

I mentioned technical differences. First, and foremost, they didn’t start with hardware but with an initial algorithm for massively parallel processing (MPP). They figured it was a great way to speed up database performance and stuck with columnar oriented relational technology. That’s allowed them to work on multi-terabyte systems with fast performance.

They have published some great TPC-H benchmark numbers, often being two orders of magnitude better than the competitors. While admitting that TPC stats are questionable since they’ve been defined by the big vendors to benefit their performance, often don’t reflect real life queries and often don’t use typical hardware, the numbers were still impressive. In addition, it was a smart business move as a small company blowing away the big vendors’ benchmarks helps elevate visibility and get them into doors.

However, let’s look back at Actian. They also talked about TPC, but they used the TPC-DS benchmark. How do you compare? Well, you can’t.

One other TPC factoid is, just like their competitor, there’s no clear information on true multi-user performance in today’s mobile age. No large numbers of connected clients was mentioned.

So results are great, but how do they fight the Hadoop bandwagon? They understand that open source is cheaper from a license standpoint, but also point out their performance saves in direct comparison when you total all costs for an implementation. People forget that while hardware prices have dropped, servers aren’t free.

Unfortunately, from a business model, it looks like they’re making the typical startup mistake of focusing on their product rather than business needs. They understand that ROI matters, but it seems to be too far down the list in their corporate messaging.

Another major advantage they have in common with the previous presenters is the sticking with SQL involves an easier build of ecosystem to include the existing vendors from ETL through visualization. However, they seem to be a bit further behind the curve in building those partnerships. While they have a strong strategic understanding of that, they need to bubble it up the priority list.

Exasol platform offering

One critical business success they have is their inclusion in the Dell Founders Club 50. That means advice and cooperation from Dell to help improve their performance and expand their presence. For a small company to have access not only to Dell at the technical level but also to bring customers to Dell Solution Centers for demonstrations is a great thing.

While they have been focused on MPP and large customers, the industry move to the Cloud also means they are looking at smaller licensing including a potential one-node free trial.

However, as mentioned in the lead, they seem to have the same business model issue as their competitors: They’re focused on the bleeding edge market who think the main message is performance. While they know there are other aspects to the buying decision, they went back, again and again, to performance. They have the whole picture in mind, but they’re not yet thinking of the mass market.

Organizations such as TDWI, Gartner and Forrester have all reported the high percentage of organizations that are considering big data and how to get a handle on the vast volume of information coming from heterogeneous sources. There’s clearly demand building up behind the dam. The problem seems to be they’re trying, as major IT organizations always do, to understand how best to integrate new technologies and capabilities with as little pain as possible. Meanwhile, the vendors seem to still be focused on the early adopters with their messaging. That leaves dollars on the table and slows adoption of new technology.

Summary

EXASOL seems to have a strongly performing and highly scalable database technology to work with large data sets. Yet, like many companies in the business intelligence space it comes back to audience. Are they still aiming at early adopters or will they focus on the mass market?

Have BI and big data advanced to the point where people need to think about the chasm and how to better address business needs not just technical issues. I think so, and I hope they adjust their business focus.

The company seems to have great potential, but will they turn that into reality? As the great Yogi Berra said, “It’s like deja vu all over again.”

Actian at the BBBT: Hadoop Big Data for the Enterprise Mass Market?

In the mid-90s, Sybase rolled out its new database. It was a great leap forward in performance and they pushed it like crazy. Sybase’s claims were justified, but it was a new way to look at databases and Sybase loudly announced how different it was from what people were used to using. Oops. They sold almost none of it and hit a financial wall and they never quite recovered.

That came to mind during yesterday’s BBBT presentation by Actian. Their technology foundation goes back to Ingres and that means they’ve been in the database market a long time. The question is whether or not they’ve learned from past case studies.

The presenters were John Santaferraro, VP of Solution and Product Marketing, and Emma McGrattan, SVP Engineering. They gave a great technical overview of Actian’s offerings. Put simply, they’re providing a platform for Big Data access. At the core is Hadoop, but they’ve taken their deep understanding of RDBMS technology and incorporated SQL access. That clearly opens up two things:

  • Better access to partners for ETL and analytics
  • The ability for the mass of business analysts to get at Hadoop data to more easily perform their jobs.

That’s a great thing and I’ll discuss later whether they’re taking that technology to the right markets. Before that, however, I should point out the main competitive point they repeatedly hit on. TPC benchmarks are public, so they went out and compared themselves to who they consider, rightly, to be their main competition: Cloudera Impala. Their results are seen in the chart below.

Actian performance comparison

Actian’s TPC-DS comparison with Cloudera Impala

 

They returned to this time and time again. On the other hand, they discussed the full platform intelligently but only briefly.

They also covered more of the technology, and there’s a lot of it. As a Computer Associates company, they grow by acquisition. It’s not just a renamed Ingres, but has acquired, VectorWise, Versant, Pervasive and ParAcell. Many companies have had trouble acquiring and integrating firms, but the initial descriptions seem to be showing a consolidated platform.

One caveat: We had no demo. The explanation was the Hadoop Summit demo went so well that they’re in the middle of moving it to a new server and IT didn’t give a heads up. Believable, but again I personally am not too worried. As a former field guy, I know how little emphasis to put into a short demo.

So what did I think was the key technology, if not performance? That’s next.

Hadoop meets SQL

To folks focused on the largest data sets and others, as in car ownership, who like speed for the pure sake of it, the performance is impressive. To me, that’s not the key. Rather, it’s the ability to bridge the Hadoop-SQL divide. As John Santaferraro pointed out, orders of magnitude more business analysts and business users know SQL than know MapReduce and the related underpinnings of Hadoop.

Actian Hadoop platform for big data

Actian platform

While other Big Data companies have been building bridges to ETL, data cleansing, analytics and other tools in the ecosystem, custom work to do that is time consuming. Opening the ability to use standard, existing SQL tools means you can more quickly build a stronger ecosystem.

Why does that matter?

What is the market

During the presentation, the Actian team was asked about their sweet spot. Is it folks already playing with Hadoop who want better access to enterprise data or is it companies who’ve heard about Hadoop but haven’t stepped in yet to try because of all the questions. Their answer was the first group. I think that’s wrong, however, I understand why they are

Another statement from John was that they are in Silicon Valley and everyone there thinks everyone uses Hadoop because everyone there does. He admitted that’s not true out of the small region. However, sometimes it’s hard to fight the difference between what you intellectually know and what you’re used to. I’ve seen it in multiple companies, and I think it’s happening here.

The mass of global businesses haven’t yet touched Hadoop. It’s very different from what the typically overburdened and underfunded IT organization does, and that much change is scary. Silicon Valley is full of early adopters, it attracts them. In addition, there are plenty of early adopters out there for the picking. However, there are now a lot of vendors in the BI and big data spaces and we’re getting close to a tipping point. The company that figures out how to cross the chasm first is the one who will make it big.

It’s not pure performance that will attract the mass market, it’s how to get the advantages of big data in the most affordable way with the easiest transition path. It’s the ability to quickly leverage existing IT infrastructure and to join it with the newest technology.

Once again, it’s evolution rather than revolution that will win the day.

Summary

From what I saw of the platform, it’s a great start. The issue I see is the focus on the wrong market. The technology will always be important, but though it’s critical it only exists to solve the business problems. Actian seems to have a good handle on the technology and are on a path to integrate and leverage all the acquisitions into a solid platform, but will they be able to explain why that matters to the right market?

There is hope for that. One thing discussed is that their ability to bridge SQL and Hadoop means they are working on building partnerships with major vendors to extend their ecosystem. If they focus on that, they have a great chance of being very successful and being the company that brings Hadoop to the wider IT market.

Twitter: @actiancorp, @santaferraro & @emmakmcgrattan

TDWI and HP Webinar: Modernizing the Data Warehouse

After a couple of mediocre webinars, it was nice to see TDWI get back on track. This week’s seminar was sponsored by HP Vertica and discussed Data Warehousing Modernization. The speakers were Philip Russom, from TDWI, and Steve Sarsfield, Product Marketing Manager, HP Vertica.

Philip led with the five key reasons organizations need to modernize Enterprise Data Warehouses (EDWs):

  • Analytics
  • Scale
  • Speed
  • Productivity
  • Cost Control

He pointed out that TDWI research show the first three to be far more of a key focus for companies than that others. One key point was that cost control should have more of an impact than it does. Mr. Russom pointed out that even if your EDW peforms properly today, much of the new technology is based on open source and less expensive servers, so a rethink of your warehouse can bring clear ROI, as he pointed out with ““Modernization is a great opportunity to rethink economics.”

Another major point was the simple fact, overlooked by many zealots, that EDWs aren’t going anywhere. Sure, there are newer technologies that allow for analytics straight from operational data stores (ODSs) and other places, but there will always be a place for the higher latency accumulation of information that is the EDW.

After that setup, Steve Sarsfield gave the expected sponsor pitch for how HP Vertica helps companies modernize. It’s also good to say that his presentation was better than most. It walked the right line, avoiding the overly-salesy and too technical extremes of many sponsor pitches.

Sarsfield’s main point is that Hadoop is great for ODSs but implementations still haven’t gotten up to speed in joins and other data manipulation capabilities seen in the mature SQL environment. He described HP Vertica as having the following key components:

 TDWI HP Vertica Secret Sauce

I think the only one that needs explanation is the last, Projections. If not, please let me know and I’ll expand on the others. Projections are, simply put, the HP method for replacing indices. Columnar databases don’t provide the index structures that standard RDMS systems based on rows provide.

It was a good overview that should bring HP into the mix for anyone looking to modernize their EDW environment.

The final point that came up during Q&A was about Big Data. It’s one many folks have made, but we know how much you listen to analysts pontificating…

Philip Russom pointed out, as many have, that Big Data isn’t about the size of the data but about managing the complexity of modern data. He did that point pitching the most recent TDWI Best Practices Report, Evolving Data Warehouse Architectures in the  Age of Big Data. What Philip pointed out was that the judges regularly came back with clear opinions that complexity was more important than database size. Very large databases where people were just doing aggregations of columns weren’t interesting. It was the ability to link to multiple sources and provide advanced insight through analytics that the judges felt most reflected the power in the concept of Big Data.

All told, it was a smooth and informative presentation that hopefully helped its IT audience understand a bit more about the issues involved in modern data warehousing. It was time well spent.