Category Archives: Cloud

Webinar: IBM, Actuate and Cirro describe faster analytics

Today a webinar was hosted by Database Trend and Applications. While there are important things to talk about, I’ll start with the amusing point of the inverse relationship between company size and presenter title found in every webinar, but wonderfully on display here. The three presenters were:

  • Mark Theissen, CEO, Cirro
  • Peter Hoopes, VP/GM, BIRT Analytics Division, Actuate
  • Amit Patel, Program Director, Data Warehouse Solutions Marketing, IBM

The topic was “Accelerating your Analytics for Faster Insights.” That is a lot to cover in less than an hour, made more brief by a tag team of three people from different companies. I must say I was pleasantly surprised with how well they integrated their messages.

Mark Theissen was up first. There were a lot of fancy names for what Cirro does, but think ETL as it’s much easier. Mark’s point is that no single repository can handle all enterprise data even if that made sense. Cirro’s goal is to provide on-demand distributed analytics, using federation to link multiple data sources in order to help businesses analyze more complete information. It’s a strong point people have forgotten in the last few years during the typical “the latest craze will solve everything” focus on Hadoop and minimizing the role of getting to multiple sources.

Peter Hoopes then followed to talk about doing the analytics. One phrase he used should be discussed in more detail: “speed wins.” So many people are focused on the admittedly important area of immediate retail feedback on the web and with mobile devices. There, yes, speed can win. However, not always. Sometimes though helps too. That’s one reason why complex analysis for high level business strategy and planning is different that putting an ad on a phone as you walk by a store. There are clear reasons for speed, even in analytics, but it should not be the only focus in a BI decision.

IBM’s Amit Patel then came on to discuss the meat of the matter: DB2 Blu. This is IBM’s foray into in-memory, columnar databases. It’s a critical ad to the product line. There are advantages to in-memory that have created a need for all major players to have an offering, and IBM does the “me too!” well; but how does IBM differentiate itself?

As someone who understands the need for integration of transaction and analytic systems and agrees both need to co-exist, I was intrigued by what Amit had to say. Transactions going into normal DB2 environment while being shadowed into columnar BLU environment to speed analytics. Think about it: Transactions can still be managed with the row-oriented technologies best suited for them while the information is, in parallel, moved to the analytics database that happens to be in memory. It seems to be a good way to begin to blend the technologies and let each do what works best.

For a slightly techhie comment, I did like what Mr. Patel was saying about IBM’s management of memory and CPU. After all, while IBM is one of the largest software vendors in the world, too many folks forget their hardware background. One quick mention in a sentence about “hardware vendors such as Intel and IBM…” was a great touch to add a message that can help IBM differentiate its knowledge of MPP from that of pure software companies. As a marketing guy, I smiled big time at the smooth way that was brought up.

Summary

The three presenters did a good job in pointing out that the heterogeneous nature of enterprise data isn’t going away, rather it’s expanding. Each company, in its own way, put forward how it helps address that complexity. Still, it takes three companies.

As the BI market continues to mature, the companies who manage to combine the enterprise information supply chain components most smoothly will succeed. Right now, there’s a message being presented by three players. Other competitors also partner for ETL, data storage and analytics. It sounds interesting, but the market’s still young. Look for more robust messages from single vendors to evolve.

NuoDB at the BBBT: Another One Bringing SQL to the Cloud

Today’s presentation in front of the BBBT was by NuoDB’s CTO, Seth Proctor. NuoDB is a small company with big investments. What makes them so interesting? It’s the same thing as in many of the other platform presenters at the BBBT. How do we get real databases in the Cloud?

Hadoop is an interesting experiment and has clearly brought value to the understanding of massive amounts of unstructured data. The main value, though, remains that it’s cheap. The lack of SQL means it’s ok for point solutions that don’t stress its performance limitations. Bringing enterprise database support to the cloud is something else.

The main limitation is that Hadoop and other unstructured databases aren’t able to handle transactional systems while those still remain the major driver in operating businesses.

NuoDB has redesigned the database from the ground up to be able to run distributed across the internet. They’ve created a peer-to-peer structure of processes, with separate processes to manage the database and SQL front end transaction issues.

Seth pointed out that they ““Have done nothing new, just things we know put together in a new way.” He also pointed out they have patents. My gripe about patents for software is an issue for another day, but that dichotomous pairing points to one reason (Apple’s patent on a rounded rectangle is another example of the broken patent system, but off the soap box and onwards…).

It’s clear that old line RDMS systems were designed on major, on-premise servers. The need for a distributed system is clear and NuoDB is on the forefront of creating that. One intriguing potential strength, one about which there wasn’t time to discuss in the presentation, is a statement about the object-oriented structure needed for truly distributed applications.

Mr. Proctor stated that the database schema is in object definitions, not hard coded into the database. He added that provides more flexibility on the fly. What it also could mean is that the schema isn’t restricted to purely RDBMS schemas and that future versions of their database could support columnar and even unstructured database support. For now, however, the basic ability to change even a standard row-based relational database on the fly without major impacts on performance or existing applications is a strong benefit.

As the company is young and focused on the distributed aspects of performance, it was also admitted that their system isn’t one for big data, even structures. They’re not ready for terabytes, not to mention petabytes of data.

The Business

That’s the techie side, but what about business?

The company is focused on providing support for distributed operational systems. As such, Seth made clear they haven’t looked at implementations supporting both operational and analytical systems. That means BI is not a focus and so the product might not be the right system for providing high level business insight.

In addition, while I asked about markets I mainly got an answer about Web sites. They seem to think the major market isn’t Global 1000 businesses looking for link distributed operational systems but that Web commerce sites are their sweet spot. One example referred to a few times was in transactional systems for businesses selling across a country or around the world. If that’s the focus, it’s one that needs to be made more explicit on their web site, which really doesn’t discuss markets in the least.

It’s also an entry into the larger financial markets space. It and medical have always been two key verticals for new database technologies due to the volumes of information. That also means they need to prioritize the admitted lack of large database support or they’ll hit walls above the SMB market.

The one business thing the bothers me is their pricing model. It’s based on the number of hosts. As the product is based on processes, there’s no set number of processes per host. In addition, they mentioned shared hosting, places such as AWS, where hosts may be shared by multiple of NuoDB’s customers or where load balancing might take your processes and have them on one host one day and multiple hosts the next.

Host base pricing seems to be a remnant of on-premises database systems that Cloud vendors claim to be leaving. In a distributed, internet based setup, who cares how big the host is, where the host is, or anything else about the host? The work the customer cares about is done by the processes, the objects containing the knowledge and expertise of NuoDB, not the servers owned by the hosting firm. I would expect that Cloud companies would move from processors to process.

Summary

NuoDB is a company focused on reinventing the SQL database for the Cloud. They have significant investment from the VC and business markets. However, it would be foolish to think that Oracle, IBM and other existing mainstream RDBMS vendors aren’t working on the same thing. What NuoDB described to the BBBT used most of the right words from the technology front and they’re ramping up their development based on the investments, but it’s too early to say if they understand their own products and markets enough to build a presence for the long term.

They have what looks like very interesting technology but, as I keep repeating in review after review, we know that’s not enough.

Actian at the BBBT: Hadoop Big Data for the Enterprise Mass Market?

In the mid-90s, Sybase rolled out its new database. It was a great leap forward in performance and they pushed it like crazy. Sybase’s claims were justified, but it was a new way to look at databases and Sybase loudly announced how different it was from what people were used to using. Oops. They sold almost none of it and hit a financial wall and they never quite recovered.

That came to mind during yesterday’s BBBT presentation by Actian. Their technology foundation goes back to Ingres and that means they’ve been in the database market a long time. The question is whether or not they’ve learned from past case studies.

The presenters were John Santaferraro, VP of Solution and Product Marketing, and Emma McGrattan, SVP Engineering. They gave a great technical overview of Actian’s offerings. Put simply, they’re providing a platform for Big Data access. At the core is Hadoop, but they’ve taken their deep understanding of RDBMS technology and incorporated SQL access. That clearly opens up two things:

  • Better access to partners for ETL and analytics
  • The ability for the mass of business analysts to get at Hadoop data to more easily perform their jobs.

That’s a great thing and I’ll discuss later whether they’re taking that technology to the right markets. Before that, however, I should point out the main competitive point they repeatedly hit on. TPC benchmarks are public, so they went out and compared themselves to who they consider, rightly, to be their main competition: Cloudera Impala. Their results are seen in the chart below.

Actian performance comparison

Actian’s TPC-DS comparison with Cloudera Impala

 

They returned to this time and time again. On the other hand, they discussed the full platform intelligently but only briefly.

They also covered more of the technology, and there’s a lot of it. As a Computer Associates company, they grow by acquisition. It’s not just a renamed Ingres, but has acquired, VectorWise, Versant, Pervasive and ParAcell. Many companies have had trouble acquiring and integrating firms, but the initial descriptions seem to be showing a consolidated platform.

One caveat: We had no demo. The explanation was the Hadoop Summit demo went so well that they’re in the middle of moving it to a new server and IT didn’t give a heads up. Believable, but again I personally am not too worried. As a former field guy, I know how little emphasis to put into a short demo.

So what did I think was the key technology, if not performance? That’s next.

Hadoop meets SQL

To folks focused on the largest data sets and others, as in car ownership, who like speed for the pure sake of it, the performance is impressive. To me, that’s not the key. Rather, it’s the ability to bridge the Hadoop-SQL divide. As John Santaferraro pointed out, orders of magnitude more business analysts and business users know SQL than know MapReduce and the related underpinnings of Hadoop.

Actian Hadoop platform for big data

Actian platform

While other Big Data companies have been building bridges to ETL, data cleansing, analytics and other tools in the ecosystem, custom work to do that is time consuming. Opening the ability to use standard, existing SQL tools means you can more quickly build a stronger ecosystem.

Why does that matter?

What is the market

During the presentation, the Actian team was asked about their sweet spot. Is it folks already playing with Hadoop who want better access to enterprise data or is it companies who’ve heard about Hadoop but haven’t stepped in yet to try because of all the questions. Their answer was the first group. I think that’s wrong, however, I understand why they are

Another statement from John was that they are in Silicon Valley and everyone there thinks everyone uses Hadoop because everyone there does. He admitted that’s not true out of the small region. However, sometimes it’s hard to fight the difference between what you intellectually know and what you’re used to. I’ve seen it in multiple companies, and I think it’s happening here.

The mass of global businesses haven’t yet touched Hadoop. It’s very different from what the typically overburdened and underfunded IT organization does, and that much change is scary. Silicon Valley is full of early adopters, it attracts them. In addition, there are plenty of early adopters out there for the picking. However, there are now a lot of vendors in the BI and big data spaces and we’re getting close to a tipping point. The company that figures out how to cross the chasm first is the one who will make it big.

It’s not pure performance that will attract the mass market, it’s how to get the advantages of big data in the most affordable way with the easiest transition path. It’s the ability to quickly leverage existing IT infrastructure and to join it with the newest technology.

Once again, it’s evolution rather than revolution that will win the day.

Summary

From what I saw of the platform, it’s a great start. The issue I see is the focus on the wrong market. The technology will always be important, but though it’s critical it only exists to solve the business problems. Actian seems to have a good handle on the technology and are on a path to integrate and leverage all the acquisitions into a solid platform, but will they be able to explain why that matters to the right market?

There is hope for that. One thing discussed is that their ability to bridge SQL and Hadoop means they are working on building partnerships with major vendors to extend their ecosystem. If they focus on that, they have a great chance of being very successful and being the company that brings Hadoop to the wider IT market.

Twitter: @actiancorp, @santaferraro & @emmakmcgrattan

TDWI and HP Webinar: Modernizing the Data Warehouse

After a couple of mediocre webinars, it was nice to see TDWI get back on track. This week’s seminar was sponsored by HP Vertica and discussed Data Warehousing Modernization. The speakers were Philip Russom, from TDWI, and Steve Sarsfield, Product Marketing Manager, HP Vertica.

Philip led with the five key reasons organizations need to modernize Enterprise Data Warehouses (EDWs):

  • Analytics
  • Scale
  • Speed
  • Productivity
  • Cost Control

He pointed out that TDWI research show the first three to be far more of a key focus for companies than that others. One key point was that cost control should have more of an impact than it does. Mr. Russom pointed out that even if your EDW peforms properly today, much of the new technology is based on open source and less expensive servers, so a rethink of your warehouse can bring clear ROI, as he pointed out with ““Modernization is a great opportunity to rethink economics.”

Another major point was the simple fact, overlooked by many zealots, that EDWs aren’t going anywhere. Sure, there are newer technologies that allow for analytics straight from operational data stores (ODSs) and other places, but there will always be a place for the higher latency accumulation of information that is the EDW.

After that setup, Steve Sarsfield gave the expected sponsor pitch for how HP Vertica helps companies modernize. It’s also good to say that his presentation was better than most. It walked the right line, avoiding the overly-salesy and too technical extremes of many sponsor pitches.

Sarsfield’s main point is that Hadoop is great for ODSs but implementations still haven’t gotten up to speed in joins and other data manipulation capabilities seen in the mature SQL environment. He described HP Vertica as having the following key components:

 TDWI HP Vertica Secret Sauce

I think the only one that needs explanation is the last, Projections. If not, please let me know and I’ll expand on the others. Projections are, simply put, the HP method for replacing indices. Columnar databases don’t provide the index structures that standard RDMS systems based on rows provide.

It was a good overview that should bring HP into the mix for anyone looking to modernize their EDW environment.

The final point that came up during Q&A was about Big Data. It’s one many folks have made, but we know how much you listen to analysts pontificating…

Philip Russom pointed out, as many have, that Big Data isn’t about the size of the data but about managing the complexity of modern data. He did that point pitching the most recent TDWI Best Practices Report, Evolving Data Warehouse Architectures in the  Age of Big Data. What Philip pointed out was that the judges regularly came back with clear opinions that complexity was more important than database size. Very large databases where people were just doing aggregations of columns weren’t interesting. It was the ability to link to multiple sources and provide advanced insight through analytics that the judges felt most reflected the power in the concept of Big Data.

All told, it was a smooth and informative presentation that hopefully helped its IT audience understand a bit more about the issues involved in modern data warehousing. It was time well spent.

GoodData at the BBBT

Today’s BBBT presentation was by GoodData and I’m still waiting. Vendor after vendor tells us that they’re very special because they’re unique when compared to “traditional BI.” They don’t seem to get that the simple response is “so what?” Traditional BI was created decades ago, when offering software in the Cloud was not reasonable. Now it is. Every young vendor has a Cloud presence and I can’t imagine there’s a “traditional” company that isn’t moving to a Cloud offering. BI is not the Cloud. I want to hear why they have a business model that differentiates them from today’s competitions, not from the ones in the 1990s. I’m still waiting.

Almost all the benefits mentioned were not about their platform, they weren’t even about BI. What was mentioned were the benefits that any application get by moving to the Cloud. All the scalability, shareability, upgradability and other Cloud benefits do not a unique buying proposition make. Where they will matter is if GoodData implemented those techniques faster and better in the BI space than the many competitors who exist.

Serial founder, Roman Stanek wants his company to provide a strong platform for BI based on Open Source technology. The presentation, however, didn’t make clear if he really had that. He had the typical NASCAR slide, but only under NDA, with only a single company mentioned as an open reference. His technological vision seems to be good, but it’s too early to say whether or not the major investments he has received will pay off.

What I question is his business model. He and his VP of Marketing, Jeff Morris, mentioned that 2/3 of their revenue comes from OEM agreements, embedding their platform into other applications. However, his focus seems to be on trying to grow the other third, the direct sales to the Fortune 2000. I’m not sure that makes sense.

Another business model issue is that the presenters were convinced that the Cloud means they can provide a single version of product to all customers. They correctly described the headaches of managing multiple versions of on-premises software (even if they avoided saying “on-premise” only a third of the time). However, the reason that exists is because people don’t want to switch from comfortable versions at the speed of the vendor. While the Cloud does allow security and other background fixes to easily update to all customers, any reasonable company will have to provide some form of versioning to allow customers a range of time to convert to major upgrades.

A couple of weeks ago, 1010data went the other direction, clearly admitting that customers prefer that. I didn’t mention that in my blog post on that presentation, even though I thought they went too far in the other direction of too many versions, but combined with GoodData’s thinking there should only be one, now’s as good a time as any to mention that. Good Cloud practices will help minimize the number of versions that need to be active for your customers, but it’s not reasonable to think that will mean a single version.

At the beginning of the presentation, Roman mentioned a company, as a negative reference: Crystal Reports. At this point, I don’t think that comparison is at all negative. Nothing that GoodData showed us led me to believe that they can really get access to the massively heterogeneous data sources in true enterprise business. He also showed nothing that indicates an ability to provide top level analysis and display as required in that market. However, providing OEM partners a quick and easy way to add basic BI functions to their products seems to be a great way to build market share and bring in revenue. While Crystal Reports seems archaic, it was the right product with the right business plan at the right time, and the product became the de facto standard for many years.

The presentation left me wondering. There seems to be a sharp team but there wasn’t enough information to see if vision and product have gelled to create a company that will succeed. The company’s been around since 2008, just officially released the product, yet have a number of very interesting customers. That can’t be based just on the strong reputation of Mr. Stanek, there has to be meat there. How much, though, is open to question based on this presentation. If you’re considering an operational data store in the Cloud, talk with them. If you want more, get them to talk to you more than they talked to us.

Cloudera at the BBBT: The limits of Open Source as a business model

Way back, in the dawn of time, there were ATT and BSD, with multiple flavors of each base type of Unix. A few years later, there were only Sun, IBM and HP. In a later era, there was this thing called Linux. Lots of folks took the core version, but then there were only Redhat and a few others.

What lessons can the Hadoop market learn from that? Mission critical software does not run on freeware. While open source lowers infrastructure costs and can, in some ways, speed feature enhancements, companies are willing to pay for knowledge, stability and support. Vendors able to wrap the core of open source up in services to provide the rest make money and speed the adoption of open-source based solutions. Mission critical applications run on services agreements.

It’s important to understand that distinction when discussing such interesting companies as Cloudera, whose team presented at last Friday’s BBBT session. The company recently received a well-publicized, enormous investment based on the promise that it can create a revenue stream for a database service based on Hadoop.

The team had a good presentation, with Alan Saldich, VP Marketing, pointing out that large, distributed processing databases are providing a change from “bringing data to compute” to “bringing compute to data.” He further defined the Enterprise Data Hub (EDH) as the data repository that is created in such an environment.

Plenty of others can blog in detail about what we heard about the technology, but I’ll give it only a high level glance. The Cloudera presenters were very open about their product being an early generation and they laid out a vision that seemed to be good. They understand their advantages are the benefits of Cloud and Hadoop (discussed a little more below) but that the Open Source community is lagging in areas such as access and control to data. It’s providing such key needs to IT that will help their adoption and provide a revenue stream, and their knowing that is a good sign.

I want to spend more time addressing the business and marketing models. Cloudera does seem to be struggling to figure out how to make money, hence the need more such a large investment from Intel. Additional proof is the internal confusion of Alan saying they don’t report revenues and then showing us only bookings, while Charles Zedlewski, VP Products, had a slide claiming they’re leading their market in revenue. Really? Then show us.

They do have one advantage, the Cloud model lends itself to a pricing model based on nodes and, as Charles pointed out, that’s a ““business model that’s inherently deflationary” for the customer.  Nodes get more powerful so the customers regularly get more bang for the buck.

On the other side, I don’t know that management understands that they’re just providing a new technology, not a new data philosophy. While some parts of the presentation made clear that Cloudera doesn’t replace other data repositories except for the operational data store, different parts implied it would subsume others without giving a clear picture of how.

A very good point was the partnerships they’re making with BI vendors to help speed integration and access of their solution into the BI ecosystem.

One other confusion that Cloudera, and the market as a whole, seems to be clearly differentiating that the benefits of Hadoop come from multiple technologies: Both the software that helps better manage unstructured data and simple hardware/OS combination that comes from massively parallel processing, whether the servers are in the Cloud or inside a corporate firewall. Much as what was said about Hadoop had to do with the second issue, and so the presenters rightfully got pushback from analysts who saw that RDBMS technologies can benefit from those same things and therefore minimizing that as a differentiator.

Charles did cover an important area of both market need and Cloudera vision: Operational analytics. The ability to quickly massage and understand massive amounts of operational information to better understand processes is something that will be enhanced by the vendor’s ability to manage large datasets. The fact that they understand the importance of those analytics is a good sign for corporate vision and planning.

Open source is important, but it’s often overblown by those new to the industry or within the Open Source community. Enterprise IT knows better, as it has proved in the past. Cloudera is a the right place at the right time, with a great early product, the understanding as to many of the issues that are needed in the short term. The questions are only about the ability to execute both on the messaging and programming sides. Will their products meet the long term needs of business critical applications and will they be able to explain clearly how they can do so? If they can answer correctly, the company will join the names mentioned at the start.

VisualCue at BBBT: A New Paradigm for Operational Intelligence

The latest presentation to the BBBT was by Kerry Gilger, President and Founder of VisualCue™ Technologies. While I find most of the presentations interesting, this was real eye-opener.

Let’s start with a definition of operational intelligence  (OI): Tools and procedures to better understand ongoing business operations. It is a subset of BI focused on ongoing operations in manufacturing, call centers, logistics and other physical operations where the goal is not just to understand the high level success of processes but to better understand, track and correct individual instantiations of the process.

A spreadsheet with a row of data for each instantiation is a cumbersome way to quickly scan for the status of individual issues. The following image is an example of VisualCue’s solution: A mosaic of tiles that show different KPIs of the call center process, with a tile per operator, color coded for quick understanding of the KPIs.

VisualCue call center mosaic

 The KPIs include items such as call times, number of calls and sales. The team understands each element of the tile and a review shows the status of each operator. Management can quickly drill down into a tile to see specifics and take corrective actions.

The mosaic is a quick way to review all the instantiations of a given process, a new and exciting visualization method in the industry. However, they are a startup and there are issues to watch as they grow.

They have worked closely with each customer to create tiles that meet needs. They are working to make it easier to replicate industry knowledge to help new customers start faster and less expensively.

The product has also moved from full on-site code to a SaaS model to provide shared infrastructure, knowledge and more in the Cloud.

VisualCue understands operational intelligence is part of the BI space, and has begun to work with standard BI vendors to provide integration with other elements that make up a robust dashboard including the mosaic and other informational elements, that’s rightfully in its infancy given the company’s evolutionary stage. If they keep building and expanding the relationships there’s no problem.

However, the thing that must change to make it a full-blown system is really how they access the data. It’s understandable that a startup expects a customer to figure out all its own data access issues and provide a single source database to drive the mosaics, they’re going to have to work more closely with ETL and other vendors to provide a more open access methodology as they grow and a more dynamic, open data definition and access model than “give us a lump of data and we’ll work with it.”

Given where the company is right now, those caveats are more foibles than anything else. They have the time to build out the system and their time has, correctly, been spent in creating the robust visualization paradigm they demonstrated.

If Kerry Gilger and the rest of his team are able to execute the vision he’s shown, VisualCue will add a major advancement in the ability for business management to quickly understand operations in a way that can provide instant feedback that can improve performance.

1010data at the BBBT: Cool technology without a clear strategy

The presenters at last Friday’s BBBT session were from 1010data. The company provides some complex and powerful number crunching abilities through a spreadsheet paradigm. As with many small technical companies, they have the problem of trying to differentiate between technology and a business solution.

Let’s start with the good side, the engine seems very cool. They use distributed technology in the Cloud to provide the ability to rapidly filter through very large data sets. It’s no surprise that their primary markets seem to be CPG and financial companies, as they are dealing with high volumes of daily data. It’s also no surprise because they must have very technical business users who are used to looking at data via spreadsheets.

The biggest problem is that spreadsheets are ok for looking at raw data, but not for understanding anything except a heavily filtered subset of it. That’s why the growth of BI has been in visualization. Everything 1010data showed us involved heavy work with filters, functions, XML and more. The few graphics they showed look twenty years out of data and quite primitive by modern standards. This is a tool for a power user.

Another issue, showing the secondary thought given to re-use and display of information is their oxymoronic QuickApps. As the spreadsheet analysis is done in the cloud on the live data set, if someone wants to reuse the information in reports a lot of work must be done. The technical presenter was constantly diving into complex functions and XML code. That’s not quick.

When asked about that, the repeated refrain was about how spreadsheets are everywhere. True, but the vast majority of Microsoft Excel™ use no functions or the very simplest of sum() and a few others. Only power users create major results and BI companies have grown over the move from Excel to better ways of displaying results.

I must question whether CEO and Co-founder Sandy Steier understands where the company fits into the BI landscape. He constantly referred to Cognos and MicroStrategy as if they’re the current technology leads in BI. Those solutions are good, but they are not the focus of conversation when talking about the latest in visualization or in-memory technologies. The presentation did have one slide that listed Tableau, their web site was devoid of references to the modern generation (or it was well hidden). Repeated questions about relationships with visualization vendors were turned off to other topics and not addressed.

Of key focus was an early statement by Mr. Steier that data discovery is self-service reporting. There seems to be the typical technical person’s confusion between technology and business needs. Data discovery is the ability to understand relationships between pieces of data to build information for decision making. Self-service reporting is just one way of telling people what you’ve discovered. Self-service business intelligence is a larger issue that includes components from both.

I very much liked the technology but I must question if the management of 1010data has the vision to figure out what they want to do with it. Two, of many possible, options show the need for that choice. First, they can decide to be a new database engine, providing a very powerful and fast data repository from which other vendors can access and display information. Second, they can focus on adding real visualization to help them move past the power users so that regular business users can directly leverage their benefits. The two strategic choices mean very different tactics necessary for implementation.

To summarize: I was very impressed with 1010data’s technology but am very concerned about their long term potential in the market.

ETL across the firewall: SnapLogic at the BBBT

SnapLogic presented at the BBBT last Friday. I was on the road then so I watched the video today. The presentation was by Darren Cunningham, VP Marketing, and Craig Stewart from product management. It was your basic dog and pony show with one critical difference for the BI space, they understand hybrid systems.

Most of the older BI vendors are still on-premises and tip-toeing into the Cloud. Most of the newer vendors are proudly Cloud. The issue with enterprises is that they are clearly in a strong, hybrid situation with a very mixed set of applications within and outside the firewall. Companies talk about supporting systems in a hybrid system, but you dig down and find out it’s one way or the other, with minimal thought given to supporting the other half.

Darren made it clear from the beginning that SnapLogic understands the importance of a truly hybrid environment. They are, ignoring all the fancy words, ETL for a hybrid world. They focus on accessing data equally well regardless of on which side of the firewall it resides. Their partner ecosystem includes Tableau, Birst and other BI vendors, while SnapLogic focuses on providing the information from disparate systems.

Their view was supported by a number of surveys they’d performed. While the questions listed had the typical tilt of company offered surveys, they still provided value. The key slant, that has implications for their strategic planning, is shown by one survey question on “Technical Requirements of a Cloud Integration Platform.” “Modern scalable architecture” came in first while “Ease of use for less technical users” was third.

As Claudia Imhoff accurately pointed out, the basic information might be useful, but it’s clear om their presentation that this was an IT focused survey and should be treated as such. It would be interesting to see the survey done similarly for both IT and business users to see the difference in priorities.

SnapLogic looks like they have a good strategy, the thing to watch is how they grow. The key founder is Gaurav Dhillon, one of the founders of Informatica. He had a good strategy but was replaced when the company grew to a point where he couldn’t figure out the tactics to get over the Chasm (full disclosure: I worked at Informatica in1999, when it hit the wall. I’m not unbiased). Let’s hope he learned his lesson. There’s a clear opportunity for SnapLogic’s software, and it seems to be going well so far, but we’ll need to watch how they execute.

TDWI Webinar – BI in the Cloud

Today, TDWI held a webinar on BI in the Cloud. The simple summation: It’s slowly gaining a foothold, but it’s early.

The presentation was a tag team between Fern Halper, Research Director, Advance Analytics, and Suzanne Hoffman, Sr. Director, Analyst Relations at Tableau Software, the webinar’s sponsor. Fern Halper’s focus, along with plugging her books, was that she sees people beginning to turn the corner in understanding and using the Cloud for BI. A number of indicators of that were slides of survey results from the recent TDWI conference. One slide pointed to results shown 25% of attendees rejecting the public Cloud, at least in the short term and 36% don’t yet know. That means only 39% are either already using it or planning on using it. It’s growing, but not yet the norm.

Another key aspect of her talk was on a subject many people don’t consider until it’s too late. While people looking at the Cloud focus on the risks of getting on the cloud, such as security and compliance issues, there are issues related to a key concern that exists regardless of where your data resides: Vendor choice.

Your costs have gone up more than expected. Another vendor comes along with features you really need. What do you do? It’s hard enough to migrate between applications that you manage on premises. When your data is hosted by others, what is the access scenario? How will you get your data from one system to another? Decision makers should be planning exit strategies as part of the purchase decision.

Suzanne Hoffman’s segment covered, at a much higher and briefer level, the content of the BBBT presentation I’ve previously described. Due to that briefness and a question from one attendee, I learned something I missed the other day. Tableau Online is the server only. Anyone accessing it must have at least one desktop version to work with it to set up database relations. In this era, where more and more companies are recreating their interfaces using HTML5 in order to provide full functionality anywhere, it’s interesting that a company often described as a disruptive technology is lagging in this respect. This isn’t a problem in the short term, as delivery on multiple devices is still provided and it’s not much of a problem setting up the links on a desktop, but it’s something to watch.

Companies are tiptoeing to the Cloud, and Fern Helper’s presentation shows us the momentum building. We haven’t reached an inflection point, and I’d be surprised if we see one in the next 12-18 months, but it’s good to see TDWI keeping an eye on things and giving us sign posts on the way.