Tag Archives: teich communications

Webinar review: TDWI on Streaming Data in Real Time, in Memory

The Internet of Things (IOT) is something more and more people are considering. Wednesday’s TDWI webinar topic was “Stream Processing: Streaming Data in Real Time, in Memory,” and the event was sponsored by both SAP and Intel. Nobody from Intel took part in the presentation. Given my other recent post about too many cooks, that’s probably a good thing, but there was never a clear reason expressed for Intel’s sponsorship.

Fern Halper began with overview of how TDWI is seeing data streaming progress. She briefly described streaming as dealing with data while still in motion, as opposed to data in warehouses and other static structures. Ms. Halper then proceeded to discuss the overlap between event processing, complex event processing and stream mining. The issue I had is that she should have spent a bit more time discussing those three terms, as they’re a bit fuzzy to many. Most importantly, what’s the difference between the first two?

The primary difference is that complex event processing is when data comes from multiple sources. Some of the same things are necessary as ETL. That’s why the in-memory message was important in the presentation. You have to quickly identify, select and merge data from multiple streams and in-memory is the way to most efficiently accomplish that.

Ms. Halper presented the survey results about the growth of streaming sources. As expected, it shows strong growth should continue. I was a bit amused that it asked about three categories: real-time event streams, IOT and machine data. While might make sense to ask the different terms, as people are using multiple words, they’re really the same thing. The IoT is about connecting things, which interprets as machines. In addition, the main complex events discussed were medical and oil industry monitoring, with data coming from machines.

Jaan Leemet, Sr. VP, Technology, at Tangoe then took over. Tangoe is an SAP customer providing software and services to improve their IT expense management. Part of that is the ability to track and control network usage of computers, phones and other devices, link that usage to carrier billing and provide better cost control.

A key component of their needs isn’t just that they need stream processing, but that they need stream processing that also works with other less dynamic data to provide a full solution. That’s why they picked SAP’s Even Stream Processor – not only for the independent functionality but because it also fits in with their SAP ecosystem.

One other decision factor is important to point out, given the message Hadoop and other no-SQL folks like to give. SAP’s solution works in a SQL-like language. SQL is what IT and business analysts know, the smart bet for rapid adoption is to understand that and do what SAP did. Understand the customer and sales becomes easier. That shouldn’t be a shock, but technologists are often too enamored of themselves to notice.

Neil McGovern, Sr. Director, Marketing, at SAP gave the expected pitch. It was smart of them to have Jaan Leemet go first and it would have been better if Mr. McGovern’s presentation was even shorter so there would have been more time for questions.

Because of the three presenters, there wasn’t time for many questions. One of the few question for the panel asked if there was such a thing as too much data. Neil McGovern and Jaan Leemet spent time talking about the technology of handling lots of streaming data, but only in generalities.

Fern Halper turned it around and talked about the business concept of too much data. What data needs to be seen at what timeframe? What’s real-time? Those have different answers depending on the business need. Even with the large volume of real-time data that can be streamed and accesses, we’re talking about clustered servers, often from a cloud partner, and there’s no need to spend more money on infrastructure than necessary.

I would have liked to have heard a far more in-depth discussion about how to look at a business and decide which information truly requires streaming analysis and which doesn’t. For instance, think about a manufacturing floor. You want to quickly analyze any data that might indicate failures that would shut down the process, but the volumes of information that allow analysis of potential process improvements don’t need to be analyzed in the stream. That can be done through analysis of a resultant data store. Yet all the information can be coming across the same IoT feed because it’s a complex process. Firms need to understand their information priority and not waste time and money analyzing information in a stream for no purpose other than you can.

Semantics and big data: Thought leadership done right

Dataversity hosted a webinar by Matt Allen, Product Marketing Manager at MarkLogic. Mr. Allen’s purpose was to explain to the audience the basic challenges involved in big data which can be addressed by semantic analysis. He did a good job. Too many people attempting the same spend too much time on their own product. Matt didn’t do so. Sure, when he did he had some of the same issues that many in our industry have, of over selling change; but the corporate references were minimal and the first half of the presentation was almost all basic theory and practice.

Semantics and Complexity

On a related tangent, one of the books I’m reading now is Stanley McChrystal’s “Team of Teams.” In it, he and his co-authors point to a distinction between complicated and complex. A manufacturing process can be complicated, it can have lots of steps but have a clearly delineated flow. Complex problems have many-to-many relations which aren’t set and can be very difficult to understand.

That ties clearly into the message put forward by MarkLogic. The massive amount of unstructured data is complex, with text rather than fields and which need ways of understanding potential meaning. The problems in free text are such things as:

  • Different words can define the same thing.
  • The same word can mean different things in different contexts.
  • Depending on the person, different aspects of information are needed for the same object.

One great example that can contain all for issues was given when Matt talked about the development process. At different steps in the process, from discovery, to different development stages to product launch, there’s a lot of complexity in meanings of terms not only in development organizations but between them and all the groups in the organization with whom they have to work.

Mr. Allen then moved from discussing that complexity to talking about semantic engines. MarkLogic’s NoSQL engine has a clear market focus on semantic logic, but during this section he did well to minimize the corporate pitch and only talked about triples.

No, not baseball. Triples are a syntactical tool to link subject (person), predicate (operates), object (machine). By building those relationship, objects can be linked in a less formal and more dynamic manner. MarkLogic’s data organization is based on triples. Matt showed examples of JSON, Turtle and XML representations of triples, very neatly sliding his company’s abilities into the theory presentation – a great example of how to mention your company while giving a thought leadership presentation without being heavy handed.

Semantics, Databases and the Company

The final part of the presentation was about the database structure needed to handle semantic analytics. This is where he overlapped the theory with a stronger corporate pitch.

Without referring to a source, Mr. Allen stated that relation databases (RDBMS’) can only handle 20% of today’s data. While it’s clear that a lot of the new information is better handled in Hadoop and less structured data sources, it’s a question of performance. I’d prefer to see a focus on that.

Another error often made by folks adopting new technologies was the statement that “Relational databases aren’t solving a lot of today’s problems. That’s why people are moving to other technologies.” No, they’re extending today’s technologies with less structured databases. The RDBMS isn’t going away, as it does have its purpose. The all or nothing message creates a barrier to enterprise adoption.

The final issue is the absolutist view of companies that think they have no competitor. Mark Allen mentioned that MarkLogic is the only enterprise database using triples. That might be literally true. I’m not sure, but so what? First, triples aren’t a new concept and object oriented databases have been managing triples for decades to do semantic analysis. Second, I recently blogged about Teradata Aster and that company’s semantic analytics. While they might not use the exact same technology, they’re certainly a competitor.

Summary

Mark Allen did a very good job exposing people to why semantic analysis matters for business and then covered some of the key concepts in the arena. For folks interested in the basics to understand how the concept can help them, watch the replay or talk with folks at MarkLogic.

The only hole in the presentation is that though the high level position setting was done well, the end where MarkLogic was discussed in detail had some of the same problems I’ve seen in other smaller, still technology driven companies.

If Mr. Allen simplifies the corporate message, the necessary addition at the end of the presentation will flow better. However, that doesn’t take away from the fact that the high level overview of semantic analysis was done very well, discussing not only the concepts but also a number of real world examples from different industries to bring those concepts alive for the audience. Well done.

Marketing lesson: How to cram too many vendors into too short a timeframe

I’ll start by being very clear: This is a slam on bad marketing. Do not take this column as a statement that the products have problems, as we didn’t see the products.

Database Trends and Application magazine/website held a webinar. The first clue there was something wrong is that an hour long seminar had three sponsors. In a roundtable forum, that could work, and the email mentioned it was a roundtable, but it wasn’t. Three companies, three sequential presentations. No roundtable.

It was titled “The Future of Big Data: Hybrid Architectures and Best-of-Breed”. The presenters were Reiner Kappenberger, Global Product Manager, HP Security Voltage, Emma McGrattan, SVP Engineering, Actian, and Ron Huizenga, ER/Studio Product Manager, Embarcadero. They are three interesting companies, but how would the presentations fit together?

They didn’t.

Each presenter had a few minutes to slam through a pitch, which they did with varying speeds and content. There was nothing tying them into a unified vision or strategy. That they all mentioned big data wasn’t enough and neither was the time allotted to hear significant value from any of them.

I’ll burn through each as the stand-alone presentations they were.

HP Security Voltage

Reiner Kappenberger talked about his company’s acquisition by HP earlier this year and the major renaming from Voltage Security to HP Security Voltage (yes, “major” was used tongue-in-cheek). Humor aside, this is an important acquisition for HP to fill out its portfolio.

Data security is a critical issue. Mr. Kappenberger gave a quick overview of the many levels of security needed, from disk encryption up to authentication management. The main feature focus on Reiner’s allotted time is partial tokenization, being able to encrypt parts of a full data field. For instance, disguising the first five digits of a US Social Security number while leaving the last four visible. While he also mentioned tying into Hadoop to track and encrypt data across clusters, time didn’t permit any details. For those using Hadoop for critical data, you need to find out more.

The case studies presented included a car company’s use of both live, Internet of Things feeds and recall tracking but, again, there just wasn’t enough time.

Actian

The next vendor was Actian, an analytics and business intelligence (BI) player based on Hadoop. Emma McGrattan felt rushed by the time limit and her presentation showed that. It would have been better to slow down and cover a little less. Or, well, more.

For all the verbage it was almost all fluff. “Disruption” was in the first couple of sentences. “The best,” “the fastest,” “the most,” and similar unsubstantiated phrases flowed like water. She showed an Actian built graph with product maturity and Hadoop strength on the two axis and, as if by magic, the only company in the upper right was Actian.

Unlike the presentations before and after hers, Ms. McGrattan’s was a pure sales pitch and did nothing to set a context. My understanding, from other places, is that Actian has a good product that people interested in Hadoop should evaluate, but seeing this presentation was too little said in too little time with too many words.

In Q&A, Emma McGrattan also made what I think is a mistake, one that I’ve heard many BI companies get away from in the last few years. An attendee asked about biggest concern when transitioning from EDW to Hadoop. The real response should be that Hadoop doesn’t replace the EDW. Hadoop extends the information architecture, it can even be used to put an EDW on open source, but EDWs and big data analytics typically have two different purposes. EDWs are for clean, trusted data that’s not as volatile, while big data is typically transaction oriented information that needs to be cleaned, analyzed and aggregated before it’s useful in and EDW. They are two tools in the BI toolbox. Unfortunately, Ms. McGrattan accepted the premise.

Embarcadero

Mr. Huizenga, from Embarcadero, referred to evidence that the amount of data captured in business is doubling every 1.2 years and how the number of related jobs is also exploding. However, where most big data and Hadoop vendors would then talk about their technologies manipulating and analyzing the data, he started with a bigger issue: How do you begin to understand and model the information? After all, schema-on-write still means you need to understand the information enough to create schemas.

That led to a very smooth shift to a discussion about the concept of modeling to Embarcadero. They’ve added native support for Hive and MongoDB, they can detect embedded objects in those schemas and they can visually translate the Hadoop information into forms that enterprise IT folks are used to seeing, can understand and can add to their overall architecture models.

Big data doesn’t exist in a void, to be successful it must be integrated fully into the enterprise information architecture. For those folks already using ERwin and those who understand the need to document modeling, they are a tool that should be investigated for the world of Hadoop.

Summary

Three good companies were crammed into a tiny time slot with differing success. The title of the seminar suggested a tie that was stronger than was there. The makings existed for three good webinars, and I wish DBTA had done that. The three firms and the host could have communicated to create an overall message that integrated the three solutions, but they didn’t.

If you didn’t see the presentation, don’t bother. Whichever company interests you check it out. All three are interesting though it might have been hard to tell from this webinar.

Review: Looker Webinar on Embedding BI

Looker held a webinar today. I recently blogged about their presentation to the BBBT community, but it’s an interesting company so was worth another visit. The company is a business intelligence (BI) firm. With the presenters being Colin Zima and Zach Taylor, the presentation stayed at a much higher level than the previous presentation and was aimed at a business audience rather than analysts. It is always good to see a different view of things.

The focus of their presentation is why it’s good to embed BI in other applications in opposition to pure BI tools. It’s a good message but needs to be strengthened. Colin and Zach quickly mentioned embedding as if everyone understood it, then dove into the issues in evaluation the build v buy decision. They should have spent a couple of minutes explaining what they mean by embedding and their focus on what they focus on as places to be embedded into.

Their build v buy decision discussion was standard and hit all the right points about letting companies focus on their competencies and leverage the BI industry’s competencies for analysis. Where embedding and build v buy really blend, and they could have hit harder, is the difference in ROI between embedding and having a separate BI visualization tool.

They did have a couple of case studies that were interesting. Ibotta is a company providing analytics to their consumer packages goods clients. That’s a great application and a powerful use of BI in a business network, but I didn’t see much on what it was embedded into or how. That meant it didn’t fit into the overall scheme of the presentation.

The other key one was HubSpot using Looker to provide analytics to sales on sales performance. That’s done by embedding the analytics directly into the normal Saleforce.com windows the sales team see every day. That’s a powerful message and one that I felt deserved a bit more time.

The only questionable message I heard was during Q&A, when somebody asked about their performance issues. As in the previous presentation, they talked about using the source data and not replicating for BI. They therefore said they didn’t have performance issues when scaling users but it was one for the databases. Well, that’s not quite true.

It’s not likely that all a company’s various data sources have been built to scale to lots of users. Companies will still use ODS’s, data warehouses and other methods to parallel data and have multiple versions of the truth which require strong compliance to control. Companies will still have to spend time to analyze and prepare appropriate data sources that can handle large numbers of concurrent users. The advantage of Looker is not that it means that you don’t have to add to the confusion to get performance, whatever is provided to get good performance for Looker isn’t unique and limited to it but can serve other applications as well.

Looker is that rare young company that seems to not only have a good early generation product, but understands how to market their product to multiple audiences. As someone focused on software marketing, I think that’s great.

TDWI Webinar Review: Claudia Imhoff and SAP with an overview of the analytics supply chain

Tuesday’s TDWI webinar had a guest star: Claudia Imhoff. The topic was predictive analytics and the presentation was sponsored by SAP, so Pierre Leroux, Director of Product Marketing, SAP, also had his moment towards the end. Though the title was about predictive analytics, it’s best to view the presentation as an overview of the state of analytics, and there’s much more to discuss on that.

The key points revolved around a descriptive slide Ms. Imhoff presented to describe the changing analytics landscape.

TDWI Imhoff analytics supply chain

Claudia Imhoff described the established EDW information supply chain as being the left half of the diagram while the newer information, with web, internet of things (IOT) and other massive data sources adding the right hand side. It’s a nice, clean way of looking at things and makes clear that the newer data can still drive rather than eliminate the EDW.

One thing I’d say is missing is a good name for the middle box. Many folks call was Ms. Imhoff terms the Date Refinery a Data Lake or other similar rationalizations. My issue is that there’s really no need to list the two parts separate. In fact, there’s a need to have them seamlessly accessible as a whole, hence the growth of SQL for Hadoop and other solutions. As I’ve expressed before, the combination of the data integration and data refinery displayed are just the next generation of the ODS. I like the data refinery label, but think it more accurately applies to the full set of data described in the middle section of the diagram.

Claudia also described, the four types of analytics:

  • Descriptive: What happened.
  • Diagnostic: Why it happened.
  • Predictive: What might happen.
  • Prescriptive: What to do when it happens.

It’s important to understand the difference in analysis because each type of report needs to have a focus and an audience. One nit I have with her discussion of these was the comment that descriptive analytics are the least valuable. Rather, they’re the least strategic. If we don’t know what happened, we can’t feed the other types of analytics, plus, reporting requirements in so much of business means that understanding and reporting what happened remains very valuable. The difference is not how valuable, but in what way. Predictive and prescriptive analytics can be more valuable in the long term, but their foundation still resides on descriptive.

Not more with the Data Scientist…

My biggest complaint with our industry at large is still the obsession with the mythical data scientist. Claudia Imhoff spent a good amount of time on the subject. It’s a concept with super human requirements, with Claudia even saying that the data scientist might be the one with deep business knowledge. Nope. Not going to happen.

In Q&A, somebody brought up the point I always mention: Why does it have to be one person rather than a team. Both Claudia Imhoff and Pierre Leroux admitted that was more likely. I wish folks would start with that as it’s reasonable and logical.

I was a programmer as folks began calling themselves software engineers. I never liked that. The job wasn’t engineering but a blend of engineering and crafting. There was art. The two presenters continued to talk about the data scientist as having an art component, but still think that means the magical person is still a scientist. In addition, thirty year ago the developer was distanced much further from business, by development methods, technology and business practice. Being closer means, again, teamwork, with each person sharing expertise in math, coding, business and more to create a robust solution.

That wall has been coming down for years, but both technology and business are changing rapidly and are far more complex. The team notion is far more logical.

Business and Technology

The other major problem I had was a later slide and words accompanying it that implied it’s up to the business people to get on board with what the technologists are doing. They must find the training, they must learn that analytics are the answer to everything.

Yes, we’re able to provide better analytics faster to management than in the past. However, they’re not yet perfect nor will they be. Models are just that. As Pierre pointed out, models will never explain 100%.

Claudia made a great point earlier about one of the benefits of big data is to eliminate sampling and look at what the entire market is doing, but markets are still complex and we can’t glean everything. Technologists must get of the high horse and realize that some of the pushback from management is because the techies too often tend to dismiss intuition and experience. What needs to happen is for the messages to change to make it clear that modern analytics will help executives and line management make better decisions, not that it will replace their decision making.

In addition, quit making overly complex visualization that have great scientific relevance but waste time. The users do not need to understand the complexities of systems. If we’re so darned smart, we can distill the visualizations to things easier to comprehend so that managers can get the information, add it to all the other information and experience and make decision.

Technologists must adapt to how business runs as much as business must adapt to leverage technology.

Summary

The title of the presentation misrepresents the content. It was a very good presentation for understanding the high level landscape of the analytics information supply chain and it’s a discussion that needs to be held more often.

You’ll notice I didn’t say much about the demo by Pierre Leroux. That’s because of technical issues between demo and webinar software. However, both he and Claudia Imhoff took questions about the industry and market and gave thoughtful answers that should help drive the conversation forward.

TDWI Webinar Review: What is Data Platform as a Service (dPaaS) and What Can it Do For Your Business?

Yesterday’s TDWI webinar was sponsored by Liaison Technologies, who did the same thing last year. It’s a push for another acronym. While the acronym isn’t needed, the concept is. Data Platform as a Service is just using the cloud to help with data integration. Gosh, complex, ‘eh? I think it’s the natural progression of technology and business, it’s just data management on the cloud. But forget the marketing, let’s talk about the concept.

Cloud data management

The presentation’s first half was delivered by Phillip Russom. He started with some very trivial level setting but then quickly got to a key point. If you’ve been around for a while, you remember Best of Breed. That’s when each vendor focused product company, somewhere in the information supply chain, talked about their openness and how you could piece together a solution from different vendors. That made sense at the time, since many companies were each creating the early version of parts of a full solution.

As Phillip pointed out, times have changed. We now better understand business needs, have learned more about coding the requirements and can access far better hardware than we had fifteen years ago. That means IT is looking for what they couldn’t find back then: An integrated solution from a single or a far more limited number of vendors. They want something simpler than a hodgepodge of multiple systems.

The advantages of the cloud aren’t specific to data management. One very key business driver that was minimized in Mr. Russom’s presentation but brought out later by Patrick Adamiak during his presentation then revisited by both in the Q&A is capex versus opex – something often ignored by technical folks. Having your own hardware and data center is not just costly, it’s part of capital expenditure. Service contracts with a cloud vendor are operational expenses. That means the CxO suite and Board are often happier with that because it’s not as locked it and creates flexibility in the corporate financial picture.

One nit I had with Mr. Russom’s presentation was his statement that cloud is another architecture, like client/server or the web. The cloud and web are client server, that’s not the issue. It’s another architecture in two other key aspects: The already mentioned capex/opex divide, and the way it changes a software vendor’s ability to manage and update their software in comparison to on-premises installations.

One caution he gave that needed more explanation for folks new to the cloud was when Mr. Russom mentioned that you need to ask about the elasticity of the cloud implementation. For those who might not have heard the term, elasticity is the ability to grow or shrink cloud resources in order to match processing demands. In other words, if you get a big data dump from another source, can you quickly access more disk space? Or, from the Web side of the house: You’re hosting a big event or making a major announcement on your Web site: Can site resources be replicated quickly to handle the additional load then released when no longer needed?

Liaison

I was impressed by the fact that capex was mentioned on Patrick Adamiak’s first slide. Cloud technology has multiple advantages that can be communicated to IT, but it’s the capex/opex issue that will help close the deal in an enterprise setting. Liaison seems to understand the need to blend technical and business messages.

However, most of Mr. Adamiak’s presentation seemed to be about justifying the new acronym. The main slide compared dPaaS with other supposed solutions without admitting there’s really a lot of overlap between them. The columns weren’t as different as he’d like them to be.

His company slides didn’t seem any different than those I’ve seen from the many other firms in the space. Forget all of that, it was in a short webinar with TDWI, so he had limited time.

The fact is that Liaison claims they are where the market is going. They are vertically integrating the information supply chain while leveraging the cloud for its business and technology advantages. For those in IT looking to simplify their world, Liaison is a company that should be investigated.

Dell at BBBT: Addressing BI from IT

The most recent BBBT presentation was from Dell Software. Peter Evans, Sr. Integrated Solutions Development Consultant , and Steven Phillips, Product Marketing Manager – Big Data & Analytics, gave us an overview of Dell’s architecture for addressing business intelligence (BI).  Dell platform slide 2015-05-15

What they’re working to accomplish is, no surprise, ensure that Dell’s hardware is able to be present throughout the BI supply chain. For that, they’re working to be application agnostic, though they mislabel it as “no lock-in.” What they’re saying is you can change your software vendors and Dell will still be there. There’s no addressing true lock-in, the difficulty in changing one software vendor to another based on level of openness to data in systems and other costs of moving.

One marketing nit that caught a number of us was Peter’s early claim that Dell is “probably the third largest software company in the world.” Right… First, as a now privately held company, we have no way to confirm that. Second, I’m not sure if he knows just how much revenue is needed to be near the top of that list.

IT First

Far too many young firms are overselling BI as something that will let business “avoid IT.” That’s not only impossible, it wouldn’t make sense if it was possible. IT has a clear place in organizing infrastructure, providing consistency, helping with compliance and doing other things a central organization should do.

Dell has started with IT. They’re used to dealing with IT and their solution is focused on helping IT enable business. What’s not clear is how well they can do such a thing in the new world. They’ve pieced a lot of different applications into an architecture and that would seem to require heavy IT involvement in much of what’s being provided.

On the good side, that knowledge means they better understand true enterprise business needs. Unlike many vendors, Dell has regulatory and statutory compliance at the forefront, very clear in its marketechture slides. While most companies understand they have to mention compliance, it’s usually people dealing with corporate business groups such as IT and legal who understand just how critical compliance is.

Neither Peter Evans nor Steven Phillips spoke clearly to the business user, the want for speed and flexibility for them. While younger companies need to move more to addressing the importance of IT, Dell needs to more strongly focus on the business customer, the ones who are often in charge of the BI and related software projects and spending.

Boomi Suggest

The technical piece that stuck with me the most was the discussion of Boomi Suggest. Boomi is Dells integration tool. Within it, there’s a cloud-based tool called Boomi Suggest. If users subscribe to it, the product tracks data linkages and the de-natured information is kept to help other customers more quickly map data sources and targets.

Mr. Evens says that Boomi Suggest has a database that now contains more than 16 million links. The intelligence on top to that then is able to provide a 92% accuracy rate in analyzing new links. The time savings that alone suggests is a major decision driver that should not be overlooked.

A Great Case Study: Asthma

While the case study didn’t address enough of the end user issues of timeliness, flexibility and more, it was a very interesting case study from an inclusive standpoint. The Dell team focused on asthma case management to show the breadth of data sources, the complexity of analytics and a full process that could be generalized from the healthcare sector in order to support their full platform message.Dell asthma case study slide 2015-05-15

As you can see, they are doing a lot of things with a variety of information, but they’re also doing it with a variety of products.

Summary

Dell’s decades of working with IT has helped it look at BI with a more complex eye that can address many of IT’s concerns. What we saw was an almost completely IT solution and message. While BI focused companies are going to have to move down and address important IT messages, Dell must go in the opposite direction. Unless the team can broaden their message to address the solution to more business teams, Dell’s expansion in the market will be severely limited because it’s the business groups that write the checks.

The presentation shows a great start. However, the questions are if Dell can simplify the architecture to make it less complex, potentially by merging a number of their products, and whether or not they can learn about those folks they don’t have a history of directly understanding: The business user. If they can do that, the start will expand and Dell Software can help in the BI market.

Looker at the BBBT: A New Look at SQL Performance

The most recent BBBT presentation was by Looker. Lloyd Tabb, Founder & CTO, and Zach Taylor, Product Marketing Manager, showed up to display yet another young company’s interesting technology.

Looker’s technology is an application server that sits above relational databases to provide faster, more complex queries. They’ve developed their own language, LookML to help with that. That’s no surprise, as Lloyd is a self-described language guy.

It’s also no surprise that the demos, driven by both Lloyd and Zach, were very coding heavy. Part of the reason that very technical focus exists is, as Mr. Tabb stated, that Looker thinks there are two groups of users: Coders who build models and business managers who use the information. There is no room in that model for the business analyst, the person who understands who to communicate a complex business need to the coders and how to help the coders deliver something that is accessible to and understandable by the information consumer.

How the bifurcation was played out in the demonstration was through an almost exclusive focus on code, code and more code, with a brief display of some visualization technology. The former was very good while the later wasn’t bad but, to fit with their mainly technology focus, had complex visualizations without good enough legends – they were visualizations that would be understood by technical people but need to be better explained for the business audience they claim to address.

As an early stage company, that’s ok. The business intelligence (BI) market is still young and very fragmented. You can get different groups in large companies using different BI tools. While Looker talks about 300 customers, as with most companies of their size it could only be those small groups. If they’re going to grow past those groups, they need to focus a bit more in how to better bridge technology and business.

They also have a good start in attracting the larger market because they support both cloud and on-premises systems. The former market is growing while the later isn’t going away. Providing the ability for their server to run either place will address the needs of companies on either side of the divide.

RDMS ≠ SQL

One key to their system is they don’t move data. It stays resident on the source systems. Those could be operational systems, data warehouses, an ODS or whatever. What they must have is SQL. When asked about Hadoop and other schema-on-write systems, the Looker team stated they are an RDMS based application but they’ll work on anything with SQL access. I have no problem with the technology, but they need to be very clear about the split.

SQL came from the relational world, but as they pointed out in an aside, it isn’t limited to that. They should drop the RDMS message and focus on SQL. As Lloyd Tabb said, “SQL is the right abstraction.” What I don’t know if he understands, being focused on technology and having those biases, is it isn’t the right abstraction because of some technical advantage but because it’s the major player. McDonalds isn’t the best burger because it has the most stores. SQL might not be the best access method, but it’s the one business knows and so it’s the one the newer database companies and structures can’t ignore.

Last year, the BBBT heard from multiple companies including Actian and EXASOL, companies focused on providing SQL access to Hadoop. That’s as important as what Looker is doing. The company that manages to do both well with jump ahead of the pack.

Summary

Looker is a good, young company with some technical advantages that can greatly improve the performance of SQL queries to business databases and provides a basic BI front end to display the results. I’m not sure they have the resources to focus on both, and I think the former have the clearest advantage in the marketplace. Unless they have more funding and a strong management team that can begin to better understand the business side of the market, they will have problems addressing the visualization side of BI. They need to keep improving their engine, spread it to access more data sources, and partner with visualization companies to provide the front end.

Silwood at BBBT: Understand Packaged Software Metadata

Tuesday saw a rare, mid-week presentation at the BBBT. Silwood Technology, an Ascot, UK, company sent people to Boulder to present their technology. Roland Bullivant, Sales and Marketing Director, and Nick Porter, Technical Director (and a co-founder) were the presenters.

Silwood Safyr is focused on helping IT understand the metadata in their major packaged enterprise systems, primarily from SAP and Oracle with a recent addition of Salesforce. As those familiar with the enterprise application space know, there are a lot of tables in SAP and Oracle and documentation has never been, shall we say, close to perfect. In addition, all customers of those systems customize the applications, thereby making the metadata more difficult to understand. Safyr does a very good job at finding the technical metadata.

Let me make that clear: Technical metadata. The tables, indices and their relations are what is found. That’s extremely valuable, but not the full picture. Business metadata is not managed. I’ll discuss that in more detail below.

The company, as expected from European companies, uses partners rather than direct sales for its primary sales channel. In addition, they OEM white label products through IBM, CA and other firms. All told, Roland Bullivant says that 70% of their customers are via reseller channels. Also as expected, they still remain backline support for those partners.

Metadata Matters

As mentioned above, Safyr captures the database structure metadata. As Roland so succinctly put it, “The older packages weren’t really built with the outside world in mind.” The internal structures aren’t pretty and often aren’t easily accessible. However, that’s not the only difficulty in understanding an enterprise’s data structures.

Salesforce has a much simpler data structure, intentionally created to open the information to the ecosystem of partner applications that then grew up around the application. Still, as Mr. Bullivant pointed out, there are companies in Europe that have 16 or more customized versions in different countries or divisions, so understanding and meshing those disparate systems in order to build a full enterprise data model isn’t easy. That’s where Safyr helps.

But What Metadata?

Silwood Safyr is a great leap forward from having nothing, but there’s still much missing. While they build a data model, there’s not enough intelligence. For instance, they leave it to their users to figure out which tables are production and which are duplicates or other tables used just for performance. Sure, a table with zero rows usually means either a performance table or an unlocked app segment, but that’s left for the user rather than flagging, filtering and indicating any knowledge of the application and data structures.

Also, as mentioned above, there’s no business intelligence (gosh, where’d that word come from?). There’s nothing that lets people understand the business logic of the applications. That’s why this is a pure IT tool. The structures are just described in technical terms, exported to data modeling tools (a requirement for visualization, ERwin was used in the demo but they work with others ) and then left to the analysts to identify all the information need to clarify which tables are needed for which business purpose or customer.

One way to start working on that was indicated in Nick Porter’s demo. He showed that Safyr is good at not just getting table names, but also in accessing descriptive names and other metadata about the tables. That’s information needs to be leveraged to help prepare the results for use by people on the business side of the organization.

Where to Go From Here?

The main hole I see in the business links from the last section: The lack of emphasis on business knowledge. For instance, there’s a comparison function to analyze metadata between databases. However, as it’s purely on a technical level, it’s limited to comparing SAP with SAP and Oracle with Oracle. Given that differences in versions of those products can be significant, I’m not even sure how well that works across major version releases.

Not only do global enterprises have multiple versions of one vendor, they have SAP on one continent, Oracle in another and might acquire a new company that is using Salesforce. That lack of an ability to link business layers means that each package is working in a void and there’s still a lot of work required to build a coherent global picture.

Another part of their growth need is my usual soapbox. When the Silwood team was talking about how they couldn’t figure out why they weren’t growing as fast as they should, Claudia Imhoff beat me to the punch. She mentioned marketing. They’d earlier pointed out they don’t spend much on marketing and she quickly pointed out that’s a problem. This isn’t Field of Dreams, they won’t come just because you build it. Silwood marketing basics are good, with a lack of visible case studies being one hole, but they’re not pushing their message out through the channels.

Summary

Silwood Safyr is a good core product to help IT automate the documentation of data models in packaged enterprise software. It’s a product that should be of interest to every large enterprise using complex applications such as those by Oracle and SAP, or even multiple versions of simple databases such as Salesforce. However, there are two things missing.

The most important missing piece in the short term is the marketing necessary to help their resellers better understand benefits both they and the end customer receive, to improve interest in reselling and to shorten sales cycles.

The second is to look long term at where they can grow the business. My suggestion is to better work with business logic within and across applications vendors. That’s the key way they’ll defend their turf against the BI vendors who are slowly moving downstream to more technical data access.

The reason people want to understand data models isn’t out of curiosity, it’s to better understand business. Silwood has a great start in aiding enterprises in improving that understanding.

Rocket Software at BBBT: A Tale of Two Products

Last Friday’s BBBT presentation by an ensemble cast from Rocket Software was interesting, in both good and bad meanings of that word. They have some very interesting products that address the business intelligence (BI) industry, but they also have some confusion.

Bob Potter, SVP and GM, Business Intelligence, opened the presentation by pointing out that Rocket has more than $300 million (USD) in annual revenue yet many tech folks have never heard of them. One reason for the combination is they’ve done a good job in balancing both build and buy decisions to provide niche software solutions in a variety of places and on a number of platforms. Another is a strong mainframe focus. The third is that they don’t seem to know how to market. Let’s focus on just the two products presented to demonstrate all of these.

Rocket Data Virtualization

Most of the presentation was focused on Rocket Data Virtualization (DV). There are two issues it addressed. The first is accessing data from multiple sources without the need to first build a data warehouse. DV is the foundation of what was first thought of as the federated or virtual data warehouse. It’s useful. Gregg Willhoit, Managing Director, Research & Development, gave a good overview of DV and then delved into the product.

Rocket Data Virtualization is a mainframe resident product to enhance data virtualization, running on IBM z. While this has the clear market limit of requiring a company large enough to have a mainframe, it’s important to consider this. There are still vast amounts of applications running on mainframes and it’s not just old line Cobol. Mainframes run Unix, Linux and other OS partitions to leverage multiple applications.

An important point was brought up when Gregg was asked about access to the product. He said that Rocket is working with other BI industry partners, folks who provide visualization, so that they can access the virtualized data.

However, if you want to know more about the product, good luck. As I’ll discuss in more detail later, if you go to their site you’ll find all marcom fluff. It’s good marcom fluff, but driving deeper requires downloads or contacting sales people. That doesn’t help a complex enterprise sale.

Rocket Discover

The presentation was turned over to Doug Anderson, Solutions Engineer, for a look at their unreleased product Rocket Discover. It’s close, in beta, but it’s not yet out.

As the name implies, Rocket Discover is their version of a visualization tool. It’s a very good, basic tool that will compete well in the market except for two key things. The first is that they claimed Rocket is aiming at “high level executives” and that’s not the market. This is a product for business analysts. Second, while it has the full set of features that modern analysts will want, it’s based on a look and feel that’s at least a decade old.

On the very positive side, they do have a messaging feature built in to help with collaboration. It needs to grow, but this is a brand new product and they have seen where the market is going and are addressing it.

Another positive sign is this isn’t a mainframe product. It runs on servers (unspecified) and they’re starting with both on-premises and cloud options. This is a product that clearly is aimed at a wider market than they historically have addressed.

While they have understood the basics of the technology, the question is whether or not they understand the market. One teaser that shows that they probably don’t was brought up by another analyst who pointed out that Doug and others were often referring to the product as just Discover. Oracle has had a Discover product for many years. While Rocket might not have seen it on the mainframe, there will be some marketing issues if the company doesn’t always refer to the product as Rocket Discover, and they might have problems anyway. Their legal and marketing teams need to investigate quickly – before release.

Enterprise IT v Enterprise Software: Understanding the Difference

The product presentation and a Q&A session that covered more issues with even more folks from Rocket taking part, show the problems Rocket will have. As pointed out, the main reasons that so many people have never heard of Rocket is it sells very technical solutions to enterprise IT. Those are direct sales to a very technical audience. However, enterprise software is more than enterprise IT.

Enterprise software such as ERP, CRM, SFA and, yes, BI, address business issues with technology. That means there will be a complex sales cycle involving people from different organizations, a cycle that’s longer and more involved than a pure sale to IT. I’m not sure that Rocket has yet internalized that knowledge. As mentioned above, their website is very fluffy, as if the thought is that you put something pretty (though I argue against the current fad of multiple bands requiring scrolling, it’s neither pretty nor easy to use) with mission and message only, then you quickly get your techies talking directly to their techies, is the way you sell. Perhaps when talking with techies only, but not in an enterprise sale.

That’s my biggest gripe about the software industry not understanding the need for product marketing. You must be able to build a bridge to both technical and business users with a mix of collateral and content that span the gap. I’m not seeing that with Rocket.

In addition, consider the two products and the market. DV is very useful and there are multiple companies trying to provide the capability. While Rocket’s knowledge of and access to mainframe data is a clear advantage, the fact the product only runs on mainframes is a very limiting competitive message. I understand they have tied their horses very closely to IBM, and it makes sense to have a z option, but to not provide multiple platforms or a way for non-mainframe customers to use their more general concepts and technologies will retard growth.

If their plan is to provide what they know first then spread to other platforms, it’s a good strategy; but that wasn’t discussed.

Both products, though, have the same marketing issue. Rocket needs to show that it understands it is changing from selling almost exclusively to enterprise IT and needs to create a more integrated product marketing message to help sell to the enterprise.

There’s also the issue of how to balance the messages for the two products. For Rocket Data Virtualization to succeed, it really does need to work with the key BI vendors. Those companies will wonder about Rocket’s dedication to them while Rocket Discover exists. Providing a close relationship with those vendors will retard Rocket Discover’s growth. Pushing both products will be walking a tightrope and I haven’t seen any messaging that shows they know it.

Summary

Rocket is a company that is very strong on technology that helps enterprise IT. Both Rocket Data Virtualization and Rocket Discover have the basics in place for strong products. The piece missing is an understanding of how to message the wider enterprise market and even the mid- and small-size company markets.

Rocket Data Virtualization is the product that has the most immediate impact with the clear differentiation of very powerful access to mainframe data and the product I think should make the more rapid entrant into its space. The question is whether or not they can spread platform support past the mainframe faster than other companies will realize the importance of mainframe data. In the short term, however, they have a great message if they can figure out how to push it.

Rocket Discover is a very good start for a visualization tool, but primarily on the technology side. They need to figure out how to jump forward in GUI and into predictive and other analytics to be truly successful going forward, but the market is young and they have time.

The biggest issue is if Rocket will learn how to market and sell in broader enterprise and SMB sales, both to better address the multiple buyers in the sales cycle and to better communicate how both products interact in a complex market place.

Rocket is worth the look, they just need to learn how to provide the look to the full market.