Tag Archives: data warehouse

SiSense at the BBBT: High Performance BI at Low Cost?

Technology: Better integration of memory and Disk

The heart of their system is a patent pending technology that tightly integrates cpu cache, RAM and disk to better leverage all storage methods for higher performance. The opportunities that theory provides are enough that they’ve received $50 million (USD) in venture funding, $30 million in their latest round, earlier this year.

As they are a startup, it’s no surprise that the case studies given were for SMB or departments within enterprises. That’s the normal pattern, where a smaller group takes advantage of flexibility to try new products to solve focused problems. As their customer list includes companies such as Ebay, Wix, ESPN and Merck, companies with lots of data, those early entrants increase the potential if Sisense continues to perform.

Another key technology component is their columnar database. They created a proprietary one to be able to support their management technology. That’s completely understandable as their database isn’t purely on disk or memory, but in a combined mix that needs special database management.

The final key to their technology is that they worked to ensure the software runs on commodity chips from the X86 heritage. That means it runs on normal, affordable, off the shelf servers, not on high priced appliances. Sisense hardware price comparison

The combination of the speed and affordability of the technology is justification for the rounds of funding they’ve received.

Really full stack?

One fuzziness that I’ve mentioned with other full stack vendors is the ETL side of the process. The growth of Cloud companies such as Salesforce, and the accessibility of their APIs, means that you can get a lot of information out of systems aimed at SMB. However, true enterprise ETL means accessing a very wide variety of systems with much less easy or open APIs. When Mr. Bendov talked about multiple systems, it seems, from presentation and demo, that he’s talking about multiple instances of simple databases or open APIs, and not a breadth of source types. There wasn’t a lot of choice in the connection section of his application.

That’s not a problem for companies at Sisense’s state of maturity, as long as there’s a business plan to expand to more enterprise sources. They need to focus on proving the technology in the short term and having more heterogeneous access in their tool bag for the future.

Another issue is the question of what, exactly, their database is. Amit Bendov made a brief comment about not needed data warehouse, but as I and others quickly brought up, there are two problems with that statement. First, they would seem to be a data warehouse. They’re extracting information from source systems, transforming that information even if not into the old star-schema structures, and providing the aggregate information for analysis. Isn’t that a high level description of a warehouse? Second, as they’re young and focused on SMB or departments, as with other companies who serve visualization, they might need to look at customer demands and get access to corporate data warehouses as another source.

The old definition of a federated data warehouse seems to be evolving into today’s environment where sometimes an EDW is a source, other times a result and sometimes it’s made up of multiple accessible components such as Sisense and other databases. Younger companies who disparage EDWs need to be careful if they wish to address the enterprise market. The EDW is evolving, not dying off.

User interface and more

One of my first trips to Israel was, in part, when my boss and I had to bring a couple of UI specialists to show Mercury Interactive’s programmers why it might be nice to rethink application interfaces. It’s wonderful what twenty years have wrought. Amit Bendov says that Sisense has one UI specialist for every two programmers, and the user interface shows that. While I mentioned that they need broader ETL access, the simplicity of getting to sources is clear. While you still will need a business analyst to understand some column names, it’s a very easy to use interface.

The same is true in the visualization portions of their application. While it’s still a simpler tool, it has all the basics and is very clear to understand and use.

Paving the way for their spread into enterprise, the Sisense team also supports single-sign on, basic data access control, both in global administration and in the user interface, and other things that will be needed to convince a larger corporation to spread the technology.

Summary

Sisense looks like a startup in a great position. Their technology is well thought out and seems to be performing very well in the early stages. Affordable, fast, business intelligence is something nobody will turn down.

The challenge is two-fold:

Do they have the technology plans to help them address larger enterprise issues?
Do they have the mindset to understand the importance not only in marketing, but in changing the marketing to a more business focus?

This is the same refrain you’ve heard from me before and which you’ll hear again. This is the Chasm challenge. Their technology has a great start, but their web site and presentation show they aren’t yet thinking bigger and we’ll have to see what the future holds both for the technology and the messaging.

Business intelligence is a very visible market and one growing quickly. While small companies need to focus on the early adopters, they must very rapidly learn how to address the enterprise, both in products and marketing.

High performance BI at a reasonable cost is a great sell, but Sisense isn’t yet read for full enterprise. Sisense has a great start but life is fluid.

TDWI, Claudia Imhoff and SAP: Data Architecture Matters

Leave a reply

In a busy week for TDWI webinars, today’s presentation by Claudia Imhoff, Intelligent Solutions, and Lother Henkes, SAP, was about how the continuing discussion of the place in the data world for the data warehouse.

While many younger techies think the latest technology is a panacea and many older techies are far too skeptical for too long, the reality is that while the data warehouse is going nowhere, it has to integrate with the newer technologies to continue improving the information being provided to business knowledge workers.

One of Claudia’s early slides talked about data sources. While most people are focused on both the standard packaged software and the rush of non-structured data from the Web, call centers, etc, Claudia makes clear the item that companies are just beginning to realize and address: Sensor data is just as important as the rest and also driving data volumes. Business information continues to come from further afield and a wider variety of sources and all must be integrated.

Much of her talk, she mentioned, has come out of a couple of years of work between herself and Colin White, in formalizing the changing data architecture environment. Data warehouses are still the place for production reports and analytics, where data provenance and clarity are absolutely necessary while the techniques used on early stage data such as in streaming, Hadoop analytics, etc, are more exploratory and investigative. The duo posit that the combination of data integration, data management (including EDWs), data analysis and decision management are the “glue in the middle,” those things that bind sources, deployment and distribution technologies, and reporting and analytics options into a real system that provides value.

The picture they put together is good and Claudia Imhoff’s presentation should be looked at for a better understanding of where we are; but I wouldn’t be me if I didn’t have a couple of issues.

The first is a that she is a bit too enamored of mobile technology. It’s here and must be addressed, but statements such as “nobody has a desktop, everything is mobile” must be corrected. A JD Power survey last year showed that only 20% of tablets are used for work. On the other side, Forrester Research has pointed out a strong majority of business people are now using two devices for their information.

The issue for business intelligence is not that people are switching from desktops (including laptops in docking stations) but that smart providers of information need to build UIs that address the needs of large monitors, tablets and smartphones, addressing each device’s uniqueness while ensuring a similarity of user experience.

The second issue is a new term thrown out during the presentation. It’s “data refinery” and, as Claudia mentioned in her presentation, it’s the same thing others are calling a data swamp, data lake or numerous other terms. There’s an easy term everyone has used for years: Operational Data Store (ODS). I’m a marketing guy and I understand the urge for everyone to try to coin a term that will catch on, but it’s not needed in this case.

While it’s a separate topic (yeah, another concept for a column!), I’ll briefly point out my objections here. Even back in the late 1990s, during my brief sojourn at Informatica, we were talking about how the ODS can be used for more than only a place to use in order to quickly extract information from operational system so as not to stress them by doing transformations directly from such systems. They’ve always been a place to take an initial look at data before beginning transformations into star schemas and the like. The ODS hasn’t changed. What’s changed is the underlying technologies that support larger data stores and the higher level analytics that let us better analyze what’s in the ODS.

That brings us to one main point Claudia Imhoff made during her wrap-up, the section on business considerations. She points out that people really need to understand the importance of each data source and the data within it. Just because we can extract everything doesn’t mean we need to save everything. Her example was with customer sampling. Yes, you can get all the customer data, but only that which you need to narrow cast. For higher level decision making, those who understand confidence levels know that sampling can get to very high levels of certainty so sampling can still speed decision making and save costs. Disk space might be less expensive in the Cloud, but it’s not free. We’re in the job of helping businesses improve themselves, so we need to look at the bigger picture.

Her presentation was clearly strategic: We need to rethink, not reinvent, data modeling. Traditional techniques aren’t going away and neither are many of the new ones. Data management people need to understand how they combine.

No surprise, that was a great transition to Lother Henkes’ presentation. His key point is that SAP BW now can run on SAP HANA. It’s important even if all the capital letters look like shouting. HANA is SAP’s in memory, columnar database that’s their entry into the Cloud market to manage the high volumes of modern data. It’s a move to bridge the gap between the ODS and relational database arenas with one underlying infrastructure.

In such a brief webinar, it’s hard to see more than the theory, but it’s a clear move by SAP to do what Claudia Imhoff suggested, to take a fresh look at data models in order to understand how to better support the full range of data now being incorporated into business decision making.

TDWI: Evolving Data Warehouse Architectures in the Age of Big Data Analytics

Teich Communications

B2B Marketing, Analysis, Journalism