Tag Archives: business intelligence cray

Webinar Review: Big Data addressed poorly

I’ve been in computing business for almost thirty five years, but until this year it was always working for vendors or systems integrators. As a newly minted analyst, I’ve stayed away from very negative reviews. I’ve watched a few bad webinars recently and made the choice not to blog about them. However, as I’ve seen more and more, I’ve realized that doesn’t help the industry and I can’t remain silent.

On Tuesday, I watched a webinar by David Loshin, President of Knowledge Integrity, and Ramesh Menon from Cray. It was not pretty.

Let’s take, for instance, David Loshin’s five points for big data:

  • Plan for scalability
  • Go heavy on Memory
  • Free your applications from the Hadoop 1.0 Execution Model
  • Real-time ingestion and integration
  • Feed the SQL need

The first item has been around since client/server application first came to the fore. Big data has grown, in part, because of its ability to scale large volumes of data. This is nothing new.

Memory? It was a great point years ago, with Tableau and others having pushed it for quite a while. However, the last year or two we’ve been hitting the limits of pure memory solutions and I’ve seen a number of presentations from vendors focused on better integrating memory and disk depending on data latency needs. David’s statement that ““We will start seeing more applications using memory rather than disk,” is wrong. We’ll see more applications better leveraging memory, but disk isn’t going anywhere.

The Hadoop organization’s release of Hadoop v2, YARN, is a clear indication of the limitations of 1.0 and why people have also been talking about it for years. However, in the presentation, leading with 2.0 would have been better than again being a laggard about the known issues with 1.0. Either people use Hadoop and already know the issues or haven’t yet used it and will start with 2.0.

It’s not real-time ingestion the critical issue and I would have liked to see him focus more on the second half of the fourth bullet. Real-time extractions of information are moving much more rapidly than the ability to integrate it with the rest of corporate information and to provide analytical to that information.

David’s final point is the only timely one. People have recently begun to remember that evolution is easier than revolution and I’ve seen a number of vendors begin to focus on providing access to the new data sources via SQL. A lot more people providing business insight to corporations know SQL and that needs to be made available. Ramesh Menon said it better, but the point is here.

The biggest problem I had was with Loshin’s forward looking statement. I’ll almost ignore that nonsense about data lake, he’s not the only one busy trying to use a new, supposedly fancier, term for the ODS, but I’ll mention it anyway. The issue was that he claimed he saw data management moving away from the data lake as we move to in-memory. Really? The ODS isn’t going anywhere. It’s nonsense to thing that every bit of corporate information needs to reside in memory, just in case it might be needed. The ODS is becoming the central source of all operational and business data. Individual business intelligence tools and needs will drive in-memory usage for near real-time needs of specific departmental, divisional or corporate level analytics needs, but there will always be a non-memory source for all that information in order to provide consistency, appropriate levels of control and to handle data governance issues.

Now we turn to Ramesh Menon. His presentation was better than David Loshin’s, but not by much. I’m sorry, there’s no excuse for someone who puts himself forward as a voice of the industry to not understand the difference between a premise and a premises. Considering he used premise correctly in his presentation, it was terrible that it was used three times before that while describing “on-premise” computing. Everyone in our industry needs to sit down, focus and practice saying the right word.

His customer use case was a very jumbled story and an overcrowded slide with the main point being “the customer had a lot of data.” I wouldn’t have guessed. He needs to talk more about solutions, how Cray address the data.

As mentioned above, Ramesh had a very clear point about the difference between data scientists and business analysts being one reason that Hadoop 2.0 is important. The move from batch to lower latency access is part of the difference between a data scientist, someone wanting to be the priest at the temple, and a business analyst, a much larger group working to provide wider access to business information. Updating Hadoop is critical to the ability to keep it relevant.

That was a key point, the problem is that Ramesh isn’t the analyst, he’s Cray’s spokesperson. The discussion shouldn’t have been about generalities but about how Cray must have focused on Hadoop 2.0 for the Urika-XA appliance – but that wasn’t made clear. It was in the data sheet images plopped into the presentation, but reasons and results should have been openly discussed.

I’ll end with the one very interesting point from Mr. Menon’s presentation. He had a slide where he discussed four phases in the analytics pipeline: ETL, algorithms, analysis and visualization. His point is that there are very different resource requirements for each phase of the pipeline. This could be an entire presentation itself and Ramesh could focus and expand this, explaining how Cray helps to address part or all of those requirements to help present Cray to the industry.

Summary

The analysis got a couple of things right, but was mostly too late or wrong. The corporate presentation didn’t clearly link Cray to the issues involved. Both presentation halves were far to generic and descriptive with almost no descriptive takeaways. Furthermore, you could tell that neither presenter seemed to have put much time and effort into the webinar by both the content and presentation styles.

People need to learn that “there’s no such thing as bad press” is only something said by entertainers. It’s not enough to have a webinar to get your name out there. Lots and lots of companies are doing that. Thought needs to go into the presentation and practice needs to go into delivery.

There were some good tidbits in the presentation, but overall it was a mess. I was very disappointed in the hour that I lost.