Tag Archives: david loshin

TDWI Webinar — Engaging the Business, again from the technologist’s perspective

This week’s TDWI hosted webinar was about engaging business and, once again, it came from the standpoint of technologists rather than from business. There were some very good things said. However, until our industry stops thinking of business knowledge workers as children to be tutored and begins to think about them as people whose knowledge is the core of what we must encapsulate, we’ll continue to miss the mark and adoption of solutions will remain slow.

The main presenter was David Loshin, President of Knowledge Integrity. He began the presentation with a slide that describes his view of the definition of “data driven,” including three main points:

  • Focus on turning data into actionable knowledge that can lead to increased corporate value.
  • Aware of variance that can cause inconsistent interpretation.
  • Coordination among data consumers to enforce standards for utilization.

We should all clearly understand that the first item is not new and was not created by the business intelligence (BI) industry. Business has always been data driven. What we’re able to do now is access far more data than ever before so that we can provide a more robust view of the corporation.

Inconsistent Data v Inconsistent Utilization

The second bullet is a core point. Mr. Loshin used a couple of example such as sales territory and other areas where definitions are fuzzy. One clear difference to me is one I directly experienced 25 years ago, and more directly addresses the visualization side of the BI conundrum. I was working for a major systems integrator (SI) and my client was, well, let’s just say it was a large, fruit based computing company.

A different SI had created an inventory system for the client’s manufacturing facility but the system was a failure though all the right data was in the system. The problem was that the reports were great for the accounting department, not for inventory and manufacturing. We interviewed the inventory team and then rewrote reports to address and present the information from their standpoint.

Too often, technologists get lost in the detailed data definitions and matching fields across data sources. That is critical, but it loses the big picture. Even when data is matched, different business people use data differently.

Which brings us to David Loshin’s third point. No, we don’t need to enforce exact standards for utilization. We need to ensure that the data each consumer refers to is consistent, but we must do a better job in understanding that different departments can utilize the exact same data in a variety of ways.

Business Drivers and Data Governance

David did get to the key issue a bit later, on a slide titled Operationalizing Business Policies. He points out that it’s critical to ensure that “Information policies model the data requirements for business policy.” This is key and should be bubbled up higher in the mindset of our industry. While I hear it mentioned often, it seems to be honored more in the breach.

Time was spent discussing the importance of understanding different users and their varying utilization of data. As I mentioned in the introduction, the solution to the new complexities then veers from addressing business needs to ignoring history. In a previous blog post, I discussed how many in the industry seem to be ignoring the lessons to be learned from the advent of the PC. Mr. Loshin seems to be doing that when he talks about empowering the business users to set their own usability rules. He splits IT and business in the following way:

  • Business data consumers are accountable for the rules asserting usability for their views of the data.
  • IT becomes responsible for managing the infrastructure that empowers the business user.

The issue I have with that argument is a phrase that didn’t appear in this webinar until Linda Briggs, the moderator, mentioned it in a poll question right before Q&A: Data Governance. Corporations are increasingly liable for how they control and manage information. It does not make sense to allow each user to define their own data needs in a void. Rather than allow for massively expanded and relatively uncontrolled access to data and then later have to contract access, as corporations had to regain a handle on what was being done on scattered desktop computers, BI vendors should be positioning data governance from the start.

Whether it’s by executive fiat, a cross-functional team, or some other method, companies need to clarify data governance rules. Often, IT is the best intermediary between groups, actively participating in data governance definition as an impartial observer and facilitator. It is then the job of IT to ensure that it provides as open access as possible to business workers given their needs and the necessity of following governance rules.

There was one question, during Q&A, on the importance of data governance. I thought David Loshin again understated its importance while Harald Smith, Director of Product Management at Trillium, the webinar sponsor, had the comment that “everyone is responsible for data governance.” That is my only mention of the sponsor, as I felt his portion of the presentation was a recitation of sound bites, talking points and buzz words that didn’t provide any value to the hour.

Summary

David Loshin has a clear view of engaging the business and gets a number of key things correct. However, that view is one of a technologist looking over a self-imagined bridge separating technology and business. There’s not a bridge separating IT and business. They overlap in many critical areas and both must learn from and work well with each other.

TDWI Webinar Review: David Loshin & Liaison on Data Integration

The most recent TDWI webinar had a guest analyst, David Loshin of Knowledge Integrity. The presentation was sponsored by Liaison and that company’s speaker was Manish Gupta. Given that Liaison is a cloud provider of data integration, it’s no surprise that was the topic.

David Loshin gave a good overview of the basics of data integration as he talked about the growth of data volumes and the time required to manage that flow. He described three main areas to focus upon to get a handle on modern integration issues:

  • Data Curation
  • Data Orchestration
  • Data Monitoring

Data curation is the organization and management of data. While David accurately described the necessity of organizing information for presentation, the one thing in curation that wasn’t touched upon was archiving. The ability to present a history of information and make it available for later needs. That’s something the rush to manage data streams is forgetting. Both are important and the later isn’t replacing the former.

The most important part of the orchestration Mr. Loshin described was in aligning information for business requirements. How do you ensure the disparate data sources are gathered appropriately to gain actionable insight? That was also addressed in Q&A, when a question asked why there was a need to bother merging the two distinct domains of data integration and data management. David quickly pointed out that there was no way not to handle both as they weren’t really separate domains. Managing data streams, he pointed out, was the great example of how the two concepts must overlap.

Data monitoring has to do with both data in motion, as in identifying real-time exceptions that need handling, and data for compliance, information that’s often more static for regulatory reporting.

The presentation then switched to Manish Gupta, who proceeded to give the standard vendor introduction. It’s necessary, but I felt his was a little too high level for a broader TDWI audience. It’s a good introduction to Liaison, but following Mr. Loshin there should have been more detail on how Liaison addresses the points brought up in the first half of the presentation – Just as in a sales presentation, a team would lead with Mr. Gupta’s information, then the salesperson would discuss the products in more detail.

Both presenters had good things to say, but they didn’t mesh enough, in my view, and you can find out far more talking to each individually or reading their available materials.

Webinar Review: Big Data addressed poorly

I’ve been in computing business for almost thirty five years, but until this year it was always working for vendors or systems integrators. As a newly minted analyst, I’ve stayed away from very negative reviews. I’ve watched a few bad webinars recently and made the choice not to blog about them. However, as I’ve seen more and more, I’ve realized that doesn’t help the industry and I can’t remain silent.

On Tuesday, I watched a webinar by David Loshin, President of Knowledge Integrity, and Ramesh Menon from Cray. It was not pretty.

Let’s take, for instance, David Loshin’s five points for big data:

  • Plan for scalability
  • Go heavy on Memory
  • Free your applications from the Hadoop 1.0 Execution Model
  • Real-time ingestion and integration
  • Feed the SQL need

The first item has been around since client/server application first came to the fore. Big data has grown, in part, because of its ability to scale large volumes of data. This is nothing new.

Memory? It was a great point years ago, with Tableau and others having pushed it for quite a while. However, the last year or two we’ve been hitting the limits of pure memory solutions and I’ve seen a number of presentations from vendors focused on better integrating memory and disk depending on data latency needs. David’s statement that ““We will start seeing more applications using memory rather than disk,” is wrong. We’ll see more applications better leveraging memory, but disk isn’t going anywhere.

The Hadoop organization’s release of Hadoop v2, YARN, is a clear indication of the limitations of 1.0 and why people have also been talking about it for years. However, in the presentation, leading with 2.0 would have been better than again being a laggard about the known issues with 1.0. Either people use Hadoop and already know the issues or haven’t yet used it and will start with 2.0.

It’s not real-time ingestion the critical issue and I would have liked to see him focus more on the second half of the fourth bullet. Real-time extractions of information are moving much more rapidly than the ability to integrate it with the rest of corporate information and to provide analytical to that information.

David’s final point is the only timely one. People have recently begun to remember that evolution is easier than revolution and I’ve seen a number of vendors begin to focus on providing access to the new data sources via SQL. A lot more people providing business insight to corporations know SQL and that needs to be made available. Ramesh Menon said it better, but the point is here.

The biggest problem I had was with Loshin’s forward looking statement. I’ll almost ignore that nonsense about data lake, he’s not the only one busy trying to use a new, supposedly fancier, term for the ODS, but I’ll mention it anyway. The issue was that he claimed he saw data management moving away from the data lake as we move to in-memory. Really? The ODS isn’t going anywhere. It’s nonsense to thing that every bit of corporate information needs to reside in memory, just in case it might be needed. The ODS is becoming the central source of all operational and business data. Individual business intelligence tools and needs will drive in-memory usage for near real-time needs of specific departmental, divisional or corporate level analytics needs, but there will always be a non-memory source for all that information in order to provide consistency, appropriate levels of control and to handle data governance issues.

Now we turn to Ramesh Menon. His presentation was better than David Loshin’s, but not by much. I’m sorry, there’s no excuse for someone who puts himself forward as a voice of the industry to not understand the difference between a premise and a premises. Considering he used premise correctly in his presentation, it was terrible that it was used three times before that while describing “on-premise” computing. Everyone in our industry needs to sit down, focus and practice saying the right word.

His customer use case was a very jumbled story and an overcrowded slide with the main point being “the customer had a lot of data.” I wouldn’t have guessed. He needs to talk more about solutions, how Cray address the data.

As mentioned above, Ramesh had a very clear point about the difference between data scientists and business analysts being one reason that Hadoop 2.0 is important. The move from batch to lower latency access is part of the difference between a data scientist, someone wanting to be the priest at the temple, and a business analyst, a much larger group working to provide wider access to business information. Updating Hadoop is critical to the ability to keep it relevant.

That was a key point, the problem is that Ramesh isn’t the analyst, he’s Cray’s spokesperson. The discussion shouldn’t have been about generalities but about how Cray must have focused on Hadoop 2.0 for the Urika-XA appliance – but that wasn’t made clear. It was in the data sheet images plopped into the presentation, but reasons and results should have been openly discussed.

I’ll end with the one very interesting point from Mr. Menon’s presentation. He had a slide where he discussed four phases in the analytics pipeline: ETL, algorithms, analysis and visualization. His point is that there are very different resource requirements for each phase of the pipeline. This could be an entire presentation itself and Ramesh could focus and expand this, explaining how Cray helps to address part or all of those requirements to help present Cray to the industry.

Summary

The analysis got a couple of things right, but was mostly too late or wrong. The corporate presentation didn’t clearly link Cray to the issues involved. Both presentation halves were far to generic and descriptive with almost no descriptive takeaways. Furthermore, you could tell that neither presenter seemed to have put much time and effort into the webinar by both the content and presentation styles.

People need to learn that “there’s no such thing as bad press” is only something said by entertainers. It’s not enough to have a webinar to get your name out there. Lots and lots of companies are doing that. Thought needs to go into the presentation and practice needs to go into delivery.

There were some good tidbits in the presentation, but overall it was a mess. I was very disappointed in the hour that I lost.