I’ve published another article in my Management AI series that’s a subset of my forbes.com column. This one is a high level introduction to natural language processing (NLP) and natural language generation (NLG).
Data discovery, as we defined it in the 1990s, has changed to take advantage of modern algorithms. It’s moved past basic data analysis and is better at finding relationships between different pieces of information. Discussed in detail in my forbes.com article.
My latest article in Search Data Management points out the importance of data quality in training machine learning systems.
My latest article on Forbes describes how different components of AI are beginning to work together to improve manufacturing.
A short overview of the subject on Forbes.
My Forbes article about how I think the interesting research is a step towards passing the Turing Test.
I’d been thinking of writing a column on p-values, since the claim that data “scientists” can provide valuable predictive analytics is a regular feature of the business intelligence (BI) industry. However, my heavy statistics are years in my past. Luckily, there’s a great Vox article on p-values and how some scientists are openly stating that P<.05 isn’t stringent enough.
It’s a great introduction. Check it out.
I’ve seen a few company webinars recently. As I have serious problems with their marketing, but don’t wish that to imply a problem with technology, this post will discuss the issues while leaving the companies anonymous.
What matters is letting business decision makers separate the hype from what they really need to look at when investigating products. I’m in marketing and would never deny its importance, but there’s a fine line between good marketing and misrepresentation, and that line is both subjective and fuzzy.
As the title suggests, I’ll discuss the line by describing my views of two buzzwords in business intelligence (BI). The first has been used for years, and I’ve talked about it before, it’s the concept of self-service BI. The second is the fairly new and rapidly increasing use of the word “machine” in marketing
Self-Service Still Isn’t
As I discussed in more detail in a TechTarget article, BI vendors regularly claim self-service when software isn’t. While advances in technology and user interface design are rapidly approaching full self-service for business analysts, the term is usually directed at business users. That’s just not true.
I’ve seen a couple of recent presentations that have that message strewn throughout the webinars, but the demonstrations show anything but that capability. The linking of data still requires more expertise that the typical business user needs. Even worse, some vendors limit things further. The analysts still create basic reports and templates, within which business people can wander with a bit of freedom. Though self-service is claimed, I don’t consider that to approach self-service.
The result is that some companies provide a limited self-service within the specified data set, a self-service that strongly limits discovery.
As mentioned, that self-service is either misunderstood or over promised doesn’t obviate that the technology still allows customers to gain far more insight than they could even five years ago. The key is to take the promises with a grain of salt.
When you see it, ignore the phrase “self-service.”
Prospective BI buyers need to focus on whether or not the current state of the art presents enough advantages over existing corporate methodologies to provide proper ROI. That means you should evaluate vendors based on specific improvements on your existing analytics and the products should be rigorously tested against your own needs and your team’s expertise.
Machine learning, to be discussed shortly, has exploded in usage throughout the software industry. What I recently saw, from one BI vendor, was a fun little marketing ploy to leverage that without lying. That combination is the heart of marketing and, IMO, differs from the nonsense about self-service.
Throughout the webinar, the presenter referred to the platform as “the machine.” Well, true. Babbage’s machines were analytic engines, the precursors to our computers, so complex software can reasonable be viewed as a machine. The usage brings to mind the concept of machine learning while clearly claiming it’s not.
That’s the difference, self-service states something the products aren’t while machine might vaguely bring to mind machine learning but does not directly imply that. I am both amused and impressed by that usage. Bravo!
Machine Learning and Natural Language Processing
This phrase needs a larger article, one I’m working on, but I would be remiss to not mention it here. The two previous sections do imply how machine learning could solve the self-service problem.
First, what’s machine learning? No, it’s not complex analytics. Expert systems (ES) are a segment of artificial intelligence focused on machines which can learn new things. Current analytics can use very complex algorithms, but they just drive user insight rather than provide their own.
Machine learning is the ability for the program to learn new things and to even add code that changes algorithms and data as it learns. A question to an expert system has one answer the first time, and a different answer as it learns from the mistakes in the first response.
Natural Language Processing (NLP) is more obvious. It’s the evolving understanding of how we speak, type and communicate using language. The advances have meant an improved ability for software to responds to people without clicking on lots of parameters to set search. The goal is to allow people to type or speak queries and for the ES to then respond with information at the business level.
The hope I have is that the blend will allow IT to set up systems that can learn the data structures in a company and basic queries that might be asked. That will then allow business users to ask questions in a non-technical manner and receive information in return.
Today, business analysts have to directly set up dashboards, templates and other tools that are directly used by business, often requiring too much technical knowledge. When a business person has a new idea, it has to go back to a slow cycle where the analyst has to hook in more data, at new templates and more.
When the business analyst can focus on teaching the ES where data is, what data is and the basics of business analysis, the ES can focus on providing a more adaptable and non-technical interface to the business community.
Machine learning, i.e. expert systems, and NLP are what will lead to truly self-service business applications. They’re not here yet, but they are on the horizon.
The new books section of my library had a text I almost didn’t check out. Unfortunately, I did. It’s “The Content Trap” by Bharat Anand, and it’s another great example of what academics miss about the real world. The book, from the fly leaf and introduction, presents itself as attempting to say that social networks are important and content isn’t. While the recent presidential election might imply that’s true, the author is supposedly knowledgeable about business and is focused on helping management strategy.
The problem is that I didn’t get twenty pages into the book before Mr. Anand displayed his complete misunderstanding of the business of technology. His chapter three is about “networks” and the first example purports to explain why Apple lost to Microsoft in the 1980s. He provides some semantically nil blather about “direct network effects” and “indirect network effects,” while assiduously avoiding what happened.
There are a number of reasons for Apple’s failure to get a significant market share at that time, among which are:
- Jobs and Wozniak ran a perfectionist organization while Gates and Allen quickly got “good enough” products to the market.
- Microsoft’s founders understood what IBM’s off-the-shelf production meant for rapidly entering a market while Apple wanted complete control of hardware, software and networking.
- Apple went for high-end price and élan rather than the factors that attract a business market quickly looking to move many things off of the mainframe and onto a manager’s desk.
- While Microsoft quickly adapted to larger screens, more functional mice, and other newer technology easier for business users, Apple stuck with the Mac’s small screen, one button mouse, and other limitations for far too long.
While the author talks about “network effects,” he doesn’t seem to show any understanding of the key products that provided that for Microsoft: The elements that became Microsoft Office: In particular, the spreadsheet. To talk about networks at a high, completely theoretical, level while claiming to give a case study does nothing to display an understanding of the issues involved.
That brings us back to Mr. Anand’s primary, fallacious, point. The PC didn’t create a network. Mainframe reports already provided to the network. His page 13 graphic about the hub and spoke versus multiple connection network has a simplistic accuracy but again misses the point. In the traditional method, most content was centralized. What he misses is that it’s not just users talking to each other around central content, as he presents, but each user having his or her own content that needs to integrate which changed.
The spreadsheet, and so many things since then, allowed individual managers to create their own content and then share it, faster than they previously could do the same. That led to a speed-up of business reactions.
However, it also led to multiple versions of content and the question of “versions of truth” that those of us in business intelligence daily address. We understand the power of networks, but also understand that without content and control over it business will have serious problems.
Content and networks can be seen as two halves of a coin. However, as the Apple example shows, they’re really two faces of a die, with many other factors that also matter. Bharat Anand doesn’t seem to comprehend that, but seems to instead to be quickly taking advantage of a market condition to abuse a network without content. It’s clear that, if you’re only interested in making money, networks will help. For example, an impressive academic title might get a lot of libraries to buy your book. However, to be truly of value, there must be content. The Content Trap lacks content. The author has made money, he’s added another line to his CV, but he’s added nothing of value to the ecosystem.
Nobody in business should pay attention to this book.