Tag Archives: teich communications

TDWI Best Practices Report on Hadoop: A good report for IT, not executives

The latest TDWI Best Practices Report is concerned with Hadoop. Philip Russom is the author and the article is worth a read. However, it has the usual issue I’ve seen with many TDWI reports, very strong on numbers but missing the real business point. In journalism, there’s an expression called burying the lede, hiding the most important part of a story down in the middle. Mr. Russom gets his analysis correct, bit I think the priorities or the focus needs work. It’s a great report to use as a source by IT, it’s not a report for executives.

Why am I cranky? The report starts with an Executive Summary. The problem is that it isn’t aimed at executives but is something that lets technical folks think they’re doing well. It doesn’t tell executives why they should care. What are the business benefits? What are the risks? Those things are missing.

First, let’s deal with the humorous marketing number. The report mentions the supposedly astounding figure that “Hadoop clusters in production are up 60% in two years.” That’s part of the executive summary. You have to slide down into the body to understand that only 16% of respondents said they have HDFS production. It’s easy for early adopters to grow a small percent to a slightly larger small percentage, it’s much tougher to get a larger slice of the pie.

Philip Russom accurately deals with why it will take a bit for Hadoop to grow larger, but it does it past the halfway point of the article. Two things: Security and SQL.

Executives are concerned that technology helps business. Security ensures that intellectual property remains within the firm. It also ensures that litigation is minimized by not having breaches that could be outside regulatory and contractual requirements. Mr. Russom accurately discusses the security risks with Hadoop, but that begins down on page 18 and doesn’t bubble up into the executive summary.

So too is the issue of SQL. After writing about the problems in staffing Hadoop, the author gives a brief but accurate mention of the need to link Hadoop into the rest of a business’ information infrastructure. It is happening, as a sidebar comment points out with “Hadoop is progressively integrated into complex multi-platform environments.” However, that progress needs to speed up for executives to see the analytics from Hadoop data integrated into the big picture the CxO suite demands.

The report gives IT a great picture of where Hadoop is right now. As expected from a technical organization, it weighs the need, influence and future of the mystical data scientist too highly, but the generalities are there to help mid-level management understand where Hadoop is today.

However, I’ve seen multiple generations of technology come in, and Hadoop is still at an early adopter phase where too many proponents are too technical to understand what executives need. It’s important to understand risks and rewards, not a technical snapshot; and the later is what the report is.

IT should read this report as valuable insight to what the market is doing. It’s, obviously, my personal bias, but the summary is just that, a summary. It’s not for executives. It’s something that each IT manager will use for its good resources to build their own messages to their executives.

JInfonet at the BBBT: OEM or Direct, a Decision is Necessary

Let’s cut to the chase, this is another company with a very good product and no idea how to message. Unless they quickly figure out and communicate the right message, they’ll need to get ready for acquisition as an exit strategy.

Jinfonet is a company founded, it seems, to clone Crystal Reports in Java. Hence the awkward name. JReport, their product, is full featured and we’ll get to that, but the legacy name using report will leave them behind if that remains their focus.

The presentation was primarily by Dean Yao, Director of Marketing, with demo support brought by the able Leo Zhao, Senior Systems Consultant. However, the presentation indicated the message problem.

Reports? What Reports?

The name of the product is JReports, but at no time in the three hours did a report make an appearance. They showed two different analyst charts, Nucleus Research and EMA, of the business intelligence (BI) industry to show where they were placed. BI. Yet when asked about competition, Dean Yao repeatedly mentioned they didn’t compete against BI vendors but focused on reports.

Their own presentation begs to differ:

JReport solution areas

Notice that reports are a secondary feature of one focus.

What’s also good and bad is that Leo Zhao’s demonstrations showed a very richly featured product that does compete against the other vendors. The only major hole wasn’t in functionality, it’s that the rich set of visualizations weren’t as pretty as most of the competition. That is in part because they are self-funded with more limited resources and partly because they’re great techies who haven’t prioritized visualizations as they should.

OEM or Direct?

OEM, in JInfonet’s business model, doesn’t only mean the product embedded in third party applications. Mr. Yao discussed how JReport is also regularly embedded in departmental IT applications. That is different than when companies use JReport as a standalone product.

Dean talked about how 30% of their business in recent years was direct, with the rest being OEM. At the same time, he mentioned that last year was around 50/50. That’s not a problem. What is an issue is that they don’t know why it was. Did sales focus on direct? Was one major direct client a large revenue outlier which skewed the results? They don’t seem to know.

That matters because the OEM and direct models are very different. With OEM, you let the other company deal with business messages. All you’re doing is presenting to them a good technical story and cost point compared do simpler products, a tiny segment of competition or doing nothing and losing out to their competitors.

Enterprise sales, on the other hand, require a focus on the end user, the folks using the products and the business issues they have. That is what’s missing from the presentation, their web site and the few pieces of collateral I reviewed.

One thing should also be said about the OEM to departments model. The cloud is changing the build v buy balance for many departments for the applications in which JReport is embedded, so I’m not sure how much longer this model will be of significant revenue.

Mr. Yao said they don’t do enterprise sales, but just sell to SMB and enterprise departments, so that means they’re not really competing against other BI vendors. A lot of the analysts on the call quickly jumped on that, pointed out that even one of the largest companies openly talks about its strategy of land and expand. “Just land” is not a long term strategy.

What’s that mean?

Right now the enterprise market is very fragmented, so there’s a space for a small company, but that won’t last long. Crystal Reports had a long run based on the technologies of the day, but it no longer is independent. Today, things are changing far more rapidly. The cloud is allowing BI firms to address small to global companies with similar products and the major players (and most smaller ones) are focused on that full business market.

Given the current product, JInfonet can go one of two ways. They can decide to completely focus on OEM, keep a technical message and just sell enterprise as it happens.

The other option, one I openly prefer, is that they realize that they have a very good product that does compete in the direct model and they need to focus more messaging. They can still provide to OEM, but that’s easier – it’s a subset of the full featured message.

The solution, though, resides in the folks who weren’t in Boulder: The founders. The company has been self-funded since 1998 and the founders are used to their control. I’ve seen companies fail because owners were unwilling to see that times have changed. They mistakenly think that pivoting markets says they did something wrong in the past, so they’re hesitant. It doesn’t say that, but only that the people have enough confidence to adapt to a new market with the same energy and intellect with which they addressed the original market.

JInfonet has great potential, but it will require a strong rethink and clarification of who they are in order to convert that to kinetic. From what I’ve seen of the product and two people, I hope they succeed.

AptiMap at BBBT: Improving Data Mapping

Today the BBBT held a special session. While most presentations are by companies with full products, existing sales and who typically have been around for a few years, today we had the pleasure of listening to Sherry Brown, President of AptiMap. This is a pure startup company, still tiny. She was looking for our always vocal analyst community’s opinions on her initial aim and direction. Not to surprise anyone who knows the BBBT, we gave that at full bore.

Ms. Brown’s goal is to provide a far easier way of mapping fields between source and target datasets for creating data warehouses and other data stores. It’s a great start and she has some initial features that will help. I’ll be blunt: I’m intentionally not going to say a lot. As mentioned, they are a very early startup and the software isn’t full fledged. That means any mention of what they have and don’t have could be inaccurate by next week. That’s not a bad thing, it’s what happens at that phase.

I will mention that the product is cloud based from the start.

The important question about whether or not to contact AptiMap is what who you are and what you need. Most of the feedback to Sherry was about that. It was helping to focus the message. If I have correctly understood the consensus of the attendees, here are the critical things to focus upon while defining a market for the initial product:

  • Aimed at IT and business analysts
  • Folks currently using modeling tools or spreadsheets at a start
  • Focus on standard, enterprise data sources, from spreadsheets to RDBMS’s, Hadoop can wait
  • Mid-sized companies integrating their first sets of systems or trying to get a handle on their existing data
  • Might especially be good in the hands of consultants going into those types of companies.
  • Many of the potential users are tablet users, so focus on that aspect of mobile

One final key, one that needs to be a full paragraph rather than a bullet and one that many technical startups don’t get while building their products based on user needs, is that users aren’t the only decision makers in the product. As mentioned, this is a cloud product and AptiMap will be expecting recurring revenue from monthly or annual fees. The business analyst is often not the person who approves those types of costs. The firm also needs to focus messages on the buyers, whether IT, line or consulting management, to build messages that help them understand the business benefit of providing the tool to their people.

Understanding your market matters. It will help the firm not only focus product, but also narrow down the marketing message and image to aim at the correct audience.

Too often, founders get a great technical idea and focus on a couple of users to fill out product features and then try to find a market. BI is moving too fast for that, the vision needs to be much more clearly set out much earlier than was needed in software companies twenty years ago.

Finally, I mentioned the cloud model but should also mention AptiMap is offering a 30-day free trial.

Summary

AptiMap has an initial product that can help people more rapidly and accurately create mappings between data sources and targets. It’s cloud based for easy access. It is, however, very early in the product and company life cycle.

I would suggest it primarily to analysts in mid-sized organizations or consultants who work with SMBs and want some quick hit functionality add to map data sources for the creation of data warehouses, ODS’s and other relationally oriented data repositories.

If you want to experiment inexpensively with an early product that could help, contact them.

Tableau at the BBBT: Strengthening the Business in Business Intelligence

Tableau was back at the BBBT last week. Last year’s presentation was a look ahead at v8.2. The latest visit was a look back at 2014 and a focus on v9.0. Francois Ajenstat, VP Product Management, was back again to lead us through product issues. The latest marketing presenter was Adriana Gil Miner, VP Corporate Communications.

Tableau Revenues

Ms. Gil Miner opened the morning with the look back at last year. The key point was thestrength of their growth. They are not only pleased with the year-over year growth, but thechart also shows last year’s revenue as a slice of revenue over Tableau’s lifetime. We’ll leave it simply as: They had a good year.

Another point in describing their size is that Adriana said they have 26,000 customer accounts. Some confusion with a later presentation number required clarification and this isn’t users, or even sites. We were told that the 26k is the number of paying company accounts. There were no numbers showing median account size or how far the outliers are on either extreme, but that’s a nice number for the BI space.

The final key point made by Adriana Gil Miner was localization. Modern companies almost all create products using unicode or other methods that allow for language localization, but Tableaus has made the strong push to provide localized software and data sets in multiple languages. My apologies for not listing them, there’s some weird glitch with my Adobe Reader that’s crashing only on their presentation while no other analyst is having the same problem, so I can’t provide a list. Please refer to your local Tableau rep for details.

Francois Ajenstat then took over. It was no surprise that his focus was on v9.0. He discussed it by focusing on nine points he views as key:

  • Access to more data sources
  • Answer more questions
  • Improve the user experience
  • Support analytics at scale
  • Performance as a differentiator
  • Support for mobile
  • Tableau Public redesign
  • Coming out next quarter

If you look at those, you might question why that many bullets? For instance, when it comes out is just a schedule issue and doesn’t rise to the level of the others. Tableau Public’s redesign just seems to be the obvious end of focusing on better user experience and performance.

However, a couple sound the same but should be differentiated. Analytics at scale and performance improvements overlap but aren’t identical. Francois showed both what they did to improve performance on clustered servers, helping both bigger data sources and more simultaneous users, and also demonstrating that they’ve done some great optimization in basic analytics for individuals.

One of the best parts was the honesty, now that they’re close enough to releasing v9.0, in admitting that early versions ran slowly. They showed quotes from beta testers talking about major performance improvements. In addition, Tableau Public is a great source of testing real-word analytics. Mr. Ajenstat pointed out that they took the 100 most accessed visualizations in Tableau Public and analyzed performand differences, seeing a 4x increase in performance on average. While it’s always important to generate internal tests to stress potential use, focusing on how business really use the tool is even more important in ensuring performance is seen as good in day-to-day usage by knowledge workers, not only in heavy loads by analysts doing discovery.

LOD Expressions

The one thing that really caught my eye about v9.0 is the incorporation of Level of Detail (LOD) expressions. BI firms have been adding drill-down analytics for a decade. Seeing a specific level of detail and then dropping down to a lower level is critical. However, that’s not enough.

What’s needed is to be able to visually compare the lower level details with overall numbers. For instance, a sales VP regularly wants to know not just how an individual sales person is doing, but also how that compares to the region and national numbers. Only within context can you gain insight.

Among the other things LODs help is the ability to bin aggregates. Again we can turn to sales to think about retail sales across categories while also comparing those to total sales or in a trend analysis.

While many companies are working to add more complex analysis, it’s clear that Tableau hasn’t only looked at how a very technical person can create an LOD. They’ve worked on an interface, that from the demo, has a simple and clean interface that business end users can user. Admittedly, that’s what demos are supposed to do, but I’ve seen some try and fail miserably. This seems to be a good attempt to understand business intelligence with an emphasis of the first word.

Summary

Some of the very new startups make the mistake of thinking even the first generation BI companies are too old to innovate. Those companies aren’t and are still a threat. However, Tableau is not even in the first generation and is still more nimble yet. They have their eyes on the ball and are moving forward. Even more importantly, while still focusing on their technology, as do many startups, they seem to have become mature enough to start shifting focus from the IT and business analysts to the information consumers.

Understanding what the business knowledge workers throughout the business hierarchy need, in data and performance, is what will drive the next growth spurt. Tableau seems to have them in target.

IBM and the Cloud? Don’t write it off

At today’s investor meeting, IBM execs announced a target of $40 billion in revenue for cloud, analytics, mobile, social and security software by 2018. I’ve expect to see folks talk about dinosaurs not being able to turn fast enough and predicting failure to meet that goal. I don’t know if they can do it, but to make such ardent predictions you’d have to ignore history.

Mid-sized Unix servers came along and folks talked about IBM going away.

IBM blew a chance to own PC industry and the same predictions followed them.

Linux? Freeware was going to destroy the mainframe. Oops, Linux partitions run on mainframes.

Now we know the large growth of the cloud. Much of it has been on commodity boxes. However, as data gets larger, analytics more powerful and networks become more robust, there’s clearly space for a company with such a strong history in hardware, services and adapting to changes.

After all, too many people still think of IBM as a hardware company. While it’s too early for the 2014 report, you can check the 2013 Annual Report and check page 7. Look at what a tiny percentage of the bar is hardware. Software and services are fairly even in splitting the vast majority of the revenue stream.

It’s a strong goal and will take a lot of pushing. How many politely phrased “re-orgs” will happen to lay off staff? Who knows? Will they succeed? No clue. All I expect is that they’ll continue to grow and nobody should count them out.

Datameer and Altiscale: A Tale of Two Startups

A webinar I watched was titled for the announcement of Datameer Professional, but that’s not what caught my interest. I previously blogged about Datameer’s presentation to the BBBT, so you can see where I think they’re a good, basic company who seemed to have a better grasp on the market than most startups. Sadly, that wasn’t shown today. What was most interesting was the part of the presentation that covered Altiscale, a company partnering with Datameer to provide Datameer Professional.

The Basics

The presentation was a tag team between Stefan Groschupf, CEO of Datameer, and Mike Maciag, COO of Altiscale.

It began with Stefan giving a very generic overview of the market. There was little content and that which was there wasn’t a surprise. The main point is that the demand for Hadoop has moved from IT doing behind the scenes work to business users wanting quick analysis. The lack of technical knowledge makes staffing an issue.

That led directly to Mike Maciag talking about how Altiscale provides Hadoop as a service. They’re a cloud provider of Hadoop. With founders from Google, Yahoo and LinkedIn, they understand Hadoop, the cloud and the need of companies to quickly leverage external resources to get to Hadoop with better TCO than would be the case trying to build in-house.

The reason for that quick introduction became apparent as soon as Mike tossed back to Stefan. Datameer Professional is a cloud version of their product that runs on Altiscale in the US. During Q&A, it come out that they’re using a different, unnamed provider in the EU for compliance issues over there.

One feature is they’re claiming they’ll provide three sets of basic analytics with Professional, for customer analysis, operations and cyber security – in that order. However, there was the qualification from Mr. Groschupf of “over the next few months,” so no clue how much is there now or when they’re really be available.

Case Studies

One thing I noticed placed a clear difference between the two companies as for how far they understand business customers. The Datameer customers mentioned were never named. None are approved. Therefore, I have to question the veracity of claims.

The case study presented by Altisoft was a named client. Business users want to know reality and also want to confirm that a customer has enough confidence in a vendor to attach a name.

Summary

The summary combines what I heard today and what I heard at the BBBT presentation.

Datameer is a good startup, but is still focused on technology. BI has a short life cycle and companies need to rapidly prepare for the chasm. As they showed no updates in their UI, don’t have strong market messages and didn’t have a referenced customer, Datameer is a company to look at for technology but I again wonder about the long term strategy and think the technology will be acquisition bait. Look to them for a basic interface, strong underpinnings to they don’t seem focused clearly on end-user business messages.

Altiscale is also a good startup, but they seem better focused on a business message. Perhaps that’s because they are a cloud service company so they know it is important to do so. From the little I saw, they seem to be positioning themselves well as an alternative to Amazon and other choices to help business quickly take advantage of big data on Hadoop. I’d like to hear more.

Both Datameer and Altiscale are worth looking at, but for different reasons and they don’t have to be a set.

TDWI Webinar: Embedded Analytics

The latest TDWI webinar was on embedded analytics. The speakers were Fern Halper, the director of TDWI research for advanced analytics, and Mark Gamble from OpenText. For those of you who hadn’t heard, Actuate was acquired by OpenText and is being rebranded but, according to Mark, will remain an independent division for now.

Ms. Halper’s main point is that embedded has a lot of different meanings for different audiences and that she wants to create a clear framework for understanding the terminology within the analytics space. She’s clear that what’s meant isn’t just into the mass market idea of wearable software, but that analytics can be embedded in specific applications, broader systems and, yes, devices such as mobile and wearable items.

Early in the presentation she presented a two axis image comparing structured and unstructured data combined with human and machine generated data. While I think the coloring should rotate, to emphasize that the difference between machine versus human generated information is a bigger issue than structured v unstructured, it’s a nice way of understanding some of the data streams.

TDWI Embedded Analytics - Data Sources

That, however, was a definitional slide and discussion. The real mean of Fern Halper’s presentation was the framework she described to help understand the steps of embedding analytics.

TDWI Embedded Analytics - Framework

Operationalized analytics are those that are involved in the full process of decision making. For instance, a call center employee might be talking to a prospect whose finances are flagged as a question mark. That prospect must be sent to another person to process the decision based on analytics.

Integrated analytics are those that allow the call center operator to see the analysis and immediately make decisions based upon guidelines.

Automated analytics are those that provide the operator with a decision tree response based on analytics done behind the scenes.

The only issue I take with the framework is it doesn’t necessarily mean true real time. The example discussed shows that the integrated approach can be real time for what humans think of as real time within our own interactions. Meanwhile, real-time might not be a necessary component to some automated decisions. Real-time is a separate issue and I think Fern’s framework would be better served by eliminating that item.

Fern Halper followed the framework with the usual and interesting TDWI survey numbers. This time, the questions were focused on the adoption of analytics tied to the framework. The numbers showed the unsurprising fact that analytics adoption is still in its infancy. One of the great parts of TDWI’s numbers is they show the reality which contradicts the industry’s hype.

One set of numbers I’d like to see wasn’t included. The responses were only IT responses in general, who has started using what analytics. I would have loved to see one slide that clearly showed only the sub-segment of companies who are already using analytics tools and where those companies are within Ms. Halper’s framework. Are are the bleeding edge folks doing at moving through the framework to automated solutions?

OpenText

The rest of the program was a fast presentation by Mark Gamble, pointing to OpenText’s (Actuate’s) main benefit claim of enterprise scalability and the other factors. One of the phrases I liked was his reference stating they “adhered to a low code methodology.” It’s nice to hear folks admitting that as much as we want to eliminate coding, some of that is still required. Honesty isn’t a negative in marketing and I liked that turn of phrase.

In the other direction, he mentioned there were over fourteen million downloads of BIRT and that the company “believes” they have over three million users. I’m not interested in belief but they don’t seem to have a clear figure on adoption.

The main problem I had was the demo. Mark showed experimental work positing to show live acquisition of basic automotive information such as speed and RPM displayed on a computer, phone and watch. It was not only not a business case but one that seemed to go back to the misunderstanding about the meaning of embedded which was addressed by Fern. Yes, it was embedded on two devices, but the demo didn’t show how it might be embedded in business applications. It stuck with the flashy concept of wearables.

OpenText might have something good with their analytics portability, but I don’t think the demo presents it to a business audience. Yes, techies will understand the underpinnings that make it cool, but the business folks writing checks need to see something that justifies the expenditure and I don’t think that’s shown.

Summary

Fern Halper did another good job of putting the adoption of analytics into perspective. This time, with a framework for better understanding embedded analytics.

Mark Gamble did a passable job of presenting OpenText’s solution but I feel he must do a better job of figuring out a business message.

TDWI’s data shows the early state of adoption that exists in the market. Fern Halper’s framework will help companies better understand how to move into the arena, but only if the companies providing those solutions can better present how they’ll help solve business issues.

MapR at BBBT: Supporting Hadoop and still learning

I’ve probably used this in other columns, but that’s life. MapR’s presentation to the BBBT reminded me of Yogi Berra’s statement that it feels like déjà vu all over again. Wait, if I think I’ve done this before, am I stuck in a déjà vu loop?

The presentation was a tag team effort of Steve Wooledge, VP Product Marketing, and Tomer Shiran, VP Product Management.

The Products and Their Aim

The first part of the déjà vu was good. People love to talk about freeware, but mission critical solution won’t be trusted on such. Even before Linux, before Unix, software came out and it took companies to package it with service and support to provide constancy and trust for widespread IT adoption. MapR is a key company doing that with Apache Hadoop, the primary open source technology for big data applications.

They’ve done the job well, putting together a strong company that, quite reasonably, has attracted some great investors and customers. Of course, because Hadoop is still in its infancy, even a leading company such as MapR only mentions 700 customer, companies paying for licenses; but that’s a statement about big data’s still fairly limited impact in operational systems not a knock on MapR.

Their vision statement is simple: “Empowering the As-it-happens business by speeding up the data-to-action cycle.” Note the key: Hadoop is batch oriented and all the players realize that real-time analysis matters for some key sales and marketing applications. Companies are now focusing on how fast they can get information out of the databases, not what it takes to get data in. A smart move but only half the equation.

One key part of the move to package open source into something trusted was pointed out by Steve Wooledge. When the company polled customers about why they chose MapR, the largest response was availability, the up time of the system. Better performance wasn’t far behind, but it’s clear that the company understands that availability is a critical business issue and they seem to be addressing it well.

Where the déjà vu hits in a not-so-positive way is the regular refrain of technologists still not quite getting business – even when they try. This isn’t a technology problem but an innovator’s problem. When you get so wrapped up in the cool things you’re doing, you think that you need to lead with the cool things, not necessarily what the market wants.

One example was when they were describing the complexity of the MapR packaging. Almost all the focus was on the cool buzzwords of open source. Almost lost in the mix was the mention that their software supports NFS. It was developed more than 30 years ago and helps find files on networks. That MapR helps link both the latest and the still voluminous data in existing file systems is a key point, something that can help businesses understand that Hadoop can be integrated into existing systems and infrastructure. However, it’s not cool so the information is buried.

The final thing I’ll mention about the existing products is that MapR has built a nice three product suite, providing open source, mid-tier and full enterprise versions. That’s the perfect way to address the open source conundrum and move folks along the customer curve.

Apache Drill: Has it Bitten Off Too Much?

Sorry, couldn’t help the drill bit reference. Tomer Shiran took the later part of the presentation to show off Apache’s latest data toy, Apache Drill, intended to bridge the two worlds of data. The problem I saw was one not limited to Tomer, MapR or even Apache, but to all folks with with what they think of as new technology: Over hype and an addiction to revolutionary rather than evolutionary words and messages. There were far too many phrases that denigrated IT and existing technology and implied Drill would replace things that weren’t needed. When questioned, Tomer admitted that it’s a compliment; but the unthinking words of many folks in the industry set out a pattern inimical to rapid adoption into the Global 1000’s critical information paths.

Backing up that was a reply given to one questioner: ““CIO of one of the largest tech companies said they can’t keep doing things the same way.” Tech companies tend to be bleeding edge by nature, they do not represent the fuller business world. More importantly, the idea that a CIO saying she needs to change doesn’t mean the CIO is planning on throwing out existing tools that work. It means she wants to expand and extend in a way to leverage all technology to provide better decision making capabilities to the rest of the CxO suite.

Another area of his talk finally brought forward, through a very robust discussion, of one terminology issue that many are having. Big data folks like to talk about “no schema” but that’s not really true. Even when they modify the statement to be “schema on read” it’s missing the point.

They seem to be confusing fixed layout, relational records with the theory of schemas. XML is a schema for data exchange. It’s very flexible and can be self-defined, but it’s a schema. As it came from SGML, it’s not even the first iteration of flexible schemas. The example Mr. Tomer gave was just like an XML schema. Both data source and data recipient have to know some basic information such as field names in order to make sense of data, so there’s a schema.

Flexible schemas not only aren’t new, they don’t obviate the need for flexible schemas. They’re just another technique for managing the wide variety of data that business wishes to turn into information. As long as big data folks misusing a term and acting as if they have something revolutionary, the longer they’ll retard their needed incursion into IT and business information.

Summary

Hadoop and big data aren’t going anywhere except forward. The question is at what speed. There are some great things happening in both the Apache open source world and MapR’s licensed support for that world, but the lack of understanding of existing IT and business is retarding adoption of the new and exciting technologies.

When statements such as “But the sales guy won’t do X” are used by folks who have never been in and don’t understand sales, they’re missing the market. Today’s sales person is looking for faster and more accurate information, and is using many tools people would have said the same thing about only ten years earlier. In the meantime, sales management and the CxO suite who provide guidance for the sales force are even more interested in big picture information coming from massaging large data sources.

The folks in the new arenas such as Hadoop need to realize that they are complementary to existing technologies and that can help both IT and business. When pointing that out, I was asked by one of the presenters if that meant he should do two case studies, one with Hadoop, flexible schema and one with old line uses, I gave a clear no. It should be one with new and one that shows new and existing data sources combining to give management a more holistic picture than previously possible.

Evolution is good. MapR can help. They need to do the tough part of technology and more their view from what they think is cool to what the market thinks is needed.