Gaining control over far-flung and disparate data has been a decades’ old struggle, but now as hybrid and public clouds join the mix of legacy and distributed digital architectures, new ways of thinking are demanded if comprehensive analysis of relevant data is going to become practical.
Stay with us now as we examine the latest strategies for making the best use of data integration, data catalogs and indices, as well highly portable data analytics platforms.
To learn more about closing the analysis gap between data and multiple -- and probably changeable -- cloud models, we are joined by Mike Potter, Chief Technology Officer (CTO) at Qlik. The discussion is moderated by Dana Gardner, Principal Analyst at Interarbor Solutions.
Here are some excerpts:
Gardner: Mike, businesses are adopting cloud computing for very good reasons. The growth over the past decade has been strong and accelerating. What have been some of the -- if not unintentional -- complicating factors for gaining a comprehensive data analysis strategy amid this cloud computing complexity?
Potter: The biggest thing is recognizing that it’s all about where data lives and where it's being created. Obviously, historically most data have been generated on-premises. So, there is a strong pull there, but you are seeing more and more cases now where data is born in the cloud and spends its whole lifetime in the cloud.
And so now the use cases are different because you have a combination of those two worlds, on-premises and cloud. To add further complexity, data is now being born in different cloud providers. Not only are you dealing with having some data and legacy systems on-premises, but you may have to reconcile that you have data in Amazon, Google, or Microsoft.
Our whole strategy around multicloud and hybrid cloud architectures is being able to deploy Qlik where the data lives. It allows you to leave the data where it is, but gives you options so that if you need to move the data, we can support the use cases on-premises to cloud or across cloud providers.
Gardner: And you haven’t just put on the patina of cloud-first or software as a service (Saas) -first. You have rearchitected and repositioned a lot of what your products and technologies do. Tell us about being “SaaS-first” as a strategy.
Scaling the clouds
Potter: We began our journey about 2.5 years ago, when we started converting our monolith architecture into a microservices-based architecture. That journey struck to the core of the whole product.
Qlik’s heritage was a Windows Server architecture. We had to rethink a lot of things. As part of that we made a big bet 1.5 years ago on containerization, using Docker and Kubernetes. And that’s really paid off for us. It has put us ahead of the technology curve in many respects. When we did our initial release of our multicloud product in June 2018, I had conversations with customers who didn’t know what Kubernetes was.
One enterprise customer had an infrastructure team who had set up an environment to provision Kubernetes cluster environments, but we were only the second vendor that required one, so we were ahead of the game quite a bit.
Gardner: How does using a managed container platform like Kubernetes help you in a multicloud world?
Potter: The single biggest thing is it allows you to scale and manage workloads at a much finer grain of detail through auto-scaling capabilities provided by orchestration environments such as Kubernetes.
More importantly it allows you to manage your costs. One of the biggest advantages of a microservice-based architecture is that you can scale up and scale down to a much finer grain. For most on-premises, server-based, monolith architectures, customers have to buy infrastructure for peak levels of workload. We can scale up and scale down those workloads -- basically on the fly -- and give them a lot more control over their infrastructure budget. It allows them to meet the needs of their customers when they need it.
Gardner: Another aspect of the cloud evolution over the past decade is that no one enterprise is like any other. They have usually adopted cloud in different ways.
Has Qlik’s multicloud analytics approach come with the advantage of being able to deal with any of those different topologies, enterprise by enterprise, to help them each uniquely attain more of a total data strategy?
Potter: Yes, I think so. The thing we want to focus on is, rather than dictate the cloud strategy – often the choice of our competitors -- we want to support your cloud strategy as you need it. We recognize that a customer may not want to be on just one cloud provider. They don’t want to lock themselves in. And so we need to accommodate that.
There may be very valid reasons why they are regionalized, from a data sovereignty perspective, and we want to accommodate that. There will always be on-premises requirements, and we want to accommodate that.
The reality is that, for quite a while, you are not going to see as much convergence around cloud providers as you are going to see around microservices architectures, containers, and the way they are managed and orchestrated.
You are not going to see as much convergence around cloud providers as you are going to see around microservices architectures, containers, and the way they are managed and orchestrated.
Gardner: And there is another variable in the mix over the next years -- and that’s the edge. We have an uncharted, immature environment at the edge. But already we are hearing that a private cloud at the edge is entirely feasible. Perhaps containers will be working there.
At Qlik, how are you anticipating edge computing, and how will that jibe with the multicloud approach?
Running at the edge
Potter: One of the key features of our platform architecture is not only can we run on-premises or in any cloud at scale, we can run on an edge device. We can take our core analytics engine and deploy it on a device or machine running at the edge. This enables a new opportunity, which is taking analytics itself to the edge.
A lot of Internet of Things (IoT) implementations are geared toward collecting data at the sensor, transferring it to a central location to be processed, and then analyzing it all there. What we want to do is push the analytics problem out to the edge so that the analytic data feeds can be processed at the edge. Then only the analytics events are transmitted back for central processing, which obviously has a huge impact from a data-scale perspective.
But more importantly, it creates a new opportunity to have the analytic context be very immediate in the field, where the point of occurrence is. So if you are sitting there on a sensor and you are doing analytics on the sensor, not only can you benefit at the sensor, you can send the analytics data back to the central point, where it can be analyzed as well.
Gardner: It’s auspicious, the way that Qlik’s catalog, indexing, and abstracting out the information about where data is approach can now be used really well in an edge environment.
Potter: Most definitely. Our entire data strategy is intricately linked with our architectural strategy in that respect, yes.
Gardner: Analytics and being data-driven across an organization is the way of the future. It makes sense to not cede that core competency of being good at analytics to a cloud provider or to a vendor. The people, process, and tribal knowledge about analytics seems essential.
Do you agree with that, and how does Qlik’s strategy align with keeping the core competency of analytics of, by, and for each and every enterprise?
Potter: Analytics is a specialization organizationally within all of our customers, and that’s not going to go away. What we want to do is parlay that into a broader discussion. So our focus is enabling three key strategies now.
It's about enabling the analytics strategy, as we always have, but broadening the conversation to enabling the data strategy. More importantly, we want to close the organizational, technological, and priority gaps to foster creating an integrated data and analytics strategy.
By doing that, we can create what I describe as a raw-to-ready analytics platform based on trust, because we own the process of the data from source to analysis, and that not only makes the analytics better, it promotes the third part of our strategy, which is around data literacy. That’s about creating a trusted environment in which people can interact with their data and do the analysis that they want to do without having to be data scientists or data experts.
So owning that whole end-to-end architecture is what we are striving to reach.
Gardner: As we have seen in other technology maturation trend curves, applying automation to the problem frees up the larger democratization process. More people can consume these services. How does automation work in the next few years when it comes to analytics? Are we going to start to see more artificial intelligence (AI) applied to the problem?
Automated, intelligent analytics
Potter: Automating those environments is an inevitability, not only from the standpoint of how the data is collected, but in how the data is pushed through a data operations process. More importantly, automating enables on the other end, too, by embedding artificial and machine learning (ML) techniques all the way along that value chain -- from the point of source to the point of consumption.
Gardner: How does AI play a role in the automation and the capability to leverage data across the entire organization?
Potter: How we perform analytics within an analytic system is going to evolve. It’s going to be more conversational in nature, and less about just consuming a dashboard and looking for an insight into a visualization.
The analytics system itself will be an active member of that process, where the conversation is not only with the analytics system but the analytics system itself can initiate the conversation by identifying insights based on context and on other feeds. Those can come from the collective intelligence of the people you work with, or even from people not involved in the process.
The analytics system itself will be an active member of that process, where the conversation is not only with the analytics system but it will initiate the conversation by identifying insights based on context and other feeds.
Gardner: I have been at some events where robotic process automation (RPA) has been a key topic. It seems to me that there is this welling opportunity to use AI with RPA, but it’s a separate track from what's going on with BI, analytics, and the traditional data warehouse approach.
Do you see an opportunity for what’s going on with AI and use of RPA? Can what Qlik is doing with the analytics and data assimilation problem come together with RPA? Would a process be able to leverage analytic information, and vice versa?
Potter: It gets back to the idea of pushing analytics to the edge, because an edge isn’t just a device-level integration. It can be the edge of a process. It can be the edge of not only a human process, but an automated business process. The notion of being able to embed analytics deep into those processes is already being done. Process analytics is an important field.
But the newer idea is that analytics is in service of the process, as opposed to the other way around. The world is getting away from analytics being a separate activity, done by a separate group, and as a separate act. It is as commonplace as getting a text message, right?
Gardner: For the organization to get to that nirvana of total analytics as a common strategy, this needs to be part of what the IT organization is doing, with full stack architecture and evolution. So AIOps and DataOps also getting closer over time.
How does DataOps in your thinking relate to what the larger IT enterprise architects are doing, and why should they be thinking about data more?
Optimizing data pipelines
Potter: That’s a really good question. From my perspective, when I get a chance to talk to data teams, I ask a simple question: “You have this data lake. Is it meeting the analytic requirements of your organization?”
And often I don’t get very good answers. And a big reason why is because what motivates and prioritizes the data team is the storage and management of data, not necessarily the analytics. And often those priorities conflict with the priorities of the analytics team.
What we are trying to do with the Qlik integrated data and analytic strategy is to create data pipelines optimized for analytics, and data operations optimized for analytics. And our investments and our acquisitions in Attunity and Podium are about taking that process and focusing on the raw-to-ready part of the data operations.
Gardner: Mike, we have been talking at a fairly abstract level, but can you share any use cases where leading-edge organizations recognize the intrinsic relationship between DataOps and enterprise architecture? Can you describe some examples or use cases where they get it, and what it gets for them?
Potter: One of our very large enterprise customers deals in medical devices and related products and services. They realized an essential need to have an integrated strategy. And one of the challenges they have, like most organizations, is how to not only overcome the technology part but also the organizational, cultural, and change-management aspects as well.
They recognized the business has a need for data, and IT has data. If you intersect that, how much of that data is actually a good fit? How much data does IT have that isn't needed? How much of the remaining need is unfulfilled by IT? That's the problem we need to close in on.
Gardner: Businesses need to be thinking at the C-suite level about outcomes. Are there some examples where you can tie together such strategic business outcomes back to the total data approach, to using enterprise architecture and DataOps?
Data decision-making, democratized
Potter: The biggest ones center on end-to-end governance of data for analytics, the ability to understand where the data comes from, and building trust in the data inside the organization so that decisions can be made, and those decisions have traceability back to results.
The other aspect of building such an integrated system is a total cost of ownership (TCO) opportunity, because you are no longer expending energy managing data that isn't relevant to adding value to the organization. You can make a lot more intelligent choices about how you use data and how you actually measure the impact that the data can have.
Gardner: On the topic of data literacy, how do you see the behavior of an organization -- the culture of an organization -- shifting? How do we get the chicken-and-egg relationship going between the data services that provide analytics and the consumers to start a virtuous positive adoption pattern?
One of the biggest puzzles a lot of IT organizations face is around adoption and utilization. They build a data lake and they don't know why people aren't using it.
Potter: One of the biggest puzzles a lot of IT organizations face is around adoption and utilization. They build a data lake and they don't know why people aren’t using it.
For me, there are a couple of elements to the problem. One is what I call data elitism. When you think about data literacy and you compare it to literacy in the pre-industrial age, the people who had the books were the people who were rich and had power. So church and state, that kind of thing. It wasn't until technology created, through the printing press, a democratization of literacy that you started to see interesting behavior. Those with the books, those with the power, tried to subvert reading in the general population. They made it illegal. Some argue that the French Revolution was, in part, caused by rising rates of literacy.
If you flash-forward this analogy to today in data literacy, you have the same notion of elitism. Data is only allowed to be accessed by the senior levels of the organization. It can only be controlled by IT.
Ironically, the most data-enabled organizations are typically oriented to the Millennials or younger users. But they are in the wrong part of the organizational chart to actually take advantage of that. They are not allowed to see the data they could use to do their jobs.
The opportunity from a democratization-of-data perspective is understanding the value of data for every individual and allowing that data to be made available in a trusted environment. That’s where this end-to-end process becomes so important.
Gardner: How do we make the economics of analytics an accelerant to that adoption and the democratization of data? I’ll use another historical analogy, the Model T and assembly line. They didn't sell Model Ts nearly to the degree they thought until they paid their own people enough to afford one.
Is there a way of looking at that and saying, “Okay, we need to create an economic environment where analytics is paid for-on-demand, it's fit-for-purpose, it's consumption-oriented.” Wouldn’t that market effect help accelerate the adoption of analytics as a total enterprise cultural activity?
Think positive data culture
Potter: That’s a really interesting thought. The consumerization of analytics is a product of accessibility and of cost. When you build a positive data culture in an organization, data needs to be as readily accessible as email. From that perspective, turning it into a cost model might be a way to accomplish it. It's about a combination of leadership, of just going there and making occur at the grassroots level, where the value it presents is clear.
And, again, I reemphasize this idea of needing a positive data culture.
Gardner: Any added practical advice for organizations? We have been looking at what will be happening and what to anticipate. But what should an enterprise do now to be in an advantageous position to execute a “positive data culture”?
Potter: The simplest advice is to know that technology is not the biggest hurdle; it's change management, culture, and leadership. When you think about the data strategy integrated with the analytics strategy, that means looking at how you are organized and prioritized around that combined strategy.
Finally, when it comes to a data literacy strategy, define how you are going to enable your organization to see data as a positive asset to doing their jobs. The leadership should understand that data translates into value and results. It's a tool, not a weapon.