Analytics India Magazine caught up with Ravi Shankar, Chief Marketing Officer, Denodo, a market leader in data virtualization software who gives us the low down on data virtualization that has emerged as a game changer for enterprises. From lending tremendous business agility and increased ROI with a unified virtual data layer to enabling business users to run day-to-day operations without interruption, the Silicon Valley headquartered company helps its many customers across BFSI, health and life sciences and public sector among others get significant gains by modernizing data landscape.
In a freewheeling chat with AIM, Shankar reveals how data virtualization is now much more than data federation and is the key for integrating data from big data sources and the existing enterprise data, to provide a holistic view. Read the full interview below:
Analytics India Magazine: How has data virtualization market evolved since it first emerged in 2006? Has this become the mandate for companies to stay agile or is it just another capability in the data integration toolbox?
Ravi Shankar: Data virtualization used to be referred to as enterprise information integration (EII) and then as data federation. Data virtualization is now much more than data federation because it includes data abstraction, a semantic layer, features that optimize its performance, and it supports many more use cases.
Data virtualization has enabled companies to gain agility, beyond what was possible using the traditional data integration toolbox. Today, business moves so fast, and business users do not have time to wait for IT to integrate the data by replicating it into another data store, just so they can access it. They need to be able to get the data in real time, at the speed of business.
AIM: Could you share with us how data virtualization has emerged as the game changer in 2017, enabling real time big data analytics without data movement?
RS: Big data technologies are maturing, as companies adopt it for data discovery and next-generation architectures like data warehouse offloading and data lakes. But in many cases, big data systems have evolved into yet another silo, separate from the traditional enterprise data. So, data virtualization has become key for integrating the data from big data sources and the existing enterprise data, to provide a holistic view. Also, given the volume and velocity of big data, it does not make sense to try to collect it into another repository; analysts, such as Gartner, advise their clients to leave the data where it is, but connect to it using technologies like data virtualization.
AIM: How is presenting new data infrastructure opportunities to clients and helping them rethink data integration? What are the day-to-day data integration challenges for organizations today?
RS: Modern data architectures are evolving to include cloud technologies, in addition to big data technologies. Many companies are moving their infrastructure from on-premises to the cloud, but they want to do so without disturbing business operations. In this scenario, data virtualization has become a core technology, since it provides the abstraction that enables IT teams to modernize their infrastructure while also enabling business users to run day-to-day operations without interruption.
For today’s organizations, data integration is too time-consuming and expensive. Using ETL processes, it takes four developers about four weeks to deliver data to business users for a particular business flow, whereas using data virtualization, it would only take one developer one week to deliver the same data. Today’s organizations also want the data in real time, which is not possible if the IT teams are spending enormous amount of time moving the data into a separate repository.
AIM: Can you share in detail how Denodo’s data virtualization services helping clients to “connect, combine, and publish” data? What goes into ensuring that your data-driven solution can overcome market competition? Are there any use cases that you can share with us?
RS: First, the Denodo Platform connects with disparate structured and unstructured data sources using connectors. The platform introspects the metadata of each source, and exposes the data as normalized views in the data virtualization layer. Next, the platform combines, integrates, transforms, and cleanses these source views into canonical, model-driven business views of the data. This can be done in a GUI or through documented scripting procedures. Finally, any of the views can be secured and published as SQL views or in dozens of data services formats.
The Denodo Platform introduced three key capabilities to advance data virtualization and differentiate from other equivalent technologies: (1) Dynamic Query Optimizer, which determines the best query execution plan to deliver the optimal performance; (2) Self-service data discovery and search, which enables business users to develop a more intuitive, immediate relationship with the data without relying on IT; and (3) Cloud deployment options: the Denodo Platform is the first and only data virtualization solution to be provisioned as a virtual image on Amazon AWS Marketplace.
Autodesk is a multinational software corporation that develops software for the architecture, engineering, construction, manufacturing, media and entertainment industries. The company needed to modernize its systems, to support greater agility and the move to a new revenue model, without significantly impacting business operations. Autodesk leveraged the Denodo Platform to establish a uses a logical data lake, which complemented its existing business intelligence platform and its operational enterprise data warehouse reporting model. Supported by the Denodo Platform, Autodesk was able to successfully change its business revenue model from a conventional perpetual license model to a subscription-based license model. Also, Autodesk was able to implement single-point security enforcement and a uniform data layer for access. Autodesk now uses the data virtualization layer to combine data from big data and cloud sources with other enterprise data to provide unified data access to consuming applications in an agile manner.
AIM: The new Denodo Platform 6.0 accelerates your fast data strategy with data virtualization by delivering breakthrough performance in big data, logical data warehouses, and operational scenarios. Please walk us through this statement, highlighting the key capabilities integrated within the platform.
RS: The Denodo Platform 6.0 employs several optimization strategies (some of which are unique to Denodo) such as full and partial aggregation pushdown, partition pruning, join reordering, join pruning, and on-the-fly data movement. To optimize performance, the Dynamic Query Optimizer gathers many statistics about each data source, such as the average row size and the number of distinct values, to determine the best optimization strategies to apply for each query. For big data sources, the platform also takes into account the block size for I/O and the number of processing units in a parallel database.
AIM: Denodo’s product ecosystem comprises of numerous solutions and services stretching across Agile Business Intelligence, Data governance, Data warehouses, Big Data, Web automation, Master Data Management, and more. Please describe the key solutions and share use cases around the same?
RS: Our three key solutions are agile BI, big data, and logical data warehousing:
Agile BI: The Denodo Platform enables IT to be much more agile in responding to the almost insatiable demands of business users for actionable information. All data remains in its source data stores, since replication, with its accompanying staging, transformation, and batch copying tools and processes, is not required. Business users gain access to the data is through virtual views, created in the Denodo Platform, which are quickly created, lightweight, and easily discarded, if necessary. The most common use case takes the form of real time reports and dashboards that are easily accessed by business users.
Big Data: The modern data stores of big data implementations do not typically provide standard JDBC/ODBC-based SQL interfaces, which makes them difficult to use for BI and reporting, and for data analysts who are mostly familiar with SQL. The Denodo Platform provides an abstraction and federation layer that hides the complexities of the big data stores, making it easy to integrate data from these stores with other data within the enterprise. The Denodo Platform supports many use cases in its solution for big data, including hybrid data warehouses, which combine traditional data warehouses with cloud based clusters; data lakes and enterprise data hubs; and analytical integration, in which analytical data from Hadoop is seamlessly integrated with traditional data sources stored in a data warehouse.
Logical Data Warehousing: Data Virtualization is a critical part of the logical data warehouse architecture, as it enables queries to be federated across traditional structured data sources, such as databases, data warehouses, etc., and modern data sources like Hadoop, NoSQL, web services, and SaaS applications, while still appearing as a single ‘logical’ data source to the user. The most common use cases are virtual data marts, data warehouses integrated in real time with master data from MDM systems, data warehouses extended with cloud sources, the integration of multiple traditional data warehouses, data warehouses with historical data offloaded to Hadoop, and data warehouses partitioned to move seldom-used data to less expensive storage.
AIM: Some of the verticals where Denodo’s solutions apply are: finance, government, insurance, oil and gas are some of the vertical where Denodo furnishes it solutions. Can you tell us how and why data integration will come to play a key role in IoT data? Are Denodo’s solutions leveraged in IoT?
RS: Internet of Things (IoT) is becoming a new trend, and many companies are using data virtualization to perform predictive analytics on streaming IoT data. Denodo solutions are leveraged by a global equipment manufacturer to analyze IoT data. This manufacturer leveraged the Denodo Platform to establish a logical data lake that takes streaming sensor data from equipment and combines it with parts, maintenance, and dealer information from traditional systems to support both analytical and operational needs. After implementing our solution, the manufacturer realized numerous benefits:
- Improved asset performance and proactive maintenance
- Reduced warranty costs due to proactive maintenance of parts, preventing part failure
- Optimized pricing for services and parts among global service providers
- Increased revenue from sale of services and parts
- New business model opportunities based on real-time analyses of detailed sensor data
AIM: What is the roadmap for Denodo in 2017-18?
RS: Given that big data use cases are becoming mainstream, performance is a key consideration for integrating very large volumes of data at very high speeds. So, the Denodo Platform is evolving to support in-memory fabric for processing very large volumes of data rapidly, providing instant answers to business users. Parallel, in-memory computation will work with our existing Dynamic Query Optimizer to further optimize query performance by handling heavy processing in the in-memory grid.