Over the last few years, we have seen how data analytics has gained a crucial role in organisations. Companies, regardless of their sizes, are utilising data to gain strategic information as well as operational decisions.
SQL is one of the oldest players in the data management system. This mainstream language includes a number of important commands which allow a data analyst to interact with the large volumes of data.
However, with the growth of data, organisations are utilising less than half of the structured data for decision-making purposes. One issue is that data, both structured as well as unstructured is stored across multiple databases, data lakes, etc. and the other issue is that in SQL, there is an issue of partial control due to which the data analysts don’t have complete control over the databases. In such cases, the analysts face issues between the query language and the data format in which it is stored.
Researchers from AWS have now introduced PartiQL, a new SQL-compatible query language where data can be queried efficiently despite its location and format. With the help of this query language, one can not only process structured data from relational databases but also semi-structured and nested data in open data formats, schema-less data in NoSQL or document databases.
Why This Language
The main reason for introducing this language is that the researchers from Amazon are facing challenges while transforming not just SQL tabular data but also nested and semi-structured data that is found in a variety of formats and storage engines. Besides using SQL, the existing database research community in Amazon wanted to expand into accessing the non-relational data of the data lake, while maintaining strict backward compatibility with SQL. This led to the creation of PartiQL which offers strict SQL compatibility, achieves nested and semi-structured processing with minimal extensions.
- SQL Compatibility: This language adopts the compatibility with SQL such that the existing SQL queries will continue to work in SQL query processors that are extended to provide PartiQL and makes it easy for the developers to leverage PartiQL.
- First-Class Nested Data: This language provides syntax and semantics which comprehensively access and query nested data.
- Optional Schema And Query Stability: This language is designed to be usable by database engines which assume the presence of a schema as well as schema-less engines.
- Minimal Extensions: PartiQL has a minimum number of extensions over SQL which includes intuitive filtering, joining, aggregation, and windowing on the combination of structured, semi-structured, and nested datasets.
- Format Independence: This language is independent of any data format. A query can be written in various formats such as JSON, Parquet, ORC, CSV, Ion, among others.
- Data Store Independence: This language is independent of any particular underlying data store.
How It Is Better Than SQL Data Model
The PartiQL data model extended the SQL data model as mentioned below
- In PartiQL data model, the elements of an array can be any kind of value—not just tuples. Unlike SQL where the values are tables that have homogeneous tuples that have scalars, this language complex values are arbitrary compositions of arrays, bags and tuples.
- PartiQL’s data model extends SQL to Ion’s type system to cover schema-less and nested data.
- Unlike SQL, PartiQL allows the possibility of duplicate attribute names, in the interest of compatibility with Ion datasets.
- There are two absent values in this language, they are NULL and MISSING. Unlike SQL where a query which refers to a non-existent attribute name is expected to fail during compilation. The absent value of PartiQL, MISSING plays a role in such cases where navigation fails to bind to any information.
- Unlike traditional SQL, the PartiQL query language also meets the needs of NoSQL and non-relational databases.
- This language operates not only on SQL tables but also on data that may have union types, different attributes across different tuples, and many other features.