Use Case Mix of Transactional and reporting data – slow running queries
Queries are running slowly.
Must be able to speed it up - without moving data to a data warehouse or changing the existing schema
Overview - Two methods of storing data
Table rows and indices are stored in data pages, stored horizontally. Row Group = 1 million rows, Segment = Single Compressed Column from the rowgroup
Non Clustered Columnstore Index - the columns that are part of the index are stored separately in different data page. Stored Vertically
Once table is queryable in a columnar format, the size of the result set is drastically reduced (as much as 99% reduced). - e.g.
select first_name, last_name from bigquery_public_data.wikipedia_authors
as opposed to
select * from bigquery_public_data.wikipedia_authors
So - the basic idea is to use columnar indices to start speeding up the queries. This will not require a schema change or require moving the data to a data warehouse.
Leave a Reply