Dataflow to BigQuery
Also read Bigquery Partitioning Overview - Clustering versus Partitioning and File Formats for Loading Data into BigQuery
Overview
Dataflow provides a very powerful basis for transforming data (ETL) , and BigQuery provides fast, ad-hoc analysis of that data.
Cloud Dataflow provides full integration with BigQuery via the BigQueryIO reader and writer.
Bounded and Unbounded Data
BigQueryIO automatically adapts how it writes to BigQuery based on whether the pipeline is processing bounded or unbounded data. For bounded datasets, BigQueryIO performs inserts using batch file uploads.
For unbounded datasets, inserts are performed using streaming insert API calls.
Summary
BigQueryIO is a powerful, adaptive library that makes it easy for DataFlow to work with BigQuery.
Leave a Reply