File Formats for BigQuery Loading Data
Also read Bigquery Partitioning Overview - Clustering versus Partitioning
The format and the type of storage (Cloud Storage, BigQuery, BigTable, CloudSQL...) dictate query performance.
Formats suitable for Cloud Storage
If your dataset sits in Cloud Storage, use either Parquet or Avro.
Column Formats
Parquet - Compressed, Columnar format
Avro - columnar, binary format
Row Formats
JSON row format, duplication of row headers. Also allows binary
CSV - Row format
Summary
The format of the data is important to optimize data loading and retrieving. If your dataset sits in Cloud Storage, use either Parquet or Avro, both columnar format, for optimal performance.
Leave a Reply