BigQuery - Ingestion Time Partitioning - Multi Cloud Architect, Security Specialist

Also read Partitioning Overview in BigQuery , Bigquery Partitioning Overview - Clustering versus Partitioning and File Formats for Loading Data into BigQuery

Ingestion time partitioning

Step 1 - Define the partition boundary - hourly, daily, weekly, monthly, yearly. These are all in UTC time.

Step 2 - When you create a table partitioned by ingestion time, BigQuery automatically assigns rows to partitions based on the time when BigQuery ingests the data.

An ingestion-time partitioned table has a pseudocolumn named _PARTITIONTIME.
The value of this column is the ingestion time for each row, truncated to the partition boundary (such as hourly or daily).

For example, suppose that you create an ingestion-time partitioned table with hourly partitioning and send data at the following times:

Ingestion time	`_PARTITIONTIME`	Partition (hourly)
2021-05-07 17:22:00	2021-05-07 17:00:00	`2021050717`
2021-05-07 17:40:00	2021-05-07 17:00:00	`2021050717`
2021-05-07 18:31:00	2021-05-07 18:00:00	`2021050718`

Because the table in this example uses hourly partitioning, the value of _PARTITIONTIME is truncated to an hour boundary. BigQuery uses this value to determine the correct partition for the data.

You can also write data to a specific partition. For example, you might want to load historical data or adjust for time zones. You can use any valid date between 0001-01-01 and 9999-12-31.

Instead of using _PARTITIONTIME, you can also use _PARTITIONDATE. The _PARTITIONDATE pseudocolumn contains the UTC date corresponding to the value in the _PARTITIONTIME pseudocolumn.

Summary

Partitioning bigquery tables can improve query performance.

BigQuery – Ingestion Time Partitioning

No Comments Yet

Leave a Reply Cancel reply