GA 360 to BigQuery
Overview - Why Link Google Analytics with BigQuery
- Sampling and Reporting API Limitations - If you are using the analytics API, there is a query limitation (10,000 requests per day per view).
- No more need to do Sampling
- Unlimited dimensions
- Create your own Channel Grouping
- Combine Google Analytics data with third party data sources and Visualization Tools
BigQuery Linkage to Google Analytics and Exporting data to BigQuery
- Step 1- You will need a project (preferably inside a GCP Folder - called Development or Analytics or something). You will need billing enabled on the project as well as the BigQuery API enabled.
- Step 2 - With the above, go into Analytics and 'link to the project containing BigQuery'
- Step 3 - Once linked, we enable BigQuery Export from Analytics Admin.
Sampling and Reporting - aka Querying the data via the API
If queries need to be run against the API (e.g via PHP or Java code), there is a limitation once the sandbox is exceeded.
What is the cost associated with linking BigQuery to GA 360?
Free Tier - BigQuery sandbox provides you with up to 1 terabyte per month of query capacity and 10GB of free storage. All tables and partitions have a 60-day retention policy. Some features of BigQuery are not included in the sandbox (DML, Streaming, Data Transfer Service).
Paid Tier - The cost starts at a whopping $150,000 per year (though I could not get Google to confirm).
Step 1 - Using a Service Account to authenticate to Google APIs
Service accounts are special Google accounts that can be used by applications to access Google APIs programmatically via OAuth 2.0. A service account uses an OAuth 2.0 flow that does not require human authorization. Instead, it uses a key file that only your application can access.
Note: The default credentials are obtained when you install gCloud SDK on your worksation.
gcloud auth application-default login
Step 3 - Linking Analytics to BigQuery and Exporting the data
Once the linkage is complete, data should start flowing to your BigQuery project within 24 hours. 1 file will be exported each day that contains the previous day’s data (generally, during the morning of the time zone you set for reporting), and 3 files will be exported each day that contain the current day's data. We will provide a historical export of the smaller of 10 billion hits or 13 months of data within 4 weeks after the integration is complete.
Each 'View' (which can be the analytics for a single website) maps to a single dataset in BigQuery.
Here is a sample using R (python would be similar)
library(bigQueryR)
## go through Google oAuth2 flow
## needs email that has access to the BigQuery dataset
bqr_auth()
## get lists of your project and datasets
bqr_list_projects()
bqr_list_datasets("project-id")
a
Appendix - BigQuery IAM Roles
Basic roles for projects are granted or revoked through the Google Cloud Console. When a project is created, the Owner
role is granted to the user who created the project.
When BigQuery receives a call from an identity (either a user, a group, or a service account) that is assigned a basic role, BigQuery interprets that basic role as a member of a special group. Special group membership grants the identity permissions that are assigned to a BigQuery IAM role. The following chart identifies the special group and role for each basic role:
Basic role | Assigned group | BigQuery IAM Role |
---|---|---|
roles/ |
access. |
roles/ |
roles/ |
access. |
roles/ |
roles/ |
access. |
roles/ |
Basic roles for BigQuery datasets
The following basic roles apply at the dataset level.
Capabilities | |
---|---|
READER |
|
WRITER |
|
OWNER |
Note: A dataset must have at least one entity with the |
Leave a Reply