An architectural guide to scaling BigQuery while slashing costs, featuring a 73% compute cost reduction via intermediate tiering and a 12x performance boost by rethinking CTEs.
The Data Processing Bottleneck
In many enterprise data warehouse environments, reporting tools are often pointed directly at complex,
multi-level hierarchical datasets involving immense JOIN operations across massive
dimension and fact tables. In one specific initiative, weekly refresh procedures were executing
split-logic aggregations that scanned over 800+ GB of data per run, taking up to 3 hours to complete and
generating massive compute bills.
Architecture 1: The Intermediate Processing Table
To solve the high query complexity and resource consumption, we decoupled the reporting layer from the raw datasets by introducing a centralized, modular intermediate processing table.
The Strategy:
- Partitioning: We partitioned the intermediate table strictly by
fiscal_year_week. This instantly limited the scanned data to only the relevant week, entirely avoiding full table scans during the scheduled refreshes. - Clustering: Within those partitions, we clustered by hierarchical reporting levels (e.g., category, sub-category, store). This enabled lightning-fast pruning for downstream queries.
- Automated Maintenance: We configured partition expiration to automatically drop data older than 12 weeks, ensuring the table never bloated unnecessarily.
The Impact: Data scanned per run plummeted from 800+ GB to under 150 GB. The weekly pipeline duration dropped from almost 3 hours to just 46 minutes. Most importantly, the monthly query cost for this pipeline was reduced by 73%.
Architecture 2: Consolidating CTEs
Common Table Expressions (CTEs) are fantastic for code readability, but they can become silent performance killers if misunderstood. In a major retail analytics pipeline, developers had created multiple separate CTEs executing at the exact same granularity (e.g., product and store level) before eventually joining them together.
Because BigQuery often evaluates highly complex, multi-referenced CTEs poorly (sometimes re-evaluating them multiple times), this pipeline was taking over 2 hours to execute.
The Fix: We re-engineered the SQL logic to consolidate these separate steps into a single, cohesive CTE that handled both product and store attributes simultaneously. This approach drastically reduced temporary data shuffling, eliminated unnecessary join operations, and simplified the query execution graph.
The Impact: This single architectural shift reduced the pipeline runtime from 2 hours down to just 10 minutes (a 12x speed increase).
Optimization Heuristics Map
From these deployments and dozens of others, I've developed a strict decision tree for data modeling in BigQuery:
- Materialized Views: Unbeatable for predictable, frequent identical aggregations that change hourly/daily. Terrible for highly volatile data or complex window functions.
- Temporary Tables: For performance-critical, multi-step pipelines where a result set is referenced 3+ times downstream, explicitly writing to a temporary table is often vastly cheaper than utilizing CTEs.
- BI Engine: Best deployed as an acceleration layer for sub-second dashboard requirements over structured, optimized intermediate tables, rather than pointing it at raw telemetry.