I build reliable data systems for executive decision-making.

I design scalable data models, canonical metrics, and warehouse-first pipelines. I care about explicit grain, modular transformations, reliability, and trust. I also build practical interfaces and prototypes that sit on top of those systems.

dbt BigQuery Snowflake Python Streamlit APIs Semantic Modeling

Selected Data Systems

A few projects that show how I think about modeling, reliability, and applied analytics engineering.

Canonical Paid Media Mart

Unified disparate paid media performance data to deliver standardized, cross-platform metrics for the executive team.

dbt BigQuery Fivetran

Architecture Flow

Ingestion (Fivetran) BigQuery dbt (Staging > Int > Marts) BI Layer

Data Model & Grain

  • Grain: Explicit day-platform-campaign
  • Fact Table: fct_daily_campaign_performance
  • Metrics: Blended ROAS, CPA, Spend, Conversions

Key Design Decisions

  • Isolated platform nuances (Meta vs. Google conversion logic) within the staging layer.
  • Enforced a universal schema in the intermediate layer using strict UNION ALL, ensuring downstream marts are platform-agnostic.
  • Implemented robust dbt tests on the composite key to guarantee grain integrity.

Why It Matters & Scaling

This canonical approach future-proofs the reporting layer. When marketing launches a new channel, only a new staging model and union update are required. Executive reporting remains entirely unaffected. The warehouse-first logic establishes absolute trust in the metric definitions.

View Data Model

ProductDW — Mini ELT System

A modern ELT pipeline and dashboard built entirely in code, transforming raw GA4 event data into analytics-ready tables and interactive KPIs.

Python BigQuery Streamlit SQL

Architecture Flow

Raw GA4 Data Python Orchestration BigQuery (Staging > Core > Marts) Streamlit Dashboard

Data Model & Grain

  • Modeling Layers: Designed dimensional models via staging, core, and mart tables with clean joins and incremental logic.
  • Transformations: Governed using complex SQL (CTEs, window functions) stored as version-controlled code.
  • DataOps: Established a reproducible pipeline using virtual environments and Git.

Key Design Decisions

  • Built a warehouse-native ELT pipeline using Python to orchestrate parameterized BigQuery jobs.
  • Enforced an "everything-as-code" philosophy to eliminate UI-based configuration drift.

Why It Matters

This project demonstrates end-to-end analytics engineering execution. It proves the ability to translate raw events into a trusted dimensional model entirely through code, culminating in an interactive consumption layer where non-technical users can explore KPIs.

View Application Source

Live API Feed to BigQuery Ingestion

Automates the ingestion of event data from the Camera Event API into Google BigQuery as a scheduled Cloud Run Job, providing a "set and forget" synchronization.

Python REST APIs Cloud Run BigQuery

Architecture Flow

Camera Event API Python Serverless (Cloud Run) BigQuery (Raw)

Data Model & Schema

  • Schema Mapping: Transforms nested Camera Event JSON into a flat BigQuery schema with support for JSON columns.
  • Pagination: Efficiently handles high-volume event data using offset-based pagination.

Key Design Decisions

  • Idempotency: Uses BigQuery-based watermarking to resume from the last ingested event, preventing gaps and duplicates.
  • Automatic Auth: Automatically refreshes the API access token using client credentials for uninterrupted syncing.

Why It Matters & Scaling

This project demonstrates practical systems engineering capabilities. By packaging the Python script into a Docker container and deploying it as a serverless Cloud Run job triggered by Cloud Scheduler, it produces an autonomous, scalable upstream pipeline feeding directly into the data warehouse.

Sorry! Work Product

Applied Engineering

I don't just model data. I build lightweight systems that use it.

Analytics Interfaces

I build simple, targeted interfaces on top of warehouse logic, enabling self-serve access for non-technical users and reducing dependency on ad-hoc SQL pulls.

Streamlit Tailwind

Internal Tools & Prototypes

From API endpoints to internal admin panels, I use modern engineering stacks to quickly prototype solutions for real workflows and internal business bottlenecks.

FastAPI Postgres Docker Fly.io

Practical Product Thinking

When a workflow is repetitive, brittle, or blocked by tooling, I can architect the system around it—connecting data, logic, and UI into something usable and valuable.

GitHub Actions APIs

Systems Thinking

Core principles for scalable analytics.

Explicit Grain

I define and defend explicit grain in every model. Ambiguity causes fan-out traps. Strict grain ensures downstream consistency.

Modular Transformations

Staging, intermediate, and mart layers are strictly separated. Logic is applied once and referenced modularly. DRY over copy-paste.

Warehouse-First Logic

I leverage the raw compute of modern data warehouses to centralize metrics logic, keeping reporting and application layers functionally lightweight.

Canonical Metrics

Different BI tools shouldn't yield different answers. I engineer single sources of truth so leadership can trust the numbers they see.

Stakeholder Empathy

Technical execution means nothing if it doesn't map to the mental model of the executive team. I optimize architectures for business comprehension.

Reliability and Trust

Data is a product. Tests, lineage, and append-only logic are not optional—they are the prerequisite for organizational trust.


Let's build reliable systems.

I am looking to join a high-leverage team as a Senior Analytics Engineer.