Leading product development for OpenEPA emissions data platform and two specialized microservices that transform how energy companies and environmental researchers access, analyze, and trust industrial data. From AI-powered client intelligence to transparent OpenEPA analytics.
One enterprise platform and two cross-cutting microservices that serve multiple domains: EPA emissions data for researchers, energy sector intelligence for AEs, and custom calculations for analysts across industries.
EPA GHGRP emissions data platform (2010-2024) with AI Q&A, benchmarking views, and verifiable certificates for researchers and journalists.
2,800+ facilities • 15 years data • 99.9% uptime
AI-powered client onboarding and chat intelligence for energy sector. Vue 3 + TypeScript frontend with GPT-4 integration.
2+ hours to 5 min prep • $85M pipeline • 200+ hrs/mo saved
Universal calculation engine with formula transparency, methodology docs, and audit trails for verifiable analytics.
Academic-grade reproducibility • Cross-platform
Conducted multi-phase user research across energy sector AEs, environmental researchers, and sustainability analysts. Used shadowing sessions, workflow analysis, and journey mapping to identify pain points and automation opportunities.
One-on-one interviews with AEs, researchers, journalists to understand current workflows, tools, and pain points. Recorded 2+ hours of manual prep time per client for AEs, black-box calculations for researchers.
Observed real workflows in action: AE client research across scattered sources, PhD researchers building emissions models in Excel with zero reproducibility, journalists unable to cite unattributed data.
Mapped end-to-end user journeys for each persona. Identified automation opportunities: AI-powered onboarding (95% time reduction), transparent calculations (academic reproducibility), data provenance (journalistic citations).
After enabling industrial digital twins through our Knowledge Graph technology in Immutably™, we realized we had enormous capabilities for insights. This sparked a new direction: extending our platform to give open access to EPA emissions data—and building the microservices to power it.
After building Immutably™'s Knowledge Graph technology to create industrial digital twins, we had a powerful insight infrastructure. Energy clients were using it to model entire facilities, but the KG capabilities (ontology mapping, sub-second queries, data provenance) could do so much more.
We spotted the opportunity: EPA's GHGRP emissions data was public but difficult to use. Researchers spent weeks wrangling Excel files, journalists couldn't cite sources, and custom calculations had zero reproducibility. Our KG technology could solve all three problems at once.
I led user research with environmental researchers, journalists, and sustainability analysts. The pain points were clear: 2+ hours of manual prep work per analysis, black-box calculations, and no way to verify results. We had the technology—now we needed the product.
I built the OpenEPA POC using Claude to prototype the Q&A interface, connecting it directly to our existing KAG (Knowledge-Augmented Generation) infrastructure. The pilot validated our hypothesis: researchers could ask natural language questions and get cited, verifiable answers from 15 years of EPA data.
In parallel, we built Context AI, a microservice for energy sector client intelligence. Using Vue 3 + TypeScript, I created an AI-powered onboarding flow that extracted company context in 60 seconds instead of 2+ hours. The frontend was built with mock services to unblock development while the backend caught up, saving 3 weeks.
Impact: Context AI generated $85M in attributed pipeline for our AE team. OpenEPA pilot attracted 100+ early users from research institutions and NGOs.
With the POC validated, we built the full OpenEPA platform infrastructure. I designed the EPA data ingestion pipeline with version tracking where every emissions report is timestamped with its EPA source URL and release date. This wasn't just about loading data; it was about creating an audit trail that researchers and journalists could trust.
We integrated ArcadeDB as our Knowledge Graph store, leveraging the same ontology-mapping capabilities we'd built for Immutably™. The result: sub-second queries across 2,800+ facilities and 15 years of emissions history. Researchers could compare facility emissions trends, identify geospatial hotspots, and benchmark Top/Bottom 5 emitters, all with full data provenance showing exactly where each number came from.
Tech Stack: ArcadeDB for Knowledge Graph storage, custom ontology for emissions data modeling, EPA GHGRP API integration with delta sync, version-controlled data pipeline with rollback capabilities. Deployed on cloud infrastructure with 99.9% availability targets.
With the EPA data pipeline running, we deployed Context AI on Azure to power the intelligence layer. The challenge was connecting AI models to our Knowledge Graph so they could answer complex emissions queries with full citations. We used Azure OpenAI Service with GPT-4, building a custom integration layer that translated natural language questions into KG queries and synthesized responses with EPA source attribution.
The Context AI microservice became the bridge between our KAG infrastructure and both OpenEPA (for emissions Q&A) and the energy sector AE dashboard. For OpenEPA, researchers could ask "Which Texas refineries increased emissions most from 2020-2023?" and get cited answers with EPA report URLs. For the AE dashboard, Context AI analyzed client KG data to generate company intelligence in under 60 seconds.
Deployment Details: Azure OpenAI Service (GPT-4 model), Azure Kubernetes Service for microservice orchestration, Redis for session caching, custom KAG integration layer for query translation. P95 latency: 2.1s for complex emissions queries. Scaled to support 100+ concurrent users with auto-scaling policies.
Automated data is powerful, but we found an even bigger opportunity: enabling researchers to create industrial workflows using their own methodologies. I designed the Calculation Editor (Abacus) to let users build custom emissions intensity calculations (per MWh, per ton of product, per capita) with full formula transparency and methodology documentation.
The key was our custom ontology. Drawing on industrial best practices and client methodologies from energy companies, we designed an ontology that mapped EPA emissions data to production metrics, financial data, and operational parameters. Researchers could now normalize emissions by facility output and compare apples to apples, something EPA's raw data didn't support out of the box.
Technical Implementation: Custom ontology with 120+ industrial metrics, formula parser with validation and unit conversion, methodology documentation system with versioning, integration with OpenEPA benchmarking views and Context AI Q&A. Built with TypeScript, deployed as microservice with REST API. Researchers can export verifiable calculation certificates showing formula, methodology, and EPA source data, critical for academic reproducibility.
As we prepared for OpenEPA's public launch, we formalized our team structure. We established an R&D office in Amsterdam with a small, focused team tracking work through a Kanban board and running bi-weekly retros. This wasn't just about process but about culture. We transitioned from an Engineering-led approach (build features, ship code) to Product-led (solve user problems, measure impact).
This transition actually started earlier with my work on the Immutably™ platform, where I shifted the company from feature-driven development to user-research-driven product strategy. With OpenEPA, Context AI, and Calculation Editor, we applied those same lessons: validate with users, ship MVPs fast, iterate based on feedback, and measure everything.
Launch Metrics: OpenEPA deployed with 99.9% availability SLA, P95 query latency at 2.1s (beating our 2.5s target), 100+ concurrent users supported, <24 hour data freshness from EPA. Context AI generated $85M in attributed pipeline. Calculation Editor enabled reproducible research with verifiable certificates for academic publications.
The result: three platforms—OpenEPA, Context AI, and Calculation Editor—working together to democratize access to industrial data and power data-driven decisions across energy, sustainability, and compliance sectors.
Real-time dashboard showing AI automation impact across revenue generation, cost savings, and delivery acceleration
*Data based on internal Context Labs metrics (Q1 2025 - Q1 2026). Pipeline influence from AE attribution surveys (n=12 AEs). Cost savings vs baseline without AI automation.
Introduced agile product practices to Context Labs: sprint planning, iterative delivery, user feedback loops, and cross-functional collaboration. Transformed from ad-hoc feature development to systematic product delivery.
Before: 12+ weeks average from ideation to production
After: 6 to 8 weeks for MVP, iterative enhancements every 2 weeks
Before: 65% feature adoption rate (internal estimate)
After: 87% adoption for research-validated features
Before: 30% rework due to unclear requirements
After: 8% rework with written PRDs and acceptance criteria
AEs said "we need better client data" but shadowing revealed they spent 2+ hours manually researching companies. Built AI onboarding that extracts context in 60 seconds—solving the real workflow problem, not the stated feature request. Observation beats self-reported needs every time.
Set OpenEPA performance targets first (99.9% availability, P95 < 2.5s), then chose ArcadeDB to hit them. For Context AI, picked Pinia over Vuex for faster TypeScript state updates. Never choose technology first—let requirements guide technical decisions. Start with user needs, end with architecture.
Built Context AI with mock services for 4 energy clients before backend APIs existed. Enabled parallel frontend development, saving 3+ weeks of waiting time. Shipped onboarding MVP in 6 weeks total instead of 9+ weeks serial development. Unblock your team with realistic test data and stubs.
Researchers won't use black-box calculations. Every OpenEPA query cites EPA source, version, timestamp. Calculation Editor displays exact formulas and methodology. For industrial data, users need verifiable provenance. Transparency isn't a feature, it's table stakes for trust.
I turn ambiguous problems into shipped products through user research, Azure AI deployments, and agile transformation. If you're building industrial data platforms, energy infrastructure systems, or AI-powered analytics, let's connect.