Albuquerque · 2026
Available  ·  EU remote

João Pedro Albuquerque

Data & Analytics Engineer

🌐 Brazil  ·  UTC−3  ·  EU business hours overlap

  • Python
  • SQL
  • dbt
  • Airflow
  • BigQuery
  • Kafka
  • GCP

I build production data infrastructure. Five live systems — from a factory lakehouse on BigQuery + dbt + Airflow to a multi-agent incident platform. Each is deployed, monitored, and documented.

Contact // get in touch →
João Pedro Castro Albuquerque
Factory Data Platform — System Architecture Blueprint-style architecture diagram with numbered callouts and caption grid ALBUQR · FACTORY DATA PLATFORM · SYSTEM ARCHITECTURE DRAWING 01 · 2025–26 BATCH + STREAM SOURCES 1 INGEST 2 [A] LAKEHOUSE BigQuery · dbt · Airflow 4 [B] STREAM Kafka · Redis · 28 streams · rule-based detection 3 [C] PLATFORM API 5 DASHBOARDS D ALERTS A REPORTS R SCALE 0 0.5 1.0
[1] TWO INGEST PATHS
Batch path: operational and financial data from the factory into BigQuery bronze tables. Stream path: 28 machine events into Kafka topics in real time. Separate producers, separate contracts.
[2] LAKEHOUSE
BigQuery + dbt + Airflow modeling layer. Tested, freshness-monitored, scheduled daily at 6am.
[3] PLATFORM API · DEPENDENCY INVERSION
FastAPI unification layer. The dashboard never calls BigQuery or Redis directly — all reads route through this API. Swap either underlying service without touching the presentation layer.
[4] STREAM · RULE-BASED DETECTION
Kafka + Redis over 28 machine streams. Anomaly detection is deliberately rule-based — zero labeled anomaly history makes a supervised model untrainable at this stage. Redis reads in <1ms; BigQuery ~2s latency kills the sub-30s SLA. p95 freshness under 6 minutes.
[5] OUTPUTS
Three consumer surfaces — Dashboards (D), Alerts (A), Reports (R). Each one is its own SLA contract.
02 / 05
CRISIS Platform

CRISIS Platform 🥇 1st Place

Multi-agent incident response built with IBM watsonX Orchestrate. 1st Place, Hackathon IA Descomplicada (UNASP + IBM). Agents handle real routing decisions — not a demo workflow.

IBM watsonX Python Flask Docker VPS
03 / 05
Credit Risk Scoring System

Credit Risk Scoring System

XGBoost classifier predicting credit default risk with full SHAP explainability. AUC-ROC 0.908 across 1.5M rows. Three models compared; trained on synthetic Brazilian credit profile data.

XGBoost SHAP FastAPI Streamlit Docker SQLite
04 / 05
Bling vs Octalink ERP Analysis

Bling vs Octalink ERP Analysis

AI-assisted comparative ERP analysis for a real Brazilian manufacturer. 834 lines Python, 6 analysis pages, 4 verified PDFs, real client. Finding changed the procurement decision.

Python Streamlit ReportLab Claude VPS
05 / 05
Educathon B2B Pipeline — n8n automation

Educathon B2B Pipeline

Production B2B automation at Instituto Educathon. Discovers prospects via Google Maps API, qualifies, routes outreach through WhatsApp. First run: 86 businesses, 34% contact rate on day 1.

n8n RapidAPI Evolution API Baserow Easypanel

Notes I take while studying.

Build it like
it has to last.

ADR Every decision is written down — future-me always forgets.
TESTS Gate merges. CI is the floor, not the ceiling.
OBSERVABILITY Before features. If I can't see it, I haven't shipped it.
README README last — written once the system is understood, not before.
SCOPE Data platforms · multi-agent systems · applied ML.
STANCE Engineering as engineering — not as a reporting bolt-on.

Mail me.

Email is fastest — I check it constantly. --:-- in Hortolândia, SP right now, and I'm at the desk.

Brazil is UTC−3. EU business hours start around midday my time — I'm at my desk well before that. Best window: 10:00–13:00 BRT. Write then and I'll reply fast.

Recruiters: if you're sending a templated InMail, please at least say which company. I read everything, but I respond to the ones that read me first.

João Pedro Castro Albuquerque
Data & Analytics Engineer