Alex Peczon

Hey, I'm Alex

Data software engineer. I like turning messy APIs, logs, reviews, surveys, and spreadsheets into pipelines people can actually use.

Favorite language: Python

Where I moved data around for real

Work that sits somewhere between data engineering, applied ML, and product automation.

Future Tilt

Software Engineer
Aug 2025 - Present
San Francisco
  • Built ETL workflows with Airbyte, BigQuery, and AWS Lambda for analytics workloads supporting 50+ ecommerce clients and 20M+ daily queries.
  • Developed an AI email generation platform that turns campaign calendar data into editable marketing emails while pulling assets from Google Drive.
Airbyte BigQuery AWS Lambda Klaviyo API Google Sheets Trello API
Future Tilt Future Tilt Actionable data, high-impact campaigns, and measurable ROI for DTC brands.

Superlinked (Series A)

SIE Demo Software Developer (Contract)
Mar 2026 - May 2026
San Francisco Bay Area · Hybrid
  • Developed a paid launch demo for Superlinked's SIE engine during early access, working with Valentin Marek and Eric Taylor to showcase explainable wine recommendations.
  • Built a RAG wine recommender that used Vivino data, small inference models, OCR, and text embeddings to surface similar wines with clearer reasoning.
  • Shipped a React UI and containerized Python monorepo in Docker, keeping OCR and embedding modules cleanly separated for documentation users.
Superlinked SIE React Python Docker RAG Small Models

USF MAGIC Lab

NLP Research Assistant
Mar 2025 - May 2026
San Francisco
  • Built parallelized ETL pipelines across 5 virtual machines to generate entity-aware NLP datasets from 20,000+ news articles.
  • Developed an explainable entity-based sentiment framework that preserved relationship paths instead of flattening everything into a mystery score.
  • Used graph-based relationship extraction to connect articles, entities, sentiment, and evidence into datasets researchers could inspect.
Python Go NLP ETL Graph Extraction DuckDB

Alaris Security (Pre-seed)

Junior Fullstack Engineer
Aug 2025 - Nov 2025
San Francisco
  • Designed orchestration workflows with Prefect and Airflow to normalize CrowdStrike, Elastic, and Microsoft Defender telemetry for SOC analytics.
  • Built an agentic cybersecurity analysis system that searched across millions of security records and generated customer-specific NIS2 compliance reports.
  • Created customer-facing report tooling that linked AI-generated findings back to supporting telemetry, because security summaries are not useful if nobody can trace them.
Prefect Airflow CrowdStrike Elastic Microsoft Defender SOC Analytics

Future Tilt

Software Engineering Intern
Jun 2025 - Aug 2025
San Francisco
  • Built a Lambda campaign orchestration service that syncs Google Sheets planning calendars with Klaviyo campaigns and Trello production tasks, cutting setup time by 50%.
  • Worked across Google Sheets, Klaviyo, Trello, and AWS Lambda to turn campaign planning data into production-ready automation.
AWS Lambda Google Sheets Klaviyo API Trello API Automation

Candle Stories

Production Assistant
Apr 2025 - Aug 2025
San Francisco
  • Supported documentary shoots, equipment handling, and on-set logistics. Less data pipeline, more real-world pipeline.
Production Logistics

USF Strategic Enrollment Management

Data Analyst / Web Intern
Jul 2024 - Jul 2025
San Francisco
  • Analyzed 500,000+ student records from SLATE, turning raw SQL exports into datasets, dashboards, and PCA models for admissions strategy.
  • Evaluated student interactions across admissions events and geographic regions to surface recruitment and marketing insights.
  • Automated recurring web and reporting updates with Python and Jinja2 so the data work did not become manual copy-paste theater.
SQL Pandas PCA SLATE Jinja2 Admissions Analytics

iD Tech Camps (Stanford)

Machine Learning Instructor
Jun 2024 - Aug 2024
Stanford, California
  • Taught high school students Python, neural networks, NumPy, Pandas, Keras, and practical AI workflows through project-based lessons.
  • Rebuilt check-in/out analysis with Seaborn heatmaps to understand traffic flow and improve camp operations.
Python PyTorch Keras NumPy Pandas Seaborn

UC Merced - SATAL

Data Analyst Intern
Aug 2023 - May 2024
Merced
  • Analyzed thousands of Qualtrics survey responses and focus-group notes from 500+ students to identify drivers of engagement and academic performance.
  • Used Pandas and OpenAI-assisted categorization to turn open-ended feedback into structured themes faculty could act on.
  • Presented research on methodology at the Fresno State Exemplary Practices in Higher Education Conference.
Pandas Qualtrics OpenAI Survey Analysis Research Methods

Acme Builders Incorporated

Construction Worker → Accounting Assistant
May 2021 - Dec 2023
Oakland · On-site · Part-time
  • Built internal data systems in Python with NumPy and Pandas to clean, organize, and standardize records across departments.
  • Updated, organized, and archived company documents to support payroll cycles, budgeting, and reliable business data management.
  • Used OCR workflows to reduce manual document sorting and make scanned account records easier to organize.
Python Pandas NumPy OCR Business Data Accounting Construction

Projects

These are mostly passion projects that I made with friends.

show me

showing everything

All projects, no bucket applied.

Live
Stars

nextsteamgame.com

I built this because most game recommenders stop at "pick a game you like" and then hand you a mystery list. I wanted something that lets people say what actually matters: Persona 5 for the jazz fusion OST and modern Tokyo setting, not just because it is an RPG. The goal is to help niche games surface for the right reasons, and to show users why a match made sense instead of hiding the recommendation behind a black box. It is free, open source, and very much a passion project; the implementation details live in the repo for anyone who wants the deeper data-flow tour.

Long-term PostgreSQL ChromaDB Qdrant ModernBERT FastAPI Docker
Superlinked Wine Recommender
Superlinked
Series A

Superlinked Wine Recommender

An explainable wine recommender developed with the Superlinked team during early access to their SIE engine. Instead of only returning similar bottles, it shows which attributes shaped the match, like fizz, cherry notes, body, or acidity. The demo uses document processing, vector embeddings, and small-model inference to make wine discovery feel more transparent.

Long-term Superlinked SIE Vector Search OCR Small Models Chroma PostgreSQL
2nd Place

Maldemic Simulator

A stochastic pandemic simulator that models disease spread across city coordinates using Markov-chain mobility and SIR dynamics. Python computes the population state transitions, then Godot renders the spread on a 3D globe so people can see the model instead of just reading equations. It earned 2nd place at BLOOM Hackathon and later received grant support for neural-network forecasting work.

Long-term Python NumPy SciPy Godot Markov Chains SIR
Next Chapter
Hackathon

Next Chapter

A hackathon project built to help people make sense of retirement planning questions like "Can I retire in Asia?" or "How much should I start saving?" The goal was not to automate away accountants, but to make the first pass through retirement variables faster, clearer, and less intimidating. We built a modular RAG system in about five hours with no WiFi, with finance guidance from Richard Vo at AMEX and teammates Eric Taylor and Faadil Shaik.

RAG LLMs FinTech Personal Finance AI for Good
USF Search Engine Crawler
Crawler

USF Search Engine Crawler

A high-throughput crawler built around data movement more than page scraping. 300 extract workers download and parse pages while 300 database workers batch writes into SQLite, with queue buffers keeping the system fed without falling over. It was a fun way to think through throughput, batching, worker coordination, and what happens when your pipeline moves faster than your database wants it to.

Long-term Go SQLite Concurrency Batch Writes Queues
Antidote Intelligence
Open Source

Antidote Intelligence

An open-source ML security tool for finding suspicious or poisoned content in training datasets before that data quietly becomes model behavior. The system uses a multi-agent analysis pipeline to inspect dataset content, generate hypotheses, and surface examples worth investigating. I like this project because it sits right at the point where data quality, automation, and model safety stop being abstract and start being operational.

Long-term Python OpenAI ML Security Data Quality Agent Pipeline
Dreamville
In Progress

Dreamville

A gamified Canvas LMS tracker that pulls assignments into a game loop, then scores urgency using completion patterns and difficulty signals. The fun part was not the coins; it was turning school workflow data into something students could act on without needing another dashboard yelling at them.

Long-term Godot Go Canvas API Regression Workflow Data
Hackathon

Hyper Rosen

A hackathon-built 3D Godot experiment in procedural space generation. We used swirled Perlin noise for planet placement so the world can keep expanding with an infinite-feeling galaxy structure, wave function collapse for city placement, and procedural systems for enemies and asteroids. It was not continued after the event, but I keep it here because it shows my interest in systems that evolve from local rules.

Godot Hackathon Procedural Generation Perlin Noise Wave Function Collapse
Cake Walk
GDC
GDC Jam

Cake Walk

A one-day game jam project we made and demoed at GDC Festival of Gaming. It was my first time at GDC, and showing a tiny game to a room full of people who love making things was genuinely special. Built with Keriya Son on 3D, Angie Peczon on art, Eric Taylor on shaders, and Ilce Perez on music. It was not continued after the jam, but the artifact captures the speed and joy of the event.

Godot Game Jam 3D Shaders Team Project
Old Man Climbs
First Project, 2022

Old Man Climbs

A small vertical climber built over a weekend for a UC Merced game jam in 2022. It was my first project and was not continued after the jam, but it is still here as a reminder that shipping small, complete things is its own skill.

Godot Game Jam 2022
Quick Autocorrect
Obsidian

Quick Autocorrect

A small Obsidian community plugin for cleaning up writing without leaving the note. It helps catch repeated misspellings, apply quick corrections, and keep a personal dictionary for words Obsidian should stop fighting you on. It is less glamorous than a big data system, but it has the same shape I like: take a messy stream of text and make it easier for people to work with.

Long-term TypeScript Obsidian Plugin Text UX
NutriFinder

NutriFinder

A dietary search app that filters restaurant menu items by nutrition and constraint data. It is a smaller project, but it still has the shape I keep coming back to: pull in messy real-world information, normalize it enough to search, and give people a cleaner way to make a decision.

Long-term React Flask Python Search Filters
Spiral Visualizer

Spiral Visualizer

A small visualization for spiral growth using queued directions. It is a compact teaching project, but I keep it around because visualizing state changes is often the fastest way to understand a system.

Long-term Python Matplotlib Queues