Alex Peczon | Full Stack Software Engineer for Data Systems and NLP Pipelines

Data Systems, NLP, and Product Engineering

Experience building full stack platforms, ETL workflows, retrieval systems, and applied NLP tools that turn messy operational data into useful products.

Future Tilt

Software Engineer

Jul 2025 to Present

San Francisco

AI template builder: developed a React and FastAPI platform that converted campaign plans into editable Klaviyo compatible marketing emails, using AI generated copy and reusable email components to produce branded first drafts from designer maintained assets in Google Drive, reducing email preparation time by 40%.
Built an AWS Lambda and BigQuery alerting system that compared month to date campaign and flow revenue against the same date range from the prior year, posting daily Slack reports to surface seasonal underperformance across ecommerce clients.

React FastAPI BigQuery AWS Lambda Klaviyo API Slack API Google OAuth Docker

Superlinked (Series A)

SIE Demo Software Developer (Contract)

Mar 2026 to May 2026

San Francisco Bay Area · Hybrid

Developed a paid launch demo for Superlinked's SIE engine during early access, working with Valentin Marek and Eric Taylor to showcase explainable wine recommendations.
Built a RAG wine recommender that used Vivino data, small inference models, OCR, and text embeddings to surface similar wines with clearer reasoning.
Shipped a React UI and containerized Python monorepo in Docker, keeping OCR and embedding modules cleanly separated for documentation users.

Superlinked SIE React Python Docker RAG Small Models

MAGICS Lab

NLP Research Assistant

Mar 2025 to May 2026

San Francisco

Built parallelized ETL pipelines in Go across 5 virtual machines using DuckDB analytical querying to generate entity aware NLP datasets from 20,000+ news articles.
Implemented a modular explainable ABSA framework that outperformed VADER on SemEval 2014 Restaurant by 5 percentage points (78.58% vs. 73.57% accuracy), with transparent reasoning traces.

Go Python NLP DuckDB SemEval VADER ABSA

Alaris Security (Pre seed)

Junior Fullstack Engineer

Aug 2025 to Nov 2025

San Francisco

Resolved frontend and data consistency issues across an internal React platform, eliminating duplicate data displays, fixing UI rendering bugs, and establishing reusable patterns for frontend to backend communication.
Consolidated 50+ frontend database queries into typed tRPC procedures backed by Drizzle ORM, improving maintainability and reducing redundant data access across a Next.js + React monorepo.

Next.js React tRPC Drizzle ORM TypeScript PostgreSQL

Future Tilt

Software Engineering Intern

Jun 2025 to Aug 2025

San Francisco

Engineered a campaign synchronization service using AWS Lambda, BigQuery, and webhook based APIs to coordinate campaign data across Trello, Klaviyo, and internal systems, reducing campaign maintenance overhead by 50%.

AWS Lambda BigQuery Klaviyo API Trello API Webhooks Automation

Candle Stories

Production Assistant

Apr 2025 to Aug 2025

San Francisco

Supported documentary shoots, equipment handling, and on set logistics. Less data pipeline, more real world pipeline.

Production Logistics

USF Strategic Enrollment Management

Data Analyst / Web Intern

Jul 2024 to Jul 2025

San Francisco

Conducted year over year exploratory analysis on 500,000+ prospect student records from SLATE, identifying macro enrollment trends and analyzing acceptance and declination rates by geography, finding that recruitment events where students from the same region met had significantly higher conversion rates.
Automated recurring website content updates for prospective students by building Python and Jinja2 templating workflows over a legacy SLATE hosted system running raw HTML and React, eliminating manual copy paste updates across enrollment marketing pages.

Go SQL Pandas SLATE Jinja2 React Admissions Analytics

iD Tech Camps (Stanford)

Machine Learning Instructor

Jun 2024 to Aug 2024

Stanford, California

Taught project based Python and machine learning lessons to high school students at Stanford, covering neural networks, NumPy, Pandas, and Keras.

Python PyTorch Keras NumPy Pandas

UC Merced to SATAL

Data Analyst Intern

Aug 2023 to May 2024

Merced

Started my first day as a Data Analyst Intern at SATAL UC Merced, using survey data and student feedback to improve enrollment support, course experiences, and faculty decision making.
Built a survey response normalization pipeline using Pandas and OpenAI assisted categorization to bucket thousands of open ended Qualtrics responses into structured themes, transforming unstructured student feedback into analyzable datasets for faculty action.
Analyzed survey responses tied to student grade outcomes across lab sections and lectures, splitting fieldwork across a team of 6 to conduct weekly focus groups and large scale surveys, then compiling findings into weekly reports delivered to 5+ faculty members, contributing to measurable improvements in student outcomes and faculty relationships.
Presented research on methodology at the Fresno State Exemplary Practices in Higher Education Conference.

Pandas Qualtrics OpenAI Survey Analysis Research Methods

Acme Builders Incorporated

Construction Worker → Accounting Assistant

May 2021 to Dec 2023

Oakland · On site · Part time

Built internal data systems in Python with NumPy and Pandas to clean, organize, and standardize records across departments.
Updated, organized, and archived company documents to support payroll cycles, budgeting, and reliable business data management.
Used OCR workflows to reduce manual document sorting and make scanned account records easier to organize.

Python Pandas NumPy OCR Business Data Accounting Construction

Projects

These are mostly passion projects that I made with friends.

show me

showing everything

All projects, no bucket applied.

Live

Stars

Bay Area Gov Jobs

A free, open source search tool that collects Bay Area government jobs from public APIs, normalizes listings from multiple providers, and indexes them in memory for fast filtering. Its explainable ranking engine combines token frequency, skill aliases, title and department matches, and parsed experience requirements, while Leaflet maps jobs across configurable city and county boundaries.

Go Explainable Search Government APIs Server Rendered HTML Leaflet OpenStreetMap JavaScript Open Source

Visit Site View Code

Live

Stars

nextsteamgame.com

A semantic Steam recommender for 80,000+ games built around the idea that games should match by what they are, not only by player overlap. The pipeline filters up to 2,000 reviews per game, classifies useful review pools with ModernBERT, extracts identity tags and focus vectors, canonicalizes noisy generated tags, then precomputes candidate relationships so users can rerank recommendations by soundtrack, setting, mechanics, narrative, pacing, and vibe.

Long term PostgreSQL ChromaDB Qdrant ModernBERT FastAPI Docker

Visit Site Diagnostic View Code

Live Game

Package Panic

A closed source live browser game where players place and upgrade machines, react to special package types, build combos, survive boss events, and send pressure in versus matches. The client is written in Go and compiled to WebAssembly with Ebitengine style canvas rendering, while static web assets, account UI, chat, audio, and deployment glue live in vanilla HTML, CSS, and TypeScript. Supabase stores account and leaderboard data, and SpacetimeDB powers PvP matchmaking, friend invites, package send events, player state sync, open lobby chat logs, and moderated realtime chat.

Long term Ebitengine Go WebAssembly SpacetimeDB Supabase PvP TypeScript ebitdock CI/CD

Play Game

Open Source

ebitdock

A Go native development orchestrator for Ebitengine browser games. ebitdock builds the Ebitengine WASM target, starts a Docker Compose stack, serves the browser client, watches files, and exposes a local dashboard for ports, logs, build status, and service health. It keeps the game code, HTML shell, JS bridge, assets, APIs, and databases inside the project while owning the orchestration around them, including optional backend presets like Nakama and live service examples using APIs, realtime services, admin tools, Postgres, and SpacetimeDB.

Long term Go Ebitengine WebAssembly Docker Compose Nakama CI/CD

View Code

Open Source

Stars

PixARoss

A small open source Go and Ebitengine vertical slice for a cozy Picross style puzzle game. PixARoss loads puzzle JSON from assets, generates clues from solution art, supports fill and X mark drawing tools, drag painting, undo, reset, and reveals pixel art when a puzzle is solved. The level pipeline can generate self contained puzzle JSON from two panel spritesheets, which keeps the game data easy to extend while leaving the core puzzle loop Go native.

Go Ebitengine Puzzle Game WebAssembly Pixel Art Open Source

View Code

Open Source

Stars

USF Search Engine

An open source search engine written in Go for crawling and searching the USF website. The crawler splits work across extraction workers that download pages and discover links, then database workers batch page content into SQLite. It was designed around tunable concurrency, with worker counts, batch size, and queue size exposed as parameters so the crawler can scale from hundreds of workers while keeping the local search app simple to run.

Go SQL SQLite Concurrency Crawler Worker Pools Search

View Code

Series A

Superlinked Wine Recommender

A wine recommender developed with the Superlinked team during early access to their SIE engine. It uses document processing, vector embeddings, and small model inference to explain why a result appears, whether the match came from fizz, cherry notes, body, acidity, or other wine attributes.

Long term Superlinked SIE Vector Search OCR Small Models Chroma PostgreSQL

View Code SIE Example Docs My Post Superlinked Post

2nd Place

Maldemic Simulator

We built Maldemic to help close the gap between researchers and the public. Disease models can feel locked behind papers and equations, so we turned SIR dynamics and Markov chain mobility into a 3D globe people can watch, question, and reason about. Python computes the stochastic population transitions, then Godot makes the spread visible for public education.

Long term Python NumPy SciPy Godot Markov Chains SIR

View Project Research Paper

Hackathon

Next Chapter

A hackathon project built to make retirement questions feel less foggy. Users can ask things like "Can I retire in the Philippines?" or "How much should I start saving?" and the system answers with retrieved context and visible data instead of pretending a prompt is a financial plan.

RAG LLMs FinTech Personal Finance AI for Good

Watch Demo Vimeo Demo View Code

Open Source

Antidote Intelligence

An open source ML security project that treats training data as the place where model risk often starts. The system uses a multi agent analysis pipeline to inspect dataset content, generate hypotheses, and surface examples worth investigating before bad data becomes expensive behavior.

Long term Python OpenAI ML Security Data Quality Agent Pipeline

Open Source Code

In Progress

Dreamville

A gamified Canvas LMS tracker that pulls assignments into a game loop, then scores urgency from completion patterns and difficulty signals. The useful part is turning school workflow data into a next action system students can act on without another dashboard yelling at them.

Long term Godot Go Canvas API Regression Workflow Data

View Code

Hackathon

Hyper Rosen

A hackathon built Godot experiment in systems that can keep expanding. Swirled Perlin noise places planets, wave function collapse handles city placement, and procedural rules create enemies and asteroids, making the project feel like a small galaxy generated from reusable data rules.

Godot Hackathon Procedural Generation Perlin Noise Wave Function Collapse

View Project

GDC Jam

Cake Walk

A fast game jam pitch: make a tiny character readable, charming, and playable in a single day. We built and demoed Cake Walk at GDC Festival of Gaming with Keriya Son on 3D, Angie Peczon on art, Eric Taylor on shaders, and Ilce Perez on music.

Godot Game Jam 3D Shaders Team Project

Play on itch.io

First Project, 2022

Old Man Climbs

A small vertical climber built over a weekend for a UC Merced game jam in 2022. It is here less as a technical flex and more as the first shipped artifact: a reminder that finishing a small loop teaches more than endlessly planning a bigger one.

Godot Game Jam 2022

View Project

Obsidian

Quick Autocorrect

A small community plugin for reducing friction while writing in Obsidian. It catches repeated misspellings, applies quick corrections, and keeps a personal dictionary for words Obsidian should stop fighting you on: a tiny version of the same pattern I like, cleaning a messy text stream into something easier to use.

Long term TypeScript Obsidian Plugin Text UX

Plugin Page

NutriFinder

A small dietary search project with a practical pitch: pull in messy menu and nutrition information, normalize it enough to filter, and give people a cleaner way to decide what they can eat.

React Flask Python Search Filters

View Project

Spiral Visualizer

A compact teaching visualization for spiral growth using queued directions. The pitch is simple: when a system changes step by step, showing the state often teaches faster than another paragraph of explanation.

Python Matplotlib Queues

View Code

Blog

Hey hey! Didn't think many people would see this haha. These are basically leftover thoughts from projects I made: recommendation systems, explainable AI, data poisoning, and the parts that did not fit cleanly on a resume.

Comic strip reference for editorial layout — Since I am using a newspaper style here, I put a Garfield strip. It is in the public domain.

Usage Note June 2026

Steam Diagnostics, Interesting Usage, and a 30% Conversion Win

I added a Steam recommender diagnostic report so I could look at what people were actually doing inside NextSteamGame. I am keeping this to the parts that matter most: conversion, what people picked, what they clicked, and what the genre signals suggest.

Conversion

The funnel is the cleanest success metric. People picked 2,652 games from search and made 913 Steam clicks, which works out to a 34.4% click conversion rate. That means the site was not just getting curiosity traffic. A decent share of people found a recommendation interesting enough to leave for Steam.

Search to Steam click funnel chart showing 2,652 games picked from search and 913 Steam clicks — Search to Steam click funnel from the report: 2,652 games picked from search, 913 Steam clicks, 34.4% conversion.

Picked games

The games people started with mostly reflect Steam popularity: highly rated games with huge player counts and strong word of mouth. Persona 5 Royal, ELDEN RING, Stardew Valley, Baldur's Gate 3, and Factorio showing up near the top makes sense because these are the kinds of games people already search for most.

Top Games Picked From Search report chart

Persona 5 Royal53

ELDEN RING40

Stardew Valley39

Baldur's Gate 336

Factorio34

Subgenre and identity signal

The subgenre and specific genre charts are where the audience gets more interesting. Open world, roguelike, turn based RPG, resource management, and deckbuilding game all show up in clicked source subgenres. On the more specific side, strategic card battler leads the list. My read is simple: card battler and deck builder people are actively looking for new games, and they are probably nerdy enough to try a vector based recommendation website if it gives them better explanations than Steam tags.

Subgenre And Identity Signal report charts

open world44

roguelike39

turn based RPG37

strategic card battler15

deckbuilding game10

Open the full telemetry report.

Steam Diagnostics PostgreSQL Usage Analytics Conversion

Case Study May 2026

How I Built a Semantic Recommendation Engine for 80,000 Steam Games

NextSteamGame is a Steam recommendation project built around a simple complaint: most game recommenders know that two games are related, but they rarely explain why. Player overlap signals are useful, but they flatten intent. Someone may like Persona 5 for the jazz fusion soundtrack and modern Tokyo setting; another person may like it for social simulation and dungeon crawling. Those are different reasons, and a good recommendation system should let users separate them.

Games as vectors

I think games can be represented as weighted profiles: not just genre, but the parts that actually make the game feel like itself.

Persona 5 Royal: jazz fusion, modern urban fantasy, social simulation, dungeon crawling, stylish UI.

Micro tags normal genres miss

The problem

Most recommendation systems lean on the pattern "players who liked X also liked Y." That works well for popular games, but it struggles with niche tastes and gives weak explanations. I wanted a system that could represent a game as a shape: soundtrack, setting, systems, narrative, vibe, and the small micro tags that genre labels leave behind.

The pipeline

Collect Steam metadata, appids, genres, tags, descriptions, release data, and storefront artwork.
Pull up to 2,000 reviews per game, then remove spam and low signal reviews with regex filters, word diversity scoring, quality heuristics, and descriptive phrase detection.
Classify useful reviews with ModernBERT into pools for gameplay, art, soundtrack, systems depth, narrative, and general description.
Generate semantic identity data: focus vectors, mechanics, narrative, vibe, structure loop, signature tags, niche anchors, music tags, and micro tags.
Canonicalize noisy generated tags with heuristics, fuzzy matching, embedding similarity, and vector search so tags like fast action, quick action, and high speed combat can be grouped without losing useful distinctions.
Precompute candidate relationships offline, then let the live FastAPI and React app apply user controlled reranking at runtime.

Why the architecture is cool

The key design choice is splitting expensive semantic work from cheap interactive reranking. Computing every similarity at runtime would be wasteful, so candidate relationships are built offline. When a user searches, the app retrieves candidates, applies the user's weights, and reranks recommendations based on the profile dimensions they care about.

From review to recommendation

A raw review like "the combat is fast, the soundtrack goes hard, and the boss fights feel like rhythm puzzles" becomes structured signals: fast combat, high energy soundtrack, boss focused structure, rhythm like timing, and mechanical precision. Those signals can then be weighted independently by the user.

What it demonstrates

80,000+ Steam games indexed
Up to 2,000 reviews analyzed per game
Semantic vectors, identity tags, and canonicalized genre/tag relationships
30,000+ users and discovery across 8,000+ unique games
A retrieval design cheap enough to run on constrained cloud infrastructure

What I learned

Review text is noisy, so filtering before embeddings matters. LLM generated tags are useful, but raw generated tags need canonicalization. Most importantly, recommendations feel better when users can inspect and control the reason behind a match instead of accepting a mystery list.

Recommendation Systems Vector Search ModernBERT FastAPI Semantic Retrieval Steam

Friends + Games GDC Game Jam

Cake Walk: A One Day Game Jam at GDC

Cake Walk at GDC — Cake Walk started as a tiny joke and became a playable floor demo by the end of the day.

Cake Walk was a one day game jam project I made at GDC with friends. The whole thing was intentionally small: make a little cake cross the street, make it readable, make it charming, and ship something people could actually try.

The shape of the day

Everyone had a lane. We split up character work, art, shaders, music, and gameplay, then kept cutting scope until the core loop was visible. That is the best part of game jams: you cannot hide behind architecture for too long. Either the thing plays or it does not.

Cake Walk group photo — The real artifact was less the game and more the tiny production pipeline we built under pressure.

Why it matters

I keep these projects on the site because they show a different kind of engineering. Hackathons are messy, but they force prioritization, communication, and taste. You learn how much polish can come from a few good decisions when the team is moving fast.

GDC Game Jam Godot Team Project Friends

Friends + Games Hackathon Notes

Hyper Rosen: A Tiny Galaxy From a Hackathon Weekend

Hyper Rosen hackathon photo — Hyper Rosen was one of those weekend builds where the idea was bigger than the time limit, which is kind of the whole point.

The same silent gameplay capture from the project card, dropped into the newsletter so the build feels alive instead of only described.

Hyper Rosen was a hackathon game I made with friends. The pitch was simple and probably too ambitious: build a procedural space game where planets, cities, enemies, and asteroid fields come from generation rules instead of hand placement.

What we tried

The fun part was treating the game like a small systems experiment. Swirled noise placed planets, procedural rules filled out the galaxy, and wave function collapse style logic helped with city layout. It was not polished in the normal product sense, but it had that good hackathon feeling where every hour made the world a little more alive.

Why I still like it

I like projects like this because they make constraints obvious. You learn what actually matters when the deadline is close: readable movement, a loop people can understand, and enough visual feedback that the system feels real even if half of it is held together by deadline energy.

One day

Long term, I still want to make a full Mario Galaxy style procedural game from this idea: tiny planets, playful gravity, generated worlds, and a sense that the level is wrapping around you. That is probably a post college version of the project, though. The kind you build when breakfast is no longer mostly oats and coffee.

Hackathon Godot Procedural Generation Friends Game Dev

Update October 2025

Turns Out We Weren't Crazy About Data Poisoning

In December 2024, I built Antidote Intelligence around a simple belief: training data is infrastructure, and poisoned examples can become model behavior if nobody inspects the dataset early enough.

Anthropic, the UK AI Security Institute, and The Alan Turing Institute later published a large scale poisoning study that makes that concern feel a lot less speculative. Their result: in their experimental setup, as few as 250 malicious documents were enough to introduce a backdoor across models from 600M to 13B parameters.

250 docsenough to backdoor tested models in Anthropic's denial of service setup

Why this reinforced the project

A lot of people think poisoning only matters if an attacker controls a meaningful percentage of the training set. The Anthropic result challenges that. Their finding suggests the absolute number of poisoned documents can matter more than the percentage of the corpus, at least for the narrow backdoor they tested.

What Antidote was trying to do

Antidote was not trying to solve all model security. It was a dataset inspection tool: look at examples before they become model behavior, generate hypotheses about suspicious content, and make data quality visible enough for a human to investigate.

The larger lesson

This is the same theme as my recommender and ABSA work: AI systems can be powerful without becoming completely opaque. If model behavior depends on messy upstream data, then inspection, provenance, and explainable intermediate artifacts are not extras. They are part of the system.

Read Anthropic's research post.

Data Poisoning ML Security Dataset Inspection AI Safety Antidote Intelligence

Research Note October 2025

Building Explainable ABSA Without Hiding the Reasoning

AeVAA is a research project about a question I keep coming back to: machine learning and AI are powerful, but can we build systems where the important reasoning stays inspectable?

Aspect based sentiment analysis usually tries to predict whether a sentence is positive, negative, or neutral toward a target. That is useful, but it often hides the path from text to judgment. AeVAA takes a different route: split the problem into modules, keep intermediate artifacts, and use survey derived formulas to explain how sentiment moves between entities.

Σ(x)^k = σ(s^k, i^k, r^k)Sentiment as a function of local score, interaction, and relation context.

The core idea

Instead of asking one model for one answer, AeVAA builds a trace. It extracts clauses, resolves entities, identifies relationships, constructs a graph, and then calculates valence aware sentiment over that graph. The model can still use black box components, but the system around them exposes what each component contributed.

Why this matters

Document level sentiment can miss the point. In a sentence like "the person was bad, but the child was good," the total sentiment is not enough. The meaningful question is who the sentiment is aimed at and why it changed. That becomes even more important for media bias, long form narrative, and texts where framing matters.

What we built

A modular pipeline for constituency clause extraction, entity/coreference resolution, relation and modifier extraction, graph construction, and sentiment aggregation.
A human annotation study with 36 participants and 3,900+ sentiment judgments across action, association, ownership, and temporal aggregation cases.
Survey fitted formulas for action, target, association, ownership, and aggregate sentiment dynamics.
Explanatory traces that show where errors came from instead of only reporting a final label.

Results

The fitted formulas explained roughly half of the variance in pilot sentiment judgments. On SemEval 2014, AeVAA reached 78.58% restaurant accuracy and 68.52% laptop accuracy. It did not beat state of the art DeBERTa systems, but that was not the point of the prototype. The point was to show that a modular, inspectable ABSA system can produce plausible results and make debugging easier.

The bigger theme

I like projects that score well without becoming total black boxes. The goal is not to reject ML; it is to use ML where it helps, then design the surrounding system so people can inspect the evidence, the intermediate state, and the reason a result appeared.

Explainable AI ABSA NLP Human Annotation SemEval Research

Earlier Post Feb 2025

Maldemic: A Pandemic Model You Can Watch

Maldemic is a stochastic disease spread simulator that turns SIR equations and city mobility into a live 3D globe. I like projects where the math becomes something you can inspect with your eyes.

Data flow

Python computes population movement with a Markov matrix, updates local susceptible/infected/recovered states, then passes the evolving state into a Godot visualization.

Technical shape

Markov chain mobility between cities
SIR disease dynamics for local spread
Population cleanup to keep totals consistent
3D globe rendering for real time visual feedback

The project won 2nd place at BLOOM Hackathon and received grant support for neural network forecasting work.

NumPy SciPy Godot 3D Markov Chains SIR Model Simulation

Data Poisoning Detection at Continue DX

Continue DX presentation header — Continue DX demo: inspecting training data before it becomes model behavior.

I demoed Antidote Intelligence at Continue DX, showing a content aware data poisoning detection system for ML training datasets. The basic pitch: before we argue about the model, let's look harder at the data we fed it.

Why this matters

Training data quality is one of those problems that hides until it becomes expensive. Bad examples, poisoned content, or subtle distribution weirdness can leak into model behavior long before anyone notices.

What I built

Multi agent review pipeline for suspicious dataset content
Hypothesis generation and validation around poisoned examples
Content aware checks instead of only metadata based filtering
Reports aimed at making data issues inspectable, not magical

I am interested in this space because it treats data quality as infrastructure. The model gets the attention, but the dataset is where a lot of the story starts.

ML Security Data Quality Training Data Agent Pipeline

Journal Entry August 2024

USF, Sentiment, and Moving Into the City

View from my USF dorm — The view from my dorm at USF. This was the point where school started feeling connected to the city instead of separate from it.

This one is more of a journal entry than a project breakdown.

I transferred to USF in 2024 after UC Merced because I wanted to be in San Francisco. Merced felt too far away from the people and companies I wanted to learn from. I wanted to make a name for myself, be around real builders, and learn from people actually working in tech.

Before I even got to USF, I applied to more than 100 jobs. That search eventually turned into a Data Analyst / Web Intern role, where I worked on enrollment analytics, SLATE data, admissions event cleanup, and prospective student web updates. It was not glamorous, but it taught me something important: useful software usually starts as messy data, weird processes, and people who need better tools.

Where sentiment came in

Later, at the MAGIC Lab, I worked on AeVAA, an explainable sentiment project. The technical version is about aspect based sentiment analysis, graphs, coreference, relation extraction, and survey derived formulas. The personal version is simpler: I was trying to understand how a system could make a judgment without hiding the reasons.

That thread shows up in a lot of my projects. Recommenders should explain why a game matches. Sentiment systems should explain who a sentence is about and why the score moved. Data poisoning tools should show what suspicious examples look like before they become model behavior.

USF campus — USF became the place where those ideas started turning into real projects instead of just things I was reading about.

The pattern

I like AI systems, but I do not like when the answer is the only artifact. The projects I keep coming back to are the ones where the intermediate state matters: vectors, tags, traces, formulas, records, examples, and the evidence behind a prediction.

USF was where that started to become a theme instead of a coincidence.

USF Journal Sentiment Analysis Research San Francisco

Journal Entry August 2023

UC Merced Game Dev Club and My First Data Internship

UC Merced Game Dev Club event — UC Merced Game Dev Club, back when I was trying to get more students to actually start making games.

In 2023, I was the secretary of the UC Merced Game Dev Club. A lot of the work was not glamorous: planning, messaging people, getting rooms, keeping events moving, and making sure students felt like they could show up even if they had never shipped anything before.

We hosted a successful showcase where people brought in games they had been building, talked through what worked, and got to see other students care about the same weird problems: controls, art, music, level design, scope, and how to make a tiny idea feel playable.

Game jams and mixers

We also hosted a game jam that produced some genuinely cool student projects. The best part was watching people form teams quickly and make something real under a deadline. I also helped host a mixer for students who wanted to get started with game development but did not know who to work with yet.

That year also overlapped with my first day as a Data Analyst Intern at SATAL UC Merced. I was using survey data and student feedback to improve enrollment support, course experiences, and faculty decision making. Looking back, both roles were about the same thing: turning scattered student energy into something organized enough that people could act on it.

SATAL Fresno State presentation poster — We ended up presenting this SATAL work at Fresno State, showing how student feedback could become faculty facing evidence instead of disappearing into end of term forms.

That presentation mattered to me because it made the internship feel real. We were not just cleaning survey data for a class assignment; we were turning student perspectives into something instructors could discuss, revise around, and bring back into their courses.

UC Merced Game Dev Club Game Jam SATAL Journal

Journal Entry Summer 2022

ACME Builders: Construction, Payroll Scans, and Starting College

ACME Builders building — ACME Builders, around the time I was just starting college and trying to find any useful way to spend the summer.

In 2022, I had just started college and I could not get an internship yet. I was still early, still figuring out what counted as experience, and honestly just trying to keep moving instead of waiting around for someone to hand me a clean first opportunity.

So I worked with ACME Builders. Some of it was construction work. Some of it was office work: scanning accounting documents, payroll records, and old archives so they were easier to store and reference. It was not software engineering, but it gave me a closer look at the kind of messy operational work that every business quietly depends on.

Why I still count it

Looking back, this was one of the first places I saw how much value exists in boring process cleanup. Paper records, payroll files, old folders, and construction logistics are not glamorous, but someone still has to make them usable. That lesson shows up later in my data work: useful systems often start by organizing the unorganized.

At the time, it was mostly a way to pass the summer and stay busy. But it also taught me that not every important experience looks like a polished internship. Sometimes the early work is just learning how real businesses keep track of things.

ACME Builders Construction Accounting Payroll Archives Journal

Hey, I'm Alex

Data Systems, NLP, and Product Engineering

Future Tilt

Superlinked (Series A)

MAGICS Lab

Alaris Security (Pre seed)

Future Tilt

Candle Stories

USF Strategic Enrollment Management

iD Tech Camps (Stanford)

UC Merced to SATAL

Acme Builders Incorporated

Projects

Bay Area Gov Jobs

nextsteamgame.com

Package Panic

ebitdock

PixARoss

USF Search Engine

Superlinked Wine Recommender

Maldemic Simulator

Next Chapter

Antidote Intelligence

Dreamville

Hyper Rosen

Cake Walk

Old Man Climbs

Quick Autocorrect

NutriFinder

Spiral Visualizer

Blog

Steam Diagnostics, Interesting Usage, and a 30% Conversion Win

Conversion

Picked games

Subgenre and identity signal

How I Built a Semantic Recommendation Engine for 80,000 Steam Games

The problem

The pipeline

Why the architecture is cool

From review to recommendation

What it demonstrates

What I learned

Cake Walk: A One Day Game Jam at GDC

The shape of the day

Why it matters

Hyper Rosen: A Tiny Galaxy From a Hackathon Weekend

What we tried

Why I still like it

One day

Turns Out We Weren't Crazy About Data Poisoning

Why this reinforced the project

What Antidote was trying to do

The larger lesson

Building Explainable ABSA Without Hiding the Reasoning

The core idea

Why this matters

What we built

Results

The bigger theme

Maldemic: A Pandemic Model You Can Watch

Data flow

Technical shape

Data Poisoning Detection at Continue DX

Why this matters

What I built

USF, Sentiment, and Moving Into the City

Where sentiment came in

The pattern

UC Merced Game Dev Club and My First Data Internship

Game jams and mixers

ACME Builders: Construction, Payroll Scans, and Starting College

Why I still count it