CPU: ---% | MEM: ---% | BOIDS: ---
Alex Peczon

Hello, I'm Alex

Enjoyer of Software Development, Data Science and Simulations

Side Projects I like

Click to explore the depths of each creation.

nextsteamgame.com
Live

nextsteamgame.com

Gameplay-first game recommender using ETL + vector similarity. 20k+ titles, user-weighted tags.

Go FastAPI HTMX SQLite3 SpaCy VADER
Antidote Intelligence
Open Source

Antidote Intelligence

Open source tool to help make LLMs more secure by detecting data poisoning.

OpenAI GPT-4 6-Agent Pipeline ML Security Python Statistical Analysis
Dreamville
In Progress

Dreamville

Gamified Canvas LMS tracker. Earn coins by completing tasks; urgency scored via regression.

Godot Go Canvas API
Maldemic Simulator

Maldemic Simulator

Real-time SIRD with Markov chains + grant-funded NN work; 3D globe viz in Godot.

Godot NumPy SciPy
USF Search Engine Crawler

USF Search Engine Crawler

High-performance concurrent crawler with 300 extract workers + 300 DB workers.

Go SQLite
Hyper Rosen

Hyper Rosen

3D galaxy toy with Perlin fields and vector-field enemies in Godot.

Godot 3D Perlin
NutriFinder

NutriFinder

Dietary search over restaurant menus; Flask API + React client.

React Flask Python
Old Man Climbs

Old Man Climbs

Mini jam game built in a weekend for UC Merced GDC.

Godot Game Jam
Spiral Visualizer

Spiral Visualizer

Queue-driven spiral plotting with Matplotlib.

Python Matplotlib

Experiences

Alaris Security

Junior Fullstack Engineer
Aug 2025 – Nov 2025
San Francisco
  • Designed scalable data pipeline with Prefect + Airflow, processing 500K+ security events daily and improving analytics reliability by 60%.
  • Resolved critical UI bugs in React-based security platform, reducing system downtime by 90% and improving user experience for 1000+ enterprise clients.
  • Authored comprehensive platform-wide data flow documentation, enabling leadership to evaluate scaling solutions and reducing onboarding time for new engineers by 50%.
Prefect Airflow React Data Pipelines Security Platform

Future Tilt (Ecommerce Marketing Agency)

Software Developer
Aug 2025 – Present
San Francisco
  • Built an AI template builder that auto-generates email boilerplates, cutting design prep time by 40%.
  • Developed BigQuery dashboards analyzing 10M+ daily records, improving brand forecasting.
  • Maintained an AWS Lambda + BigQuery alerting pipeline, reducing manual checks by 80%.
  • Automated Trello board updates via Lambda, streamlining campaign tracking for 15+ clients.
  • Deployed ECS Dockerized Lambda for real-time revenue alerts, cutting lag from 24h to <1h.
AI Templates AWS Lambda BigQuery ECS Docker Trello API

Future Tilt (Ecommerce Marketing Agency)

Software Development Intern
Jul 2025 – Aug 2025
San Francisco
  • Building micro-services with AWS Lambda and BigQuery to take the busywork out of eCommerce, and improve sales forecasting.
  • Automating processes in our CRM, setting up Big Query integration.
AWS Trello API Automation

USF MAGIC Lab

NLP Research Assistant
Mar 2025 – Present
San Francisco
  • Built an ETL pipeline (BeautifulSoup) scraping 20k+ news articles for sentiment analysis.
  • Developed SpaCy + NetworkX models to map sentiment and reveal bias trends.
Python SpaCy NetworkX ETL

iD Tech Camps (Stanford)

Machine Learning Instructor
Jun 2024 – Aug 2024
Stanford, California
  • Taught high school students Python, neural networks, and key tools like NumPy, Pandas, Keras, and ChatGPT through project-based learning.
  • Rebuilt check-in/out system using Seaborn heatmaps to optimize traffic flow, improving efficiency by 40% adopted by 2 other iD Tech camps.
Python PyTorch Keras NumPy Pandas Seaborn

USF Strategic Enrollment Management

Predictive Analytics / Web Intern
Jul 2024 – Jul 2025
San Francisco
  • Created predictive models that improved enrollment forecasting accuracy by 15%.
  • Developed semantic search + Pandas system, reducing record reconciliation time by 50%.
  • Automated website updates with Python + Jinja2, cutting update time from hours to minutes.
SQL Pandas OpenAI Jinja2

UC Merced - SATAL

Data Research Analyst
Jun 2023 – Sep 2024
California
  • Designed and conducted statistical analysis on 50+ classroom feedback surveys for educational improvement.
  • Built ML classification system using LLMs + TensorFlow, achieving 99% accuracy in response categorization.
  • Processed complex XML-based Qualtrics survey data and created automated reporting systems for faculty.
  • Applied NLP techniques to extract insights from open-ended survey responses.
PyTorch LLM Flask

Candle Stories

Production Assistant
Apr 2025 – Aug 2025 · Completed
San Francisco
  • Supported on-set operations and equipment handling across documentary shoots.
Production Logistics

Acme Builders Incorporated

Data Analyst
May 2021 – Dec 2024
Oakland
  • Started off as a construction laborer then worked on business logistics.
  • Built business data systems in Python using NumPy and Pandas to clean and organize records across department.
  • Organized, updated, and archived company records to support accurate data management.
Data Maintenance Business Data Management Google Workspace Data Control

Engineering Blog

Latest insights from AI security and data science research.

Demoing Data Poisoning Detection at Continue DX

Latest Post • December 2024

Recently had the opportunity to demo Antidote Intelligence at Continue DX, showcasing our content-aware data poisoning detection system to industry professionals. The response was incredibly encouraging, with several attendees expressing interest in the practical applications for enterprise ML pipelines.

Current Outreach Efforts

I'm actively reaching out to vector embedding marketing vendors and MIT researchers to explore collaboration opportunities around data quality assessment. The core methodology we've developed—using AI agents for hypothesis generation and systematic validation—has broader applications beyond just poisoning detection.

Key Technical Innovations

  • 6-agent pipeline for comprehensive content analysis
  • Statistical validation using sample size significance algorithms
  • Content-aware filtering that goes beyond metadata
  • Real-time detection for high-stakes ML applications

Industry Impact Potential

The conversations at Continue DX reinforced something important: data quality is the silent crisis in ML. While everyone focuses on model architecture and training techniques, corrupted training data can undermine even the most sophisticated systems. This is especially critical in financial services, healthcare, and autonomous systems where the stakes are highest.

Looking forward to sharing more updates as these partnerships develop. If you're working on data quality challenges or RAG system validation, I'd love to connect and discuss potential applications.

ML Security Data Quality Research Collaboration Industry Demo

Maldemic: Create a Virus, Watch it Spread

Earlier Post • Feb 2025

Create a virus yourself and watch it spread. See live SIRD data between cities and learn how certain viruses spread more than others through our interactive realtime simulation.

Stochastic Markov Chain Algorithm

Using a Stochastic Markov Chain Algorithm written in NumPy and SciPy, we use distance between cities to determine how likely it is for the virus to spread. We then mix up the matrix and update the SIR accordingly.

Once the Python program compiles this information, we then visualize this simulation in Godot with a beautiful 3D globe interface.

Mathematical Framework

The total state of the global population is distributed between N cities with distinction between susceptible, infected, and recovered populations:

Evolution equations:

x₀ = (s₀, i₀, r₀)

xₙ₊₁ = S(xₙ)

Where S(x) consists of:

xₙ₊₁ = C₂ ∘ Σ ∘ C₁(Msₙ, Miₙ, Mrₙ)

Technical Implementation

  • Markov Matrix M: Probabilistically shuffles population between cities
  • Global SIR Function Σ: Locally spreads diseases according to SIR model
  • Cleaning Functions C₁, C₂: Maintain population constants using Hare-Neimeyer Method
  • 3D Visualization: Real-time Godot globe showing spread patterns

Grant Recognition

This project received grant funding to incorporate Neural Networks for enhanced predictive modeling. The combination of stochastic modeling with machine learning creates a powerful tool for understanding pandemic dynamics.

The simulation allows users to experiment with different virus parameters and observe how mathematical models translate into real-world spread patterns, making complex epidemiological concepts accessible and interactive.

NumPy SciPy Godot 3D Markov Chains SIRD Model Grant Funded