Dive deep into my projects — click to explore the depths of each creation.
Live
nextsteamgame.com
Gameplay‑first game recommender using ETL + vector similarity. 20k+ titles, user‑weighted tags.
GoFastAPIHTMXSQLite3
Open Source
Antidote Intelligence
Open source tool to help make LLMs more secure by detecting data poisoning. Read the blog post →
OpenAI GPT-46-Agent PipelineML Security
In progress
Dreamville
Gamified Canvas LMS tracker. Earn coins by completing tasks; urgency scored via regression.
GodotGoCanvas API
Maldemic Simulator
Real‑time SIRD with Markov chains + grant‑funded NN work; 3D globe viz in Godot.
GodotNumPySciPy
USF Search Engine Crawler
High-performance concurrent crawler with 300 extract workers + 300 DB workers. Batch processing (150 docs/batch) with 9000-job queue buffer for lightning-fast website indexing.
GoSQLite
Hyper Rosen
3D galaxy toy with Perlin fields and vector‑field enemies in Godot.
GodotPerlin
NutriFinder
Dietary search over restaurant menus; Flask API + React client.
ReactFlask
Old Man Climbs
Mini jam game built in a weekend for UC Merced GDC.
GodotGame Jam
Spiral Visualizer
Queue‑driven spiral plotting with Matplotlib.
PythonMatplotlib
Experiences
My voyage through professional waters, charting courses through real‑world challenges.
Alaris Security
Junior Fullstack Engineer
Aug 2025 – Present
San Francisco
Designed scalable data pipeline with Prefect + Airflow, processing 500K+ security events daily and improving analytics reliability by 60%.
Resolved critical UI bugs in React-based security platform, reducing system downtime by 90% and improving user experience for 1000+ enterprise clients.
Authored comprehensive platform-wide data flow documentation, enabling leadership to evaluate scaling solutions and reducing onboarding time for new engineers by 50%.
Created predictive models that improved enrollment forecasting accuracy by 15%.
Developed semantic search + Pandas system, reducing record reconciliation time by 50%.
Automated website updates with Python + Jinja2, cutting update time from hours to minutes.
SQLPandasOpenAIJinja2
UC Merced — SATAL
Data Research Analyst
Jun 2023 – Sep 2024
California
Designed and conducted statistical analysis on 50+ classroom feedback surveys for educational improvement.
Built ML classification system using LLMs + TensorFlow, achieving 99% accuracy in response categorization.
Processed complex XML-based Qualtrics survey data and created automated reporting systems for faculty.
Applied NLP techniques to extract insights from open-ended survey responses.
PyTorchLLMFlask
Candle Stories
Production Assistant
Apr 2025 – Aug 2025 · Completed
San Francisco
Supported on‑set operations and equipment handling across documentary shoots.
ProductionLogistics
Acme Builders Incorporated
Data Analyst
May 2021 – Dec 2024
Oakland
Started off as a construction laborer then worked on business logistics.
Built business data systems in Python using NumPy and Pandas to clean and organize records across department.
Organized, updated, and archived company records to support accurate data management.
Data MaintenanceBusiness Data ManagementGoogle WorkspaceData Control
Engineering Blog
Latest insights from AI security and data science research.
Demoing Data Poisoning Detection at Continue DX
Latest Post • December 2024 • ML Security Research Update
Recently had the opportunity to demo Antidote Intelligence at Continue DX, showcasing our content-aware data poisoning detection system to industry professionals. The response was incredibly encouraging, with several attendees expressing interest in the practical applications for enterprise ML pipelines.
Current Outreach Efforts
I'm actively reaching out to vector embedding marketing vendors and MIT researchers to explore collaboration opportunities around data quality assessment. The core methodology we've developed—using AI agents for hypothesis generation and systematic validation—has broader applications beyond just poisoning detection.
Key Technical Innovations
6-agent pipeline for comprehensive content analysis
Statistical validation using sample size significance algorithms
Content-aware filtering that goes beyond metadata
Real-time detection for high-stakes ML applications
Industry Impact Potential
The conversations at Continue DX reinforced something important: data quality is the silent crisis in ML. While everyone focuses on model architecture and training techniques, corrupted training data can undermine even the most sophisticated systems. This is especially critical in financial services, healthcare, and autonomous systems where the stakes are highest.
Looking forward to sharing more updates as these partnerships develop. If you're working on data quality challenges or RAG system validation, I'd love to connect and discuss potential applications.
ML SecurityData QualityResearch CollaborationIndustry Demo
Maldemic: Create a Virus, Watch it Spread
Earlier Post • Feb 2025 • Realtime Pandemic Simulation
Create a virus yourself and watch it spread. See live SIRD data between cities and learn how certain viruses spread more than others through our interactive realtime simulation.
Stochastic Markov Chain Algorithm
Using a Stochastic Markov Chain Algorithm written in NumPy and SciPy, we use distance between cities to determine how likely it is for the virus to spread. We then mix up the matrix and update the SIR accordingly.
Once the Python program compiles this information, we then visualize this simulation in Godot with a beautiful 3D globe interface.
Mathematical Framework
The total state of the global population is distributed between N cities with distinction between susceptible, infected, and recovered populations:
Evolution equations:
x₀ = (s₀, i₀, r₀)
xₙ₊₁ = S(xₙ)
Where S(x) consists of:
xₙ₊₁ = C₂ ∘ Σ ∘ C₁(Msₙ, Miₙ, Mrₙ)
Technical Implementation
Markov Matrix M: Probabilistically shuffles population between cities
Global SIR Function Σ: Locally spreads diseases according to SIR model
Cleaning Functions C₁, C₂: Maintain population constants using Hare-Neimeyer Method
3D Visualization: Real-time Godot globe showing spread patterns
Grant Recognition
This project received grant funding to incorporate Neural Networks for enhanced predictive modeling. The combination of stochastic modeling with machine learning creates a powerful tool for understanding pandemic dynamics.
The simulation allows users to experiment with different virus parameters and observe how mathematical models translate into real-world spread patterns, making complex epidemiological concepts accessible and interactive.
A game recommendation app backed by a custom ETL pipeline. It pulls data from Steam, YouTube, and the web, using VADER sentiment, regex, gameplay keyword frequency, and spam filters to extract the most insightful reviews. From these, it procedurally generates tags and places 20K+ games in a hierarchical genre tree. Each game becomes a vector capturing gameplay, tone, art, and music; searches prioritize gameplay similarity with user‑weighted tags.
Gamified assignment tracker for Canvas LMS. Removes self‑accountability by letting you check off only tasks you've finished. Build a city with coins earned from tasks. Assignment urgency is estimated with a linear regression over completion patterns and difficulty.
Stochastic Markov‑based SIRD simulator with a Godot 3D globe visualization. Models virus spread from city coordinates. Secured a grant to integrate neural networks for predictive spread analysis.
Scalable concurrent Go web crawler built for speed. Uses dual worker architecture: 300 extract workers download pages while 300 DB workers batch-upload to SQLite (150 docs/batch). 9000-job queue buffer enables lightning-fast crawling of entire domains like USF's website.