CUDA / ML systems
Fused-Linear-Attention ↗
The repository behind the fused attention case study, focused on custom CUDA execution and profiling.
Code
I do not treat GitHub as a dump of everything I have touched. The repositories here matter because they connect directly to the projects and notes elsewhere on the site.
The repository behind the fused attention case study, focused on custom CUDA execution and profiling.
Training, evaluation, deployment, and feedback-loop code for deadline and expiry extraction.
FastAPI serving, retrieval workflows, and recommendation-system logic from the StyleSync project.