At Netflix, I focus on the intersection of ML modeling and hardware efficiency. I work across the entire inference stack, using techniques like model compression and architectural adaptations to make foundation models run as fast and efficiently as possible for 300 million users. I pivoted into this role after doubling our core evaluation throughput and running weight ablation studies to systematically diagnose model overparameterization.
My career is built on the belief that model performance is inseparable from system design. I have spent years solving the economic and scaling friction points of ML in production to bridge that gap. My work spans stabilizing homepage recommendation engines at scale, optimizing AI assistant workloads to slash compute costs, and co-designing ML platforms. Moving forward, I am focusing on inference research to build hardware-optimized architectures.
My background includes a master’s degree in Artificial Intelligence from the University of Amsterdam and early professional experience in edge computer vision in Spain.
MSc Artificial Intelligence, 2020
University of Amsterdam
BSc Biomedical Engineering, 2018
College of New Jersey