About Me

I’m a data scientist and computer scientist with expertise in algorithm development, machine learning, and production software engineering. With over 20+ years of experience in genomics and bioinformatics, I have a proven ability to design algorithms and translate them into highly optimised, maintainable and professionally engineered software to handle data at petabyte scales.

Throughout my career, I’ve been dedicated to finding innovative ways to make vast datasets accessible to biologists. This journey began at the Sanger Institute and continued at Solexa, where I played a key role in pioneering the informatics of next-generation sequencing and developed the first program for aligning “short read” DNA sequences. My subsequent work at Illumina encompassed a wide range of DNA sequencing applications, including leading a collaboration with Genomics England on methods for population-scale genomics and applying machine learning to comparative analysis between species.

I’m skilled in applying professional software engineering practices to tackle complex data challenges in both high-performance (C/C++) and data science (Python) environments. Having worked in distributed international teams throughout my career, I excel at collaborative development across disciplines and organizations.