John Healy

Image John Healy
Lead Strategic Researcher

I'm a mathematician and data scientist at the Tutte Institute for Mathematics and Computing (TIMC). I enjoy identifying fundamental mathematical problems which underlie a variety of real world problems and working with a team to design algorithms to solve them. I then enjoy closing the loop by bringing those solutions back to clients and helping them understand how to use them to make a difference.

I have worked with a wide variety of machine learning and statistical techniques over the years from neural networks to relational or graph analytics. Most recently my work has focused on unsupervised learning and specifically clustering, outlier detection, dimension reduction and interactive data visualization.

Current research and/or projects

Most recently the projects that I've been most heavily involved in are the development of a fast version of the density based clustering algorithm, HDBSCAN, which is current in scikit-learn-contrib, the invention and development a dimension reduction algorithm called Uniform Manifold Approximation and Projection (UMAP) and a python library for vectorizing variable length sequences called Vectorizers.

The theme of my current work involves developing a solid practical pipeline for vectorizing, exploring and labelling data within low dimensional interactive maps.  I have a particular interest in cyber defense data. 

Professional activities / interests

I am actively involved in promoting and assisting with data science during the the cyber defense Geek Week workshops.  

Key publications