Mike Arpaia

Mike Arpaia

About Me

I’m a computer scientist working on machine learning and distributed systems. I’m passionate about deep learning approaches to natural language problems, deep reinforcement learning, and the effective design, implementation, and operation of distributed systems in high-throughput production environments.

Professionally, I’m currently a Machine Learning Architect at Workday where I work across the organization in an effort to improve the scalability and stability of our software and infrastructure while also increasing the velocity, rigor, and reproducibility of our research.

Before Workday, I was the Co-Founder and CTO of an infrastructure analytics startup where I built and led a completely remote engineering organization and raised 9.6 million dollars in venture funding. I’ve also worked as an Engineering Manager and Software Engineer at Facebook as well as a Software Engineer at Etsy.

While working at Facebook in 2014, I created and open-sourced a tool called osquery which exposes a SQL interface to a fleet of computers for fast, flexible operating system monitoring. Osquery has since become a foundational tool in the information security and compliance industries.

Outside of work, I’m an avid outdoors enthusiast. I like to train for and participate in a variety of mountain sports including rock climbing, skiing, and adventure motorcycling. I also play the electric and upright bass. The bass fulfills an extremely foundational role in the rhythmic structure of music and that has always drawn me to the instrument.

Professional Experience

 
 
 
 
 

Machine Learning Architect

Workday

April 2020 – Present Boulder, CO
After helping to build, launch, and scale a distributed embedding-based search and recommendation system, I was promoted to Architect (Workday’s Director-level IC role) and my responsibilities shifted to emphasize explicit focus on establishing and unifying the architecture and strategy for wider data science and engineering initiatives in an effort to optimize for the velocity and rigor of methodological inquiry as well as the long-term scalability and stability of the platform across all research and engineering groups.
 
 
 
 
 

Principal Machine Learning Engineer

Workday

March 2019 – April 2020 Boulder, CO
At Workday, I joined the Machine Learning organization to lead the architecture, implementation, and productionalization of a distributed embedding-based search and multi-document matching engine. I led the design and delivery of several different parts of the stack which we successfully launched and scaled in support of several generally available Workday products with the Recruiting, Learning, and Talent product organizations.
 
 
 
 
 

Co-Founder & CTO

Kolide

July 2016 – January 2019 Boulder, CO
As the Co-Founder and CTO of a small venture-backed infrastructure analytics startup, I built and led a high-performing, fully-remote engineering organization with engineers in every US timezone. I also acted as the lead architect and developer for almost all of our backend, infrastructure, and operating system software. As Co-Founder, my role also allowed me to spend time as a frequent author of blog articles, part-time salesman, periodic financial negotiator, persistent pedagogue, etc.
 
 
 
 
 

Engineering Manager

Facebook

October 2015 – June 2016 Menlo Park, CA
After working as an individual contributor at Facebook, I became the Engineering Manager of the intrusion detection infrastructure team. In addition to learning a lot about the technical subject matter of intrusion detection and large-scale data analytics, I learned a lot about how to be an effective people leader for a team of extremely high-performing individuals. Facebook provides a lot of great support to individual contributors that transition to engineering management and I took advantage of these resources as much as possible.
 
 
 
 
 

Software Engineer

Facebook

February 2014 – October 2015 Menlo Park, CA
I joined the team at Facebook to work on improving host intrusion detection capabilities, specifically on macOS and Linux which were falling behind Windows tools from vendors. To accomplish this across all of Facebook’s environments, I created the osquery project and widely deployed it throughout corp and production with enormous help from an amazing team. Osquery is the most starred security project on all of GitHub!
 
 
 
 
 

Senior Software Engineer

Etsy

July 2013 – February 2014 New York, NY
After helping to establish several aspects of Etsy’s infrastructure and application security practices, I was promoted to Senior Software Engineer where I was the youngest Senior Engineer in the history of the company as well as the only engineer to have ever been an active participant in both the Engineering and Operations on-call rotations. Being passionate about data infrastructure and data analytics, I also became the designated security lead for data infrastructure and I voluntarily maintained some analytics infrastructure for our team of Data Analysts.
 
 
 
 
 

Software Engineer

Etsy

October 2012 – July 2013 New York, NY
While at Etsy, I was a Software Engineer on the Security team working on a wide range of engineering efforts to ensure the security of Etsy’s infrastructure and application. One of the things that I worked on was a custom host intrusion detection system which I deployed and managed across Etsy’s corporate infrastructure where I was fortunate enough to participate in several red team exercises designed to test it’s effectiveness. This effort provided a lot of experience and domain expertise that guided a lot of design decisions in osquery.
 
 
 
 
 

Security Engineer

iSEC Partners

August 2011 – October 2012 New York, NY
At iSEC Partners, I was a penetration tester and security researcher, specializing in infrastructure security, mobile operating system security, and mobile application security. I did research on mobile device exploitation, PHP application security, and mobile application security.
 
 
 
 
 

Security Engineer

Gotham Digital Science

January 2011 – August 2012 New York, NY
I did security assessments for GDS while also attending University. I participated in infrastructure and application assessments for a variety of large financial and technology companies.
 
 
 
 
 

Network Technician

Stevens Institute of Technology

June 2010 – January 2011 Hoboken, NJ
During university, I worked as a Network Technician for the Stevens IT department where I performed a variety of maintenance, debugging, and repair tasks on networking equipment all throughout the campus.

Research Experience

 
 
 
 
 

Research Engineer

Mila - Québec Artificial Intelligence Institute

May 2019 – December 2019 Montréal, QC
At Mila, I worked as a volunteer on software and infrastructure engineering objectives for a project which aimed to raise awareness and conceptual understanding of climate change by depicting accurate and personalized outcomes of climate change using cutting-edge techniques from artificial intelligence and climate modeling.

Open-Source Leadership

Alpaca Trading API C++ Client Project Lead

Alpaca Trading API C++ Client Project Founder

Kubernetes Release Team Member

Kubernetes Multi-Tenancy Working Group Member

Osquery Project Lead

Osquery Project Founder

Conference Talks

Using a Kubernetes Operator to Manage Tenancy in a B2B SaaS App

Companies that create products for other companies or teams often have to reason about how to deal with the application-level tenancy of each team. This presentation will discuss how Kolide has approached the problem of application tenancy by building a Kubernetes Operator to manage the complete lifecycle of each tenant as an isolated instance of a single-tenant application.

Behind The Scenes: Kubernetes Release Notes Tips and Tricks

This session aims to shed more light on the release note process from the Kubernetes contributor’s point of view. We will briefly …

Instrumenting Dynamic Environments with Source Control, Peer Review, and Decentralized Intelligence Distribution

Osquery configurations often start simple and static, but, as the complexity of an osquery deployment grows, the level of dynamicism …

Starting Growing and Scaling Your Host Intrusion Detection Efforts

Osquery is a lightweight host intrusion detection tool that organizations can use to monitor extremely large production environments as …

Building Successful Open Source Security Software

Released in 2014 by Facebook, osquery is an open source operating system instrumentation framework and toolset. In this talk, I will …

Publications

RTFn: Enabling Cybersecurity Education Through a Mobile Capture the Flag Client

Cybersecurity is one of the most highly researched and studied fields in computer science. It has made its way into numerous accredited …

Contact