I’m a data scientist & ML engineer.
Building scalable data solutions, solving natural-language processing problems, and teaching Python and software engineering best practices to other data scientists.
I’m currently a machine learning engineer in the Boston office for Pluralsight, focusing on data-driven and personalized approaches for our learners to explore and unlock new tech skills on the Pluralsight platform. Previously, I was a manager / technical lead at Wayfair on the Merchandising Ops Data Science team, providing data-driven solutions for managing, maintaining, and capitalizing on structured & unstructured information about millions of Wayfair.com products.
Prior to Wayfair, my first Data Science role was as a data scientist at Cinch Financial, a small startup building a one-of-a-kind personal financial advisor to level the playing field with banks and credit card issuers. I joined Cinch through the Insight Data Science program, an intensive postdoctoral fellowship geared towards STEM PhDs transitioning to data-driven industry. The Boston session I attended was geared towards healthcare & biomedical applications – I was particularly drawn to the opportunity for data science in socially-beneficial applications.
My academic background is in nuclear engineering, and before transitioning to data science I was a researcher in nuclear fusion energy at the MIT Plasma Science & Fusion Center’s Alcator C-Mod tokamak experiment, where I had worked in various capacities since my Sophomore year, completing my doctorate in September 2014.
Outside of work, I spend much my time hiking, rock & ice climbing, and mountaineering. While most of my experience is in the excellent White Mountains of New Hampshire, I’ve also logged ascents of Mt. Rainier (14,411’) in Washington, and Ixtaccíhuatl (17,159’) and Pico de Orizaba (18,491’) in Mexico. I also enjoy board and tabletop games, and have worked for several years as an Enforcer at PAX East and Unplugged teaching new & unreleased board games.
September 2020 ~ Synthwave Styling Data Visualizations in Python with Altair
June 2020 ~ Data Processing with Dask
February 2020 ~ Python CLI Utilities with Poetry and Typer
January 2020 ~ Managing Python Environments
Data Processing & Pipelines: Apache Airflow, Apache Spark, Dask, Pandas, Luigi
Database Technologies: SQL (Postgres, MSSQL Server), Hive, Vertica, AWS Aurora, AWS Athena
Neural Networks & Deep Learning: Tensorflow, Keras, PyTorch
Machine Learning & Statistics: scikit-learn, XGBoost, H2O, PyMC3, Pomegranate, LightFM
Natural Language Processing: FastText, SpaCy, NLTK, Gensim, CoreNLP
Visualization & Dashboards: Matplotlib, Seaborn, Bokeh, Altair, D3.js, Metabase, Superset
Hosting & Deployment: FastAPI, Flask, Docker, Kubernetes, AWS Sagemaker