Mike McCarty and Gil Forsyth work at the Capital One Center for Machine Learning, where they are building internal PyData libraries that scale with Dask and RAPIDS. For this webinar, they’ll join Hugo Bowne-Anderson and Matthew Rocklin to discuss their journey to scale data science and machine learning in Python.
In 2020, Capital One left data centers behind by completing a transition to the cloud, and they are now using the cloud with Dask to scale data science and machine learning. We’ll take a whirlwind tour through what this looked like and dive into several key specifics, such as how to deploy Dask and RAPIDS on AWS, the ins and outs of scaling your XGBoost workflows, and how Capital One leverages the scikit-learn API to scale with custom estimators.
We’ll also hit some more cultural notes, such as how Capital One is building internal communities who are knowledgeable on the best practices of using these OSS tools and why it’s important for an enterprise company to contribute to this community today.
After attending, you’ll know:
- How Dask has grown at Capital One and some of the challenges they faced.
- How (and why) to scale XGBoost training
- How to leverage the scikit-learn API to build your own custom estimators that scale
- The importance for institutions to participate in the open-source projects they are using
Join us Tuesday, February 23rd at 5:00 PM US Eastern time by signing up here and dive into the wonderful world of all things Dask and scalable Python at Capital One!