In this live code-along session, Hugo Bowne-Anderson, Data Scientist and Head of Evangelism and Marketing at Coiled, will teach you everything you wanted to know about scaling your data science work to larger datasets and larger models, while staying in the comfort of the PyData ecosystem (numpy, pandas, scikit-learn, Jupyter notebooks). This live code along session is a collaboration between DataCamp and Coiled.
You’ll come away knowing:
- How to reason about when you need to scale your data and machine learning work and when to not;
- How to leverage distribute computation on your local workstation (such as you laptop) to analyze larger datasets and build larger, more complex models;
- How to harness the power of clusters to support larger-than-memory computation, all from the comfort of your own laptop;
- How to do all of this while writing code similar to the numpy, pandas, and/or sckit-learn code you already write.
When
Tuesday, September 15th, 2020, 4 PM EDT
Where
Live on our Facebook page. Register to be reminded!
Instructions
Follow the steps found in this Github repository if you wish to code along with Hugo! All relevant materials will be in the repository by the end of the day, Monday, September 14th, so make sure to check it out again the morning of the live coding session.
Prerequisites
Not a lot. It would help if you knew:
- programming fundamentals and the basics of the Python programming language (e.g., variables, for loops);
- a bit about pandas, numpy, and scikit-learn (although not strictly necessary);
- a bit about Jupyter Notebooks;
- your way around the terminal/shell.
However, Hugo finds that the most important and beneficial prerequisite is a will to learn new things, so if you have this quality, you’ll definitely get something out of this code-along session.
If you’d like to just watch and not code along, you’ll also have a great time, and these notebooks will be downloadable afterward.
If you are going to code along and use the Anaconda distribution of Python 3 (see below), Hugo asks that you install it before the session.
Note: Live submissions to Kaggle may happen during the event, so if you want to do that, make sure to register for an account before the session.
If you have any thoughts, comments, or questions, feel free to reach out to Hugo on Twitter: @hugobowne.
Content retrieved from:
https://www.datacamp.com/community/blog/data-science-machine-learning-at-scale.
Leave a Reply