So you want to be a data scientist, or maybe you're already one and want to increase your toolbox. You've arrived to the correct location. The goal of this page is to give a comprehensive learning route for persons who are new to Python for data science. This curriculum provides a complete overview of the steps required to learn Python for data science. If you already have some background or don't require all of the components, feel free to create your own paths and let us know how you changed the path.
You can also look at the tiny version of this learning route –> Infographic: Quick Guide to Learning Data Science in Python.
We have created a new learning route for you! Check it out on our courses page and get started on your data science adventure right away.
The first step is to set up your machine.
Now that you've made your decision, it's time to set up your machine. The simplest way to get started is to get Anaconda from Continuum.io. It comes with almost everything you'll ever need. The main disadvantage of going this way is that you will have to wait for Continuum to update their packages, even if an update to the underlying libraries is available. That shouldn't matter if you're just starting off.
If you have any problems installing, you can find more thorough instructions for various operating systems here.
Step 2: Learn the fundamentals of the Python programming language.
You should begin by learning the fundamentals of the language, libraries, and data structures. One of the greatest places to begin your adventure is with Analytics Vidhya's free Python training. This course focuses on how to get started with Python for data science, and by the conclusion, you should be familiar with the language's fundamental concepts.
Assignment: Take Analytics Vidhya's fantastic free Python course.
Alternative resources: If interactive coding isn't your thing, check out The Google Class for Python. It is a two-day class series that covers some of the topics mentioned later.
Step 3: Become acquainted with Python's Regular Expressions.
You will need to use them frequently for data purification, particularly if you are working with text data. The best approach to learn Regular expressions is to take the Google lesson and keep this cheat sheet on hand.
Do the baby name exercise as a homework assignment.
If you still need more practise, check out this text cleaning instruction. It will test you on the many stages involved in data manipulation.
Step 4: Learn Python scientific libraries such as NumPy, SciPy, Matplotlib, and Pandas.
This is where the good times begin! Here's a quick rundown of some libraries. Let's get started with some basic operations.
Extensively practise the NumPy tutorial, particularly NumPy arrays. This will lay a solid foundation for the future.
Then, take a look at the SciPy tutorials. Go over the introduction and the essentials, then do the rest based on your needs.
If you guessed Matplotlib tutorials, you are mistaken! They are far too thorough for our purposes. Instead, scroll down to Line 68 of this ipython notebook (i.e. till animations)