inspiration, news, python, tutorial

Learning Python For Data Science

For those of you who wish to begin learning Python for Data Science, here is a list of various resources that will get you up and running. Included are things like online tutorials and short interactive course, MOOCs, newsletters, books, useful tools and more. We decided to put this together so that you can begin learning Data Science with Python right of the bat, without having to spend hours surfing the web in search of resources. Please note that while we believe the list is comprehensive, it is by no means exhaustive. We probably have missed out on a couple of nice resources so feel free to mention them in the comments if you are so inclined.:)


Intro to Python for Data Science by DataCamp: This free and interactive tutorial focus on Python skills and tools specifically for Data Science use. Through this course you will learn the foundations of Python as well as the very essential data science tools. This is a very easy and friendly way to introduce yourself the Python syntax and get off to a positive start.

Python Programming by Codecademy: While this course from Codecademy doesn’t teach Python in the context of data science it’s still a fantastic resource. In this free course you can get exposure to more fundamentals of Python programming and maybe pick up some web development skills along the way. Most importantly you will be getting practice with Python syntax.

A Byte of Python: This is a collection of very friendly tutorials on all the basics of Python that can help you get started and unstuck, especially with simple tasks that you need do when working with Python. : Learn Python is a pltaform where you will find a series of programming tutrials and in browser exercises that go along with them. This could be useful as a set of examples of how to go about completing certain programming tasks, as well as an exposure to some basic Python programming topics.

Code Mentor Python Tutorials: You will be able to find a number of nice tutorials varying in type and scope  on codementor. Some articles will teach you tricks or best practices when it comes to using Python, while others might be full on cases of Python projects and applications of data science to various domains.

Dataquest teaches you data science interactively in your browser, using Python.  They advertise themselves as a company which teaches you all the skills you need to be a well-rounded data scientist or data analyst.  They helps you build your portfolio with projects after teaching you the theory.  Dataquest members have been hired at companies like Fitbit and 3M. Dataquest offers beginner and intermediate content on Python for free.  The rest of the content is available with a monthly subscription. I have not used them personally but this seems intriguing.

Intermediate Python for Data Science by DataCamp: In this sequel to the Intro to Python for Data Science you will carry on learning the key tools for plotting and visualization, working with data, basic Python programming, and a full hands on Case Study where you use all of your new skills in consortium. In addition you receive a certificate that you can share in your social and professional networks.


Massive Open Online Courses (MOOCs)

Introduction to Python for Data Science by Microsoft: In this course you start with the true basics including variables and arithmetic, and work yourself up to working with NumPy arrays and Pandas DataFrames. Gradually you begin to cover topics central to data science including visualization using Matplotlib, and control flow. This open course comprises of video tutorials, and what sets it apart are the interactive in browser exercises that you complete as you learn.

Python for Everybody by University of Michigan: This course focuses on the behind the scene part of data science, namely retrieval and processing of data, as well as some visualization. You begin by learning basic Python, and then how to work with data structures, access data from databases and the web.

Data Analysis and Interpretation Specialization: This series of courses focuses on the analysis of data implemented with Python, and the interpretation of results. Once you become acquainted with tools for analyzing data in Python, you may participate in a capstone course testing your skills through a sponcored project.

Data Analysis with Python and Pandas on Udemy: After getting some basic Python knowledge you may want to explore a specific topic in more depth, and this course is a great way to do so. If you are brand new to Python it is advised not to take this course. Nonetheless, this is a great way to learn Pandas library in greater detail.

Data Analysis with Python and Matplotlib on Udemy: Similar to the course above, this is not for complete novices. Rather this course will allow you to delve deeper into the visualization tools on Python with the Matplotlib library.

Intro to Data Science on Udacity: This course is about Data Science, not Python. Python is used as a tool and a good amount of Python experience is necessary. After learning the basics of Python programming however this is a fantastic course to take and utilize and expand your Python abilities.


Resources and Newsletters

Bite Python: Bite Python is a great newsletter to be signed-up to especially if you are always on a look out for Python tips, tutorials, and essential news. As a novice you will find this relevant right of the bat.

PyData,org: PyData is a community of Python data tools users. They organize and host conferences dedicated to Python data tools. From their web page you can learn about the all of the essential Python libraries, technologies, and tools built specifically for data analysis and data science. This link will lead you to the main source of Python documentation. As a data scientist it is very important to learn how to take consult the while working on a project or task. When you need explanations for certain functions or operations this is the place to go. This is a site dedicated to everything Python. On the site you will see numerous articles and news about Python, and frequently posts with data science and data analysis as topics.

Python Weekly: This is one of the most popular and essential newsletters dedicated to Python. While you might not find everything in the newsletter useful or relevant from the beginning, this is a good way of stepping into the community and being up-to-date with what’s going on in the Python universe.

Pycoder’s Weekly: This newsletter is somewhat more advanced, and is not entirely dedicated to Data Science. Still, you might stumble upon some topic that will be relevant to what you are doing with Python particularly if you are also interested in computer science and development. It becomes easier to take advatage of all of the open online courseware with this website. The site hosts extensive lists of resources for learning data science theory, as well as technologies including Python. If you are confident in yourself-studying abilities completing the curriculum can prepare you for a career in Data Science.



In case you like to learn from books as well, here are a couple of good texts dedicated to learning Python as a tool for data analysis and data science. Most of these books are comprised of examples and exercises, and some are accompanied with actual data which wich you can get your fingers dirty while reading.

Intermediate Python: This online book is free to read and contains intermediate Python concepts which are usually not taught in beginner books. This is a must read if you have already finished beginner books.

Learn Python The Hard Way: This online book is free to read, and contain a ton of examples, exercises and demonstration that will get you started and move you along most of the Python programming topics you are likely to need.

Practical Data Analysis with Python: This book is all about data analysis with Python playing the role of the data analysis tool. The coolest thing about this product is that you may purchase the book with the data used throughout which allows you to reproduce the analysis done in the text.

Python for Data Analysis: As the title suggests this book is all about Python as a data analysis tool. The text covers many of the essential topics such as … . It’s always nice to read but the best way to use this book is by consulting sections as needed.

Data Science from Scratch: First Principles with Python: There is a reason why this text has the phrase “from scratch” right in the title. The book is dedicated to teaching you HOW the data science techniques work in principle, using Python. You won’t see much of NumPy or Pandas, rather you will see the Python code for the essential algorithms used in data science.


Work Space

While analyzing your data and organizing data science projects with Python, you will need a work space where you will be writing your code and executing your analysis. There are a couple of good options designed for Data Science specifically.

Rodeo IDE: A product developed by Yhat, Inc. This is a relatively new IDE but it deserves a mention because it has been designed specifically for data analysis projects, rather than for general programming purposes that Python is capable of. For those familiar with the RStudio IDE for R, Rodeo is a very similar tool for working with Python.

Anaconda: Anaconda is a platform developed by Continuum Analytics, who’s founders and developers are creators and contributors to some of the most popular Python based data science tool.  Through Anaconda you will be able to get a package consisting of Python with the essential data analysis libraries (NumPy, SciPy, Pandas..) , Jupyter notebooks, as well as a number of other tools for visualization and analysis.


12 thoughts on “Learning Python For Data Science

  1. Pingback: 43 New External Machine Learning Resources and Updated Articles — Dr. Jonathan Jenkins, DBA, CSSBB, MSQA

  2. Hi, I found your post very useful and compliant with our mission. I’m a member of team where we are building data science online workspace for python, R, Julia. We are at the beginning of our journey but I hope this information will be valuable for blog readers.

    PS. If you want more info please write to us at hello at

  3. Very nice compilation of learning resources. When it comes to workspaces I think that one could mention the distribution Python(x, y) which comes with the IDE Spyder. I would say that Spyder is quite similar to RStudio. Of course, it does not look like RStudio in the way that Rodeo do.

    Pycharm is also a good, I think.

    Again, thank you for a great compilation of resources.

  4. Fantastic post! I use Python to do data science all the time. On the job and at home on pet projects. I love it. I love the flexibility it gives you. And well done on recommending Anaconda, it’s unbelievably useful for managing the scientific Python packages. Some; like Matplotlib Basemap can’t be pip installed, but are on Anaconda.

    Pandas is probably my favourite Python package! I wrote a tutorial on pandas on my blog (and numpy and matplotlib and basemap + more to come) you might want to check out too.

  5. Pingback: Learning Python For Data Science | WinDecision.

  6. Pingback: Learning Python For Data Science — Dr. Jonathan Jenkins, DBA, CSSBB, MSQA

  7. Pingback: Data Science with Python or Java | Abdul Wahid

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s