Setting Up the Python Data Science Environment

The data science quest begins with a single step. Mine started by installing the right software.

If you want to start exploring radiology informatics in Python, you’ll first need to set up your environment. Anaconda by Continuum is free and one of the best ways to get started.

Anaconda

Anaconda is a package management software for data science, making it extremely easy to get your computer set up the right way to get started. Detailed download instructions are here, but I’ll provide an abbreviated set of steps here.

I chose the Python 3.5 version for 64-bit Windows. You should pick the one that works for your computer.

Let’s install this just for the local user for now.

Both adding to PATH and setting Anaconda as default Python are good ideas. However, seasoned Python developers may have their own preferred settings.

You have now just set up a powerful data analysis platform! Anaconda is best thought of as a package management system for data science. Through it you can keep all your analytic packages up to date in a centralized fashion.

Jupyter

I highly recommend the Jupyter notebook for data science because it keeps the source code and the output in the same place and allow you to share them easily. You’ll see that many posts in this blog are written using Jupyter as the backend.

First, let’s launch Anaconda Navigator. I’m assuming a Windows 10 environment, but with minimal adjustments this works just as well in Mac or Linux. In Windows you can find this on the Start Menu.

Anaconda comes with a lot of very cool tools, but we’ll be working with Jupyter. Click Launch to run Jupyter.

After a command line runs some initialization processes, a web browser will launch. Jupyter is now up and running!

Jupyter runs a local web server and uses a local browser as a way to access it. This might seem a little awkward at first, but it has many advantages, including cross-platform compatibility, minimalist design, and remote access with minor configurative changes.

You are now all set to start some analytical goodness! (What, did you think it was going to be more complicated than that?)

Related

(Howard) Po-Hao Chen, MD MBA is the Associate Informatics Officer at the Cleveland Clinic Imaging Institute and a musculoskeletal radiology subspecialist. He has an interest in data-driven radiology, quality improvement, and innovation.

Howard will finish training with fellowships in musculoskeletal radiology and nuclear medicine in June 2018 from University of Pennsylvania.