Module 1: Setting Up#

Module 1 Objectives:
By the end of this module, you will be able to:

  • Set up Python in various work environments (e.g., terminal, VSCode, Anaconda).

  • Understand how to execute Python scripts and work within Jupyter notebooks.

  • Test the setup with simple Python code using the print statement.

Why Python?#

Python is a beginner-friendly programming language with clean, readable synthax that resembles natural human language. It’s used for a wide range of tasks, including:

  • Data analysis and manipulation (using libraries like Pandas, NumPy).

  • Data visualization (using libraries like Matplotlib, Seaborn, Plotly).

  • Statistical analysis and modeling (using libraries like SciPy, Statsmodels, Scikit-learn).

  • Building and training machine learning models (using libraries like Scikit-learn, TensorFlow, Keras, PyTorch).

Python versus R#

Python and R are both widely used in data science. We often see debates in the field about whether to use one or the other, as if they were mutually exclusive. For beginners, this choice might seem more critical, but in the long term, learning both languages is often necessary. Instead of viewing them as competing options, it’s better to consider them as complementary tools, each suited to different tasks or use cases.

For instance, R is highly specialized in statistical tests and models like mixed-effects models, survival analysis, or time-series forecasting, while Python is ideal for tasks related to deep learning, web development, automation, or deploying models [DataCamp, 2022].

Python + Jupyter Notebook = …?#

There are two main files format used for Python codes.

Python scripts (with a .py extension) are plain text files containing Python code. They are used ideally for production code, writing applications and libraries/packages, and reusing functions and classes. However, they have no built-in output visualization and are executed as one whole script.

If you’re interested in seeing what your code is doing, you can instead use Jupyter Notebook files (with a .ipynb extension). Jupyter notebook files (previously known as Ipython notebook) allow you to create computational notebooks and write your code interatively (not limited to Python). Quoting the official documentation, “A computational notebook is a shareable document that combines computer code, plain language descriptions, data, rich visualizations like 3D models, charts, graphs and figures, and interactive controls”.

In other words, it’s interactive, which allows for step-by-step code execution and for viewing your outputs. It’s great for data exploration and analysis since you can easily experiment with Python code and visualize results, as well as add text-based explanations (Markdown).

We’ll be using Jupyter Notebook files for exercices, and we’ll eventually learn how to use both when required [Jupyter, 2015].

N.B. Jupyter Notebook refers to both a file format and an application. You can edit .ipynb files in any code editor or IDE!

Feature

.py (Python Script)

.ipynb (Jupyter Notebook)

File Format

Plain text file containing Python code.

JSON file that contains code, text (Markdown), and outputs.

Execution

Executed as a whole script.

Executed in chunks (cells), output shown after each cell.

Usage

Suitable for production code, apps, and large projects.

Best for interactive exploration, analysis, and learning.

Interactivity

Linear execution, no interactive output.

Interactive execution with immediate feedback and outputs.

Output Visualization

No built-in output visualization.

Directly supports rich visual outputs like plots and tables.

Documentation

Limited to comments in the code.

Supports Markdown for rich documentation and formatting.

Where can I write my Python code?#

You can write and edit Python code through your terminal. However, you might find it easier and more manageable to use applications like Jupyter Notebook, JupyterLab, VS Code, or even Google Colab.

Code editors or Integrated Development Environments (IDE) let you write Python code in .py or .ipynb format. The key difference between the two is that code editors are primarily used to writing and editing code, while IDEs integrate a suite of programming and software development tools in one interface. However, the line can get blurred as some code editors like VSCode are highly customisable thanks to community plugins. Ultimately, both code editors and IDEs can significantly boost your coding efficiency, simplify your workflow, and make debugging more effective. We’re using jupyter notebooks to run code so it doesn’t matter what editor/environment you end up choosing.

We’ll first talk about how to work with Python in a terminal before moving onto more useful tools like VS Code and Anaconda.

References:

[Dat22]

DataCamp. Python vs r for data science: which should you learn? https://www.datacamp.com/blog/python-vs-r-for-data-science-whats-the-difference, 2022.

[Jup15]

Project Jupyter. What is a notebook. https://docs.jupyter.org/en/latest/#what-is-a-notebook, 2015.

[Mic25]

Microsoft. Windows subsystem for linux. https://learn.microsoft.com/en-us/windows/wsl/about, 2025.

[Pan25]

Pandas. Pandas.series — pandas documentation. https://pandas.pydata.org/docs/reference/api/pandas.Series.html, 2025. Accessed: 2025-06-25.

[NumPyDevelopers]

NumPy Developers. What is numpy? Accessed: 27-04-2025. URL: https://numpy.org/doc/stable/user/whatisnumpy.html.

[PythonSFoundation25]

Python Software Foundation. Beginner's guide to python. https://wiki.python.org/moin/BeginnersGuide, 2025.