Hands-on 1: Environment Setup for Data Analysis¶
Introduction¶
Welcome to the first hands-on exercise! In this notebook, we'll guide you through setting up your Python development environment for data analysis. By the end of this tutorial, you'll have a fully functional setup ready for the course.
Prerequisites¶
Welcome to our first hands-on exercise! During this TA session, we'll help you set up your Python development environment. Don't worry if you run into any issues - that's what we're here for! This tutorial covers all major operating systems (Windows, macOS, and Linux), so just follow the instructions for your system.
Note: If you encounter any problems during the installation process, raise your hand and a TA will come help you.
Part 1: Installing Visual Studio Code¶
Visual Studio Code (VS Code) is a powerful, lightweight code editor that we'll use throughout the course.
- Visit the VS Code download page
- Download the appropriate version for your operating system:
- Windows:
.exeinstaller - macOS:
.dmgfile (Intel) or Apple Silicon version - Linux:
.deb/.rpmpackage or snap store installation
- Windows:
Installation instructions by operating system:
Windows¶
- Run the downloaded
.exefile - Follow the installation wizard
- Make sure to check "Add to PATH" during installation
macOS¶
- Open the downloaded
.dmgfile - Drag VS Code to the Applications folder
- Launch VS Code from Applications
Linux¶
Ubuntu/Debian:
sudo apt update
sudo apt install code
Fedora/RHEL:
sudo dnf install code
Part 2: Setting Up Package Management with Mamba¶
We'll use Mamba for managing our Python environment. It's a faster alternative to Conda that helps manage Python packages and dependencies.
Installing Mamba¶
- First, download Miniforge for your operating system from here
Installation commands by OS:
Windows¶
- Run the downloaded
.exefile - Follow the installation wizard
- Important: Select "Add to PATH" when prompted
macOS/Linux¶
bash ~/Downloads/Miniforge-$(uname)-$(uname -m).sh
After installation, verify mamba is installed:
mamba --version
Part 3: Creating Our Python Environment¶
Now let's create a dedicated environment for data analysis:
# Create a new environment named 'dataanalysis' with Python 3.10
mamba create -n dataanalysis python=3.10
# Activate the environment
# Windows:
mamba activate dataanalysis
# macOS/Linux:
source activate dataanalysis
Part 4: Installing Required Packages¶
Let's install the core packages we'll need:
mamba install -c conda-forge jupyter numpy pandas matplotlib seaborn scikit-learn
Additional packages using pip:
pip install plotly
Part 5: Setting Up VS Code Extensions¶
Install these essential VS Code extensions:
- Python (Microsoft)
- Jupyter (Microsoft)
- GitHub Copilot (if you have access)
To install extensions:
- Click the Extensions icon in the left sidebar (or press Ctrl+Shift+X)
- Search for each extension
- Click "Install"
Part 6: Verifying Your Setup¶
Let's verify everything is working correctly. Create a new notebook in VS Code:
- Press Ctrl+Shift+P (Cmd+Shift+P on macOS)
- Type "Create New Jupyter Notebook"
- Select your 'dataanalysis' kernel
Test your setup with this code:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
print("NumPy version:", np.__version__)
print("Pandas version:", pd.__version__)
# Create a simple plot
plt.figure(figsize=(8, 6))
sns.set_style("whitegrid")
x = np.linspace(0, 10, 100)
plt.plot(x, np.sin(x))
plt.title("Test Plot")
plt.show()
Troubleshooting¶
Common issues and solutions:
Environment not found in VS Code
- Restart VS Code
- Ensure Mamba is in your PATH
- Run:
python -m ipykernel install --user --name dataanalysis
Package installation fails
- Check your internet connection
- Try installing packages one by one
- Use
pip installas a fallback
Matplotlib plots not showing
- Restart the kernel
- Run
%matplotlib inlineat the start of your notebook
Next Steps¶
Congratulations! You now have a fully configured Python environment for data analysis. In the next hands-on session, we'll start exploring data manipulation with Python.
Additional Resources¶
Remember to:
- Keep your packages updated using
mamba update --all - Create different environments for different projects
- Save your environment configuration using
mamba env export > environment.yml