Data Science in Everyday Practice
Course Motivation
Image
This course is offered as part of the programme "University in the Colleges" and it is hosted by Collegio Nuovo - Sezione Laureati (Graduate Section).
It is aimed at a medical and life science audience, with no prior background of data analysis and a minimal background of statistics. The goal of the course is to provide students with the most important tools and decision criteria, to import and visualise data originating from different sources (structured medical data, laboratory measures, biological experiments), to explore and understand key elements in those datasets they might encounter in their studies or everyday practice.
Where
Via E. Tibaldi, 4, 27100 Pavia PV
Via E. Tibaldi, 4, 27100 Pavia PV
When
The course will take place during the week 19th to 23rd January 2026
Requirements
None. This course is designed for students with no prior background in data analysis and minimal knowledge of statistics.
CFU: 3
Intended Learning Outcomes
1. Understand the role of data science in their discipline
2. Use Python and Jupyter to explore different datasets
3. Apply basic criteria and tools to transform and visualise their data
4. Interpret their data based on the results of an exploratory data analysis (EDA)
Contents and Structure
The course focuses on practical data science techniques using Python, leveraging Jupyter Notebooks and/or Google Colab for hands-on practice. Key topics include:
- Introduction to Python for Data Science:
- Overview of Python's relevance to life sciences.
- Introduction to Jupyter Notebooks and Google Colab.
- Setting up your environment and working with online/cloud-based tools.
- Python Basics for Data Analysis:
- Data types and basic operations in Python.
- Using pandas to manipulate tabular data.
- NumPy for numerical computations.
- Data Wrangling and Preparation:
- Understanding and cleaning messy datasets.
- The concept of “tidy data”/tabular in Python.
- Merging, grouping, and transforming datasets.
- Data Visualization:
- Exploratory plots using matplotlib and seaborn.
- Interactive visualizations with Plotly.
- Visualization techniques for biological and medical datasets.
- Introduction to Statistics in Python:
- Descriptive statistics and their applications.
- Hypothesis testing using Python’s scipy/statsmodel.
- Data Lab:
- Hands-on exercises using Google Colab or local Jupyter Notebooks.
- (BONUS TRACK) use of ChatGpt in the data science workflow
- Applied Examples of Data Analysis:
- Real-world case studies with medical and biological datasets.
- Exploratory data analysis workflow from data import to visualization.
Methods
Class activity will be focused to demonstrations, discussions and problem solving through interaction: demo, group work, quiz and real-time feedback.
Jupyter notebooks and Google Colab will be used for easy access to coding and command line tools, with no prior experience.
Textbooks
Detailed Schedule
Document