8. Practice Sheet#
Congratulations, you have made it to the end of the second module!
This module covered a lot of information very quickly, so this excercice sheet is to help you practice your gained knowledge. Ideally, you should either download this page or open it in GoogleColab / Binder (by clicking the spaceship icon on the top right), so that you can interact freely with it and add as many cells as you want. Other else you can also use the Live Button as before to run your code directly here.
Tip
As a recap, here is the most important information that was covered in this module:
Values can be stored in variables, making them easy to recall, use, and manipulate.
Variables are defined using equal signs
=
and can be changed by reassigning a new value.Each variable has a data type defining what type of information it contains and how it can be manipulated. Python automatically infers the data type from the value.
Common data types are numerical data (integers and floats), booleans, and strings.
Other data types are sequence types/data structures which store multiple data points, e.g. tuples
()
, lists[]
, sets{}
, and dictionaries{key: value}
.Each data type has specific functions, methods, and operators that can be used to manipulate them, e.g. arithmetic operators, comparison operators, and logical operators.
To make manipulations easier, one can define conditional blocks, which specify which block should be carried out, how often, and with which variable values.
Running into errors is normal – clear variable names, comments, print statements, and try-except statements are your best friends in identifying where they come from.
You will use the knowledge you have gained to analyse synthesised patient data. For each of our ten patients, we will know their gender, age, height, weight, smoking status, and any diseases they have. You will create data structures that contain information about different patients, and then use conditional blocks and operations to gain insights into them.
Let’s start with patient 1: she is female, 25 years old, 1.75 m tall, weighs 70.9 kg, does not smoke, and has asthma and diabetes. For each of these properties, create a variable and store the corresponding value in it. The diseases she has should be stored in a tuple, and the smoking status should be saved as True
or False
.
# Your code goes here
Can you infer what type each of the variables has? First, make a guess, then print the value and the type of each variable. What type does each of the entries in the tuple (i.e. the diseases) have? Do you have an idea of how to check the type of an entry in a data structure? (Remember how we access a specific value in a tuple.)
# Your code goes here
Oh no, I made a mistake when I told you the information about the participant — their age is actually 35. Could you please update the age variable and print the new value? I’m also curious about how many diseases patient 1 has. Print this information using a function to access the number of entries in the tuple.
# Your code goes here
Amazing, we have all the information about participant 1! However, saving each piece of information in an individual variable is quite a hassle. Let’s use a data structure to get a better overview of the values. Create a list where each entry is one of the variables in the following order: gender, age, weight, height, and smoking status.
Tip
You can either use the actual values directly when defining the list, or use the variable names — the list will automatically include their current values.
# Your code goes here
I’m sorry, I completely forgot about the diseases. Please append the list with the diseases tuple. I also realised that height and weight were switched around. Reassign the value at the height position to the weight value, and the value at the weight position to the height value.
The list should now consist of the following entries: gender
, age
, height
, weight
, smoking status
, and diseases
. Print the full list to make sure everything is correct.
💡 Solution
# Defining clinical information for participant 1
participant1_gender = 'female'
participant1_age = 25
participant1_weight = 70.9
participant1_height = 1.75
participant1_smoking = False
participant1_diseases = ('Asthma', 'Diabetes')
# Print the data type of each variable
print('Information about participant 1:')
print(f'Their gender is {participant1_gender} which is of type {type(participant1_gender)}')
print(f'Their age is {participant1_age} which is of type {type(participant1_age)}')
print(f'Their weight is {participant1_weight} which is of type {type(participant1_weight)}')
print(f'Their height is {participant1_height} which is of type {type(participant1_height)}')
print(f'Their smoking status is {participant1_smoking} which is of type {type(participant1_smoking)}')
print(f'Their diseases are {participant1_diseases} which is of type {type(participant1_diseases)} with the diseases being type {type(participant1_diseases[0])} and {type(participant1_diseases[1])}')
# Changing participant's age to 35
participant1_age = 35
print(f'Their age is now {participant1_age} which is {type(participant1_age)}')
# Print the number of diseases
print(f'The number of diseases is {len(participant1_diseases)}')
# Put all the variables in a list
participant1_information = [participant1_gender, participant1_age, participant1_weight, participant1_height, participant1_smoking]
print(f'Participant 1 information is: {participant1_information} which is of type {type(participant1_information)}')
# Append diseases to the list
participant1_information.append(participant1_diseases)
# Swap the height and weight values
participant1_information[3] = participant1_weight
participant1_information[2] = participant1_height
# Print the corrected list
print(f'participant 1 information is now fixed: {participant1_information}')
Amazing — we have all the information we need from participant 1!
Let’s get the information for participant 2: he is male, 40 years old, 1.79 m tall, weighs 103.4 kg, smokes, and does not have any diseases.
Create a list with all the information for participant 2 in the same order as for participant 1 (i.e. gender, age, height, weight, smoking status, diseases). You can choose whether to insert the values directly into the list or to create individual variables first. Even though participant 2 does not have any diseases, we still want to include a tuple so that the list has the same number of entries as participant 1. Do you have an idea how to create an empty tuple? (See the hint if not.)
Print the final list.
💡 Hint
You can create an empty data structure by simply defining it without any values.
For example, the diseases for participant 2 would be defined as:
participant2_diseases = ()
# Your code goes here
💡 Solution
# Defining clinical information for participant 2
participant2_information = ['male', 40, 1.79, 103.4, True, ()]
# Printing the final list
print(f'Participant 2 information is: {participant2_information}')
Great, we have recorded the information for participants 1 and 2. We have ten participants in total, so eight more to go:
Participant |
Gender |
Age |
Height |
Weight |
Smoking? |
Diseases |
---|---|---|---|---|---|---|
3 |
male |
18 |
1.75 |
85.1 |
True |
Lung cancer |
4 |
female |
83 |
1.63 |
55.9 |
False |
Cardiovascular disease, Alzheimer’s |
5 |
female |
55 |
1.68 |
50.0 |
False |
Asthma, Anxiety |
6 |
female |
32 |
1.59 |
64.0 |
True |
Diabetes |
7 |
male |
21 |
1.90 |
92.9 |
False |
Asthma, Colon cancer |
8 |
male |
46 |
1.71 |
75.4 |
False |
|
9 |
female |
32 |
1.66 |
90.7 |
True |
Depression |
10 |
male |
67 |
1.78 |
82.3 |
False |
Anxiety, Diabetes, Cardiovascular disease |
You can either create a list for each participant in the same way you did for participant 2, or you can copy the code in the next hidden solution segment if you’d prefer not to input all the values manually.
Note
If a participant has only one disease, make sure to include a comma after the disease name when creating the tuple — for example:
('Lung cancer',)
💡 Solution
'''
Overview of the clinical information for each participant:
| Participant | Gender | Age | Height | Weight | Smoking? | Diseases |
|-------------|--------|-----|--------|--------|----------|----------|
| 3 | male | 18 | 1.75 | 85.1 | True | Lung cancer |
| 4 | female | 83 | 1.63 | 55.9 | False | Cardiovascular disease, Alzheimer’s |
| 5 | female | 55 | 1.68 | 50.0 | False | Asthma, Anxiety |
| 6 | female | 32 | 1.59 | 64.0 | True | Diabetes |
| 7 | male | 21 | 1.90 | 92.9 | False | Asthma, Colon cancer |
| 8 | male | 46 | 1.71 | 75.4 | False | |
| 9 | female | 32 | 1.66 | 90.7 | True | Depression |
| 10 | male | 67 | 1.78 | 82.3 | False | Anxiety, Diabetes, Cardiovascular disease |
'''
# Defining clinical information for each participant
participant3_information = ['male', 18, 1.75, 85.1, True, ('Lung cancer',)]
participant4_information = ['female', 83, 1.63, 55.9, False, ('Cardiovascular disease', 'Alzheimer’s')]
participant5_information = ['female', 55, 1.68, 50.0, False, ('Asthma', 'Anxiety')]
participant6_information = ['female', 32, 1.59, 64.0, True, ('Diabetes',)]
participant7_information = ['male', 21, 1.90, 92.9, False, ('Asthma', 'Colon cancer')]
participant8_information = ['male', 46, 1.71, 75.4, False, ()]
participant9_information = ['female', 32, 1.66, 90.7, True, ('Depression',)]
participant10_information = ['male', 67, 1.78, 82.3, False, ('Anxiety', 'Diabetes', 'Cardiovascular disease')]
# Example output
print(participant9_information)
Wow, that’s a lot of information! Even though we’ve put the properties of each patient into a list, we still have ten separate lists. Let’s create a dictionary to make it easier to access the information for each participant.
Create a dictionary where each key is the participant (e.g. participant1
) and the value is the corresponding list. Print the complete dictionary. Then, print only the information for participant 7.
# Your code goes here
💡 Solution
# Create the dictionary from the individual lists
participants_clinical_information = {
'participant1': participant1_information,
'participant2': participant2_information,
'participant3': participant3_information,
'participant4': participant4_information,
'participant5': participant5_information,
'participant6': participant6_information,
'participant7': participant7_information,
'participant8': participant8_information,
'participant9': participant9_information,
'participant10': participant10_information
}
# Print the full dictionary
print("All participants' clinical information:")
print(participants_clinical_information)
# Now we can access each participant's information using their key
print("Participant 7's information:")
print(participants_clinical_information['participant7'])
Great! We now have the data for all our participants saved in an easy-to-access format. Now we can finally start learning something from it.
Let’s begin with something simple: we want to access and print the age of each participant. To do this, we’ll loop through our dictionary. When we use the for variable_name in data_structure:
statement with dictionaries, we loop through each key — in our case, the participant identifiers.
We can then use the key to access the corresponding value in the dictionary (which is the list containing the participant’s medical information).
To get a feel for this, start by writing a for
loop. Inside the block, print the variable (i.e. the key) and the corresponding dictionary entry (i.e. the participant’s list of clinical information).
# Your code goes here (should only be 2 or 3 lines long)
💡 Solution
for participant in participants_clinical_information:
print(f"{participant}'s information is: {participants_clinical_information[participant]}")
Now that this is working, replace the print statement of the dictionary entry with a new variable called participant_information
, which you assign to the dictionary entry for the current key. This variable now represents the list of clinical information for that participant. You can then print the age, which is the second item in the list (i.e. index 1 of the participant_information
variable).
# Your code goes here
💡 Solution
for participant in participants_clinical_information:
participant_information = participants_clinical_information[participant]
print(f"{participant} is {participant_information[1]} years old")
Well done! Let’s try something a bit more challenging: we want to find out which diseases are common in our population. To do this, we need to identify all the unique diseases and count how often each one appears in our data.
As a first step, repeat what we did above — but this time, instead of printing the age, print each participant’s diseases.
# Your code goes here
Secondly, we will now save all of the diseases in a single list. To do this, create an empty list above the for
loop. Then, inside the for
loop — after printing the diseases — add them to the list you’ve just created.
You can use the .extend(data_structure)
method instead of .append()
, so that each entry in the tuple is added as an individual element to your list (rather than the entire tuple being added as one item).
Once the loop is complete, print the final list of diseases.
# Your code goes here
Amazing — we now have all the diseases stored in a list! Next, we want to find out which unique diseases participants have overall.
Can you remember which data structure only contains unique entries? That’s right — sets.
Create a new variable that is a set created from the list of diseases.
💡 Tip
You can convert one data structure into another using the syntax data_structure_name(data_structure_variable)
.
In this case, data_structure_name
is set, and data_structure_variable
is your list of all diseases.
# Your code goes here
Now that we have our unique set of diseases, let’s loop through it. For each entry (i.e. each disease), print its name and how often it appears in the list of all diseases.
To do this, use the .count(value)
method — where value
is the name of the disease. This method returns the number of times the specified value appears in the list.
# Your code goes here
💡 Solution
# Create an empty list to store all diseases
diseases_list = []
# Loop through participants and collect diseases
for participant in participants_clinical_information:
participant_information = participants_clinical_information[participant]
print(f"{participant} has the following diseases: {participant_information[5]}")
diseases_list.extend(participant_information[5]) # use extend to unpack the tuple
# Print all diseases collected
print("\nThe following diseases are present among the participants:")
print(diseases_list)
# Create a set to get unique diseases
unique_diseases = set(diseases_list)
# Loop through the unique diseases and count occurrences
print("\nOccurrences of each unique disease:")
for disease in unique_diseases:
print(f"- {disease} occurs {diseases_list.count(disease)} time(s)")
Amazing — you’re already becoming an expert in for
loops, which is fantastic as you’ll be using them a lot!
Let’s return to our dictionary containing information for all participants. This time, we want to compute the BMI for each participant using the formula: \(\begin{equation}\text{BMI} = \frac{\text{weight}}{\text{height}^2}.\end{equation}\)
Loop through each participant, calculate their BMI, and check whether it falls into one of the following categories:
• Underweight: BMI less than 18.5
• Normal range: BMI between 18.5 and 25 (inclusive)
• Overweight: BMI over 25
Use if-elif-else
statements (you should have three conditions in total).
For each participant, print their number, gender, BMI value, and whether they are underweight, in the normal range, or overweight.
# Your code goes here
💡 Solution
for participant in participants_clinical_information:
# Get the participant's information
participant_information = participants_clinical_information[participant]
# Compute the BMI (BMI = weight / height²)
bmi = participant_information[3] / (participant_information[2] ** 2)
# Check BMI category and print result
if bmi < 18.5:
print(f"{participant} ({participant_information[0]}) is underweight with a BMI of {bmi:.1f}.")
elif 18.5 <= bmi <= 25:
print(f"{participant} ({participant_information[0]}) is in the normal range with a BMI of {bmi:.1f}.")
else:
print(f"{participant} ({participant_information[0]}) is overweight with a BMI of {bmi:.1f}.")
We’ve already used a for loop to go through all participants and calculate their BMI. Now let’s make our program a bit more flexible: we’ll let it look up and print the BMI for one participant at a time, and we’ll use a while loop to keep going until we decide to stop.
Use a while loop to repeatedly:
Ask the user for a participant number (1–10).
Look up their information in the dictionary.
Calculate and print their BMI and weight category (underweight, normal range, or overweight).
Stop the loop when the user types
"exit"
.
💡 New: Asking for input
You can ask the user to type something in using the input()
function.
The value returned by input()
is always a string, even if the user types a number.
You can combine it with other code like this:
participant_number = input("Enter a number: ")
key = "participant" + participant_number
💡 Solution
while True:
user_input = input("Enter a participant number (1–10), or type 'exit' to stop: ")
if user_input == 'exit':
break # stop the loop
key = 'participant' + user_input
if key in participants_clinical_information:
info = participants_clinical_information[key]
bmi = info[3] / (info[2] ** 2)
if bmi < 18.5:
category = 'underweight'
elif bmi <= 25:
category = 'in the normal range'
else:
category = 'overweight'
print(f"{key} ({info[0]}) has a BMI of {bmi:.1f} and is {category}.")
else:
print("That participant number doesn’t exist. Please enter a number from 1 to 10.")