5. Data Structures#

5.1. What is a Data Structure?#

A data structure is a data type which is used to organise and store multiple data points / values, so that they can be accessed and worked with efficiently. In Python, some of the most commonly used built-in data structures (sequence data types) are lists, tuples, dictionaries, and sets.

5.2. Lists#

A list is an ordered, changeable collection of items. Lists can contain items of any type (even a mixture of types).

Lists are created using square brackets [ ]:

fruits = ["apple", "banana", "cherry"]
print(fruits)
['apple', 'banana', 'cherry']

You can access elements of the list using indexing (remember that Python indexing starts at 0). To access the last entry of the list you can also use the -1 index:

print(fruits[0]) # Output: apple
print(fruits[1]) # Output: banana
print(fruits[2]) # Output: cherry
print(fruits[-1]) # Output: cherry
apple
banana
cherry
cherry

You can also modify lists, i.e. change one entry with another, by assigning the list entry (accessed through its index) a new value. You can add an entry at the end of the list using the method .append(value) or at a specific position with the method .insert(index,value). To remove a entry use either the method .remove(value) for a specific value or .pop(index) for a specific position:

fruits[1] = "blueberry" #change the entry at index 1
print(fruits) # Output: ['apple', 'blueberry', 'cherry']

fruits.append("orange") # Add to the end of the list
print(fruits) # Output: ['apple', 'blueberry', 'cherry', 'orange']

fruits.insert(1, "kiwi") # Add to list at the index 1
print(fruits) # Output: ['apple', 'kiwi', 'blueberry', 'cherry', 'orange']

fruits.remove("kiwi") # Remove the entry with value "kiwi"
print(fruits) # Output: ['apple', 'blueberry', 'cherry', 'orange']

fruits.pop(-1) # Remove the last entry in the list
print(fruits) # Output: ['apple', 'blueberry', 'cherry']
['apple', 'blueberry', 'cherry']
['apple', 'blueberry', 'cherry', 'orange']
['apple', 'kiwi', 'blueberry', 'cherry', 'orange']
['apple', 'blueberry', 'cherry', 'orange']
['apple', 'blueberry', 'cherry']

If you have a list of values and you want to add it to a different list you have two options. The method new_list.append(list) adds the whole list as a single element to the new one. In contrast the method new_list.extend(list) adds each element of the list as its own entry in the new list:

print(fruits) # Output: ['apple', 'blueberry', 'cherry']
more_fruit = ["mango", "peach", "pear"]
fruits.extend(more_fruit) # Add the entries in more_fruit to the end
print(fruits) # Each fruit is added individually: ['apple', 'blueberry', 'cherry', 'mango', 'peach', 'pear']

even_more_fruit = ["kiwi", "plum"]
fruits.append(even_more_fruit) # Add the list even_more_fruit to the end
print(fruits) # The list even_more_fruit is added as a single entry: ['apple', 'blueberry', 'cherry', 'mango', 'peach', 'pear', ['kiwi', 'plum']]
['apple', 'blueberry', 'cherry']
['apple', 'blueberry', 'cherry', 'mango', 'peach', 'pear']
['apple', 'blueberry', 'cherry', 'mango', 'peach', 'pear', ['kiwi', 'plum']]

5.3. Tuples#

A tuple is similar to a list, but it is immutable — meaning its entries cannot be changed after creation.

Tuples are defined using round brackets ( ):

coordinates = (4, 5)
print(coordinates)
(4, 5)

You can access tuple elements just like lists:

print(coordinates[0]) # Output: 4
4

You cannot modify a tuple once it is created, e.g., changing a value, adding a new entry, removing an entry, …

coordinates[0] = 10 # This will raise an error!
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[7], line 1
----> 1 coordinates[0] = 10 # This will raise an error!

TypeError: 'tuple' object does not support item assignment

5.4. Dictionaries#

A dictionary stores data as key–value pairs. To access a specific values in a list or tuple, one uses the index. For dictionaries, you can personalise what you use to access the value. For each stored value, you define a key (usually a string or integer) that is used to access that value.

Dictionaries are created using curly braces { }, where each entry is key: value:

person = {"name": "Ross", "age": 22, 3: "three"}
print(person)
{'name': 'Ross', 'age': 22, 3: 'three'}

You can access, modify, and delete values by their keys the same way as when using indexes for lists (only exception: .remove() does not work). To create a new entry, assign a value to a key that has not been used before. All keys in the dictionary can be listed using the method .keys() and values using .values().

print(person["name"]) # Output: Ross
print(person[3]) # Output: three
print(person['age'])

#change the age
person["age"] = 31
print(person) # Output: {'name': 'Ross', 'age': 31, 3: 'three'}

#add a new key-value pair
person["height"] = 5.75
print(person) # Output: {'name': 'Ross', 3: 'three', 'height': 5.75}

#remove a key-value pair by its key
person.pop(3)
print(person) # Output: {'name': 'Ross', 'age': 31, 'height': 5.75}

#get all keys
print(person.keys()) # Output: dict_keys(['name', 'age', 'height'])
#get all values
print(person.values()) # Output: dict_values(['Ross', 31, 5.75])
Ross
three
22
{'name': 'Ross', 'age': 31, 3: 'three'}
{'name': 'Ross', 'age': 31, 3: 'three', 'height': 5.75}
{'name': 'Ross', 'age': 31, 'height': 5.75}
dict_keys(['name', 'age', 'height'])
dict_values(['Ross', 31, 5.75])

5.5. Sets#

A set is an unordered collection of unique items. Sets automatically remove duplicate entries.

Sets are created using curly braces { }, but without key–value pairs and just the value for each entry:

animals = {"cat", "dog", "bird", "dog"}
print(animals) # note that the duplicate "dog" is removed
{'dog', 'bird', 'cat'}

Since sets are unordered it is not subscriptable, i.e. we can’t access specific entries using the index or key as before. We can still add new entries using the .add(value) method and delete entries using .remove(value).

#add a new entry
animals.add("fish")
print(animals) # Output: {'cat', 'dog', 'bird', 'fish'}

#remove an entry
animals.remove("cat")
print(animals) # Output: {'dog', 'bird', 'fish'}


#try to access by index
print(animals[0]) # This will raise an error!
{'dog', 'bird', 'fish', 'cat'}
{'dog', 'bird', 'fish'}
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[48], line 11
      7 print(animals) # Output: {'dog', 'bird', 'fish'}
     10 #try to access by index
---> 11 print(animals[0]) # This will raise an error!

TypeError: 'set' object is not subscriptable

5.6. Data structure operators#

There are various methods and operators that can used for all data structures besides what has been introduced so far. Here are some examples:

  • in operator: check whether a value is an entry in the structure. For dictionaries this will check for the key and not for the value.

  • len() function: get the number of entries in the data structure

#define and print different data structures
example_list = [1, 2, 3]
example_tuple = (1, 2, 3)
example_set = {1, 2, 3}
example_dict = {"a": 1, "b": 2, "c": 3}

print(f'Example data structures:')
print(example_list) # Output: [1, 2, 3]
print(example_tuple) # Output: (1, 2, 3)
print(example_set) # Output: {1, 2, 3}
print(example_dict) # Output: {'a': 1, 'b': 2, 'c': 3}

# check for 1 in each of them
print('Checking for 1 in each data structure:')
print(1 in example_list) # Output: True
print(1 in example_tuple) # Output: True
print(1 in example_set) # Output: True
print(1 in example_dict) # Output: False --> 1 is an value, not a key

# check for "a" in each of them
print('Checking for "a" in each data structure:')
print("a" in example_list) # Output: False
print("a" in example_tuple) # Output: False
print("a" in example_set) # Output: False
print("a" in example_dict) # Output: True --> key in dict

# get length of each of them
print('Length of each data structure:')
print(len(example_list)) # Output: 3
print(len(example_tuple)) # Output: 3
print(len(example_set)) # Output: 3
print(len(example_dict)) # Output: 3
Example data structures:
[1, 2, 3]
(1, 2, 3)
{1, 2, 3}
{'a': 1, 'b': 2, 'c': 3}
Checking for 1 in each data structure:
True
True
True
False
Checking for "a" in each data structure:
False
False
False
True
Length of each data structure:
3
3
3
3

5.7. Quick Practice#

Try these small challenges:

  • Create a list called colours containing at least three colour names.

  • Create a tuple called dimensions with three numbers inside.

  • Create a dictionary called student with keys “name” and “age” and give it reasonable values.

  • Create a set called numbers_set with some numbers, including duplicates.

Print each data structure to see the result.

# Put your code here
💡 Solution
colours = ["red", "blue", "green"]
dimensions = (1920, 1080)
student = {"name": "Bob", "age": 20}
numbers_set = {1, 2, 2, 3, 4}

print(colours)
print(dimensions)
print(student)
print(numbers_set)