Data Structures in python allow you to store and access data more efficiently. In this tutorial, we’ll cover the four basic inbuilt data structures in python – lists, tuples, sets, and dictionaries. These inbuilt data structures are commonly used not just by programmers but also by data science practitioners for their day to day tasks.
This is our fifth and final article in our series, Python for Data Science.
- In the first article, we introduced how data science is changing the world and why Python is preferred by a majority of data science practitioners.
- The second article explained some of the fundamental building blocks of programming in python – expressions, variables, data types, operators, comments, input/output functions, etc.
- In our third article, we looked at the flow of control in python using constructs like conditionals and loops.
- The fourth article covered the fundamentals of functions in python – function definition, arguments, the scope of variables, the return statement, etc.
Table of Contents
- Inbuilt Data Structures in Python
- Lists
- Creating a list
- Accessing list items
- Slicing a list
- Adding elements to a list
- Removing elements from a list
- Concatenating lists
- Tuples
- Creating a tuple
- Accessing and slicing tuple items
- Concatenating tuples
- Immutable but potentially changing
- Set
- Creating a set
- Adding and removing elements from set
- Set operations
- Dictionary
- Creating a dictionary
- Updating a dictionary
- Lists
- Recommended Reading
Inbuilt Data Structures in Python
While storing and accessing data efficiently is important, there’s no one-size-fits-all way of doing so. Different use cases may require the data to be stored differently. And this is why Python offers four different inbuilt data structures – Lists, Tuples, Sets, and Dictionaries each with its own utility and use cases.
Lists
Lists are used to store an ordered collection of items. These items can be any type of object from numbers to strings or even another list. This makes lists are one of the most versatile data structures in python to store a collection of objects.
Creating a list
In python, lists can be created using the square brackets []
with individual items inside the square brackets separated by a comma.
# syntax for creating a list # list_name = [item1, item2,..., itemn] # example ls = [1, 2, 3] print(ls)
Output:
[1, 2, 3]
Accessing list items
Items in a list are ordered. Meaning, they are present in a specific sequence and can be accessed by their index which denotes their position in the list. The items of a list are indexed starting from 0 all the way to n-1 where n is the length of the list. This indexing is called positive indexing.
Introductory ⭐
- Harvard University Data Science: Learn R Basics for Data Science
- Standford University Data Science: Introduction to Machine Learning
- UC Davis Data Science: Learn SQL Basics for Data Science
- IBM Data Science: Professional Certificate in Data Science
- IBM Data Analysis: Professional Certificate in Data Analytics
- Google Data Analysis: Professional Certificate in Data Analytics
- IBM Data Science: Professional Certificate in Python Data Science
- IBM Data Engineering Fundamentals: Python Basics for Data Science
Intermediate ⭐⭐⭐
- Harvard University Learning Python for Data Science: Introduction to Data Science with Python
- Harvard University Computer Science Courses: Using Python for Research
- IBM Python Data Science: Visualizing Data with Python
- DeepLearning.AI Data Science and Machine Learning: Deep Learning Specialization
Advanced ⭐⭐⭐⭐⭐
- UC San Diego Data Science: Python for Data Science
- UC San Diego Data Science: Probability and Statistics in Data Science using Python
- Google Data Analysis: Professional Certificate in Advanced Data Analytics
- MIT Statistics and Data Science: Machine Learning with Python - from Linear Models to Deep Learning
- MIT Statistics and Data Science: MicroMasters® Program in Statistics and Data Science
🔎 Find Data Science Programs 👨💻 111,889 already enrolled
Disclaimer: Data Science Parichay is reader supported. When you purchase a course through a link on this site, we may earn a small commission at no additional cost to you. Earned commissions help support this website and its team of writers.
# indexing example ls = ['a', 'b', 'c'] print("Item at the 0th index:", ls[0]) print("Item at the 1st index:", ls[1]) print("Item at the 2nd index:", ls[2])
Output:
Item at the 0th index: a Item at the 1st index: b Item at the 2nd index: c
Items in a list can also be accessed by a negative index. These indices refer to items from the end of the list. The negative index starts from -1. For example, ls[-1]
will give the last item, ls[-2]
, the second last item, and so on.
# indexing example ls = ['a', 'b', 'c'] print("Item at the -1 index:", ls[-1]) print("Item at the -2 index:", ls[-2]) print("Item at the -3 index:", ls[-3])
Output:
Item at the -1 index: c Item at the -2 index: b Item at the -3 index: a
Slicing a list
We can access individual items in a list through their index but what if we want to select a range of items inside the list? In python, this can be done through slicing which is done using the :
symbol. The syntax to slice elements in a list is:
list_name[start_index:end_index]
The above syntax returns the list sub-segment starting from the start_index
up-to but not including the end_index
. Example:
# slicing example ls = ['India', 'USA', 'Canada', 'Australia', 'UK'] print("Slicing [1:3] gives", ls[1:3]) # If starting index is not provided, it's assumed to be 0 print("Slicing [:3] gives", ls[:3]) # If end index is not provided, it's assumed to be the list's length print("Slicing [3:] gives", ls[3:]) # slicing using [:] gives the entire list print("Slicing [:] gives", ls[:])
Output:
Slicing [1:3] gives ['USA', 'Canada'] Slicing [:3] gives ['India', 'USA', 'Canada'] Slicing [3:] gives ['Australia', 'UK'] Slicing [:] gives ['India', 'USA', 'Canada', 'Australia', 'UK']
Updating list elements
Lists are mutable. You can add, remove, or update the elements in a list. This ability makes lists quite flexible when it comes to storing data. Example:
# update the list ls = ['a', 'b', 'c'] # original list print("Original list:", ls) # change the second element to 'd' ls[1] = 'd' # updated list print("Updated list:", ls)
Output:
Original list: ['a', 'b', 'c'] Updated list: ['a', 'd', 'c']
Adding an element to a list
Items in a list can be added using the append()
or insert()
function.
The append()
function is used to add an element to the end of the list.
The insert()
function is used to insert an element at a specific index inside the list. Example:
# adding elements to a list example ls = ['India', 'USA', 'Canada', 'Australia', 'UK'] # append is used to add the element to the end of the list ls.append("South Africa") print(ls) # insert is used to add the element at a specific index ls.insert(1, "South Korea") print(ls)
Output:
['India', 'USA', 'Canada', 'Australia', 'UK', 'South Africa'] ['India', 'South Korea', 'USA', 'Canada', 'Australia', 'UK', 'South Africa']
Removing elements from a list
Items in a list can be removed using the remove()
or pop()
function.
The remove()
function is used to remove the first occurrence of the value passed from the list.
The pop()
function is used to remove the element from the list at the specified index.
Note: If you don’t pass any index to the pop function the last element from the list is removed (or popped out). Example:
# removing elements from a list example ls = ['India', 'USA', 'Canada', 'Australia', 'UK'] # remove the element based on the value passed ls.remove("Australia") print(ls) # remove the element based on the index passed ls.pop(1) print(ls) # remove the last element from the list ls.pop() print(ls)
Output:
['India', 'USA', 'Canada', 'UK'] ['India', 'Canada', 'UK'] ['India', 'Canada']
Concatenating lists
We can concatenate lists using the +
operator. Just as using the +
operator on strings concatenates them, using it on list results in a combined list. Example:
# concatenate lists ls1 = [1, 2, 3] ls2 = [4, 5] ls3 = ls1 + ls2 print(ls3)
Output:
[1, 2, 3, 4, 5]
Tuples
Tuples are similar to lists in that they’re used to store ordered data. The only difference is that Tuples are immutable. Meaning, once a tuple is created, its values cannot be updated or changed.
Creating a tuple
A tuple can be created using parenthesis ()
with individual items inside the parenthesis separated by a comma. Example:
# syntax for creating a tuple # tuple_name = (item1, item2,..., itemn)) # example tup = (1, 2, 3) print(tup)
Output:
(1, 2, 3)
Accessing and slicing tuple items
Like lists, tuple items can be accessed and sliced using their indices enclosed in squared brackets []
. Example:
# tuple accessing and slicing example tup = ('India', 'USA', 'Canada', 'Australia', 'UK') # accessing tuple elements print("Item at 0 index:", tup[0]) print("Item at 2 index:", tup[2]) # negative indexing print("Item at -1 index:", tup[-1]) # slicing tuple print("Slicing [1:3] gives", tup[1:3]) # if starting index is not provided, it's assumed to be 0 print("Slicing [:3] gives", tup[:3]) # if end index is not provided, it's assumed to be the tuple's length print("Slicing [3:] gives", tup[3:]) # slicing using [:] gives the entire tuple print("Slicing [:] gives", tup[:])
Output:
Item at 0 index: India Item at 2 index: Canada Item at -1 index: UK Slicing [1:3] gives ('USA', 'Canada') Slicing [:3] gives ('India', 'USA', 'Canada') Slicing [3:] gives ('Australia', 'UK') Slicing [:] gives ('India', 'USA', 'Canada', 'Australia', 'UK')
Concatenating tuples
Like lists, tuples can be concatenated using the +
operator. Example:
# concatenate tuples tup1 = (1, 2, 3) tup2 = (4, 5) tup3 = tup1 + tup2 print(tup3)
Output:
(1, 2, 3, 4, 5)
Immutable but potentially changing
Since tuples are immutable we cannot add or remove elements from a tuple. But, what if a tuple contains a mutable item, for example, a list? Can we change the entries in the list?
This is an interesting question. A tuple, by definition, is a collection of objects. This collection is immutable, that is, it cannot be changed. But, if it contains a mutable element, for example, a list, it can potentially change.
This happens because tuple and other data structures actually store the reference to the items. So, if you try changing an immutable object (for example, an integer value) it tries to change its reference or address which is not allowed. But, if you try to change a mutable object, (for example, a list) it does not change the reference to that list hence the tuple (which is basically an immutable collection of such references) remains unchanged. The example below depicts this behavior:
# tuple with a mutable object tup = ('red', 'blue', [1,2,3]) print("The original tuple:", tup) # print the memory location of the list print("Memory location of tuple's third item(the list):", id(tup[2])) # updating the list inside the tuple tup[2][0] = 7 # tuple after updating the list print("The updated tuple:", tup) # print the memory location of the list print("Memory location of tuple's third item(the list):", id(tup[2]))
Output:
The original tuple: ('red', 'blue', [1, 2, 3]) Memory location of tuple's third item(the list): 140065732921856 The updated tuple: ('red', 'blue', [7, 2, 3]) Memory location of tuple's third item(the list): 140065732921856
For more on this behavior of tuples, refer to this article.
Set
Sets are used to store a collection of unordered and unique elements. Sets are unordered, meaning you cannot use them for storing sequences as there is no inherent ordering of elements inside the set. Hence, sets do not support indexing, slicing, or other sequence-like behavior. Also, sets are mutable.
Creating a set
A set can be created using curly braces {}
. Example:
# set example sample_set = {'red', 'red', 'blue', 'green'} # print the set print(sample_set)
Output:
{'red', 'green', 'blue'}
As you can see in the above example, the duplicate value for ‘red’ was not considered in the set.
Adding and removing elements from set
Sets are mutable and thus we can add and remove elements from a set.
To add an element to a set, we use the add()
function. Example:
# add element to set example sample_set = {'red', 'blue', 'green'} # print the set print("Original set:", sample_set) # add 'yellow' to the set sample_set.add('yellow') # print the updated set print("Updated set:", sample_set)
Output:
Original set: {'red', 'green', 'blue'} Updated set: {'red', 'green', 'blue', 'yellow'}
To remove an element from a set we can use remove()
or discard()
methods:
1. remove()
: This method gives an error if the element is not present in the set.
2. discard()
: This method does not give an error if the element is not present in the set.
Example:
# remove element from set example sample_set = {'red', 'blue', 'green', 'yellow'} # print the set print("Original set:", sample_set) # use the remove() function sample_set.remove('yellow') # print the updated set print("Updated set after removing yellow:", sample_set) # use the discard() function sample_set.discard('blue') # print the updated set print("Updated set after removing blue:", sample_set)
Output:
Original set: {'red', 'green', 'blue', 'yellow'} Updated set after removing yellow: {'red', 'green', 'blue'} Updated set after removing blue: {'red', 'green'}
Set Operations
Python sets are similar to sets in mathematics and many of the mathematical set operations like union, intersection, difference, etc. can be performed on sets in python. Example:
# common set operations a = {1, 2, 3} b = {2, 3, 4} # print the sets print("Set a:", a) print("Set b:", b) # union operation print("Union of sets a and b:", a.union(b)) # intersection operation print("Intersection of sets a and b:", a.intersection(b)) # difference operation print("Elements of set a not in b (a-b):", a.difference(b)) print("Elements of set b not in a (b-a):", b.difference(a)) # symmetric difference operation print("Elements present in either set a or set b but not in both:", a.symmetric_difference(b)) print("Elements present in either set a or set b but not in both:", b.symmetric_difference(a))
Output:
Set a: {1, 2, 3} Set b: {2, 3, 4} Union of sets a and b: {1, 2, 3, 4} Intersection of sets a and b: {2, 3} Elements of set a not in b (a-b): {1} Elements of set b not in a (b-a): {4} Elements present in either set a or set b but not in both: {1, 4} Elements present in either set a or set b but not in both: {1, 4}
Dictionary
Dictionaries are used to store key to value mappings in python. Unlike sequences (example, lists, tuples) which are indexed by a range of numbers, dictionaries are indexed by keys. Dictionaries are mutable but you can only use immutable types as their keys.
Creating a dictionary
A dictionary in python can be created using the curly braces {}
with individual key: value pairs inside the curly braces separated by a comma. The value corresponding to a key can easily be accessed by using the square brackets []
. Example:
# creating a sample dictionary sample_dict = {'USA': 'Washington', 'India': 'New Delhi', 'UK':'London'} # print the dictionary print("Sample dictionary:", sample_dict) # print the value corresponding to the key India print("Key corresponding to India:", sample_dict['India'])
Output:
Sample dictionary: {'USA': 'Washington', 'India': 'New Delhi', 'UK': 'London'} Key corresponding to India: New Delhi
Updating a dictionary
Dictionaries are mutable and hence we can update the dictionary by adding new key: value pairs, removing existing key: value pairs, or changing the value corresponding to a key.
The example below shows how to update the value of an existing key and also add a new key: value pair to the dictionary.
sample_dict = {'USA': 'Washington', 'India': 'New Delhi', 'UK':'London'} # print the dictionary print("Original dictionary:", sample_dict) # update the value corresponding to the key 'USA' sample_dict['USA'] = 'New York' # print the updated dictionary print("Updated dictionary:", sample_dict) # add another key:value pair sample_dict['France'] = 'Paris' # print the updated dictionary print("Updated dictionary:", sample_dict)
Output:
Original dictionary: {'USA': 'Washington', 'India': 'New Delhi', 'UK': 'London'} Updated dictionary: {'USA': 'New York', 'India': 'New Delhi', 'UK': 'London'} Updated dictionary: {'USA': 'New York', 'India': 'New Delhi', 'UK': 'London', 'France': 'Paris'}
To remove a key: value pair from a dictionary, we can use the pop()
function which returns the value that has been removed. Example:
sample_dict = {'USA': 'Washington', 'India': 'New Delhi', 'UK':'London'} # print the dictionary print("Original dictionary:", sample_dict) # remove the key UK sample_dict.pop("UK") # print the updated dictionary print("Updated dictionary:", sample_dict)
Output:
Original dictionary: {'USA': 'Washington', 'India': 'New Delhi', 'UK': 'London'} Updated dictionary: {'USA': 'Washington', 'India': 'New Delhi'}
Recommended Reading
In this tutorial, we looked at the different inbuilt data structures in python – Lists, sets, tuples, and dictionaries, their characteristics, and how each of them is different when it comes to storing data. If you’d like to dive deeper, we recommend the following (opening available) resources to supplement the topics covered –
- Chapter-4 and Chapter-5 of the book Automate the Boring Stuff with Python. This book is specially designed to have an implementation first approach. The online version of the book is freely available.
Here’s the complete list of articles of our five-part tutorial series on Python for Data Science:
If you found this article useful do give it a share! For more such articles subscribe to us.
With this, we come to the end of our Python for Data Science series. In the series, we covered some of the fundamentals of the Python programming language which are essential for Data Science. Stay curious and happy learning!