Set
Set data structure is an unordered collection with no duplicate elements. Using curly braces { } you can create a set. There is also a set() function you can use to create a Set. Since empty curly brace { } is used to create both Set and Dictionary, to create an empty set you have to use set() function.
Here is an example:
americas = {'Canada', 'United States', 'Mexico', 'Mexico'}
print(americas)
Output: {'United States', 'Canada', 'Mexico'}
You will notice that duplicate 'Mexico' is not added. Also please note that your output may be different from the one shown. It may or may not be in the same order. Print output order is not predictable. Hence the term unordered collection. If you want the sequence order to be maintained then you should use one of the ordered collection types.
You can also construct a set using the set() function. Here we create a set out of list elements.
americas = set(['Canada', 'United States', 'Mexico'])
print(americas)
Output: {'United States', 'Canada', 'Mexico'}
Another example of using the set function with a tuple element
americas = set(('Canada', 'United States', 'Mexico'))
print(americas)
Output: {'United States', 'Canada', 'Mexico'}
If you do not put the square braces or parenthesis and also remove comma between words, then a set is formed with all the letters in the words.
americas = set('Canada' 'United States' 'Mexico')
print(americas)
Output: {'c', 'e', ' ', 'i', 'd', 'n', 'M', 'o', 'S', 't', ',', 's', 'a', 'U', 'C', 'x'}
You will notice that the result is not only case sensitive but also duplicate letters are eliminated. Notice only one letter 'a' is present in the set and similarly other duplicate letters are eliminated.
Set Operations
Operation | Example | Result | Comments |
---|---|---|---|
add an element add | americas.add('Puerto Rico') |
{'United States', 'Canada', 'Puerto Rico'} | Given element is added. Order could be different |
add multiple elements update | americas.update(('Puerto Rico','Cuba')) |
{'United States', 'Canada', 'Puerto Rico', 'Cuba'} | Comma separated multiple elements are added |
remove an element remove | americas.remove('Puerto Rico') |
{'United States', 'Canada'} | Given element is removed. If the item to remove does not exist, remove() will raise an error. |
remove an element discard | americas.discard('Puerto Rico') |
{'United States', 'Canada'} | Given element is removed. If the item to discard does not exist, discard() will not raise an error. |
remove and get a random element pop | americas.pop() |
Any element may be removed | Removed element is returned from this function |
Empty the set clear | americas.clear() |
{} | Set is emptied |
Operations on Two Sets
Let us consider another set of three countries which were English colonies:
old_eng_col = {'Canada', 'India', 'Australia', 'United States'}
Then, here are some of the common set operations you can perform on both the sets and their results:
Operation | Example | Result | Comments |
---|---|---|---|
intersection; & | americas & old_eng_col americas.intersection(old_eng_col) |
{'United States', 'Canada'} | A new set is created with the elements which are common in both sets. Note two ways of getting the same result |
union; l | americas | old_eng_col |
{'United States', 'India', 'Australia', 'Canada', 'Mexico'} | A new set with all unique elements from both sets. Here also you can use union function instead of using l |
difference; - | americas - old_eng_col |
{'Mexico'} | Elements in americas but not in old_eng_col. Here also you can use difference function instead of - |
issubset | americas.issubset(old_eng_col) |
False | Returns True if americas is a subset of old_eng_col |
Here are a few more functions on sets:
- copy() - Returns a copy of the set
- difference_update() - Removes the items in this set that are also included in another, specified set
- intersection_update() - Removes the items in this set that are not present in other, specified set(s)
- isdisjoint() - Returns whether two sets have a intersection or not
- issubset() - Returns whether another set contains this set or not
- issuperset() - Returns whether this set contains another set or not
- symmetric_difference() - Returns a set with the symmetric differences of two sets
- symmetric_difference_update() - inserts the symmetric differences from this set and another
Common Operations across All Collections
Consider the code below to work with the table describing the operations below:
country_codes = {'US':'United States', 'UK':'United Kingdom',
'CA':'Canada', 'MX':'Mexico'} # a dictionary
a = [10, 4, 1, 11] # a list
b = (5, 1, 9, 5) # a tuple
c = {5, 9, 1} # a set
Operations | Example | Output | Comments |
---|---|---|---|
Sort method sorted | sorted_keys = sorted(country_codes) print(sorted_keys) |
['CA', 'MX', 'UK', 'US] | A sorted list of keys is returned |
Sort method sorted | sorted_list = sorted(a) print(sorted_list) |
[1, 4, 10, 11] | A sorted list is returned |
Sort method sorted | sorted_tuple = sorted(b) print(sorted_tuple) |
(1, 5, 5, 9) | A sorted tuple is returned |
Sort method sorted | sorted_set = sorted(c) print(sorted_set) |
{1,5,9} | A sorted set is returned |
There are many other functions like sorted that can be applied across all types of collections. E.g., len (to find the size of the collection), min (get min value), max (get max value)_ etc.. For a dictionary the operations are performed on the keys.
Unpacking Elements of a Collection
Elements of any collection can be unpacked into individual variables. This is very useful when you want to unpack all the elements at once. The below code unpacks the tuple containing top student's individual values into separate values:
top_student = ('jane', 'doe', 99, 21, 'f')
first_name, last_name, score, age, gender = top_student
print(first_name)
output
'jane'
The same operation works on List, Set and also Dictionary. In case of dictionary the unpacked value would be the keys only and not the key/value pair.