Ordered Collections - Sequence

Python provides three data structures for ordered collection, also called sequence; Lists, Tuples and Ranges.

In this lesson we will learn all three.

Lists

Lists are used for representing ordered collection of mutable objects - both compound and primitive. Here is an example of a collection of numbers called scores.

scores = [50, 80, 90, 100]

Notice the square bracket used to represent the collection. A list can also be created by using list function. In this case the above statement would be

scores = list([50, 80, 90, 100])

Here are some of the commonly used operations on lists:

Operations Example Output Comments
Positive positional index scores[0] 50 first position in the list has a value:50
Negative positional index scores[-1] 100 first position from the end.
Slicing scores[0:2] [50, 80] new list with numbers from position 0 (included) to 2 (excluded)
Slicing with start position default scores[:2] [50, 80] new list with numbers from from 0 index and 2nd position (excluded)
Slicing with end position default scores[2:] [90, 100] new list with numbers from position 2 to end of the list
Build in function len len(scores) 4 len can be applied to lists to obtain the size of the list
Modify an element
scores[0] = 60
print(scores)
[60, 80, 90, 100]
lists can be modified unlike strings
Add new element
scores.append(90)
print(scores)
[60, 80, 90, 100, 90]
adds a new element at the end of the list
Use slice to replace more than one element
scores[2:4]=[89, 99]
print(scores)
[60, 80, 89, 99, 90]
numbers in 2nd and 3 positions are replaced
Use slice to remove all elements
scores[:]=[ ]
print(scores)
[] removes all elements of the list!
Use slice with a step value scores[::-1] [90, 99, 89, 80, 60] using -1 as step value reverses the list.
List of lists
scores = [50, 60]
names=['joe', 'john']
term=[scores, names]
print(term)
[[50, 60], ['joe', 'john']]
c is a list of lists
Sort a list in place
scores = [60, 30, 80]
scores.sort()
print(scores)
[30, 60, 80]
scores is sorted in place using the default sort algorithm - sort from smallest to largest. Can change the default by passing in optional 'key' and 'reverse' attributes to sort function.

Few other useful methods: There are a few other interesting methods on lists. Replace 'list' below with the name of your list.

  • list.count(x) - returns the number of times number x appears in the list
  • list.index(x[, start[,end]]) - return index value of the first item whose value is x. Raises ValueError if no such item. Optional start and end are used to limit the search similar to slice.
  • list.insert(index,element) - to insert the specified element at the specified index
  • list.remove(index) - to remove an element at the specified index
  • list.pop([index]) - removes the last element if no index is specified, otherwise it gets the specified index element and removes it from the list.

Built-in functions

Python provides rich built-in functions which can be applied on list structures. In the first lesson you calculated mean of 3 individual numbers. If you create a list object of the same three numbers, then you can use the sum function on the list to calculate the sum of all the list numbers and use len function to calculate the total number of elements in the list. Then the mean of all the numbers in the list can be obtained as shown below:

my_numbers = [15, 35, 55]
mean = sum(my_numbers) / len(my_numbers)
print(mean)

Output:

35

You can also apply sorted built in function on any list. This will return a sorted list and the original list is untouched. This is different from the sort function which can be applied on a list, which sorts the list in place. The default sort order is small to large. You can change this default sort order by passing in the 'key' and 'reverse' attributes to the sorted function. More on this: https://docs.python.org/3/howto/sorting.html

Official reference

Complete list of built-in functions can be found at: https://docs.python.org/3/library/functions.html

Tuples

Tuples are immutable collection of objects. As a result, tuple elements cannot be modified, deleted or inserted. Tuples are used to represent a collection which may have heterogeneous values. Tuples are used to represent objects, which make sense as a whole, when all elements are together.

For example, in a class, if you have to represent the complete details of a student who scored the highest marks in the class, you could represent the student record as a tuple as shown below:

top_student = ('jane', 'doe', 99, 21, 'f')

This tuple contains values for the student's first name, last name, total score, age and gender. Collection of items is a mix of String and numbers, and so the term heterogeneous, but together they make a record for one student. Since the values, though heterogeneous, represent one student with the highest score, you make a tuple representing the record.

Operation on tuples are similar to lists except, the modification operations are not allowed. Here is the run-down

Operations Example Output Comments
Positive positional index top_student[0] 'jane' first position in the tuple
Negative positional index top_student[-1] 'f' first position from the end.
Slicing top_student[1:3] ('doe', 99) new tuple with numbers from position 1 (included) to 3 (excluded)
Slicing with start position default top_student[:2] ('jane', 'doe') new tuple with numbers from from 0 index and 2nd position (excluded)
Slicing with end position default top_student[2:] (99, 21, 'f') new tuple with numbers from position 2 to end of the tuple
Build in function len len(top_student) 5 len can be applied to tuples also to obtain the size of the tuple

Unpack a tuple

Elements of a tuple can be unpacked into individual variables. This is very useful when you want to unpack all the elements at once. The below code unpacks the top student's individual values into separate values:

top_student = ('jane', 'doe', 99, 21, 'f')
first_name, last_name, score, age, gender = top_student
print(first_name)

output

'jane'

Range

Range type represents an immutable sequence of numbers mostly used in for loops for looping a specific number of times.

range([start,]stop [,step])

start and step are optional.

start

Loop starts from this value. 0 is the default when not supplied

step

Loop steps through with in increments with the value defined in step

stop

The loop execution stops when incremented value matches stop. Once it matches the loop is not executed.

Here are a few examples:

Definition Output Comments
list(range(5)) [0,1,2,3,4] Total number is 5 but the last index number is 4
list(range(1, 5)) [1,2,3,4] Instead of starting at default 0, it starts at 1 since start number is given
list(range(0, 20, 5)) [0,5,10,15] Since step is given, start should be given. Steps in increment of 5 - step value
list(range(0, -5, -1))
[0,-1,-2,-3,-4] Same rule applies for negative numbers as well

Points to note

  • Tuples are very popular in Data Analytics and should be understood thoroughly.
  • Tuples are more efficient than lists. When ever the elements do not change, you should use tuples.
  • Lists are not as popular for Data Analytics as NumPy Arrays or Pandas Series. NumPy and Pandas are Python libraries heavily used in Data Analytics. We use NumPy arrays or Pandas Series in place of Python lists because these data structures are more efficient for handling large datasets and also provides many convenient functions to apply for analytics.

results matching ""

    No results matching ""