Functions and Modules


A function is a block of code consisting of one or more Python statements which are invoked together as a sequence of instructions, when the function is called. A function can be defined with parameters and can receive arguments for the defined parameters. We have already called several functions in this book. E.g.; print("hello"), input("please type in anything") etc.. These functions were already available in Python and we used them. Now we will create our own function. Here is an example of a function which prints out event numbers up to given number:

# define the function with one parameter - limit
def generate_even_numbers(limit):  
    #All reference documentation should go in between ''' block
    This method generates even numbers from 0 to the value 
    passed in the 'limit' parameter, excluding the 'limit' value
    even_number_list = []  # create an empty list 
    for i in range(limit):
        if (i%2 == 0):  # remainder 0 indicates even
            even_number_list.append(i) # add to the list
    return even_number_list # return statement and function definition ends

# call the defined function with one argument having a value of 10 for limit parameter
# help on a function will print out the text within '''. 



[0, 2, 4, 6, 8]
Help on function generateevennumbers in module __main:

generate_even_numbers(limit) This method generates even numbers from 0 to the value passed in the 'limit' parameter, excluding the 'limit' value

You can send different values for the argument to the function while invoking and accordingly your results will be different.

Function Definitions With Default Values

Sometimes you may have to define a function with one or more default argument values. In such cases all the default arguments can be defined after all the non-default arguments are defined. Here is an example;

def myfunc(state, county, country="US"):
    print(country, state, county)

myfunc('MI','Wayne', "United States of America")


US MI Wayne United States of America MI Wayne

With the above function definition, you can invoke the function with or without a value for 'country'. If you pass country value then the passed in value would be considered. If it is missing then the default value defined in the function definition will be used.

Function Definitions with Variable args and kwargs

Occasionally you may require a function which can accept variable arguments. In such cases you can use the * (asterisk) and/or ** (double asterisk) against the argument names to receive a list and/or keyword arguments respectively. Here is an example;

def myfunc(*args, **kwargs):
    print(kwargs)  # receives a dictionary of keyword arguments
    print(args)   # receives a tuple of all arguments.

myfunc(1, 2, 3, country='US',county='Wayne')


{'country': 'US', 'county': 'Wayne'} (1, 2, 3)

Referencing variable from outer scope inside a function

Supposing you want to reference a variable outside of the function, you can do so only by using global keyword. Here is an example

x = 10
def myfunc():
    x = x+ 100



UnboundLocalError: local variable 'x' referenced before assignment

The above reference of the outer variable throws an error. You can fix this by assigning the global keyword for the variable x. Here is the code:

x = 10
def myfunc():
  global x
  x = x+ 100



Note: All variables that are declared inside a function are considered local by default unless 'global' keyword is used.

However you can reference an outer variable inside the function. Here is the example to illustrate that:

x = 10
def myfunc():




Understanding variable scope in Python

By default all variables declared in a python file (the .py file which becomes a module if used by other programs) is visible to conditional code blocks like if, while, functions etc.. inside that file.

However the variables in a file are not automatically global by referencing by its name only across all other modules, unlike in languages such as JavaScript. Variables which are in one module can be accessed in other modules by way of importing that module and prepending the variable name with module name/namespace.

* for Unpacking Collections

Asterisk (*) performs argument unpacking. This is used with an enclosing function. Here is an example;

def multiply(a,b):
    return a*b




Note however that, you cannot use asterisk by itself without an enclosing function. Although the example shows a list, you can replace the list with any collection object like tuple, set etc..

** for Unpacking Dictionary

You can use double asterisk (**) to unpack a dictionary into name, value pairs. Here too it should be used as an argument to a function. Here is an example;

def myfunc(country,state,county):

my_place = {'state':'Michigan', 
                'county':'Wayne', 'country':'United States',}


United States Michigan Wayne

primary:: Highlights

  • def keyword is used to define a function
  • A function is a block of code which can be called by the same or different program
  • A function can take 0 or more arguments as long as the respective parameters are defined in the function definition.
  • You can set default values to certain arguments by defining the default argument values after all the non-default arguments are defined.
  • You can also send variable length of arguments to a function by using an asterisk before the argument name.
  • You can also send a variable length of keyword arguments by using double asterisk before the keyword arguments name.


Points to note

  • Multiple arguments are separated by a comma in both the definition and invocation. The order of arguments is important when you call the function- you should maintain the same order as the definition, while passing the values to the arguments unless when the function is defined for arbitrary arguments and/or keyword arguments list.
  • A return statement, if present will return any object back from the function to the calling program.
  • Function definition should come before the function is called otherwise you will get NameError


Naming convention

  • It is a recommended practice to name all variables with a noun and all function names should be a verb. The reasoning behind this is, a variable holds a value and inherently does not do anything else with it so it should be a noun. A function on the other had is doing something on the arguments passed. It is taking some action on it and hence should be named with a verb.

Anonymous functions a.k.a. Lambda functions

In python you can create functions without the def keyword and a name. These are called Lambda functions. Lambda functions can be created when the function has only one expression in its body. Let us take an example:

def add(x, y):
    return x + y

In the above function definition, there is only one expression x+y which is returned. So this is a good candidate for defining it as a Lambda function instead. Here is how it is defined as a Lambda:

(lambda x, y: x + y)(5,10)


In this the Lambda function is taking x and y as arguments and the expression x + y after colon, is the body of the lambda function. As you can see lambda function has no name and it gets called when you pass arguments (5,10) to it through a pair of parenthesis.


A module is a Python program file containing one or more functions. Module name for import will be same as the file name except you do not use the .py of the filename.

As an entry level Data Analyst, instead of writing your own custom modules, you would focus on using ready made modules and its functions available in open source libraries. Rich libraries (library is nothing but a group of one or more Python program files bundled together) are written for Python by many open source developers, which you can use by using the Module system.

Python also provides built-in modules which are readily available. In addition, Anaconda package also contains 100's of modules already downloaded for us. For any other module which is not part of Anaconda, you first have to install (in other words download the module) the module before you import them. For all others you can start using the functions of a module by using the import keyword.

In the example below we import random module to use its random function.

# this package is readily available for import
import random   
random_number = random.random()


A random real number between 0 and 1 is printed out.

If you want to know all the functions and usage of each of the functions for random module, you can type in:


This prints out the official documentation which is part of the module. Output is not shown as it is very verbose. Documentation text is added to a Python file between three single quotes '''. Any text between three single quotes will be printed out when you use help() function. To get help on a specific function you can invoke help on that function. For e.g., random module has a function called randint to know more about this type in:



Help on method randint in module random:

randint(a, b) method of random.Random instance Return random integer in range [a, b], including both end points.

As you can see randint method gives you a random integer between two numbers a and b which you can specify. In the above example you used a dot operator (.) on random to invoke randint function of random. To reduce clutter and directly use randint you can use the below import

from random import randint

Full code:

from random import randint 
random_number = randint(2, 6)


Any random number between 2 and 6 is printed out.

Basic Python modules used by all

Module Description
random Functions for generating random numbers
pickle Functions for data storage
tkinter Functions for front end GUI applications
decimal Functions for working with decimals

Advanced modules used by Data Analytics commnity

Module Description
numpy Functions for efficient handling of arrays. In Data Analytics we use NumPy arrays more than native Python arrays
pandas Functions which are built on top of NumPy and provide very efficient implementations for manipulating tabular data.
matplotlib Functions for handling 2D plotting - graphics.
sklearn (skikit learn) Functions for image processing and machine learning
seaborn Functions for creating colorful visualizations of statistical models such as bar plots, violin plots, heat maps etc..

Official Reference:

Exception handling

Programs throw error when the Python interpreter encounters a statement which is illegal and cannot be executed. This could be a programmer error or user input error or system error. Such a condition is called throwing an exception. For e.g,, if you try to import a package cheesecake which does not exist then you receive ModuleNotFoundError on the console. We have earlier come across IndexError in String Operations chapter when we try to access an element position which is out of bounds for a given string.

When an error occurs during processing of a large amount of data, then the program crashes and all the work that the program would have completed until that error point is lost. This is not desirable when processing massive amount of data which takes significant amount of time to process.

Instead of crashing the program, it is desirable to collect the all the issues and complete the given execution and after the execution, you can look into the error situations and give a fix for all the errors, the next time it is run. We use Exception handling using try and except key words to achieve this task.

In the code below, we fix the erroneous import by using try block. Put all the code which could potentially throw an error inside the try block and when an exception is thrown, the program of execution jumps to except block which will handle the exception meaningfully. After handling the exception, the program of execution continues further on with the next statements after except without crashing the program.

    import cheesecake
except ImportError:
    print("Such a module does not exist")
print("program continues after the exception is excepted")

Note however, that it is not always a good thing to move on when an exception happens during execution. If the cheesecake module is absolutely required by you to process all your data then you should crash the program. However, supposing if you are processing a million records of users input data, you may come across a few invalid entries in the middle of otherwise good entries. Then in such situations you can collect all the invalid records separately and complete the processing of the rest of the million records. Then when the program finishes you can look into invalid entries and take action appropriately for all of those.

An optional finally block can be added to the try catch block. Finally block will be executed irrespective of weather an exception is thrown on not. Typically, finally blocks contain code for releasing resources like closing open file handles, releasing database connections etc., which were established in the try block. Then upon either completing the try block successfully or upon an exception situation, irrespective of the situation, the database connection or files should be closed. Such type of code is added in finally block. If you do not release the resources in the finally block and an exception happens during execution, then the program crashes without releasing the resources in a graceful way.

Note that file and database operations will be covered in the next book. Although these two concepts are very important, we rarely use pure Python to open files or databases. Instead we use pandas module for these operations and hence we cover them in NumPy and Pandas book.

Below is a program which handles invalid user inputs elegantly by informing the user what exactly went wrong and asks user to input the correct values instead of crashing the program.

print("This program calculates the rate of interest for your deposit. \nPlease input your deposit and the total interest earned/year\nand I will calculate the rate of interest\n")
print("To quit the program enter 'quit' in small case\n\n")

# program in an infinite loop till user enters quit
while True:
    try:  # try statement encloses statements which could throw an exception
        user_input1 = input("Enter Deposit Amount:")

        if (user_input1 == 'quit'):
            print("Good bye! Hope you had fun getting your interest rates!")

        user_input2 = input('Enter total interest earned:')
        deposit_amount = float(user_input1) 
        interest_earned = float(user_input2)
        interest_rate = (interest_earned / deposit_amount) * 100
         # round function rounds the result to 2 decimal places
        print("Rate of interest:"+ str(round(interest_rate, 2)))
    except: # Catch all exceptions if nothing specified.
        print("You entered an invalid integer. Please type a valid value.")
        # Typically resources are released in this block
        print("This will be executed no matter what")

The program starts by letting the user know that he has to input deposit amount and total the interest earned. The user has to input quit to exit the program. Once the user inputs valid numerical values, the program calculates the interest. If the user inputs illegal values, the program is catch in except block where the user is informed of what went wrong. The except block in the above program catches all exceptions. The two types of erroneous input the user can give are:

  • alpha characters for number
  • deposit value 0 - exception thrown due to division by 0

You can refine the program to add more clarity on the exception raised with the below program:

print("""This program calculates the rate of interest for your deposit. 
Please input your deposit and the total interest earned/year
and I will calculate the rate of interest\n""")

print("To quit the program enter 'quit' in small case\n\n")

while True:

    try: # try statement encloses statements which could throw an exception
        user_input1 = input("Enter Deposit Amount:")

        if (user_input1 == 'quit'):
            print("Good bye! Hope you had fun getting your interest rates!")
        deposit_amount = float(user_input1) 

        user_input2 = input('Enter total interest earned:')
        interest_earned = float(user_input2)

        interest_rate = (interest_earned / deposit_amount) * 100
        # round function rounds the result to 2 decimal places
        print("Rate of interest:"+ str(round(interest_rate , 2))) 
    except ValueError:  # catch ValueError. This happens when the user inputs alpha instead of numeric value
        print("You entered an invalid integer. Please type a valid value")
    except ZeroDivisionError:  # This happens when the deposit amount is 0
        print("You entered a deposit value of 0... ha ha I still did not crash! Input a valid deposit amount please!")
    except:  # User input issues are already caught so crash the program
        print("finally block will be executed no matter what") # typically resources are released in this block

In the first two except blocks we mention the Error name that could occur. In the last except block, we catch all other exceptions that may occur. However when it reaches the third block we have exhausted all possible error conditions due to the user input. Then, in that case when it reaches this block, it has to be some system fault and in that case we have to crash. For that reason we have used the raise statement which rethrows the catch exception and crashes the program. The last except block can be completely removed if we are only rethrowing the exception as it will crash anyways if that block is not there. This is just added here to explain raise statement.

results matching ""

    No results matching ""