Modules

A module is a Python program file containing one or more functions. Module name for import will be same as the file name except you do not use the .py of the filename.

As an entry level Data Analyst, instead of writing your own custom modules, you would focus on using ready made modules and its functions available in open source libraries. Rich libraries (library is nothing but a group of one or more Python program files bundled together) are written for Python by many open source developers, which you can use by using the Module system.

Python also provides built-in modules which are readily available. In addition, Anaconda package also contains 100's of modules already downloaded for us. For any other module which is not part of Anaconda, you first have to install (in other words download the module) the module before you import them. For all others you can start using the functions of a module by using the import keyword.

Random Module

In the example below we import random module to use its random function.

# this package is readily available for import
import random   
random_number = random.random()
print(random_number)

output:

A random real number between 0 and 1 is printed out.

If you want to know all the functions and usage of each of the functions for random module, you can type in:

help(random)

This prints out the official documentation which is part of the module. Output is not shown as it is very verbose. Documentation text is added to a Python file between three single quotes '''. Any text between three single quotes will be printed out when you use help() function. To get help on a specific function you can invoke help on that function. For e.g., random module has a function called randint to know more about this type in:

help(random.randint)

output:

Help on method randint in module random:

randint(a, b) method of random.Random instance Return random integer in range [a, b], including both end points.

As you can see randint method gives you a random integer between two numbers a and b which you can specify. In the above example you used a dot operator (.) on random to invoke randint function of random. To reduce clutter and directly use randint you can use the below import

from random import randint

Full code:

from random import randint 
random_number = randint(2, 6)
print(random_number)

output:

Any random number between 2 and 6 is printed out.

You can import and use the above module and its function in another way; by declaring an alias for the module name:


import random as rm
random_number = rm.randint(2, 6)
print(random_number)

In the above, you first import the module random and give it an alias 'rm'. Using the dot operator on the module alias rm, you invoke the function randint.

Collection Module

Here is another example of using the collections module and importing Counter

Counter can accept a list of values and then count them and give you back a dictionary of the value and its count. Here is an example


from collections import Counter
Counter(['apple','orange','grapes','apple','apple','grapes'])

Output:

Counter({'apple': 3, 'grapes': 2, 'orange': 1})

By invoking the most_common function on the Counter, you can get the top counts. Here is an example


from collections import Counter
Counter(['apple','orange','grapes','apple','apple','grapes']).most_common(2)

Output:

[('apple', 3), ('grapes', 2)]

Reference: https://docs.python.org/3/library/collections.html

Common basic Python modules

Module Description
random Functions for generating random numbers
pickle Functions for data storage
tkinter Functions for front end GUI applications
decimal Functions for working with decimals

Advanced modules used by Data Analytics community

Module Description
numpy Functions for efficient handling of arrays. In Data Analytics we use NumPy arrays more than native Python arrays
pandas Functions which are built on top of NumPy and provide very efficient implementations for manipulating tabular data.
matplotlib Functions for handling 2D plotting - graphics.
sklearn (skikit learn) Functions for image processing and machine learning
seaborn Functions for creating colorful visualizations of statistical models such as bar plots, violin plots, heat maps etc..

Official Reference:https://docs.python.org/3/tutorial/modules.html

results matching ""

    No results matching ""