Data Types, Variables and Arithmetic Operators
Let us write a simple equation in math to calculate mean of a set of numbers:
a = 15, b = 35, c = 55
mean = (a+b+c) / 3 = 35
To do this simple calculation, you may be using mental math or a calculator. But if you are writing a Python program to do so, then you first have to understand how to declare variables a,b,c and mean, understand the data types (integers, real numbers, text etc..) that can be assigned to your variables and finally, understand the various arithmetic operators that you can use. In this lesson we will learn all of these simple concepts.
Python's common primitive data types
|Data type||Name||Example||Allowed Values|
Python assigns the data type based on the literal value assigned to the variable and there is no need to assign data type like in C or Java.
Let us now get our hands dirty by keying in the program shown below in the Code cell, to compute the mean in Python:
a = 15 b = 35 c = 55 mean = (a+b+c) / 3 print(mean) print(type(mean))
Key in the above statements one at a time in the Code input cell in firstConcept.ipynb file opened in the previous lesson. Although for your convenience a Copy button is given, if you are new to programming it is recommended that you key in the values, one statement at a time, instead of using the Copy button. To run this program, ensure that the cursor is inside the Code input cell and then press control+enter for Mac or ctrl+enter for Windows and notice the output.
Notice the class 'float' printed below 35.0. Since the computed answer is a real number it has automatically assigned a float data type to the computed answer.
Note: While the first 4 lines of code are similar to algebraic expressions, the two new keywords you may notice are
These are called functions. In very simple terms, a function can be considered as a black box, which takes zero to more inputs called arguments, and splits out zero to more outputs. In this case, the
mean, and splits out that value as the output on the screen. The
type function takes in the argument
mean and splits out the data type of the argument that is passed in. This output from the
type function is again sent as an argument to the
Function arguments are passed with in a pair of parenthesis
Rules for variable name declaration
Must start with a letter or underscore
Must contain only letters, digits or underscores
Must not use any of the reserved keywords that is used by Python
Keywords in Python:
and assert break class continue def del elif else except exec finally for from global if import in is lambda not or pass raise return try while yield
Recommendation for Variable Names
Give meaningful names for variables instead of using a,b,x,y etc., unless it is a variable declared in a loop or a mathematical equation like in the example shown.
Start with lowercase letter and use underscore to separate words.
Although camel case notations for variable names is in vogue for Object Oriented Programming (OOP), in Data Analytics however, we rarely create an object, so we will use underscore notations in this book.
- Example of camel case: studentName="joe", Example of underscore: student_name="joe"
Points to note
- Variable names are case sensitive: mean != Mean
- A variable should be first defined before it is used. The below code cell throws an exception:
NameEror: name 'd' is not defined
Single quote ('), double (") quotes and triple quotes (```) are allowed to enclose a String literal value.
Literal values for float, int, bool should not be enclosed with any type of quote.
A variable which is assigned one type first can get reassigned with another type later. Key in the below statements in the code input cell, run the code and watch the output.
weight = 100 weight = "150 pounds" print(weight)
You will notice that the program runs without any error and the output is 150 pounds.
Few more tips on trouble shooting
- Programming context is maintained between the code cells. Variables which are declared in one cell is available for code cells which are executed, after the cell containing the variable declaration is executed. Order of the cell in the notebook does not matter as long as it is executed after the variable declaration code is executed. However it is a good practice to write all the code cells in the order in which they should be executed.
- Sometimes you may lose track of all the variables active in your context and you may be seeing results which you did not anticipate. In such cases it is a good idea to restart your Kernel and start your executions with a clean slate. To start a clean run of all the code cells use Notebook --> Restart Kernel, Notebook --> Run All Cells
- Shutdown the Kernel for the notebook, close the notebook file and reopen if there are persistent issues which are not resolved by following the above procedure.
You have already used addition (+) operator in the example above. The other arithmetic operators in Python are listed below:
Assume x has a value of 2 before the execution of any one of the statements,
|Operator name||Notation||Short Notation||Result|
|Addition||x = x+1||x += 1||3|
|Subtraction||x = x-1||x -= 1||1|
|Multiplication||x = x*2||x *= 2||4|
|Division||x = x/2||x /= 2||1.0|
|Integer Division||x = x//2||x //= 2||1|
|Modulo||x = x%2||x %= 2||0|
|Exponent||x = x**2||x **= 2||4|
Points to note
- Short Notation is used where ever possible instead of Notation statements. Both achieve the same result but one is shorter in representation.
- All the operations are very similar to standard algebraic results.
- Modulo operator returns the integer remainder after division
- Division result is always a floating point number
- Recommended style guide for Python is PEP 8 - https://www.python.org/dev/peps/pep-0008/
The order of operation is very similar to algebraic rules - PEMDAS. It stands for Parentheses, Exponents, Multiplication, Division, Addition, Subtraction.