Python for Data Science – The Basics

Hi, welcome to Python for Data Science, a short series of articles to help beginners in Data Science learn the fundamentals of the python programming language with a focus on Data Science use cases. There are no prerequisites, the series aims to introduce learners in Data Science to python in as simple and as no-nonsense way as possible.

This is the second article in the series. In our first article, we introduced how data science is changing the world and why Python is preferred by a majority of data science practitioners. By the end of THIS article, you’ll have an idea about the basics of the python programming language. Also, you’ll write your very first Python program!

  • Introduction to the Jupyter Notebook Environment
  • Python Basics
    • Expressions
    • Variables
    • Data Types
    • Operators
    • Comments
    • Common Input/Output Functions
  • Write your first program
  • Recommended Reading

Jupyter Notebook is a web-based application to execute code, document, display visualizations, etc. all inside of a single notebook. And, it is this versatility that has made Jupyter Notebooks one of the most popular tools among data science practitioners.

A Jupyter notebook, simply, is a series of cells. Inside each of these cells, you can either execute code or show some text. And, based on the input in a cell, we get an appropriate output on execution. The following image shows a sample Jupyter Notebook.

New Jupyter Notebook
Sample Jupyter Notebook

The introductory article in the series has the steps documented to install Anaconda. Anaconda comes with tools like Jupyter, Spider, etc pre-installed avoiding you the trouble of installing them individually. If you have Anaconda installed, you can launch Jupyter Notebook from the Anaconda Navigator.

If you’d like a quick refresher on Jupyter Notebooks, check out our article Introduction to Jupyter Notebook. For the purpose of this tutorial series, we recommend using Jupyter Notebooks as your go-to environment for executing python code.

Python, as a programming language, is designed to have a very high level of code readability. It’s also an interpreted language, meaning it runs your code line by line without having to compile your entire code as is the case in languages like C or Java.

This combination makes Python an ideal candidate for beginners who want to get their feet wet with programming without having to worry about complicated syntax. For instance, this is how a program to print a “Hello World!” looks like in C++, Java, and Python.

#include <iostream>

int main() {
    std::cout << "Hello, World!";
    return 0;
}
class HelloWorld {
    public static void main(String[] args) {
        System.out.println("Hello, World!"); 
    }
}
print("Hello, World!")

As you can see how simple and intuitive python is compared to C++ and Java. While python in itself is a powerful object-oriented language with its own syntax and standard library, for this article we’ll focus on its basic building blocks.

A Python Expression is an instruction to the python interpreter to evaluate a resulting value. Expressions consist of atoms and operators. Operators act (operate) on the atoms, which are the individual units in the expression to produce a resulting value. Variables, literals (constants), and function calls are common types of atoms frequently used in expressions.

Python Expressions example
Sample Expression

For example, 1+2 is an expression with +, the addition operator, acting on the constant values (also called literals) 1 and 2 to give the result 3.

Variables are named memory locations used to store a value or a reference. For simplicity, imagine variables as boxes where you can store items. Whenever you need to use these “items” you can simply do that by accessing them through their “boxes”.

GIF of 23 being assigned to the box Age.
Age = 23

In the above example 23, (the item or the value) is stored in the variable Age (the box). In python, the = operator is used for variable assignment. The syntax for doing it is variable = value, for example, Age = 23.

When a variable is assigned for the first time, it’s said to be initialized. This variable can now be used in different expressions. If you assign a new value to the variable, its older value is forgotten. For instance, if you run the command Age = 45, all further references to the variable Age would result in 45.

Rules for variable names: Variable names should be descriptive. Something, that conveys what sort of value it contains. It’s up to you on what you want to name your variables. But, it has to follow the below python naming rules:

  1. It should be a single word without any spaces.
  2. It can contain only alphabets, numbers, and the underscore _ special character.
  3. It can not begin with a number.

The below table shows examples of some valid and invalid variable names in Python.

Valid NamesInvalid Names
rick8rick
R8282
rick_mortyrick morty
_rickric$

Python is a dynamically typed programming language. Meaning, you don’t need to define the type of a variable beforehand. It determines the type during run-time. But, what does it mean for a variable to have a data type?

Simply put, the data type of a variable characterizes the type of data that particular variable stores. The data type is important because it tells the python interpreter what sorts of operations can be associated with it. For example, it’s logical to evaluate the expression 2+2 but it’s not logical to evaluate 2+'cat'.

Python has multiple built-in data types. But, for the purpose of our introduction, we’ll be focusing on the following data types:

Numeric Types: Numeric data types are associated with numbers. Python uses int as the data type for integers and float for real numbers (numbers with a decimal point). Example, 13 would have int as its data type but 13.2 or even 13.0 would be float.

String Type: A string is a sequence of unicode characters. String values are enclosed in single '' or double "" quotes. Example, name = "Rick".

NOTE: The behavior of an operator can change based on the types of the values it’s operating on. For instance, the addition operator + when used with two strings concatenates them together.

Changing behavior of the + operator with two strings

Operators are special symbols in python used to perform operations on variables and values in expressions. These operations could be arithmetic, comparison, logical, assignment, etc. Python has the following different types of operators:

Arithmetic Operators: These operators are used for mathematical computations like addition, subtraction, multiplication, etc. They generally operate on numerical values and return a numerical value.

Arithmetic Operators Example

Comparison Operators: These operators are used for comparing values. They return a boolean value, either True or False.

Comparison operators example

Logical Operators: These operators are used to perform logical and, or, and not operations. They operate on boolean values and return a boolean value.

Logical operators example

The Assignment Operator: In python, = is used as the assignment operator. It’s used to assign values to variables. For example, x = 2 will assign the value 2 to the variable x.
The assignment operator can be compounded with different arithmetic operators, for example, x += 2 is the same as x = x + 2

Assignment operator example with compound assignment.

NOTE: It’s important to keep in mind that == is a comparison operator and is used as an equality check while = is the assignment operator used to assign values to variables.

Operator Precedence: The order in which operators are executed in expressions is called the Operator Precedence. Similar to the mathematics, in Python, the exponent operator, **, is evaluated first, the *, /, //, and % are evaluated next, from left to right followed by the + and – operators, which are also evaluated left to right. The evaluation order can be changed by the help of parenthesis () which have the highest precedence.

There are other operators as well, like Bitwise, Identity, and Membership operators in python. For more details on different operators in Python check out Operators in Python.

Comments are parts in the code that are meant to be ignored by the Python interpreter during execution. The comments are meant for humans. Writing clear and informative comments is one of the best coding practices. It not only helps you remember why you did what you did but also helps other developers understand your code better.

In python, we use the # symbol to denote the beginning of a comment. Anything coming after the # symbol on the same line is considered as a comment by the interpreter.

# This is a comment
a = 2 + 2 # + 3 Anything coming after the # symbol on the same line is ignored.
# In the above example, a evaluates to 4

Python comes with some standard functions that help us take input from the user and display some output on the screen. One of these has been used in a number of examples in this tutorial. Can you guess the function we’re referring to?

Yes, it is the print function.

These built-in functions, namely, input() and print() are widely used for standard input and standard output operations respectively.

Input: The input() function is used to obtain input from the user. Whenever a call is made to the input() function, the program execution is paused to allow the user to type in an input. After the user presses the enter key, all the characters typed are returned as a string.

Output: The print() function is used to display an output to the standard output device, example, your screen. We can also print the output to a file.

input and print function example

Now that we’ve covered the basics, it’s time to get your hands dirty. From the concepts covered until now, try writing a program yourself. It can be something as simple as writing a “Hello, World!” or something totally different. With the input/output functions covered, you can make your program dynamic as well.

Here’s an example of a program that asks the user for her name and displays a dynamic greeting.

# Wish the user
# Input the name of the user using the input function
name = input()
# Display a custom greetings message.
print("Hello, ", name)

This is result of running the above code in Jupyter.

Sample Program to greet a user hello in python

With the topics covered in this tutorial we hope that you got an understanding of some of the fundamental building blocks of programming in python – Expressions, Variables, Data Types, Operators, Comments, and Basic Input/Output. If you’d like to dive deeper, we recommend the following (opening available) resource to supplement the topics covered –

  • Chapter-1 of the book Automate the Boring Stuff with Python. This book is specially designed to have an implementation first approach. The online version of the book is freely available.

In the next article in this Python for Data Science series, we’d be covering important flow of control constructs like conditionals and loops in python. Stay curious and keep learning!


Subscribe to our newsletter for more informative guides and tutorials.
We do not spam and you can opt out any time.