16  Controlling Program Flow with Conditional Statements

NoneThink About It 💭

Have you ever made a decision based on a condition, like deciding whether to buy a coffee depending on how much money is in your bank account? Programming works in much the same way. So, how do we teach a computer to “make decisions” like we do?

As your programs become more complex, they need to be able to respond to different situations — choosing different paths depending on the data. Conditional statements are how Python makes those decisions.

In this chapter, you’ll learn how to:

Let’s explore how conditional logic helps you write smarter, more dynamic code.

NoneVideo 🎥:

Get introduced to conditional statements with this intro video and then read the lesson that follows to learn more and to get some hands-on practice.

By the end of this lesson you will be able to:

  1. Apply basic if statements along with additional multi-branching statements (i.e. if...elif...else).
  2. Create Pythonic versions of switch statements.
  3. Apply vectorized conditional statements on Pandas DataFrames.
Note📓 Follow Along in Colab!

As you read through this chapter, we encourage you to follow along using the companion notebook in Google Colab (or other editor of choice). This interactive notebook lets you run code examples covered in the chapter—and experiment with your own ideas.

👉 Open the Conditional Statements Notebook in Colab.

16.1 Prerequisites

Most of the examples in this lesson use base Python code without any modules; however, we will illustrate some examples of integrating control statements within Pandas. We also illustrate a couple examples that use Numpy.

import pandas as pd
import numpy as np

Also, most of the examples use toy data; however, when illustrating concepts integrated with Pandas we will use the Complete Journey transaction data:

from completejourney_py import get_data

df = get_data()['transactions']
/opt/hostedtoolcache/Python/3.13.5/x64/lib/python3.13/site-packages/completejourney_py/get_data.py:2: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81.
  from pkg_resources import resource_filename

16.2 if statement

The conditional if statement is used to test an expression. Below is some psuedo code illustrating what this looks like. If the test_expression is True, the statement gets executed. But if it’s False, nothing happens.

if test_expression:
    statement

The following is an example that tests if a particular object is negative. Notice that there are no braces or “begin/end” delimiters around the block of code. Python uses the indentation to determine blocks of code. Consequently, you can include multiple lines in the if statement as long as they have the same indentation.

x = -8

if x < 0:
    print('x contains a negative number')
x contains a negative number
# multiple lines in the statement are fine as long as they have the
# same indentation
if x < 0:
    new_value = abs(x)
    print(f'absolute value of x is {new_value}')
absolute value of x is 8

It is possible to write very short if statements on one line. This can be useful in limited situations but as soon as your resulting statement become more verbose it is best practice to switch to a multi-line approach.

# single line approach
if x < 0: print('x contains a negative number')
x contains a negative number

Its helpful to remember that everything in Python has some form of truthiness. In fact, any nonzero number or nonempty object is True. This allows you to evaluate the object directly:

# a conditional statement on an empty object is equivalent to False
empty_list = []
if empty_list:
    print("since empty_list is False this won't exectute")
# a conditional statement on a non-empty object is equivalent to True
non_empty_list = ['not', 'empty']
if non_empty_list:
    print("This list is not empty")
This list is not empty

Python uses and and or operators to evaluate multiple expressions. They always return a single True or False. Moreover, Python will stop evaluating multiple expressions as soon as the result is known.

x = -1
y = 4
if x < 0 or y < 0:
    print('At least one of these objects are less than zero.')
At least one of these objects are less than zero.
if x < 0 and y < 0:
    print('Both x and y or less than zero')

Knowledge check

NoneTry it yourself:

Fill in the following code chunk so that:

  • If month has value 1-9 the file name printed out will be “data/Month-0X.csv”

  • What happens if the month value is 10-12?

    month = 4
    
    if month ________ :
        print(f'data/Month-0{month}.csv')

Try the above yourself and then see my approach here:

16.3 Multiway branching

Multiway branching is when we want to have multiple return statement options based on the input conditions. The general form of multiway branch if statements is as follows.

The elif block is not always necessary. If you want only two output branches then just use if followed by else. However, if you have many branches, you can use as many elif statements as necessary.

if test_1:
  statement_1
elif test2:
  statement_2
else:
  statement_3

The following illustrates with a simple example. Python will perform this code in sequence and execute the statements nested under the first test that is True or the else if all tests are False.

x = 22.50

if 0 <= x < 10:
    print('low')
elif 10 <= x < 20:
    print('medium-low')
elif 20 <= x < 30:
    print('medium')
else:
    print('preferred')
medium

Knowledge check

NoneTry it yourself:

Fill in the following code chunk so that:

  • if month has value 1-9 the file name printed out will be “data/month-0X.csv”

  • if month has value 10-12 the file name printed out will be “data/month-1X.csv”

  • if month is an invalid month number (not 1-12), the result printed out is “Invalid month”

  • test it out for when month equals 6, 10, & 13

    month = 4
    
    if month ________:
        print(f'data/Month-0{month}.csv')
    _____:
        print(f'data/Month-{month}.csv')
    _____:
        print('________')

Try the above yourself and then see my approach here:

16.4 Switch statements

Many other languages have a switch or case statement that allows you to evaluate an expression and return the statement that aligns with the value. For example, in R, the switch statement looks like the following:

choice <- 'ham'

switch(choice,
       'spam'  = 1.25,
       'ham'   = 1.99,
       'eggs'  = 0.99,
       'bacon' = 1.10,
)
## [1] 1.99

Python does not have switch statement but has some handy alternatives. In the most basic approach, you could just use a multiway branch if statement:

choice = 'ham'

if choice == 'spam':
    print(1.25)
elif choice == 'ham':
    print(1.99)
elif choice == 'eggs':
    print(0.99)
elif choice == 'bacon':
    print(1.10)
else:
    print('Bad choice')
1.99

However, this approach is a bit verbose. An efficient alternative is to use a dictionary that provides the same key-value matching as a switch statement.

options = {'spam': 1.25, 'ham': 1.99, 'eggs': 0.99, 'bacon': 1.10}

You can either index this dictionary for the matching key:

options[choice]
1.99

Or, a more trustworthy approach is to use the get() method. This allows you to provide a default response in the case that the key you are looking for is not in the dictionary

Using the get() method allows you to supply a value to provide if there is no matching key (or as in if statements if there are no other conditions that equate to True).

options.get(choice, 'Bad choice')
1.99
choice = 'broccoli'
options.get(choice, 'Bad choice')
'Bad choice'

Dictionaries are good for associating values with keys, but what about the more complicated actions you can code in the statement blocks associated with if statements? Fortunately, dictionaries can also hold functions (both named functions and lambda functions) which can allow you to perform more sophisticated switch-like execution.

Don’t worry, you will learn more about functions in an upcoming lesson.

def produce_revenue(sqft, visits, trend):
    total = 9.91 * sqft * visits * trend
    return round(total, 2)

def frozen_revenue(sqft, visits, trend):
    prod = produce_revenue(sqft, visits, trend)
    total = 3.28 * sqft * visits * trend - prod * .005
    return round(total, 2)

expected_annual_revenue = {
    'produce':    produce_revenue,
    'frozen':     frozen_revenue,
    'pharmacy':   lambda: 16.11 * visits * trend
    }

choice = 'frozen'
expected_annual_revenue.get(choice, 'Bad choice')(sqft=937, visits=465, trend=0.98)
1379372.75

Knowledge check

NoneTry it yourself:

Convert the following multi-branch if-else statement into a dict where you get the month path file with path_files.get(month). In this case, which approach seems more reasonable?

month = 4

if month <= 9:
    print(f'data/Month-0{month}.csv')
elif month >= 10 and month <= 12:
    print(f'data/Month-{month}.csv')
else:
    print('Invalid month')

Try the above yourself and then see my approach here:

16.5 Applying in Pandas

When data mining, we often want to perform conditional statements to not only filter observations, but also to create new variables. For example, say we want to create a new variable that classifies transactions above $10 as “high value” otherwise they are “low value”. There are several methods we can use to perform this but a simple one is to use the apply method:

df['value'] = df['sales_value'].apply(lambda x: 'high value' if x > 10 else 'low value')
df.head()
household_id store_id basket_id product_id quantity sales_value retail_disc coupon_disc coupon_match_disc week transaction_timestamp value
0 900 330 31198570044 1095275 1 0.50 0.00 0.0 0.0 1 2017-01-01 11:53:26 low value
1 900 330 31198570047 9878513 1 0.99 0.10 0.0 0.0 1 2017-01-01 12:10:28 low value
2 1228 406 31198655051 1041453 1 1.43 0.15 0.0 0.0 1 2017-01-01 12:26:30 low value
3 906 319 31198705046 1020156 1 1.50 0.29 0.0 0.0 1 2017-01-01 12:30:27 low value
4 906 319 31198705046 1053875 2 2.78 0.80 0.0 0.0 1 2017-01-01 12:30:27 low value
df.groupby('value').size()
value
high value      45265
low value     1424042
dtype: int64

An alternative, and much faster approach is to use np.where(), which requires numpy to be loaded. np.where has been show to be over 2.5 times faster than apply():

df['value'] = np.where(df['sales_value'] > 10, 'high value', 'low value')
df.head()
household_id store_id basket_id product_id quantity sales_value retail_disc coupon_disc coupon_match_disc week transaction_timestamp value
0 900 330 31198570044 1095275 1 0.50 0.00 0.0 0.0 1 2017-01-01 11:53:26 low value
1 900 330 31198570047 9878513 1 0.99 0.10 0.0 0.0 1 2017-01-01 12:10:28 low value
2 1228 406 31198655051 1041453 1 1.43 0.15 0.0 0.0 1 2017-01-01 12:26:30 low value
3 906 319 31198705046 1020156 1 1.50 0.29 0.0 0.0 1 2017-01-01 12:30:27 low value
4 906 319 31198705046 1053875 2 2.78 0.80 0.0 0.0 1 2017-01-01 12:30:27 low value

As our conditions get more complex, it often becomes useful to create a separate function and use apply. This approach is probably the most legible; however, not always the fastest approach if you are working with significantly large data.

def flag(df):
    if (df['quantity'] > 20) or (df['sales_value'] > 10):
        return 'Large purchase'
    elif (df['quantity'] > 10) or (df['sales_value'] > 5):
        return 'Medium purchase'
    elif (df['quantity'] > 0) or (df['sales_value'] > 0):
        return 'Small purchase'
    else:
        return 'Alternative transaction'

df['purchase_flag'] = df.apply(flag, axis = 1)
df.head()
household_id store_id basket_id product_id quantity sales_value retail_disc coupon_disc coupon_match_disc week transaction_timestamp value purchase_flag
0 900 330 31198570044 1095275 1 0.50 0.00 0.0 0.0 1 2017-01-01 11:53:26 low value Small purchase
1 900 330 31198570047 9878513 1 0.99 0.10 0.0 0.0 1 2017-01-01 12:10:28 low value Small purchase
2 1228 406 31198655051 1041453 1 1.43 0.15 0.0 0.0 1 2017-01-01 12:26:30 low value Small purchase
3 906 319 31198705046 1020156 1 1.50 0.29 0.0 0.0 1 2017-01-01 12:30:27 low value Small purchase
4 906 319 31198705046 1053875 2 2.78 0.80 0.0 0.0 1 2017-01-01 12:30:27 low value Small purchase
df.groupby('purchase_flag').size()
purchase_flag
Alternative transaction       8820
Large purchase               46730
Medium purchase             128361
Small purchase             1285396
dtype: int64
TipVideo 🎥:

Here is a more thorough introduction to the apply method; plus, you’ll also be introduced to the map and applymap methods.

Knowledge check

NoneTry it yourself:

Fill in the blanks below to assign each transaction to a power rating of 1, 2, 3, or 4 based on the sales_value variable:

  • power_rating = 1: if sales_value < 25th percentile
  • power_rating = 2: if sales_value < 50th percentile
  • power_rating = 3: if sales_value < 75th percentile
  • power_rating = 4: if sales_value >= 75th percentile

Hint: use the .quantile(perc_value)

low = df['sales_value'].quantile(____)
med = df['sales_value'].quantile(____) 
hig = df['sales_value'].quantile(____)

def power_rater(df):
   if (df['sales_value'] < _____):
      return ___
   elif (df['sales_value'] < _____):
      return ___
   elif (df['sales_value'] < _____):
      return ___
   else:
      return ___

df['power_rating'] = df.apply(power_rater, axis = 1)

Try the above yourself and then see my approach here:

16.6 Summary

Conditional statements are the foundation for making decisions in Python programs. In this chapter, you learned how to control the flow of your code by selectively executing blocks of logic depending on whether certain conditions are met.

You began by exploring simple if statements and saw how Python uses indentation, not brackets, to define code blocks. From there, you learned how to extend decision logic using elif and else to handle multiple branching paths.

You also saw how to:

  • Use logical operators like and and or to combine conditions
  • Simulate switch-like behavior using dictionaries and the .get() method
  • Embed functions within dictionaries to build more dynamic logic maps

Importantly, you practiced applying conditional logic within Pandas DataFrames using methods like .apply() and np.where()—a key skill when working with real-world data. These tools allow you to classify, flag, or transform rows based on business rules or data thresholds.

Whether you’re categorizing transactions, building smart logic into a script, or adapting decisions based on user input, mastering conditional statements sets the stage for writing more intelligent, flexible programs.

You’re now ready to take on iteration—using loops to automate repeated tasks—covered in the next chapter.

16.7 Exercise: Classifying Transaction Discounts with Conditionals

In this exercise set, you’ll work with the Complete Journey transactions dataset and apply what you’ve learned about conditional statements to classify and analyze retail discount patterns. These tasks will help you get comfortable with conditional logic in Python and how to apply it to Pandas DataFrames.

Use the transactions dataset from the completejourney_py package, and import the necessary libraries as needed:

import pandas as pd
import numpy as np
from completejourney_py import get_data

df = get_data()['transactions']
Tip💡 Need a refresher?

Review how to use np.where(), the .apply() method, and conditional logic blocks with if-elif-else. Try using ChatGPT or Pandas documentation to explore options for vectorized logic and grouping.

Start by calculating the total discount applied to each transaction.

  • Create a new column called total_disc.
  • This should be the sum of retail_disc and coupon_disc.

Next, classify each transaction based on the level of discount received.

  • Create a new column disc_rating using the following logic:

    • 'none': if total_disc == 0
    • 'low': if total_disc > 0 but less than the 25th percentile
    • 'medium': if total_disc ≥ 25th percentile and < 75th percentile
    • 'high': if total_disc ≥ 75th percentile
    • 'other': for any row that doesn’t meet the above conditions

💡 Use .quantile(0.25) and .quantile(0.75) to calculate the thresholds.

  • Use groupby() and size() (or .value_counts()) to determine how many transactions fall into each disc_rating category.
  • Are most transactions high-discount or low-discount?

As an extension, explore whether high-discount transactions are associated with higher sales_value.

  • Use groupby('disc_rating')['sales_value'].mean() to compare average spending by discount group.
  • What patterns do you notice?