#
Pandas Series Assertions
In this you'll learn how to use most used Pandas Series assertion functions. Below are the functions that are covered.
assert_pd_series_variable_equals_variable(student_variable_name, expected_outcome_variable_name)
: Checks that the student's Series instudent_variable_name
matches the variable inexpected_outcome_variable_name
.assert_pd_series_variable_equals_csv(student_variable_name, solution_csv_name, read_csv_kwargs=None)
: Checks that the student's Series instudent_variable_name
matches the Series contained in the solution CSV file namedsolution_csv_name
.assert_pd_series_variable_equals_pickle(student_variable_name, pickle_name, read_pickle_kwargs=None, series_testing_kwargs=None)
: Checks that the student's Series instudent_variable_name
matches the Series contained in the solution pickle file namedpickle_name
. This used when we have pivoted data or multi-column data.
Load the utils.py
file to use the assertion functions.
exec(open('utils.py').read())
import pandas as pd
#
Activities
Now, with activities examples, you'll learn how to use the assertion functions.
#
1. Create a student series
Use below data to create a series named student_data
.
data = ['Albert', 'John', 'Peter', 'James', 'Robert']
student_data = ...
Solution:
data = ['Albert', 'John', 'Peter', 'James', 'Robert']
student_data = pd.Series(data)
As the expected output is small series, so we use assert_pd_series_variable_equals_variable()
function to assert the solution with the student series.
Assertions:
expected_output = pd.Series(['Albert', 'John', 'Peter', 'James', 'Robert'])
assert_pd_series_variable_equals_variable('student_data', 'expected_output')
Here, we passed first student variable then expected variable.
#
2. Create a prime number series
Create a series named prime_numbers_series
which contains the first 10,000 prime numbers.
prime_numbers_series = ...
Solution:
prime_numbers = []
num = 2
while len(prime_numbers) < 10000:
for i in range(2, num):
if num % i == 0:
break
else:
prime_numbers.append(num)
num += 1
prime_numbers_series = pd.Series(prime_numbers)
In previous example, the expected series is small, so we've used assert_pd_series_variable_equals_variable()
function to assert the solution with the student series but in this example, the expected series is large, so we'll save the expected series to a csv file and then use assert_pd_series_variable_equals_csv()
function to assert the solution with the csv file.
prime_numbers_series.to_csv('activity_solutions_files/activity_1.csv', index=False)
Assertions:
assert_pd_series_variable_equals_csv('field_goal_perc', 'activity_1.csv')
#
3. Create a student series with index and multi-column data
Create a series named student_data
with below data and index.
data = {'Name': ['Albert', 'John', 'Peter', 'James', 'Robert'],
'Age': [25, 30, 35, 40, 45],
'City': ['New York', 'Los Angeles', 'Chicago', 'Houston', 'Phoenix']}
index = ['A', 'B', 'C', 'D', 'E']
student_data = ...
Solution:
data = {'Name': ['Albert', 'John', 'Peter', 'James', 'Robert'],
'Age': [25, 30, 35, 40, 45],
'City': ['New York', 'Los Angeles', 'Chicago', 'Houston', 'Phoenix']}
index = ['A', 'B', 'C', 'D', 'E']
student_data = pd.Series(data, index=index)
Here, we'll use pickle file to save the expected series and then use assert_pd_series_variable_equals_pickle()
function to assert the solution with the pickle file.
student_data.to_pickle('activity_solutions_files/activity_3.pkl')
Assertions:
assert_pd_series_variable_equals_pickle('student_data', 'activity_3.pkl')