Demonstratring a problem solving process

In these notes I am going to demonstrate the process I use to solve problems, including outlining my work and constructing functions to help with problem decomposition.

The problem we are going to solve is comparable in difficulty to third problem on the midterm exam. Since that problem gave a number of students trouble, I thought it would be useful for me to take you through the process I use to solve problems of this type.

The problem

This problem centers around a text file containing sales data for a chain of shops. Each line in the data file takes the form

<store number> <gross sales for one week>

The sales data in the file covers multiple weeks and multiple stores. The data appears in no particular order.

The problem we would like to solve is to compute average sales for each of the stores in the data set and then find and report which of the stores had the highest level of average daily sales.

Outlining the solution

The first step I take in solving a problem is to construct an outline of the major steps needed to solve the problem. I write this outline in the form of a set of comments. These comments then also will serve as headings for the major parts of the program.

Here is the initial set of comments I wrote.

# Read the data from the text file
# Construct a list of distinct store numbers that appear in the data file
# Compute the average sales for all of the stores, and then determine which is best
# Print the results

Reading the data

We have seen many examples in this course of programs that read data from a text file and organize that data into an appropriate list. Since this process should be pretty routine for everyone by now, everyone should be able to quickly write out code to do this. The one additional thing you will want to do when reading the data is to also document in a comment how you will be organizing the data you read into a list.

# Read the data from the text file
# Organize the data in the file into a list: each item in the
# list will be a tuple.
# The first item in each tuple will be the store number, and the
# second item will record sales for one week.
data = []
f = open('sales.txt','r')
for line in f.readlines():
    parts = line.split()
    data.append((int(parts[0]),float(parts[1])))
f.close()

Getting a list of store numbers

The second step in the outline above says to extract a list of distinct store numbers from the original data set. This is a sufficiently complex task that we will need to construct a function to help us with this specific subtask.

The first thing I do is to write some code in the main part of the program that simply calls our function to do the work.

# Construct a list of distinct store numbers that appear in the data file
stores = distinct_stores(data)

I then go to work constructing the desired function. One thing I do in constructing the function is to also equip the function with a doc string that documents what the function is meant to do.

def distinct_stores(data):
    """Construct and return a list of the distinct store numbers
       that appear in the data list."""
    result = []
    for store, sales in data:
        # Does this store appear in our list of stores?
        if contains(store,result) == False:
            # If not, append this store to our result list
            result.append(store)
    return result

In the process of writing the code for this function I came across a step that required me to solve another subproblem: given a store number and list of store numbers that I have collected already, can I tell whether or not the new store number appears in the list already.

One approach to solving this problem is to construct another helper function to help me solve that subproblem. Here is the function I wrote for this purpose.

def contains(item,my_list):
    """Does this item appear in my_list?"""
    for thing in my_list:
        if item == thing:
            return True
    return False

Another possible approach is to look around for a built-in feature in the Python language to help us with the problem of determining whether or not some number already appears in a list. It turns out that there is such a language feature, the in operator.

for store, sales in data:
    # Does this store appear in our list of stores?
    if store not in result:
        # If not, append this store to our result list
     result.append(store)

Computing average sales and searching for the best

The final part of the problem involves computing average sales for each of the stores in the list of stores and then finding the store with the highest sales.

Searching a list of items to find the "best" item is a task that we have performed several times in this course. Everyone should be able to sketch out a loop to do such a search.

Here is the code I wrote to do the search:

# Compute the average sales for all of the stores, and then determine which is best
best_sales = 0
best_store = -1
for store in stores:
    average_sales = sales(store,data)
    if average_sales > best_sales:
        best_sales = average_sales
        best_store = store

Note that this code required me to solve a subproblem: given a store number, can I compute the average sales for that store? Once again, the approach I used is to break this subproblem off as work for a helper function. Here is the code for the function that does this computation.

def sales(store,data):
    """Compute and return the averages sales for the store whose
       store number is given by the store parameter."""
    sum = 0
    count = 0
    for store_number, sales in data:
        if store_number == store:
            sum += sales
            count += 1
    return sum/count

Computing an average is also something that we have done several times in this course. We know that computing an average involves adding up a set of values and also keeping a count of how many items we added up.

The final result

Here now is the code for the finished program.

def contains(item,my_list):
    """Does this item appear in my_list?"""
    for thing in my_list:
        if item == thing:
            return True
    return False

def distinct_stores(data):
    """Construct and return a list of the distinct store numbers
       that appear in the data list."""
    result = []
    for store, sales in data:
        # Does this store appear in our list of stores?
        if contains(store,result) == False:
            # If not, append this store to our result list
            result.append(store)
    return result

def sales(store,data):
    """Compute and return the averages sales for the store whose
       store number is given by the store parameter."""
    sum = 0
    count = 0
    for store_number, sales in data:
        if store_number == store:
            sum += sales
            count += 1
    return sum/count

# Read the data from the text file
# Organize the data in the file into a list: each item in the list will be a tuple.
# The first item in each tuple will be the store number, and the second item will
# record sales for one week.
data = []
f = open('sales.txt','r')
for line in f.readlines():
    parts = line.split()
    data.append((int(parts[0]),float(parts[1])))
f.close()

# Construct a list of distinct store numbers that appear in the data file
stores = distinct_stores(data)

# Compute the average sales for all of the stores, and then determine which is best
best_sales = 0
best_store = -1
for store in stores:
    average_sales = sales(store,data)
    if average_sales > best_sales:
        best_sales = average_sales
        best_store = store

# Print the results
print("The best store is store number ",best_store)
print("It had average sales of ",best_sales)