\[ [Index](index.md) | [Exercise 4.4](ex4_4.md) | [Exercise 5.2](ex5_2.md) \]

# Exercise 5.1

*Objectives:*

- Explore a few definitional aspects of functions/methods
- Making functions more flexible
- Type hints

In [Exercise 2.6](ex2_6.md) you wrote a `reader.py` module that
had a function for reading a CSV into a list of dictionaries.  For example:

```python
>>> import reader
>>> port = reader.read_csv_as_dicts('Data/portfolio.csv', [str,int,float])
>>>
```

We later expanded to that code to work with instances in
[Exercise 3.3](ex3_3.md): 

```python
>>> import reader
>>> from stock import Stock
>>> port = reader.read_csv_as_instances('Data/portfolio.csv', Stock)
>>>
```

Eventually the code was refactored into a collection of classes
involving inheritance in [Exercise 3.7](ex3_7.md).  However,
the code has become rather complex and convoluted.

## (a) Back to Basics

Start by reverting the changes related to class definitions.  Rewrite
the `reader.py` file so that it contains the two basic functions that
you had before you messed it up with classes:

```python
# reader.py

import csv

def read_csv_as_dicts(filename, types):
    '''
    Read CSV data into a list of dictionaries with optional type conversion
    '''
    records = []
    with open(filename) as file:
        rows = csv.reader(file)
        headers = next(rows)
        for row in rows:
            record = { name: func(val) 
                       for name, func, val in zip(headers, types, row) }
            records.append(record)
    return records

def read_csv_as_instances(filename, cls):
    '''
    Read CSV data into a list of instances
    '''
    records = []
    with open(filename) as file:
        rows = csv.reader(file)
        headers = next(rows)
        for row in rows:
            record = cls.from_row(row)
            records.append(record)
    return records
```

Make sure the code still works as it did before:

```python
>>> import reader
>>> port = reader.read_csv_as_dicts('Data/portfolio.csv', [str, int, float])
>>> port
[{'name': 'AA', 'shares': 100, 'price': 32.2}, {'name': 'IBM', 'shares': 50, 'price': 91.1}, 
 {'name': 'CAT', 'shares': 150, 'price': 83.44}, {'name': 'MSFT', 'shares': 200, 'price': 51.23}, 
 {'name': 'GE', 'shares': 95, 'price': 40.37}, {'name': 'MSFT', 'shares': 50, 'price': 65.1}, 
 {'name': 'IBM', 'shares': 100, 'price': 70.44}]
>>> import stock
>>> port = reader.read_csv_as_instances('Data/portfolio.csv', stock.Stock)
>>> port
[Stock('AA', 100, 32.2), Stock('IBM', 50, 91.1), Stock('CAT', 150, 83.44), 
 Stock('MSFT', 200, 51.23), Stock('GE', 95, 40.37), Stock('MSFT', 50, 65.1), 
 Stock('IBM', 100, 70.44)]
>>>
```

## (b) Thinking about Flexibility

Right now, the two functions in `reader.py` are hard-wired to work
with filenames that are passed directly to `open()`.  Refactor the
code so that it works with any iterable object that produces lines.
To do this, create two new functions `csv_as_dicts(lines, types)` and
`csv_as_instances(lines, cls)` that convert any iterable sequence of
lines.  For example:

```python
>>> file = open('Data/portfolio.csv')
>>> port = reader.csv_as_dicts(file, [str, int, float])
>>> port
[{'name': 'AA', 'shares': 100, 'price': 32.2}, {'name': 'IBM', 'shares': 50, 'price': 91.1}, 
 {'name': 'CAT', 'shares': 150, 'price': 83.44}, {'name': 'MSFT', 'shares': 200, 'price': 51.23}, 
 {'name': 'GE', 'shares': 95, 'price': 40.37}, {'name': 'MSFT', 'shares': 50, 'price': 65.1}, 
 {'name': 'IBM', 'shares': 100, 'price': 70.44}]
>>>
```

The whole point of doing this is to make it possible to work with different
kinds of input sources.  For example:

```python
>>> import gzip
>>> import stock
>>> file = gzip.open('Data/portfolio.csv.gz')
>>> port = reader.csv_as_instances(file, stock.Stock)
>>> port
[Stock('AA', 100, 32.2), Stock('IBM', 50, 91.1), Stock('CAT', 150, 83.44), 
 Stock('MSFT', 200, 51.23), Stock('GE', 95, 40.37), Stock('MSFT', 50, 65.1), 
 Stock('IBM', 100, 70.44)]
>>>
```

To maintain backwards compatibility with older code, write functions
`read_csv_as_dicts()` and `read_csv_as_instances()` that take a
filename as before.  These functions should call `open()` on the
supplied filename and use the new `csv_as_dicts()` or
`csv_as_instances()` functions on the resulting file.

## (c) Design Challenge: CSV Headers

The code assumes that the first line of CSV data always contains
column headers.  However, this isn't always the case. For example, the
file `Data/portfolio_noheader.csv` contains data, but no column
headers.

How would you refactor the code to accommodate missing column headers, having
them supplied manually by the caller instead?

## (d) API Challenge: Type hints

Functions can have optional type-hints attached to arguments and return values.
For example:

```python
def add(x:int, y:int) -> int:
    return x + y
```

The `typing` module has additional classes for expressing more complex kinds of
types including containers.  For example:

```python
from typing import List

def sum_squares(nums: List[int]) -> int:
    total = 0
    for n in nums:
        total += n*n
    return total
```

Your challenge: Modify the code in `reader.py` so that all functions
have type hints.  Try to make the type-hints as accurate as possible.
To do this, you may need to consult the documentation for the
[typing module](https://docs.python.org/3/library/typing.html).

\[ [Solution](soln5_1.md) | [Index](index.md) | [Exercise 4.4](ex4_4.md) | [Exercise 5.2](ex5_2.md) \]

----
`>>>` Advanced Python Mastery  
`...` A course by [dabeaz](https://www.dabeaz.com)  
`...` Copyright 2007-2023  

![](https://i.creativecommons.org/l/by-sa/4.0/88x31.png). This work is licensed under a [Creative Commons Attribution-ShareAlike 4.0 International License](http://creativecommons.org/licenses/by-sa/4.0/)