python-mastery/Exercises/ex5_1.md

\[ [Index](index.md) | [Exercise 4.4](ex4_4.md) | [Exercise 5.2](ex5_2.md) \]

# Exercise 5.1

*Objectives:*

- Explore a few definitional aspects of functions/methods
- Making functions more flexible
- Type hints

In [Exercise 2.6](ex2_6.md) you wrote a `reader.py` module that
had a function for reading a CSV into a list of dictionaries.  For example:

```python
>>> import reader
>>> port = reader.read_csv_as_dicts('Data/portfolio.csv', [str,int,float])
>>>
```

We later expanded to that code to work with instances in
[Exercise 3.3](ex3_3.md): 

```python
>>> import reader
>>> from stock import Stock
>>> port = reader.read_csv_as_instances('Data/portfolio.csv', Stock)
>>>
```

Eventually the code was refactored into a collection of classes
involving inheritance in [Exercise 3.7](ex3_7.md).  However,
the code has become rather complex and convoluted.

## (a) Back to Basics

Start by reverting the changes related to class definitions.  Rewrite
the `reader.py` file so that it contains the two basic functions that
you had before you messed it up with classes:

```python
# reader.py

import csv

def read_csv_as_dicts(filename, types):
    '''
    Read CSV data into a list of dictionaries with optional type conversion
    '''
    records = []
    with open(filename) as file:
        rows = csv.reader(file)
        headers = next(rows)
        for row in rows:
            record = { name: func(val) 
                       for name, func, val in zip(headers, types, row) }
            records.append(record)
    return records

def read_csv_as_instances(filename, cls):
    '''
    Read CSV data into a list of instances
    '''
    records = []
    with open(filename) as file:
        rows = csv.reader(file)
        headers = next(rows)
        for row in rows:
            record = cls.from_row(row)
            records.append(record)
    return records
```

Make sure the code still works as it did before:

```python
>>> import reader
>>> port = reader.read_csv_as_dicts('Data/portfolio.csv', [str, int, float])
>>> port
[{'name': 'AA', 'shares': 100, 'price': 32.2}, {'name': 'IBM', 'shares': 50, 'price': 91.1}, 
 {'name': 'CAT', 'shares': 150, 'price': 83.44}, {'name': 'MSFT', 'shares': 200, 'price': 51.23}, 
 {'name': 'GE', 'shares': 95, 'price': 40.37}, {'name': 'MSFT', 'shares': 50, 'price': 65.1}, 
 {'name': 'IBM', 'shares': 100, 'price': 70.44}]
>>> import stock
>>> port = reader.read_csv_as_instances('Data/portfolio.csv', stock.Stock)
>>> port
[Stock('AA', 100, 32.2), Stock('IBM', 50, 91.1), Stock('CAT', 150, 83.44), 
 Stock('MSFT', 200, 51.23), Stock('GE', 95, 40.37), Stock('MSFT', 50, 65.1), 
 Stock('IBM', 100, 70.44)]
>>>
```

## (b) Thinking about Flexibility

Right now, the two functions in `reader.py` are hard-wired to work
with filenames that are passed directly to `open()`.  Refactor the
code so that it works with any iterable object that produces lines.
To do this, create two new functions `csv_as_dicts(lines, types)` and
`csv_as_instances(lines, cls)` that convert any iterable sequence of
lines.  For example:

```python
>>> file = open('Data/portfolio.csv')
>>> port = reader.csv_as_dicts(file, [str, int, float])
>>> port
[{'name': 'AA', 'shares': 100, 'price': 32.2}, {'name': 'IBM', 'shares': 50, 'price': 91.1}, 
 {'name': 'CAT', 'shares': 150, 'price': 83.44}, {'name': 'MSFT', 'shares': 200, 'price': 51.23}, 
 {'name': 'GE', 'shares': 95, 'price': 40.37}, {'name': 'MSFT', 'shares': 50, 'price': 65.1}, 
 {'name': 'IBM', 'shares': 100, 'price': 70.44}]
>>>
```

The whole point of doing this is to make it possible to work with different
kinds of input sources.  For example:

```python
>>> import gzip
>>> import stock
>>> file = gzip.open('Data/portfolio.csv.gz')
>>> port = reader.csv_as_instances(file, stock.Stock)
>>> port
[Stock('AA', 100, 32.2), Stock('IBM', 50, 91.1), Stock('CAT', 150, 83.44), 
 Stock('MSFT', 200, 51.23), Stock('GE', 95, 40.37), Stock('MSFT', 50, 65.1), 
 Stock('IBM', 100, 70.44)]
>>>
```

To maintain backwards compatibility with older code, write functions
`read_csv_as_dicts()` and `read_csv_as_instances()` that take a
filename as before.  These functions should call `open()` on the
supplied filename and use the new `csv_as_dicts()` or
`csv_as_instances()` functions on the resulting file.

## (c) Design Challenge: CSV Headers

The code assumes that the first line of CSV data always contains
column headers.  However, this isn't always the case. For example, the
file `Data/portfolio_noheader.csv` contains data, but no column
headers.

How would you refactor the code to accommodate missing column headers, having
them supplied manually by the caller instead?

## (d) API Challenge: Type hints

Functions can have optional type-hints attached to arguments and return values.
For example:

```python
def add(x:int, y:int) -> int:
    return x + y
```

The `typing` module has additional classes for expressing more complex kinds of
types including containers.  For example:

```python
from typing import List

def sum_squares(nums: List[int]) -> int:
    total = 0
    for n in nums:
        total += n*n
    return total
```

Your challenge: Modify the code in `reader.py` so that all functions
have type hints.  Try to make the type-hints as accurate as possible.
To do this, you may need to consult the documentation for the
[typing module](https://docs.python.org/3/library/typing.html).

\[ [Solution](soln5_1.md) | [Index](index.md) | [Exercise 4.4](ex4_4.md) | [Exercise 5.2](ex5_2.md) \]

----
`>>>` Advanced Python Mastery  
`...` A course by [dabeaz](https://www.dabeaz.com)  
`...` Copyright 2007-2023  

![](https://i.creativecommons.org/l/by-sa/4.0/88x31.png). This work is licensed under a [Creative Commons Attribution-ShareAlike 4.0 International License](http://creativecommons.org/licenses/by-sa/4.0/)
Initial commit 2023-07-17 03:21:00 +02:00			`\[ [Index](index.md) \| [Exercise 4.4](ex4_4.md) \| [Exercise 5.2](ex5_2.md) \]`

			`# Exercise 5.1`

			`Objectives:`

			`- Explore a few definitional aspects of functions/methods`
			`- Making functions more flexible`
			`- Type hints`

			In [Exercise 2.6](ex2_6.md) you wrote a `reader.py` module that
			`had a function for reading a CSV into a list of dictionaries. For example:`

			```python
			`>>> import reader`
			`>>> port = reader.read_csv_as_dicts('Data/portfolio.csv', [str,int,float])`
			`>>>`
			```

			`We later expanded to that code to work with instances in`
			`[Exercise 3.3](ex3_3.md):`

			```python
			`>>> import reader`
			`>>> from stock import Stock`
			`>>> port = reader.read_csv_as_instances('Data/portfolio.csv', Stock)`
			`>>>`
			```

			`Eventually the code was refactored into a collection of classes`
			`involving inheritance in [Exercise 3.7](ex3_7.md). However,`
			`the code has become rather complex and convoluted.`

			`## (a) Back to Basics`

			`Start by reverting the changes related to class definitions. Rewrite`
			the `reader.py` file so that it contains the two basic functions that
			`you had before you messed it up with classes:`

			```python
			`# reader.py`

			`import csv`

			`def read_csv_as_dicts(filename, types):`
			`'''`
			`Read CSV data into a list of dictionaries with optional type conversion`
			`'''`
			`records = []`
			`with open(filename) as file:`
			`rows = csv.reader(file)`
			`headers = next(rows)`
			`for row in rows:`
			`record = { name: func(val)`
			`for name, func, val in zip(headers, types, row) }`
			`records.append(record)`
			`return records`

			`def read_csv_as_instances(filename, cls):`
			`'''`
			`Read CSV data into a list of instances`
			`'''`
			`records = []`
			`with open(filename) as file:`
			`rows = csv.reader(file)`
			`headers = next(rows)`
			`for row in rows:`
			`record = cls.from_row(row)`
			`records.append(record)`
			`return records`
			```

			`Make sure the code still works as it did before:`

			```python
			`>>> import reader`
			`>>> port = reader.read_csv_as_dicts('Data/portfolio.csv', [str, int, float])`
			`>>> port`
			`[{'name': 'AA', 'shares': 100, 'price': 32.2}, {'name': 'IBM', 'shares': 50, 'price': 91.1},`
			`{'name': 'CAT', 'shares': 150, 'price': 83.44}, {'name': 'MSFT', 'shares': 200, 'price': 51.23},`
			`{'name': 'GE', 'shares': 95, 'price': 40.37}, {'name': 'MSFT', 'shares': 50, 'price': 65.1},`
			`{'name': 'IBM', 'shares': 100, 'price': 70.44}]`
			`>>> import stock`
			`>>> port = reader.read_csv_as_instances('Data/portfolio.csv', stock.Stock)`
			`>>> port`
			`[Stock('AA', 100, 32.2), Stock('IBM', 50, 91.1), Stock('CAT', 150, 83.44),`
			`Stock('MSFT', 200, 51.23), Stock('GE', 95, 40.37), Stock('MSFT', 50, 65.1),`
			`Stock('IBM', 100, 70.44)]`
			`>>>`
			```

			`## (b) Thinking about Flexibility`

			Right now, the two functions in `reader.py` are hard-wired to work
			with filenames that are passed directly to `open()`. Refactor the
			`code so that it works with any iterable object that produces lines.`
			To do this, create two new functions `csv_as_dicts(lines, types)` and
			`csv_as_instances(lines, cls)` that convert any iterable sequence of
			`lines. For example:`

			```python
			`>>> file = open('Data/portfolio.csv')`
			`>>> port = reader.csv_as_dicts(file, [str, int, float])`
			`>>> port`
			`[{'name': 'AA', 'shares': 100, 'price': 32.2}, {'name': 'IBM', 'shares': 50, 'price': 91.1},`
			`{'name': 'CAT', 'shares': 150, 'price': 83.44}, {'name': 'MSFT', 'shares': 200, 'price': 51.23},`
			`{'name': 'GE', 'shares': 95, 'price': 40.37}, {'name': 'MSFT', 'shares': 50, 'price': 65.1},`
			`{'name': 'IBM', 'shares': 100, 'price': 70.44}]`
			`>>>`
			```

			`The whole point of doing this is to make it possible to work with different`
			`kinds of input sources. For example:`

			```python
			`>>> import gzip`
			`>>> import stock`
			`>>> file = gzip.open('Data/portfolio.csv.gz')`
			`>>> port = reader.csv_as_instances(file, stock.Stock)`
			`>>> port`
			`[Stock('AA', 100, 32.2), Stock('IBM', 50, 91.1), Stock('CAT', 150, 83.44),`
			`Stock('MSFT', 200, 51.23), Stock('GE', 95, 40.37), Stock('MSFT', 50, 65.1),`
			`Stock('IBM', 100, 70.44)]`
			`>>>`
			```

			`To maintain backwards compatibility with older code, write functions`
			`read_csv_as_dicts()` and `read_csv_as_instances()` that take a
			filename as before. These functions should call `open()` on the
			supplied filename and use the new `csv_as_dicts()` or
			`csv_as_instances()` functions on the resulting file.

			`## (c) Design Challenge: CSV Headers`

			`The code assumes that the first line of CSV data always contains`
			`column headers. However, this isn't always the case. For example, the`
			file `Data/portfolio_noheader.csv` contains data, but no column
			`headers.`

			`How would you refactor the code to accommodate missing column headers, having`
			`them supplied manually by the caller instead?`

			`## (d) API Challenge: Type hints`

			`Functions can have optional type-hints attached to arguments and return values.`
			`For example:`

			```python
			`def add(x:int, y:int) -> int:`
			`return x + y`
			```

			The `typing` module has additional classes for expressing more complex kinds of
			`types including containers. For example:`

			```python
			`from typing import List`

			`def sum_squares(nums: List[int]) -> int:`
			`total = 0`
			`for n in nums:`
			`total += n*n`
			`return total`
			```

			Your challenge: Modify the code in `reader.py` so that all functions
			`have type hints. Try to make the type-hints as accurate as possible.`
			`To do this, you may need to consult the documentation for the`
			`[typing module](https://docs.python.org/3/library/typing.html).`

			`\[ [Solution](soln5_1.md) \| [Index](index.md) \| [Exercise 4.4](ex4_4.md) \| [Exercise 5.2](ex5_2.md) \]`

			`----`
			`>>>` Advanced Python Mastery
			`...` A course by [dabeaz](https://www.dabeaz.com)
			`...` Copyright 2007-2023

			`![](https://i.creativecommons.org/l/by-sa/4.0/88x31.png). This work is licensed under a [Creative Commons Attribution-ShareAlike 4.0 International License](http://creativecommons.org/licenses/by-sa/4.0/)`