python-mastery/Exercises/ex5_1.md

179 lines
5.7 KiB
Markdown
Raw Normal View History

2023-07-17 03:21:00 +02:00
\[ [Index](index.md) | [Exercise 4.4](ex4_4.md) | [Exercise 5.2](ex5_2.md) \]
# Exercise 5.1
*Objectives:*
- Explore a few definitional aspects of functions/methods
- Making functions more flexible
- Type hints
In [Exercise 2.6](ex2_6.md) you wrote a `reader.py` module that
had a function for reading a CSV into a list of dictionaries. For example:
```python
>>> import reader
>>> port = reader.read_csv_as_dicts('Data/portfolio.csv', [str,int,float])
>>>
```
We later expanded to that code to work with instances in
[Exercise 3.3](ex3_3.md):
```python
>>> import reader
>>> from stock import Stock
>>> port = reader.read_csv_as_instances('Data/portfolio.csv', Stock)
>>>
```
Eventually the code was refactored into a collection of classes
involving inheritance in [Exercise 3.7](ex3_7.md). However,
the code has become rather complex and convoluted.
## (a) Back to Basics
Start by reverting the changes related to class definitions. Rewrite
the `reader.py` file so that it contains the two basic functions that
you had before you messed it up with classes:
```python
# reader.py
import csv
def read_csv_as_dicts(filename, types):
'''
Read CSV data into a list of dictionaries with optional type conversion
'''
records = []
with open(filename) as file:
rows = csv.reader(file)
headers = next(rows)
for row in rows:
record = { name: func(val)
for name, func, val in zip(headers, types, row) }
records.append(record)
return records
def read_csv_as_instances(filename, cls):
'''
Read CSV data into a list of instances
'''
records = []
with open(filename) as file:
rows = csv.reader(file)
headers = next(rows)
for row in rows:
record = cls.from_row(row)
records.append(record)
return records
```
Make sure the code still works as it did before:
```python
>>> import reader
>>> port = reader.read_csv_as_dicts('Data/portfolio.csv', [str, int, float])
>>> port
[{'name': 'AA', 'shares': 100, 'price': 32.2}, {'name': 'IBM', 'shares': 50, 'price': 91.1},
{'name': 'CAT', 'shares': 150, 'price': 83.44}, {'name': 'MSFT', 'shares': 200, 'price': 51.23},
{'name': 'GE', 'shares': 95, 'price': 40.37}, {'name': 'MSFT', 'shares': 50, 'price': 65.1},
{'name': 'IBM', 'shares': 100, 'price': 70.44}]
>>> import stock
>>> port = reader.read_csv_as_instances('Data/portfolio.csv', stock.Stock)
>>> port
[Stock('AA', 100, 32.2), Stock('IBM', 50, 91.1), Stock('CAT', 150, 83.44),
Stock('MSFT', 200, 51.23), Stock('GE', 95, 40.37), Stock('MSFT', 50, 65.1),
Stock('IBM', 100, 70.44)]
>>>
```
## (b) Thinking about Flexibility
Right now, the two functions in `reader.py` are hard-wired to work
with filenames that are passed directly to `open()`. Refactor the
code so that it works with any iterable object that produces lines.
To do this, create two new functions `csv_as_dicts(lines, types)` and
`csv_as_instances(lines, cls)` that convert any iterable sequence of
lines. For example:
```python
>>> file = open('Data/portfolio.csv')
>>> port = reader.csv_as_dicts(file, [str, int, float])
>>> port
[{'name': 'AA', 'shares': 100, 'price': 32.2}, {'name': 'IBM', 'shares': 50, 'price': 91.1},
{'name': 'CAT', 'shares': 150, 'price': 83.44}, {'name': 'MSFT', 'shares': 200, 'price': 51.23},
{'name': 'GE', 'shares': 95, 'price': 40.37}, {'name': 'MSFT', 'shares': 50, 'price': 65.1},
{'name': 'IBM', 'shares': 100, 'price': 70.44}]
>>>
```
The whole point of doing this is to make it possible to work with different
kinds of input sources. For example:
```python
>>> import gzip
>>> import stock
>>> file = gzip.open('Data/portfolio.csv.gz')
>>> port = reader.csv_as_instances(file, stock.Stock)
>>> port
[Stock('AA', 100, 32.2), Stock('IBM', 50, 91.1), Stock('CAT', 150, 83.44),
Stock('MSFT', 200, 51.23), Stock('GE', 95, 40.37), Stock('MSFT', 50, 65.1),
Stock('IBM', 100, 70.44)]
>>>
```
To maintain backwards compatibility with older code, write functions
`read_csv_as_dicts()` and `read_csv_as_instances()` that take a
filename as before. These functions should call `open()` on the
supplied filename and use the new `csv_as_dicts()` or
`csv_as_instances()` functions on the resulting file.
## (c) Design Challenge: CSV Headers
The code assumes that the first line of CSV data always contains
column headers. However, this isn't always the case. For example, the
file `Data/portfolio_noheader.csv` contains data, but no column
headers.
How would you refactor the code to accommodate missing column headers, having
them supplied manually by the caller instead?
## (d) API Challenge: Type hints
Functions can have optional type-hints attached to arguments and return values.
For example:
```python
def add(x:int, y:int) -> int:
return x + y
```
The `typing` module has additional classes for expressing more complex kinds of
types including containers. For example:
```python
from typing import List
def sum_squares(nums: List[int]) -> int:
total = 0
for n in nums:
total += n*n
return total
```
Your challenge: Modify the code in `reader.py` so that all functions
have type hints. Try to make the type-hints as accurate as possible.
To do this, you may need to consult the documentation for the
[typing module](https://docs.python.org/3/library/typing.html).
\[ [Solution](soln5_1.md) | [Index](index.md) | [Exercise 4.4](ex4_4.md) | [Exercise 5.2](ex5_2.md) \]
----
`>>>` Advanced Python Mastery
`...` A course by [dabeaz](https://www.dabeaz.com)
`...` Copyright 2007-2023
![](https://i.creativecommons.org/l/by-sa/4.0/88x31.png). This work is licensed under a [Creative Commons Attribution-ShareAlike 4.0 International License](http://creativecommons.org/licenses/by-sa/4.0/)