\[ [Index](index.md) | [Exercise 4.4](ex4_4.md) | [Exercise 5.2](ex5_2.md) \] # Exercise 5.1 *Objectives:* - Explore a few definitional aspects of functions/methods - Making functions more flexible - Type hints In [Exercise 2.6](ex2_6.md) you wrote a `reader.py` module that had a function for reading a CSV into a list of dictionaries. For example: ```python >>> import reader >>> port = reader.read_csv_as_dicts('Data/portfolio.csv', [str,int,float]) >>> ``` We later expanded to that code to work with instances in [Exercise 3.3](ex3_3.md): ```python >>> import reader >>> from stock import Stock >>> port = reader.read_csv_as_instances('Data/portfolio.csv', Stock) >>> ``` Eventually the code was refactored into a collection of classes involving inheritance in [Exercise 3.7](ex3_7.md). However, the code has become rather complex and convoluted. ## (a) Back to Basics Start by reverting the changes related to class definitions. Rewrite the `reader.py` file so that it contains the two basic functions that you had before you messed it up with classes: ```python # reader.py import csv def read_csv_as_dicts(filename, types): ''' Read CSV data into a list of dictionaries with optional type conversion ''' records = [] with open(filename) as file: rows = csv.reader(file) headers = next(rows) for row in rows: record = { name: func(val) for name, func, val in zip(headers, types, row) } records.append(record) return records def read_csv_as_instances(filename, cls): ''' Read CSV data into a list of instances ''' records = [] with open(filename) as file: rows = csv.reader(file) headers = next(rows) for row in rows: record = cls.from_row(row) records.append(record) return records ``` Make sure the code still works as it did before: ```python >>> import reader >>> port = reader.read_csv_as_dicts('Data/portfolio.csv', [str, int, float]) >>> port [{'name': 'AA', 'shares': 100, 'price': 32.2}, {'name': 'IBM', 'shares': 50, 'price': 91.1}, {'name': 'CAT', 'shares': 150, 'price': 83.44}, {'name': 'MSFT', 'shares': 200, 'price': 51.23}, {'name': 'GE', 'shares': 95, 'price': 40.37}, {'name': 'MSFT', 'shares': 50, 'price': 65.1}, {'name': 'IBM', 'shares': 100, 'price': 70.44}] >>> import stock >>> port = reader.read_csv_as_instances('Data/portfolio.csv', stock.Stock) >>> port [Stock('AA', 100, 32.2), Stock('IBM', 50, 91.1), Stock('CAT', 150, 83.44), Stock('MSFT', 200, 51.23), Stock('GE', 95, 40.37), Stock('MSFT', 50, 65.1), Stock('IBM', 100, 70.44)] >>> ``` ## (b) Thinking about Flexibility Right now, the two functions in `reader.py` are hard-wired to work with filenames that are passed directly to `open()`. Refactor the code so that it works with any iterable object that produces lines. To do this, create two new functions `csv_as_dicts(lines, types)` and `csv_as_instances(lines, cls)` that convert any iterable sequence of lines. For example: ```python >>> file = open('Data/portfolio.csv') >>> port = reader.csv_as_dicts(file, [str, int, float]) >>> port [{'name': 'AA', 'shares': 100, 'price': 32.2}, {'name': 'IBM', 'shares': 50, 'price': 91.1}, {'name': 'CAT', 'shares': 150, 'price': 83.44}, {'name': 'MSFT', 'shares': 200, 'price': 51.23}, {'name': 'GE', 'shares': 95, 'price': 40.37}, {'name': 'MSFT', 'shares': 50, 'price': 65.1}, {'name': 'IBM', 'shares': 100, 'price': 70.44}] >>> ``` The whole point of doing this is to make it possible to work with different kinds of input sources. For example: ```python >>> import gzip >>> import stock >>> file = gzip.open('Data/portfolio.csv.gz') >>> port = reader.csv_as_instances(file, stock.Stock) >>> port [Stock('AA', 100, 32.2), Stock('IBM', 50, 91.1), Stock('CAT', 150, 83.44), Stock('MSFT', 200, 51.23), Stock('GE', 95, 40.37), Stock('MSFT', 50, 65.1), Stock('IBM', 100, 70.44)] >>> ``` To maintain backwards compatibility with older code, write functions `read_csv_as_dicts()` and `read_csv_as_instances()` that take a filename as before. These functions should call `open()` on the supplied filename and use the new `csv_as_dicts()` or `csv_as_instances()` functions on the resulting file. ## (c) Design Challenge: CSV Headers The code assumes that the first line of CSV data always contains column headers. However, this isn't always the case. For example, the file `Data/portfolio_noheader.csv` contains data, but no column headers. How would you refactor the code to accommodate missing column headers, having them supplied manually by the caller instead? ## (d) API Challenge: Type hints Functions can have optional type-hints attached to arguments and return values. For example: ```python def add(x:int, y:int) -> int: return x + y ``` The `typing` module has additional classes for expressing more complex kinds of types including containers. For example: ```python from typing import List def sum_squares(nums: List[int]) -> int: total = 0 for n in nums: total += n*n return total ``` Your challenge: Modify the code in `reader.py` so that all functions have type hints. Try to make the type-hints as accurate as possible. To do this, you may need to consult the documentation for the [typing module](https://docs.python.org/3/library/typing.html). \[ [Solution](soln5_1.md) | [Index](index.md) | [Exercise 4.4](ex4_4.md) | [Exercise 5.2](ex5_2.md) \] ---- `>>>` Advanced Python Mastery `...` A course by [dabeaz](https://www.dabeaz.com) `...` Copyright 2007-2023 ![](https://i.creativecommons.org/l/by-sa/4.0/88x31.png). This work is licensed under a [Creative Commons Attribution-ShareAlike 4.0 International License](http://creativecommons.org/licenses/by-sa/4.0/)