JuliaForDataAnalysis/exercises/exercises07.md
2022-12-05 18:27:43 +01:00

325 lines
5.9 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Julia for Data Analysis
## Bogumił Kamiński, Daniel Kaszyński
# Chapter 7
# Problems
### Exercise 1
Random.org provides a service that returns random numbers. One of the ways
how you can use it is by sending HTTP GET requests. Here is an example request:
> https://www.random.org/integers/?num=10&min=1&max=6&col=1&base=10&format=plain&rnd=new
If you want to understand all the parameters please check their meaning
[here](https://www.random.org/clients/http/).
For us it is enough that this request generates 10 random integers in the range
from 1 to 6. Run this query in Julia and parse the result.
<details>
<summary>Solution</summary>
Example run:
```
julia> using HTTP
julia> response = HTTP.get("https://www.random.org/integers/?\
num=10&min=1&max=6&col=1&base=10&format=plain&rnd=new");
julia> parse.(Int, split(String(response.body)))
10-element Vector{Int64}:
6
2
6
3
4
2
5
2
3
6
```
</details>
### Exercise 2
Write a function that tries to parse a string as an integer.
If it succeeds it should return the integer, otherwise it should return `0`
but print error message.
<details>
<summary>Solution</summary>
Example function:
```
function str2int(s::AbstractString)
try
return parse(Int, s)
catch e
println(e)
end
return 0
end
```
Let us check it:
```
julia> str2int("10")
10
julia> str2int(" -1 ")
-1
julia> str2int("12345678901234567890")
OverflowError("overflow parsing \"12345678901234567890\"")
0
julia> str2int("1.3")
ArgumentError("invalid base 10 digit '.' in \"1.3\"")
0
julia> str2int("a")
ArgumentError("invalid base 10 digit 'a' in \"a\"")
0
```
An alternative solution would use `tryparse` (not covered in the book):
```
function str2int(s::AbstractString)
v = tryparse(Int, s)
if isnothing(v)
println("error while parsing")
return 0
end
return v
end
```
But this time we do not see the cause of the error.
</details>
### Exercise 3
Create a matrix containing truth table for `&&` operation including `missing`.
If some operation errors store `"error"` in the table. As an extra feature (this
is harder so you can skip it) in each cell store both inputs and output to make
reading the table easier.
<details>
<summary>Solution</summary>
```
julia> function apply_and(x, y)
try
return "$x && $y = $(x && y)"
catch e
return "$x && $y = error"
end
end
apply_and (generic function with 2 methods)
julia> apply_and.([true, false, missing], [true false missing])
3×3 Matrix{String}:
"true && true = true" "true && false = false" "true && missing = missing"
"false && true = false" "false && false = false" "false && missing = false"
"missing && true = error" "missing && false = error" "missing && missing = error"
```
</details>
### Exercise 4
Take a vector `v = [1.5, 2.5, missing, 4.5, 5.5, missing]` and replace all
missing values in it by the mean of the non-missing values.
<details>
<summary>Solution</summary>
```
julia> using Statistics
julia> coalesce.(v, mean(skipmissing(v)))
6-element Vector{Float64}:
1.5
2.5
3.5
4.5
5.5
3.5
```
</details>
### Exercise 5
Take a vector `s = ["1.5", "2.5", missing, "4.5", "5.5", missing]` and parse
strings stored in it as `Float64`, while keeping `missing` values unchanged.
<details>
<summary>Solution</summary>
```
julia> using Missings
julia> passmissing(parse).(Float64, s)
6-element Vector{Union{Missing, Float64}}:
1.5
2.5
missing
4.5
5.5
missing
```
</details>
### Exercise 6
Print to the terminal all days in January 2023 that are Mondays.
<details>
<summary>Solution</summary>
Example:
```
julia> using Dates
julia> for day in Date.(2023, 01, 1:31)
dayofweek(day) == 1 && println(day)
end
2023-01-02
2023-01-09
2023-01-16
2023-01-23
2023-01-30
```
</details>
### Exercise 7
Compute the dates that are one month later than January 15, 2020, February 15
2020, March 15, 2020, and April 15, 2020. How many days pass during this one
month. Print the results to the screen?
<details>
<summary>Solution</summary>
Example:
```
julia> for day in Date.(2023, 1:4, 15)
day_next = day + Month(1)
println("$day + 1 month = $day_next (difference: $(day_next - day))")
end
2023-01-15 + 1 month = 2023-02-15 (difference: 31 days)
2023-02-15 + 1 month = 2023-03-15 (difference: 28 days)
2023-03-15 + 1 month = 2023-04-15 (difference: 31 days)
2023-04-15 + 1 month = 2023-05-15 (difference: 30 days)
```
</details>
### Exercise 8
Parse the following string as JSON:
```
str = """
[{"x":1,"y":1},
{"x":2,"y":4},
{"x":3,"y":9},
{"x":4,"y":16},
{"x":5,"y":25}]
"""
```
into a `json` variable.
<details>
<summary>Solution</summary>
```
julia> using JSON3
julia> json = JSON3.read(str)
5-element JSON3.Array{JSON3.Object, Base.CodeUnits{UInt8, String}, Vector{UInt64}}:
{
"x": 1,
"y": 1
}
{
"x": 2,
"y": 4
}
{
"x": 3,
"y": 9
}
{
"x": 4,
"y": 16
}
{
"x": 5,
"y": 25
}
```
</details>
### Exercise 9
Extract from the `json` variable from exercise 8 two vectors `x` and `y`
that correspond to the fields stored in the JSON structure.
Plot `y` as a function of `x`.
<details>
<summary>Solution</summary>
```
using Plots
x = [el.x for el in json]
y = [el.y for el in json]
plot(x, y, xlabel="x", ylabel="y", legend=false)
```
</details>
### Exercise 10
Given a vector `m = [missing, 1, missing, 3, missing, missing, 6, missing]`.
Use linear interpolation for filling missing values. For the extreme values
use nearest available observation (you will need to consult Impute.jl
documentation to find all required functions).
<details>
<summary>Solution</summary>
```
julia> using Impute
julia> Impute.nocb!(Impute.locf!(Impute.interp(m)))
8-element Vector{Union{Missing, Int64}}:
1
1
2
3
4
5
6
6
```
Note that we use the `locf!` and `nocb!` functions (with `!`) to perform
operation in place (a new vector was already allocated by `Impute.interp`).
</details>