5.8 KiB
Julia for Data Analysis
Bogumił Kamiński, Daniel Kaszyński
Chapter 7
Problems
Exercise 1
Random.org provides a service that returns random numbers. One of the ways how you can use it is by sending HTTP GET reguests. Here is an example request:
https://www.random.org/integers/?num=10&min=1&max=6&col=1&base=10&format=plain&rnd=new
If you want to understand all the parameters plese check their meaning here.
For us it is enough that this request generates 10 random integers in the range from 1 to 6. Run this query in Julia and parse the result.
Exercise 2
Write a function that tries to parse a string as an integer. If it
succeeds it should return the integer, otherwise it should return
0 but print error message.
Exercise 3
Create a matrix containing truth table for &&
operation including missing. If some operation errors store
"error" in the table. As an extra feature (this is harder
so you can skip it) in each cell store both inputs and output to make
reading the table easier.
Exercise 4
Take a vector v = [1.5, 2.5, missing, 4.5, 5.5, missing]
and replace all missing values in it by the mean of the non-missing
values.
Exercise 5
Take a vector
s = ["1.5", "2.5", missing, "4.5", "5.5", missing] and
parse strings stored in it as Float64, while keeping
missing values unchanged.
Exercise 6
Print to the terminal all days in January 2023 that are Mondays.
Exercise 7
Compute the dates that are one month later than January 15, 2020, February 15 2020, March 15, 2020, and April 15, 2020. How many days pass during this one month. Print the results to the screen?
Exercise 8
Parse the following string as JSON:
str = """
[{"x":1,"y":1},
{"x":2,"y":4},
{"x":3,"y":9},
{"x":4,"y":16},
{"x":5,"y":25}]
"""
into a json variable.
Exercise 9
Extract from the json variable from exercise 8 two
vectors x and y that correspond to the fields
stored in the JSON structure. Plot y as a function of
x.
Exercise 10
Given a vector
m = [missing, 1, missing, 3, missing, missing, 6, missing].
Use linear interpolation for filling missing values. For the extreme
values use nearest available observation (you will need to consult
Impute.jl documentation to find all required functions).
Solutions
Show!
Exercise 1
Solution (example run):
julia> using HTTP
julia> response = HTTP.get("https://www.random.org/integers/?\
num=10&min=1&max=6&col=1&base=10&format=plain&rnd=new");
julia> parse.(Int, split(String(response.body)))
10-element Vector{Int64}:
6
2
6
3
4
2
5
2
3
6
Exercise 2
Example function:
function str2int(s::AbstractString)
try
return parse(Int, s)
catch e
println(e)
end
return 0
end
Let us check it:
julia> str2int("10")
10
julia> str2int(" -1 ")
-1
julia> str2int("12345678901234567890")
OverflowError("overflow parsing \"12345678901234567890\"")
0
julia> str2int("1.3")
ArgumentError("invalid base 10 digit '.' in \"1.3\"")
0
julia> str2int("a")
ArgumentError("invalid base 10 digit 'a' in \"a\"")
0
An alternative solution would use tryparse (not covered
in the book):
function str2int(s::AbstractString)
v = tryparse(Int, s)
if isnothing(v)
println("error while parsing")
return 0
end
return v
end
But this time we do not see the cause of the error.
Exercise 3
Solution:
julia> function apply_and(x, y)
try
return "$x && $y = $(x && y)"
catch e
return "$x && $y = error"
end
end
apply_and (generic function with 2 methods)
julia> apply_and.([true, false, missing], [true false missing])
3×3 Matrix{String}:
"true && true = true" "true && false = false" "true && missing = missing"
"false && true = false" "false && false = false" "false && missing = false"
"missing && true = error" "missing && false = error" "missing && missing = error"
Exercise 4
Solution:
julia> using Statistics
julia> coalesce.(v, mean(skipmissing(v)))
6-element Vector{Float64}:
1.5
2.5
3.5
4.5
5.5
3.5
Exercise 5
Solution:
julia> using Missings
julia> passmissing(parse).(Float64, s)
6-element Vector{Union{Missing, Float64}}:
1.5
2.5
missing
4.5
5.5
missing
Exercise 6
Example solution:
julia> using Dates
julia> for day in Date.(2023, 01, 1:31)
dayofweek(day) == 1 && println(day)
end
2023-01-02
2023-01-09
2023-01-16
2023-01-23
2023-01-30
Exercise 7
Example solution:
julia> for day in Date.(2023, 1:4, 15)
day_next = day + Month(1)
println("$day + 1 month = $day_next (difference: $(day_next - day))")
end
2023-01-15 + 1 month = 2023-02-15 (difference: 31 days)
2023-02-15 + 1 month = 2023-03-15 (difference: 28 days)
2023-03-15 + 1 month = 2023-04-15 (difference: 31 days)
2023-04-15 + 1 month = 2023-05-15 (difference: 30 days)
Exercise 8
Solution:
julia> using JSON3
julia> json = JSON3.read(str)
5-element JSON3.Array{JSON3.Object, Base.CodeUnits{UInt8, String}, Vector{UInt64}}:
{
"x": 1,
"y": 1
}
{
"x": 2,
"y": 4
}
{
"x": 3,
"y": 9
}
{
"x": 4,
"y": 16
}
{
"x": 5,
"y": 25
}
Exercise 9
Solution:
using Plots
x = [el.x for el in json]
y = [el.y for el in json]
plot(x, y, xlabel="x", ylabel="y", legend=false)
Exercise 10
Solution:
julia> using Impute
julia> Impute.nocb!(Impute.locf!(Impute.interp(m)))
8-element Vector{Union{Missing, Int64}}:
1
1
2
3
4
5
6
6
Note that we use the locf! and nocb!
functions (with !) to perform operation in place (a new
vector was already allocated by Impute.interp).