This commit is contained in:
behinger (s-ccs 001) 2023-10-09 14:14:56 +00:00
commit 20c4a1e58b
3 changed files with 380 additions and 456 deletions

View File

@ -34,11 +34,11 @@ slideOptions:
# Learning Goals # Learning Goals
- Justify the effort of developing tests to some extent
- Get to know a few common terms of testing - Get to know a few common terms of testing
- Work with the Julia unit testing package `Test.jl` - Work with the Julia unit testing package `Test.jl`
Material is taken and modified, on the one hand, from the [SSE lecture](https://github.com/Simulation-Software-Engineering/Lecture-Material), which builds partly on the [py-rse book](https://merely-useful.tech/py-rse), and, on the other hand, from the [Test.jl docs](https://docs.julialang.org/en/v1/stdlib/Test/). Material is taken and modified from the [SSE lecture](https://github.com/Simulation-Software-Engineering/Lecture-Material), which builds partly on the [py-rse book](https://merely-useful.tech/py-rse), and from the [Test.jl docs](https://docs.julialang.org/en/v1/stdlib/Test/).
--- ---
@ -48,30 +48,21 @@ Material is taken and modified, on the one hand, from the [SSE lecture](https://
## What is Testing? ## What is Testing?
- Smelling old milk before using it! - Smelling old milk before using it
- A way to determine if a software is not producing reliable results and if so, what is the reason. - A way to determine if a software is not producing reliable results and if so, what is the reason
- Manual testing vs. automated testing. - Manual testing vs. automated testing
--- ---
## Why Should you Test your Software? ## Why Should you Test your Software?
- Improve software reliability and reproducibility. - Improve software reliability and reproducibility
- Make sure that changes (bugfixes, new features) do not affect other parts of software. - Make sure that changes (bugfixes, new features) do not affect other parts of software
- Generally all software is better off being tested regularly. Possible exceptions are very small codes with single users. - Generally all software is better off being tested regularly. Possible exceptions are very small codes with single users.
- Ensure that a released version of a software actually works. - Ensure that a released version of a software actually works.
--- ---
## Nomenclature in Software Testing
- **Fixture**: preparatory set for testing.
- **Actual result**: what the code produces when given the fixture.
- **Expected result**: what the actual result is compared to.
- **Test coverage**: how much of the code do tests touch in one run.
---
## Some Ways to Test Software ## Some Ways to Test Software
- Assertions - Assertions
@ -83,29 +74,27 @@ Material is taken and modified, on the one hand, from the [SSE lecture](https://
## Assertions ## Assertions
- Principle of *defensive programming*. ```julia
@assert condition "message"
```
- Principle of *defensive programming*
- Nothing happens when an assertion is true; throws error when false. - Nothing happens when an assertion is true; throws error when false.
- Types of assertion statements: - Types of assertion statements:
- Precondition - Precondition
- Postcondition - Postcondition
- Invariant - Invariant
- A basic but powerful tool to test a software on-the-go. - A basic but powerful tool to test a software on-the-go
- Assertion statement syntax in Python
```julia
@assert condition "message"
```
--- ---
## Unit Testing ## Unit Testing
- Catching errors with assertions is good but preventing them is better! - Catching errors with assertions is good but preventing them is better.
- A *unit* is a single function in one situation. - A *unit* is a single function in one situation.
- A situation is one amongst many possible variations of input parameters. - A situation is one amongst many possible variations of input parameters.
- User creates the expected result manually. - User creates the **expected result** manually.
- Fixture is the set of inputs used to generate an actual result. - **Actual result** is compared to the expected result by `@test`.
- Actual result is compared to the expected result by `@test`.
--- ---
@ -113,82 +102,26 @@ Material is taken and modified, on the one hand, from the [SSE lecture](https://
- Test whether several units work in conjunction. - Test whether several units work in conjunction.
- *Integrate* units and test them together in an *integration* test. - *Integrate* units and test them together in an *integration* test.
- Often more complicated than a unit test and has more test coverage. - Often more complicated than a unit test and gives higher test coverage.
- A fixture is used to generate an actual result.
- Actual result is compared to the expected result by `@test`.
--- ---
## Regression Testing ## Regression Testing
- Generating an expected result is not possible in some situations. - Generating an expected result is not possible in some situations.
- Compare the current actual result with a previous actual result. - Compare the *current* actual result with a *previous* actual result.
- No guarantee that the current actual result is correct. - No guarantee that the current actual result is correct.
- Risk of a bug being carried over indefinitely. - Risk of a bug being carried over indefinitely.
- Main purpose is to identify changes in the current state of the code with respect to a past state. - Main purpose is to identify changes in the current state of the code with respect to a past state.
--- ---
## Test Coverage
- Coverage is the amount of code a test touches in one run.
- Aim for high test coverage.
- There is a trade-off: high test coverage vs. effort in test development
---
## Comparing Floating-point Variables
- Very often quantities in math software are `float` / `double`.
- Such quantities cannot be compared to exact values, an approximation is necessary.
- Comparison of floating point variables needs to be done to a certain tolerance.
```julia
@test 1 ≈ 0.999999 rtol=1e-5
```
- Get `≈` by Latex `\approx` + TAB
---
## Test-driven Development (TDD)
- Principle is to write a test and then write a code to fulfill the test.
- Advantages:
- In the end user ends up with a test alongside the code.
- Eliminates confirmation bias of the user.
- Writing tests gives clarity on what the code is supposed to do.
- Disadvantage: known to not improve productivity.
---
## Checking-driven Development (CDD)
- Developer performs spot checks; sanity checks at intermediate stages
- Math software often has heuristics which are easy to determine.
- Keep performing same checks at different stages of development to ensure the code works.
---
## Verifying a Test
- Test written as part of a bug-fix:
- Reproduce the bug in the test by ensuring that the test fails.
- Fix the bug.
- Rerun the test to ensure that it passes.
- Test written to increase code coverage:
- Make sure that the first iteration of the test passes.
- Try introducing a small fixable bug in the code to verify if the test fails.
---
# 2. Unit Testing in Julia with Test.jl # 2. Unit Testing in Julia with Test.jl
--- ---
## Setup of Tests.jl ## Setup of Test.jl
- Standard library to write and manage tests, `using Test`
- Standardized folder structure: - Standardized folder structure:
``` ```
@ -204,18 +137,9 @@ Material is taken and modified, on the one hand, from the [SSE lecture](https://
- Singular `test` vs plural `runtests.jl` - Singular `test` vs plural `runtests.jl`
- `setup.jl` for all `using XYZ` statements, included in `runtests.jl` - `setup.jl` for all `using XYZ` statements, included in `runtests.jl`
- Additional packages either in `[extra] section` of `./Project.toml` or in a new `./test/Project.toml` environment - Additional packages in `[extra] section` of `./Project.toml` or in new `./test/Project.toml`
- In case of the latter: Do not add the package itself to the `./test/Project.toml` - In case of the latter: Do not add the package itself to the `./test/Project.toml`
- Run: `]test` when root project is activated
---
## Run Tests
Various options:
- Directly call `runtests.jl` TODO?
- From Pkg-Manager `]test` when root project is activated
--- ---
@ -234,7 +158,7 @@ Various options:
- `@testset`: Structure tests - `@testset`: Structure tests
```julia ```julia
julia> @testset "trigonometric identities" begin @testset "trigonometric identities" begin
θ = 2/3*π θ = 2/3*π
@test sin(-θ) ≈ -sin(θ) @test sin(-θ) ≈ -sin(θ)
@test cos(-θ) ≈ cos(θ) @test cos(-θ) ≈ cos(θ)
@ -251,6 +175,8 @@ Various options:
- [HiRSE-Summer of Testing Part 2b: "Testing with Julia" by Nils Niggemann](https://www.youtube.com/watch?v=gSMKNbZOpZU) - [HiRSE-Summer of Testing Part 2b: "Testing with Julia" by Nils Niggemann](https://www.youtube.com/watch?v=gSMKNbZOpZU)
- [Official documentation of Test.jl](https://docs.julialang.org/en/v1/stdlib/Test/) - [Official documentation of Test.jl](https://docs.julialang.org/en/v1/stdlib/Test/)
---
# 3. Test.jl Demo # 3. Test.jl Demo
We use [`MyTestPackage`](https://github.com/s-ccs/summerschool_simtech_2023/tree/main/material/2_tue/testing/MyTestPackage), which looks as follows: We use [`MyTestPackage`](https://github.com/s-ccs/summerschool_simtech_2023/tree/main/material/2_tue/testing/MyTestPackage), which looks as follows:
@ -274,7 +200,7 @@ We use [`MyTestPackage`](https://github.com/s-ccs/summerschool_simtech_2023/tree
- Look at `MyTestPackage.jl` and `find.jl`: We have two functions `find_max` and `find_mean`, which calculate the maximum and mean of all elements of a `::AbstractVector`. - Look at `MyTestPackage.jl` and `find.jl`: We have two functions `find_max` and `find_mean`, which calculate the maximum and mean of all elements of a `::AbstractVector`.
- Assertions were added to check for `NaN` values - Assertions were added to check for `NaN` values
- Look at `runtests.jl`: - Look at `runtests.jl`:
- TODO: Why do we need `using MyTestPackage`? - Why do we need `using MyTestPackage`?
- We include dependencies via `setup.jl`: `Test` and `StableRNG`. - We include dependencies via `setup.jl`: `Test` and `StableRNG`.
- Testset "find" - Testset "find"
- Look at `find.jl` - Look at `find.jl`

View File

@ -1,34 +1,30 @@
############################################################################
#### Execute code chunks separately in VSCODE by pressing 'Alt + Enter' ####
############################################################################
using Statistics using Statistics
using Plots using Plots
using RDatasets using RDatasets
using GLM using GLM
## #---
trees = dataset("datasets", "trees") trees = dataset("datasets", "trees")
scatter(trees.Girth, trees.Volume, scatter(trees.Girth, trees.Volume,
legend=false, xlabel="Girth", ylabel="Volume") legend=false, xlabel="Girth", ylabel="Volume")
## #---
scatter(trees.Girth, trees.Volume, scatter(trees.Girth, trees.Volume,
legend=false, xlabel="Girth", ylabel="Volume") legend=false, xlabel="Girth", ylabel="Volume")
plot!(x -> -37 + 5*x) plot!(x -> -37 + 5*x)
## #---
linmod1 = lm(@formula(Volume ~ Girth), trees) linmod1 = lm(@formula(Volume ~ Girth), trees)
## #---
linmod2 = lm(@formula(Volume ~ Girth + Height), trees) linmod2 = lm(@formula(Volume ~ Girth + Height), trees)
## #---
r2(linmod1) r2(linmod1)
r2(linmod2) r2(linmod2)
@ -37,7 +33,7 @@ linmod3 = lm(@formula(Volume ~ Girth + Height + Girth*Height), trees)
r2(linmod3) r2(linmod3)
## #---
using CSV using CSV
using HTTP using HTTP
@ -47,6 +43,6 @@ SwissLabor = DataFrame(CSV.File(http_response.body))
SwissLabor[!,"participation"] .= (SwissLabor.participation .== "yes") SwissLabor[!,"participation"] .= (SwissLabor.participation .== "yes")
## #---
model = glm(@formula(participation ~ age), SwissLabor, Binomial(), ProbitLink()) model = glm(@formula(participation ~ age), SwissLabor, Binomial(), ProbitLink())

View File

@ -10,7 +10,7 @@ editor:
### Introductory Example: tree dataset from R ### Introductory Example: tree dataset from R
```{julia} ``` julia
using Statistics using Statistics
using Plots using Plots
using RDatasets using RDatasets
@ -25,7 +25,7 @@ scatter(trees.Volume, trees.Girth,
the *explanatory variable/covariate* `girth`? Can we predict the volume the *explanatory variable/covariate* `girth`? Can we predict the volume
of a tree given its girth? of a tree given its girth?
```{julia} ``` julia
scatter(trees.Girth, trees.Volume, scatter(trees.Girth, trees.Volume,
legend=false, xlabel="Girth", ylabel="Volume") legend=false, xlabel="Girth", ylabel="Volume")
plot!(x -> -37 + 5*x) plot!(x -> -37 + 5*x)
@ -68,7 +68,7 @@ rather use Julia to solve the problem.
\[use Julia code (existing package) to perform linear regression for \[use Julia code (existing package) to perform linear regression for
`volume ~ girth`\] `volume ~ girth`\]
```{julia} ``` julia
lm(@formula(Volume ~ Girth), trees) lm(@formula(Volume ~ Girth), trees)
``` ```
@ -183,7 +183,7 @@ the corresponding standard errors and the $t$-statistics. Test your
functions with the \`\`\`tree''' data set and try to reproduce the functions with the \`\`\`tree''' data set and try to reproduce the
output above. output above.
```{julia} ``` julia
r2(linmod1) r2(linmod1)
r2(linmod2) r2(linmod2)
@ -232,21 +232,22 @@ $$
For the models above, these are: For the models above, these are:
+--------------+---------------------+--------------------+ +---------------+-------------------+------------------+
| Type of Data | Distribution Family | Link Function | | Type of Data | Distribution | Link Function |
+==============+=====================+====================+ | | Family | |
+===============+===================+==================+
| continuous | Normal | identity: | | continuous | Normal | identity: |
| | | | | | | |
| | | $$ | | | | $$ |
| | | g(x)=x | | | | g(x)=x |
| | | $$ | | | | $$ |
+--------------+---------------------+--------------------+ +---------------+-------------------+------------------+
| count | Poisson | log: | | count | Poisson | log: |
| | | | | | | |
| | | $$ | | | | $$ |
| | | g(x) = \log(x) | | | | g(x) = \log(x) |
| | | $$ | | | | $$ |
+--------------+---------------------+--------------------+ +---------------+-------------------+------------------+
| binary | Bernoulli | logit: | | binary | Bernoulli | logit: |
| | | | | | | |
| | | $$ | | | | $$ |
@ -254,9 +255,10 @@ For the models above, these are:
| | | ( | | | | ( |
| | | \ | | | | \ |
| | | f | | | | f |
| | | rac{x}{1-x}\right) | | | | ra |
| | | c{x}{1-x}\right) |
| | | $$ | | | | $$ |
+--------------+---------------------+--------------------+ +---------------+-------------------+------------------+
In general, the parameter vector $\beta$ is estimated via maximizing the In general, the parameter vector $\beta$ is estimated via maximizing the
likelihood, i.e., likelihood, i.e.,
@ -274,7 +276,7 @@ $$
In the Gaussian case, the maximum likelihood estimator is identical to In the Gaussian case, the maximum likelihood estimator is identical to
the least squares estimator considered above. the least squares estimator considered above.
```{julia} ``` julia
using CSV using CSV
using HTTP using HTTP