Merge branch 'main' of https://github.com/s-ccs/summerschool_simtech_2023

2023-10-09 14:14:56 +00:00 · 2023-10-09 14:14:56 +00:00 · 20c4a1e58b
commit 20c4a1e58b
parent 773c15150d 39892ad1c1
3 changed files with 380 additions and 456 deletions
--- a/material/2_tue/testing/slides.md
+++ b/material/2_tue/testing/slides.md
@ -34,11 +34,11 @@ slideOptions:
 # Learning Goals
 - Justify the effort of developing tests to some extent
 - Get to know a few common terms of testing
 - Work with the Julia unit testing package `Test.jl`
-Material is taken and modified, on the one hand, from the [SSE lecture](https://github.com/Simulation-Software-Engineering/Lecture-Material), which builds partly on the [py-rse book](https://merely-useful.tech/py-rse), and, on the other hand, from the [Test.jl docs](https://docs.julialang.org/en/v1/stdlib/Test/).
+Material is taken and modified from the [SSE lecture](https://github.com/Simulation-Software-Engineering/Lecture-Material), which builds partly on the [py-rse book](https://merely-useful.tech/py-rse), and from the [Test.jl docs](https://docs.julialang.org/en/v1/stdlib/Test/).
 ---
@ -48,30 +48,21 @@ Material is taken and modified, on the one hand, from the [SSE lecture](https://
 ## What is Testing?
- Smelling old milk before using it!
+- Smelling old milk before using it
- A way to determine if a software is not producing reliable results and if so, what is the reason.
+- A way to determine if a software is not producing reliable results and if so, what is the reason
- Manual testing vs. automated testing.
+- Manual testing vs. automated testing
 ---
 ## Why Should you Test your Software?
- Improve software reliability and reproducibility.
+- Improve software reliability and reproducibility
- Make sure that changes (bugfixes, new features) do not affect other parts of software.
+- Make sure that changes (bugfixes, new features) do not affect other parts of software
 - Generally all software is better off being tested regularly. Possible exceptions are very small codes with single users.
 - Ensure that a released version of a software actually works.
 ---
 ## Nomenclature in Software Testing
 - **Fixture**: preparatory set for testing.
 - **Actual result**: what the code produces when given the fixture.
 - **Expected result**: what the actual result is compared to.
 - **Test coverage**: how much of the code do tests touch in one run.
 ---
 ## Some Ways to Test Software
 - Assertions
@ -83,29 +74,27 @@ Material is taken and modified, on the one hand, from the [SSE lecture](https://
 ## Assertions
- Principle of *defensive programming*.
+```julia
@assert condition "message"
 ```
 - Principle of *defensive programming*
 - Nothing happens when an assertion is true; throws error when false.
 - Types of assertion statements:
    - Precondition
    - Postcondition
    - Invariant
- A basic but powerful tool to test a software on-the-go.
+- A basic but powerful tool to test a software on-the-go
 - Assertion statement syntax in Python
 ```julia
@assert condition "message"
 ```
 ---
 ## Unit Testing
- Catching errors with assertions is good but preventing them is better!
+- Catching errors with assertions is good but preventing them is better.
 - A *unit* is a single function in one situation.
    - A situation is one amongst many possible variations of input parameters.
- User creates the expected result manually.
+- User creates the **expected result** manually.
- Fixture is the set of inputs used to generate an actual result.
+- **Actual result** is compared to the expected result by `@test`.
 - Actual result is compared to the expected result by `@test`.
 ---
@ -113,82 +102,26 @@ Material is taken and modified, on the one hand, from the [SSE lecture](https://
 - Test whether several units work in conjunction.
 - *Integrate* units and test them together in an *integration* test.
- Often more complicated than a unit test and has more test coverage.
+- Often more complicated than a unit test and gives higher test coverage.
 - A fixture is used to generate an actual result.
 - Actual result is compared to the expected result by `@test`.
 ---
 ## Regression Testing
 - Generating an expected result is not possible in some situations.
- Compare the current actual result with a previous actual result.
+- Compare the *current* actual result with a *previous* actual result.
 - No guarantee that the current actual result is correct.
 - Risk of a bug being carried over indefinitely.
 - Main purpose is to identify changes in the current state of the code with respect to a past state.
 ---
 ## Test Coverage
 - Coverage is the amount of code a test touches in one run.
 - Aim for high test coverage.
 - There is a trade-off: high test coverage vs. effort in test development
 ---
 ## Comparing Floating-point Variables
 - Very often quantities in math software are `float` / `double`.
 - Such quantities cannot be compared to exact values, an approximation is necessary.
 - Comparison of floating point variables needs to be done to a certain tolerance.
 ```julia
@test 1 ≈ 0.999999  rtol=1e-5
 ```
 - Get `≈` by Latex `\approx` + TAB
 ---
 ## Test-driven Development (TDD)
 - Principle is to write a test and then write a code to fulfill the test.
 - Advantages:
    - In the end user ends up with a test alongside the code.
    - Eliminates confirmation bias of the user.
    - Writing tests gives clarity on what the code is supposed to do.
 - Disadvantage: known to not improve productivity.
 ---
 ## Checking-driven Development (CDD)
 - Developer performs spot checks; sanity checks at intermediate stages
 - Math software often has heuristics which are easy to determine.
 - Keep performing same checks at different stages of development to ensure the code works.
 ---
 ## Verifying a Test
 - Test written as part of a bug-fix:
    - Reproduce the bug in the test by ensuring that the test fails.
    - Fix the bug.
    - Rerun the test to ensure that it passes.
 - Test written to increase code coverage:
    - Make sure that the first iteration of the test passes.
    - Try introducing a small fixable bug in the code to verify if the test fails.
 ---
 # 2. Unit Testing in Julia with Test.jl
 ---
-## Setup of Tests.jl
+## Setup of Test.jl
 - Standard library to write and manage tests, `using Test`
 - Standardized folder structure:
  ```
@ -204,18 +137,9 @@ Material is taken and modified, on the one hand, from the [SSE lecture](https://
 - Singular `test` vs plural `runtests.jl`
 - `setup.jl` for all `using XYZ` statements, included in `runtests.jl`
- Additional packages either in `[extra] section` of `./Project.toml` or in a new `./test/Project.toml` environment
+- Additional packages in `[extra] section` of `./Project.toml` or in new `./test/Project.toml`
  - In case of the latter: Do not add the package itself to the `./test/Project.toml`
-
+- Run: `]test` when root project is activated
 ---
 ## Run Tests
 Various options:
 - Directly call `runtests.jl` TODO?
 - From Pkg-Manager `]test` when root project is activated
 ---
@ -234,7 +158,7 @@ Various options:
 - `@testset`: Structure tests
   ```julia
-   julia> @testset "trigonometric identities" begin
+   @testset "trigonometric identities" begin
       θ = 2/3*π
       @test sin(-θ) ≈ -sin(θ)
       @test cos(-θ) ≈ cos(θ)
@ -251,6 +175,8 @@ Various options:
 - [HiRSE-Summer of Testing Part 2b: "Testing with Julia" by Nils Niggemann](https://www.youtube.com/watch?v=gSMKNbZOpZU)
 - [Official documentation of Test.jl](https://docs.julialang.org/en/v1/stdlib/Test/)
 ---
 # 3. Test.jl Demo
 We use [`MyTestPackage`](https://github.com/s-ccs/summerschool_simtech_2023/tree/main/material/2_tue/testing/MyTestPackage), which looks as follows:
@ -274,7 +200,7 @@ We use [`MyTestPackage`](https://github.com/s-ccs/summerschool_simtech_2023/tree
 - Look at `MyTestPackage.jl` and `find.jl`: We have two functions `find_max` and `find_mean`, which calculate the maximum and mean of all elements of a `::AbstractVector`.
  - Assertions were added to check for `NaN` values
 - Look at `runtests.jl`:
-  - TODO: Why do we need `using MyTestPackage`?
+  - Why do we need `using MyTestPackage`?
  - We include dependencies via `setup.jl`: `Test` and `StableRNG`.
  - Testset "find"
 - Look at `find.jl`
--- a/material/3_wed/regression/Code_Snippets.jl
+++ b/material/3_wed/regression/Code_Snippets.jl
@ -1,34 +1,30 @@
 ############################################################################
 #### Execute code chunks separately in VSCODE by pressing 'Alt + Enter' ####
 ############################################################################
 using Statistics
 using Plots
 using RDatasets
 using GLM
-##
+#---
 trees = dataset("datasets", "trees")
 scatter(trees.Girth, trees.Volume,
        legend=false, xlabel="Girth", ylabel="Volume")
-##
+#---
 scatter(trees.Girth, trees.Volume,
        legend=false, xlabel="Girth", ylabel="Volume")
 plot!(x -> -37 + 5*x)
-##
+#---
 linmod1 = lm(@formula(Volume ~ Girth), trees)
-##
+#---
 linmod2 = lm(@formula(Volume ~ Girth + Height), trees)
-##
+#---
 r2(linmod1)
 r2(linmod2)
@ -37,7 +33,7 @@ linmod3 = lm(@formula(Volume ~ Girth + Height + Girth*Height), trees)
 r2(linmod3)
-##
+#---
 using CSV
 using HTTP
@ -47,6 +43,6 @@ SwissLabor = DataFrame(CSV.File(http_response.body))
 SwissLabor[!,"participation"] .= (SwissLabor.participation .== "yes")
-##
+#---
 model = glm(@formula(participation ~ age), SwissLabor, Binomial(), ProbitLink())
--- a/material/3_wed/regression/MultipleRegressionBasics.qmd
+++ b/material/3_wed/regression/MultipleRegressionBasics.qmd
@ -10,7 +10,7 @@ editor:
 ### Introductory Example: tree dataset from R
-```{julia}
+``` julia
 using Statistics
 using Plots
 using RDatasets
@ -25,7 +25,7 @@ scatter(trees.Volume, trees.Girth,
 the *explanatory variable/covariate* `girth`? Can we predict the volume
 of a tree given its girth?
-```{julia}
+``` julia
 scatter(trees.Girth, trees.Volume,
        legend=false, xlabel="Girth", ylabel="Volume")
 plot!(x -> -37 + 5*x)
@ -68,7 +68,7 @@ rather use Julia to solve the problem.
 \[use Julia code (existing package) to perform linear regression for
 `volume ~ girth`\]
-```{julia}
+``` julia
 lm(@formula(Volume ~ Girth), trees)
 ```
@ -183,7 +183,7 @@ the corresponding standard errors and the $t$-statistics. Test your
 functions with the \`\`\`tree''' data set and try to reproduce the
 output above.
-```{julia}
+``` julia
 r2(linmod1)
 r2(linmod2)
@ -232,21 +232,22 @@ $$
 For the models above, these are:
-+--------------+---------------------+--------------------+
+---------------+-------------------+------------------+
-| Type of Data | Distribution Family | Link Function      |
+| Type of Data  | Distribution      | Link Function    |
-+==============+=====================+====================+
+|               | Family            |                  |
 +===============+===================+==================+
 | continuous    | Normal            | identity:        |
 |               |                   |                  |
 |               |                   | $$               |
 |               |                   | g(x)=x           |
 |               |                   | $$               |
-+--------------+---------------------+--------------------+
+---------------+-------------------+------------------+
 | count         | Poisson           | log:             |
 |               |                   |                  |
 |               |                   | $$               |
 |               |                   |  g(x) = \log(x)  |
 |               |                   | $$               |
-+--------------+---------------------+--------------------+
+---------------+-------------------+------------------+
 | binary        | Bernoulli         | logit:           |
 |               |                   |                  |
 |               |                   | $$               |
@ -254,9 +255,10 @@ For the models above, these are:
 |               |                   | (                |
 |               |                   | \                |
 |               |                   | f                |
-|              |                     | rac{x}{1-x}\right) |
+|               |                   | ra               |
 |               |                   | c{x}{1-x}\right) |
 |               |                   | $$               |
-+--------------+---------------------+--------------------+
+---------------+-------------------+------------------+
 In general, the parameter vector $\beta$ is estimated via maximizing the
 likelihood, i.e.,
@ -274,7 +276,7 @@ $$
 In the Gaussian case, the maximum likelihood estimator is identical to
 the least squares estimator considered above.
-```{julia}
+``` julia
 using CSV
 using HTTP