Some Jupyter notebooks to help beginners start using a selected set of ML methods.
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
 
Danilo Ferreira de Lima 45c44e4534 Updated the presentation. 5 months ago
Gaussian Processes.ipynb Updated all notebooks after re-testing the READMe instructions. 5 months ago
Mixture Models.ipynb More information 5 months ago
README.md Added note on optimizations for virtualenv. 6 months ago
Representation Learning.ipynb Updated all notebooks after re-testing the READMe instructions. 5 months ago
Supervised classification.ipynb Updated all notebooks after re-testing the READMe instructions. 5 months ago
Supervised regression.ipynb Updated all notebooks after re-testing the READMe instructions. 5 months ago
Support Vector Machines.ipynb Updated all notebooks after re-testing the READMe instructions. 5 months ago
user_meeting_ml_intro_jan_2022.pdf Updated the presentation. 5 months ago

README.md

ML Tutorial

These are some hands-on Jupyter notebooks to help beginners start using a selected set of ML methods. When there was no time to delve into the details of the presentation, some extra details on the maths derivations are also given in the notebooks.

Most of the notebooks require installing special software the following is a general setup that should work with most of the given notebooks. The list of packages needed in each given notebook is given in the beginning of the notebook with a !pip install ... command. It is highly recommended to use Anaconda/Miniconda for the setup, as shown below. If you prefer a simpler configuration, details on the Virtualenv setup is also given at the end (but this tends to use a less optimized setup, which may affect the performance and some results).

All the data used in the examples are produced on-the-fly for demonstration purposes, or are taken from public and open resources. None of the data comes from the EuXFEL, as the purpose of the examples is to show the idea behind the methods.

Anaconda configuration

If you already know how to use Anaconda (or Miniconda), the following setup is recommended. If you are not used to Anaconda, see the following references on how to install and a quick start guide:

Here is a cut-and-paste installation guide for the impatient in Linux:

wget https://repo.anaconda.com/miniconda/Miniconda3-py37_4.10.3-Linux-x86_64.sh

# or download it from: https://docs.conda.io/en/latest/miniconda.html#linux-installers
bash Miniconda3-py37_4.10.3-Linux-x86_64.sh

# follow the instructions

This creates a conda environment named ml (use whichever name your prefer instead) and install the packages needed. If you are using a Mac, please see the section below for a special configuration for Anaconda in a Mac (due to an incompatibility between some Mac libraries and Anaconda's default MKL installation).

# create the environment
conda create --name ml python=3.6

# activate it
conda activate ml

# install basic libraries
conda install mkl

# PyTorch (without the GPU acceleration):
conda install pytorch torchvision cpuonly -c pytorch

# Or, instead, if you have an NVIDIA GPU (for AMD GPUs, check https://pytorch.org/ and look for the ROCm platform):
#conda install pytorch torchvision cudatoolkit=10.2 -c pytorch

conda install numpy scipy pandas scikit-learn matplotlib jupyter

# for the Bayesian Neural Network notebook:
pip install torchbnn

# play with the notebooks:
jupyter notebook

# when you are done:

conda deactivate

Special Anaconda setup in MacOS

In MacOS, the standard optimization done in Anaconda for fast matrix multiplications using the MKL library clashes with a Mac-specific implementation. To avoid this issue, use the setup below.

# create the environment
conda create --name ml python=3.6

# activate it
conda activate ml

# remove MKL to avoid conflict in MacOS
conda install nomkl

# PyTorch (without the GPU acceleration):
conda install pytorch torchvision cpuonly -c pytorch

# Or, instead, if you have an NVIDIA GPU (will not work with an AMD or Intel GPU):
#conda install pytorch torchvision cudatoolkit=10.2 -c pytorch

conda install numpy scipy pandas scikit-learn matplotlib jupyter

# for the Bayesian Neural Network notebook:
pip install torchbnn

# remove any MKL libraries installed as a dependency from the packages above:
conda remove mkl mkl-service

# play with the notebooks:
jupyter notebook

# when you are done:

conda deactivate

More details on this can be found in:

https://stackoverflow.com/questions/53014306/error-15-initializing-libiomp5-dylib-but-found-libiomp5-dylib-already-initial

Virtualenv setup

It is preferrable to use Anaconda, as several optimizations are included in the Anaconda packages. Nevertheless, if this is not possible, or if you would rather use a fast and simple installation, the following instructions should allow for a fast setup and still prevent clashes between these packages and your default setup. Using a virtual environment, the setup would be the following (for an environment called ml, but use the name you prefer):

# create the environment ml
virtualenv -p python3 ml

# load it
source ml/bin/activate

# install some packages
pip install numpy scipy pandas torch scikit-learn matplotlib torchvision torchbnn jupyter

# play with the notebooks ...
jupyter notebook

# when you are done:
deactivate

Contact us!

Contact us at the EuXFEL Data Analysis group at any time if you need help analysing your data!

Email: da@xfel.eu