Revised with 1.6.1

This commit is contained in:
David Doblas Jiménez 2021-06-26 18:57:09 +02:00
parent 6acd882163
commit f855c1c2ea

View File

@ -0,0 +1,170 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Environment setup for data frames tutorial\n",
"\n",
"## Bogumił Kamiński"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Welcome to DataFrames.jl introduction!\n",
"\n",
"This set of Jupyter notebooks is intended to give you an overwiew of what functionality DataFrames.jl has based on practical examples.\n",
"\n",
"You can find reviews of functionality of DataFrames.jl (not as exercises as this tutorial but task-type oriented) in the following locations:\n",
"* an official manual at https://juliadata.github.io/DataFrames.jl/stable/\n",
"* a tutorial going through all functionalities of DataFrames.jl at https://github.com/bkamins/Julia-DataFrames-Tutorial\n",
"\n",
"We also assume that you have a basic knowledge of the Julia language and the Julia ecosystem. There are great tutorials on this topic in [JuliaAcademy](https://juliaacademy.com/), so I encourage you to check them out.\n",
"\n",
"As this is a hands-on tutorial you can expect that the examples will be implemented in a way as I would write them when doing actual project."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The notebooks were prepared under Julia 1.5.3 and tested under Julia 1.6.1. If you have a different version of Julia installed change the kernel in *Kernel/Change kernel* option in menu (assuming you are on a Julia 1.x all examples should work without a problem)."
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"v\"1.6.1\""
]
},
"execution_count": 1,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"VERSION"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Jupyter Notebook automatically activates project environment if it is found in the working directory.\n",
"\n",
"So first let us check if we have Project.toml and Manifest.toml files present (they should be present if you cloned the repository of this tutorial)."
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"2-element BitVector:\n",
" 1\n",
" 1"
]
},
"execution_count": 2,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"isfile.([\"Project.toml\", \"Manifest.toml\"])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"You should get `1` printed (meaning `true`) in both entries of a vector.\n",
"\n",
"Now we are sure that you are going to use exactly the same versions of the packages that I use when running this tutorial.\n",
"\n",
"Let us check what packages (and in what versions) we will use."
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [],
"source": [
"] status"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"These notebooks should work with DataFrames versions 0.22 and 1.1."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"if the command above gives a warning that some of the packages are not downloaded run the `instantiate` instruction from the following line"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {
"scrolled": true
},
"outputs": [],
"source": [
"] instantiate"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"As you see we will use the following packages:\n",
"\n",
"Package | Description\n",
":-|:-\n",
"DataFrames.jl | a core package that is a subject of this tutorial; it is used for data manipulation; we use version 0.21.0 of this package\n",
"CSV.jl | a package for reading/writing of CSV files\n",
"FreqTables.jl | a very useful package for creating frequency tables\n",
"GLM.jl | a package for fitting Generalized Linear Models (as no data science tutorial would be complete without building some predictive model)\n",
"PyPlot.jl | a package for plotting; there are many options in the Julia ecosystem to choose from; in this tutorial we use PyPlot.jl as it is based on Matplotlib so if you have experience with the Python data science technology stack it should be familiar\n",
"Pipe.jl | a package that makes chaining of operations super powerful (which is something you probably know from `%>%` in R)\n",
"Arrow.jl | a package for working with data in Apache Arrow format\n",
"Uniftul.jl | a package for working with physical units (like kg, cm, ...)"
]
}
],
"metadata": {
"@webio": {
"lastCommId": null,
"lastKernelId": null
},
"kernelspec": {
"display_name": "Julia 1.6.1",
"language": "julia",
"name": "julia-1.6"
},
"language_info": {
"file_extension": ".jl",
"mimetype": "application/julia",
"name": "julia",
"version": "1.6.1"
}
},
"nbformat": 4,
"nbformat_minor": 4
}