Revised with 1.6.1

2021-06-26 18:57:09 +02:00 · 2021-06-26 18:57:09 +02:00 · f855c1c2ea
commit f855c1c2ea
parent 6acd882163
1 changed files with 170 additions and 0 deletions
--- a/DataFrames/01__Environment_setup.ipynb
+++ b/DataFrames/01__Environment_setup.ipynb
@ -0,0 +1,170 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Environment setup for data frames tutorial\n",
+    "\n",
+    "## Bogumił Kamiński"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Welcome to DataFrames.jl introduction!\n",
+    "\n",
+    "This set of Jupyter notebooks is intended to give you an overwiew of what functionality DataFrames.jl has based on practical examples.\n",
+    "\n",
+    "You can find reviews of functionality of DataFrames.jl (not as exercises as this tutorial but task-type oriented) in the following locations:\n",
+    "* an official manual at https://juliadata.github.io/DataFrames.jl/stable/\n",
+    "* a tutorial going through all functionalities of DataFrames.jl at https://github.com/bkamins/Julia-DataFrames-Tutorial\n",
+    "\n",
+    "We also assume that you have a basic knowledge of the Julia language and the Julia ecosystem. There are great tutorials on this topic in [JuliaAcademy](https://juliaacademy.com/), so I encourage you to check them out.\n",
+    "\n",
+    "As this is a hands-on tutorial you can expect that the examples will be implemented in a way as I would write them when doing actual project."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "The notebooks were prepared under Julia 1.5.3 and tested under Julia 1.6.1. If you have a different version of Julia installed change the kernel in *Kernel/Change kernel* option in menu (assuming you are on a Julia 1.x all examples should work without a problem)."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "v\"1.6.1\""
+      ]
+     },
+     "execution_count": 1,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "VERSION"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Jupyter Notebook automatically activates project environment if it is found in the working directory.\n",
+    "\n",
+    "So first let us check if we have Project.toml and Manifest.toml files present (they should be present if you cloned the repository of this tutorial)."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 2,
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "2-element BitVector:\n",
+       " 1\n",
+       " 1"
+      ]
+     },
+     "execution_count": 2,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "isfile.([\"Project.toml\", \"Manifest.toml\"])"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "You should get `1` printed (meaning `true`) in both entries of a vector.\n",
+    "\n",
+    "Now we are sure that you are going to use exactly the same versions of the packages that I use when running this tutorial.\n",
+    "\n",
+    "Let us check what packages (and in what versions) we will use."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 3,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "] status"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "These notebooks should work with DataFrames versions 0.22 and 1.1."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "if the command above gives a warning that some of the packages are not downloaded run the `instantiate` instruction from the following line"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 4,
+   "metadata": {
+    "scrolled": true
+   },
+   "outputs": [],
+   "source": [
+    "] instantiate"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "As you see we will use the following packages:\n",
+    "\n",
+    "Package | Description\n",
+    ":-|:-\n",
+    "DataFrames.jl | a core package that is a subject of this tutorial; it is used for data manipulation; we use version 0.21.0 of this package\n",
+    "CSV.jl | a package for reading/writing of CSV files\n",
+    "FreqTables.jl | a very useful package for creating frequency tables\n",
+    "GLM.jl | a package for fitting Generalized Linear Models (as no data science tutorial would be complete without building some predictive model)\n",
+    "PyPlot.jl | a package for plotting; there are many options in the Julia ecosystem to choose from; in this tutorial we use PyPlot.jl as it is based on Matplotlib so if you have experience with the Python data science technology stack it should be familiar\n",
+    "Pipe.jl | a package that makes chaining of operations super powerful (which is something you probably know from `%>%` in R)\n",
+    "Arrow.jl | a package for working with data in Apache Arrow format\n",
+    "Uniftul.jl | a package for working with physical units (like kg, cm, ...)"
+   ]
+  }
+ ],
+ "metadata": {
+  "@webio": {
+   "lastCommId": null,
+   "lastKernelId": null
+  },
+  "kernelspec": {
+   "display_name": "Julia 1.6.1",
+   "language": "julia",
+   "name": "julia-1.6"
+  },
+  "language_info": {
+   "file_extension": ".jl",
+   "mimetype": "application/julia",
+   "name": "julia",
+   "version": "1.6.1"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 4
+}