diff --git a/Ch14-surv-lab.ipynb b/Ch14-surv-lab.ipynb
deleted file mode 100644
index 01921a7..0000000
--- a/Ch14-surv-lab.ipynb
+++ /dev/null
@@ -1,1046 +0,0 @@
-{
- "cells": [
-  {
-   "cell_type": "markdown",
-   "id": "c1af8696",
-   "metadata": {},
-   "source": [
-    "# Dummy\n",
-    "\n",
-    "## Lab: Survival Analysis\n",
-    "\n",
-    "### Some preliminary details\n",
-    "\n",
-    "We will want to be able to see plots in the notebook. We\n",
-    "will therefore include the following as introduced in\n",
-    "Chapter 2:\n",
-    "\n"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "0bce7ad7",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "%matplotlib inline"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "98e95e67",
-   "metadata": {},
-   "source": [
-    "#### New imports\n",
-    "\n",
-    "We again collect the new imports\n",
-    "needed for this lab.\n",
-    "\n"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "3741ca31",
-   "metadata": {
-    "lines_to_next_cell": 2
-   },
-   "outputs": [],
-   "source": [
-    "from lifelines import \\\n",
-    "     (KaplanMeierFitter,\n",
-    "      CoxPHFitter)\n",
-    "from lifelines.statistics import \\\n",
-    "     (logrank_test,\n",
-    "      multivariate_logrank_test)\n",
-    "from ISLP.survival import sim_time"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "c1dff6d8",
-   "metadata": {},
-   "source": [
-    "In this lab, we perform survival analyses on three separate data\n",
-    "sets. In the first section we analyze the `BrainCancer`\n",
-    "data .<br />\n",
-    "Next, we examine the `Publication`\n",
-    "data . Finally, we explore explores\n",
-    "a simulated call center data set.\n",
-    "\n",
-    "We begin by importing some of our libraries at this top\n",
-    "level. This makes the code more readable, as scanning the first few\n",
-    "lines of the notebook tell us what libraries are used in this\n",
-    "notebook.\n",
-    "\n"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "6402fd76",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "import matplotlib.pyplot as plt\n",
-    "import numpy as np\n",
-    "import pandas as pd\n",
-    "from ISLP.models import ModelSpec as MS\n",
-    "from ISLP import load_data"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "90eb295f",
-   "metadata": {},
-   "source": [
-    "### Brain Cancer Data"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "26338f8e",
-   "metadata": {},
-   "source": [
-    "We begin with the `BrainCancer` data set, contained in the `ISLP` package.\n",
-    "\n"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "b43f8c6f",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "BrainCancer = load_data('BrainCancer')\n",
-    "BrainCancer.columns"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "61eed2f0",
-   "metadata": {},
-   "source": [
-    "The rows index the 88 patients, while the columns contain the 8 predictors.\n",
-    "We first briefly examine the data.\n",
-    "\n"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "4307ef61",
-   "metadata": {
-    "lines_to_next_cell": 0
-   },
-   "outputs": [],
-   "source": [
-    "BrainCancer['sex'].value_counts()"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "1a83e4f7",
-   "metadata": {},
-   "source": [
-    "\n"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "19c25475",
-   "metadata": {
-    "lines_to_next_cell": 0
-   },
-   "outputs": [],
-   "source": [
-    "BrainCancer['diagnosis'].value_counts()"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "7fcd779d",
-   "metadata": {},
-   "source": [
-    "\n"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "d9fa6491",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "BrainCancer['status'].value_counts()"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "bd1e89cb",
-   "metadata": {},
-   "source": [
-    "Before beginning an analysis, it is important to know how the\n",
-    "`status` variable has been coded.  Most software\n",
-    "uses the convention that a `status` of 1 indicates an\n",
-    "uncensored observation, and a `status` of 0 indicates a censored\n",
-    "observation. But some scientists might use the opposite coding. For\n",
-    "the `BrainCancer` data set 35 patients died before the end of\n",
-    "the study.\n",
-    "\n",
-    "To begin the analysis, we re-create a Kaplan-Meier survival curve shown in the book. The main\n",
-    "package we will use for survival analysis\n",
-    "is `lifelines`.\n",
-    "The variable `time` corresponds to $y_i$, the time to the $i$th event (either censoring or\n",
-    "death). The first argument to `km.fit` are the event times with the\n",
-    "second argument being the censoring variable with a 1 indicating an observed\n",
-    "failure time. The `plot` method plots a curve with pointwise confidence\n",
-    "intervals. By default, they produce 90% confidence intervals. This can be changed\n",
-    "by setting the `alpha` argument to 1 minus the desired\n",
-    "confidence level.\n",
-    "\n"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "16b708ff",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "km = KaplanMeierFitter()\n",
-    "km_brain = km.fit(BrainCancer['time'],\n",
-    "                  BrainCancer['status'])\n",
-    "km_brain.plot(label='Kaplan Meier estimate')"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "71d6ef47",
-   "metadata": {},
-   "source": [
-    "Next we create Kaplan-Meier survival curves that are stratified by\n",
-    "`sex`, in order to reproduce a second figure from the book.\n",
-    "We  do this by using the `groupby` method of  a  DataFrame.\n",
-    "This  method returns a generator that can\n",
-    "be  iterated over in the `for` loop. In this case,\n",
-    "the  items  in the `for` loop are  2-tuples  representing\n",
-    "the groups: the first entry is the  value\n",
-    "of the grouping column  `sex` while the  2nd  value\n",
-    "is the DataFrame consisting of all  rows in the\n",
-    "DataFrame matching that value of `sex`.\n",
-    "We will want to use this data below\n",
-    "in the log-rank test,  hence we store this\n",
-    "information in the dictionary `by_sex`. Finally,\n",
-    "we have also used the notion of\n",
-    " to automatically\n",
-    "label the  different lines in the  plot. String\n",
-    "interpolation is a powerful technique to format strings –\n",
-    "`python` has many ways to facilitate such operations.\n",
-    "\n"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "cd07dfd6",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "by_sex = {}\n",
-    "for sex, df in BrainCancer.groupby('sex'):\n",
-    "    by_sex[sex] = df\n",
-    "    km_sex = km.fit(df['time'],\n",
-    "                    df['status'])\n",
-    "    km_sex.plot(label='Sex=%s' % sex)"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "8d40bd2f",
-   "metadata": {},
-   "source": [
-    "As discussed in the book, we can perform a\n",
-    "log-rank test to compare the survival of males to females. We use\n",
-    "the `logrank_test` function from the `lifelines.statistics` module.\n",
-    "The first two arguments are the event times, with the second\n",
-    "denoting the corresponding (optional) censoring indicators.\n",
-    "\n"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "f4e80bc9",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "logrank_test(by_sex['Male']['time'],\n",
-    "             by_sex['Female']['time'],\n",
-    "             by_sex['Male']['status'],\n",
-    "             by_sex['Female']['status'])"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "3e025210",
-   "metadata": {},
-   "source": [
-    "The resulting $p$-value is $0.23$, indicating no evidence of a\n",
-    "difference in survival between the two sexes.\n",
-    "\n",
-    "Next, we fit  Cox proportional hazards models using the `CoxPHFitter` estimator\n",
-    "from `lifelines`.\n",
-    "To begin, we consider a model that uses `sex` as the only predictor.\n",
-    "\n"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "b57ac5fa",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "coxph = CoxPHFitter # shorthand\n",
-    "sex_df = BrainCancer[['time', 'status', 'sex']]\n",
-    "model_df = MS(['time', 'status', 'sex'],\n",
-    "              intercept=False).fit_transform(sex_df)\n",
-    "cox_fit = coxph().fit(model_df,\n",
-    "                      'time',\n",
-    "                      'status')\n",
-    "cox_fit.summary[['coef', 'se(coef)', 'p']]"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "da8b9eff",
-   "metadata": {},
-   "source": [
-    "The first argument to `fit` should be a data frame containing\n",
-    "at least the event time (the second argument `time` in this case)\n",
-    "as well as an optional censoring variable (the argument `status` in this case).\n",
-    "Note also that the Cox model does not include an intercept, which is why\n",
-    "we used the `intercept=False` argument to `ModelSpec` above.\n",
-    "It is possible to obtain the likelihood ratio test comparing this model to the one\n",
-    "with no features as follows:\n",
-    "\n"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "9991b523",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "cox_fit.log_likelihood_ratio_test()"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "fc76fd07",
-   "metadata": {},
-   "source": [
-    "Regardless of which test we use, we see that there is no clear\n",
-    "evidence for a difference in survival between males and females.  As\n",
-    "we learned in this chapter, the score test from the Cox model is\n",
-    "exactly equal to the log rank test statistic!\n",
-    "\n",
-    "Now we fit a  model that makes use of additional predictors. We first note\n",
-    "that one of our `diagnosis` values is missing, hence\n",
-    "we drop that observation before continuing.\n",
-    "\n"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "391157a9",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "cleaned = BrainCancer.dropna()\n",
-    "all_MS = MS(cleaned.columns,\n",
-    "            intercept=False)\n",
-    "all_df = all_MS.fit_transform(cleaned)\n",
-    "fit_all = coxph().fit(all_df, \n",
-    "                      'time', \n",
-    "                      'status')\n",
-    "fit_all.summary[['coef', 'se(coef)', 'p']]"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "18659bcd",
-   "metadata": {},
-   "source": [
-    "The `diagnosis` variable has been coded so that the baseline\n",
-    "corresponds to HG glioma. The results indicate that the risk associated with HG glioma\n",
-    "is more than eight times (i.e. $e^{2.15}=8.62$) the risk associated\n",
-    "with meningioma. In other words, after adjusting for the other\n",
-    "predictors, patients with HG glioma have much worse survival compared\n",
-    "to those with meningioma.  In addition, larger values of the Karnofsky\n",
-    "index, `ki`, are associated with lower risk, i.e. longer survival.\n",
-    "\n",
-    "Finally, we plot estimated survival curves for each diagnosis category,\n",
-    "adjusting for the other predictors.  To make these plots, we set the\n",
-    "values of the other predictors equal to the mean for quantitative variables\n",
-    "and equal to the mode for categorical. To do this, we use the\n",
-    "`apply()` method across rows (i.e. `axis=0`) with a function\n",
-    "`representative` that checks if a column is categorical\n",
-    "or not.\n",
-    "\n"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "c2c32bdf",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "levels = cleaned['diagnosis'].unique()\n",
-    "def representative(series):\n",
-    "    if hasattr(series.dtype, 'categories'):\n",
-    "        return pd.Series.mode(series)\n",
-    "    else:\n",
-    "        return series.mean()\n",
-    "modal_data = cleaned.apply(representative, \n",
-    "                           axis=0)"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "2f5e9767",
-   "metadata": {},
-   "source": [
-    "We make four\n",
-    "copies of the column means and assign the `diagnosis` column to be the four different\n",
-    "diagnoses.\n",
-    "\n"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "7a6fc4b9",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "modal_df = pd.DataFrame([modal_data.iloc[0] for _ in range(len(levels))])\n",
-    "modal_df['diagnosis'] = levels\n",
-    "modal_df"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "dd013465",
-   "metadata": {},
-   "source": [
-    "We then construct the design matrix based on the model specification `all_MS` used to fit\n",
-    "the model, dropping the `time` and `status` variables as these are not\n",
-    "used to predict the survival functioin.\n",
-    "\n"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "f1e6cb80",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "modal_X = all_MS.transform(modal_df)#.drop(columns=['time', 'status'])\n",
-    "modal_X.index = levels\n",
-    "modal_X"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "d98548ee",
-   "metadata": {},
-   "source": [
-    "The estimated survival\n",
-    "function can be found using the `predict_survival_function()` method.\n",
-    "\n"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "8753eaa2",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "predicted_survival = fit_all.predict_survival_function(modal_X)\n",
-    "predicted_survival"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "a814ddee",
-   "metadata": {},
-   "source": [
-    "This returns a data frame,\n",
-    "whose plot methods yields the different survival curves. To avoid clutter in\n",
-    "the plots, we do not display confidence intervals.\n",
-    "\n"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "ddccc3ac",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "fig, ax = plt.subplots(1, 1, figsize=(10, 10))\n",
-    "predicted_survival.plot(ax=ax);"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "53997de5",
-   "metadata": {},
-   "source": [
-    "### Publication Data"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "47103217",
-   "metadata": {},
-   "source": [
-    "The `Publication` data   can be\n",
-    "found in the `ISLP` package.\n",
-    "We begin by plotting the Kaplan-Meier curves\n",
-    "stratified on the `posres` variable, which records whether the\n",
-    "study had a positive or negative result.\n",
-    "\n"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "01217472",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "Publication = load_data('Publication')\n",
-    "by_result = {}\n",
-    "for result, df in Publication.groupby('posres'):\n",
-    "    by_result[result] = df\n",
-    "    km_result = km.fit(df['time'], df['status'])\n",
-    "    km_result.plot(label='Result=%d' % result)"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "7929b6e9",
-   "metadata": {},
-   "source": [
-    "As discussed previously, the $p$-values from fitting Cox’s\n",
-    "proportional hazards model to the `posres` variable are quite\n",
-    "large, providing no evidence of a difference in time-to-publication\n",
-    "between studies with positive versus negative results.\n",
-    "\n"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "2e966bd6",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "posres_df = MS(['posres',\n",
-    "                'time',\n",
-    "                'status'], intercept=False).fit_transform(Publication)\n",
-    "posres_fit = coxph().fit(posres_df,\n",
-    "                         'time',\n",
-    "                         'status')\n",
-    "posres_fit.summary[['coef', 'se(coef)', 'p']]"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "1020fe19",
-   "metadata": {},
-   "source": [
-    "As expected, the log-rank test provides an identical conclusion.\n",
-    "\n"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "42572496",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "logrank_test(by_result[0]['time'],\n",
-    "             by_result[1]['time'],\n",
-    "             by_result[0]['status'],\n",
-    "             by_result[1]['status'])"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "c06a3106",
-   "metadata": {},
-   "source": [
-    "However, the results change dramatically when we include other\n",
-    "predictors in the model. Here we have excluded the funding mechanism\n",
-    "variable.\n",
-    "\n"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "7c22a546",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "model = MS(Publication.columns.drop('mech'),\n",
-    "           intercept=False)\n",
-    "coxph().fit(model.fit_transform(Publication),\n",
-    "            'time',\n",
-    "            'status').summary[['coef', 'se(coef)', 'p']]"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "2efd14bf",
-   "metadata": {},
-   "source": [
-    "We see that there are a number of statistically significant variables,\n",
-    "including whether the trial focused on a clinical endpoint, the impact\n",
-    "of the study, and whether the study had positive or negative results.\n",
-    "\n",
-    "### Call Center Data"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "9dbf298f",
-   "metadata": {},
-   "source": [
-    "In this section, we will simulate survival data using the relationship\n",
-    "between cumulative hazard and\n",
-    "the survival function explored in an exercise.\n",
-    "Our simulated data will represent the observed\n",
-    "wait times (in seconds) for 2,000 customers who have phoned a call\n",
-    "center.  In this context, censoring occurs if a customer hangs up\n",
-    "before his or her call is answered.\n",
-    "\n",
-    "There are three covariates: `Operators` (the number of call\n",
-    "center operators available at the time of the call, which can range\n",
-    "from $5$ to $15$), `Center` (either A, B, or C), and\n",
-    "`Time` of day (Morning, Afternoon, or Evening). We generate data\n",
-    "for these covariates so that all possibilities are equally likely: for\n",
-    "instance, morning, afternoon and evening calls are equally likely, and\n",
-    "any number of operators from $5$ to $15$ is equally likely. We use the\n",
-    "`ModelSpec` function from the `ISLP.models` package introduced\n",
-    "in Chapter 3.\n",
-    "\n"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "72e4c0c1",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "np.random.seed(10)\n",
-    "N = 2000\n",
-    "Operators = np.random.choice(np.arange(5, 16),\n",
-    "                             N,\n",
-    "                             replace=True)\n",
-    "Center = np.random.choice(['A', 'B', 'C'], \n",
-    "                          N,\n",
-    "                          replace=True)\n",
-    "Time = np.random.choice(['Morn.', 'After.', 'Even.'], \n",
-    "                        N,\n",
-    "                        replace=True)\n",
-    "D = pd.DataFrame({'Operators': Operators,\n",
-    "                  'Center': pd.Categorical(Center),\n",
-    "                  'Time': pd.Categorical(Time)})      \n",
-    "model = MS(['Operators',\n",
-    "            'Center',\n",
-    "            'Time'],\n",
-    "           intercept=False)\n",
-    "X = model.fit_transform(D)"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "a3f7c728",
-   "metadata": {},
-   "source": [
-    "It is worthwhile to take a peek at the design matrix `X`, so\n",
-    "that we can be sure that we understand how the variables have been coded. By default,\n",
-    "the levels of categorical variables are sorted and the first column of the usual one-hot encoding\n",
-    "of the variable is dropped.\n",
-    "\n"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "4b8f4b9f",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "X[:5]"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "06cd05ca",
-   "metadata": {},
-   "source": [
-    "Next,  we specify the coefficients and the hazard function.\n",
-    "\n"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "982ae944",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "true_beta = np.array([0.04, -0.3, 0, 0.2, -0.2])\n",
-    "true_linpred = X.dot(true_beta)\n",
-    "hazard = lambda t: 1e-5 * t"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "5054aa25",
-   "metadata": {},
-   "source": [
-    "We use the function\n",
-    "`sim_time` from the `ISLP.survival` package. This function\n",
-    "uses the relationship between the survival function\n",
-    "and cumulative hazard $S(t) = \\exp(-H(t))$ and the specific\n",
-    "form of the cumulative hazard function in the Cox model\n",
-    "to simulate data based on values of the linear predictor\n",
-    "`true_linpred` and the cumulative hazard. We can generate\n",
-    "if we know the inverse of the cumulative hazard function.\n",
-    "\n"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "e8939727",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "cum_hazard = lambda t: 1e-5 * t**2 / 2"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "c0ed581d",
-   "metadata": {},
-   "source": [
-    "Here, we have set the coefficient associated with `Operators` to\n",
-    "equal $0.04$; in other words, each additional operator leads to a\n",
-    "$e^{0.04}=1.041$-fold increase in the “risk” that the call will be\n",
-    "answered, given the `Center` and `Time` covariates. This\n",
-    "makes sense: the greater the number of operators at hand, the shorter\n",
-    "the wait time! The coefficient associated with `Center == B` is\n",
-    "$-0.3$, and `Center == A` is treated as the baseline. This means\n",
-    "that the risk of a call being answered at Center B is 0.74 times the\n",
-    "risk that it will be answered at Center A; in other words, the wait\n",
-    "times are a bit longer at Center B.\n",
-    "\n",
-    "We are now ready to generate data under the Cox proportional hazards\n",
-    "model. We’ll truncate the maximum time to 1000 seconds to keep\n",
-    "simulated wait times reasonable.\n",
-    "\n"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "a850809e",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "W = np.array([sim_time(l, cum_hazard)\n",
-    "              for l in true_linpred])\n",
-    "D['Wait time'] = np.clip(W, 0, 1000)"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "3f481113",
-   "metadata": {},
-   "source": [
-    "We now simulate our censoring variable, for which we assume\n",
-    "90% of calls were answered (`Failed==1`) before the\n",
-    "customer hungup (`Failed==0`).\n",
-    "\n"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "c93cd8c3",
-   "metadata": {
-    "lines_to_next_cell": 0
-   },
-   "outputs": [],
-   "source": [
-    "D['Failed'] = np.random.choice([1, 0],\n",
-    "                                N,\n",
-    "                                p=[0.9, 0.1])\n",
-    "D[:5]"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "aa55f285",
-   "metadata": {},
-   "source": [
-    "\n"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "bac8ff69",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "D['Failed'].mean()"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "75699d8c",
-   "metadata": {},
-   "source": [
-    "We now plot  Kaplan-Meier survival curves. First, we stratify by `Center`\n",
-    "\n"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "cafb21e9",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "by_center = {}\n",
-    "for center, df in D.groupby('Center'):\n",
-    "    by_center[center] = df\n",
-    "    km_center = km.fit(df['Wait time'], df['Failed'])\n",
-    "    km_center.plot(label='Center=%s' % center)\n",
-    "ax = plt.gca()\n",
-    "ax.set_title(\"Probability of Still Being on Hold\")"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "6c477316",
-   "metadata": {},
-   "source": [
-    "Next, we stratify by `Time`.\n",
-    "\n"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "d404c129",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "by_time = {}\n",
-    "for time, df in D.groupby('Time'):\n",
-    "    by_time[time] = df\n",
-    "    km_time = km.fit(df['Wait time'], df['Failed'])\n",
-    "    km_time.plot(label='Time=%s' % time)\n",
-    "ax = plt.gca()\n",
-    "ax.set_title(\"Probability of Still Being on Hold\")"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "dcca1dbd",
-   "metadata": {},
-   "source": [
-    "It seems that calls at Call Center B take longer to be answered than\n",
-    "calls at Centers A and C. Similarly, it appears that wait times are\n",
-    "longest in the morning and shortest in the evening hours. We can use a\n",
-    "log-rank test to determine whether these differences are statistically\n",
-    "significant using `multivariate_logrank_test`.\n",
-    "\n"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "b928758d",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "multivariate_logrank_test(D['Wait time'],\n",
-    "                          D['Center'],\n",
-    "                          D['Failed'])"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "285f5644",
-   "metadata": {},
-   "source": [
-    "Next, we consider the  effect of `Time`.\n",
-    "\n"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "976558e5",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "multivariate_logrank_test(D['Wait time'],\n",
-    "                          D['Time'],\n",
-    "                          D['Failed'])"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "e56b7df7",
-   "metadata": {},
-   "source": [
-    "As in the case of a categorical variable with 2 outcomes, these\n",
-    "results are similar to the likelihood ratio test\n",
-    "from the Cox proportional hazards model. First, let’s\n",
-    "look at the results for  `Center`.\n",
-    "\n"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "36f7fa13",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "X = MS(['Wait time',\n",
-    "        'Failed',\n",
-    "        'Center'],\n",
-    "        intercept=False).fit_transform(D)\n",
-    "F = coxph().fit(X, 'Wait time', 'Failed')\n",
-    "F.log_likelihood_ratio_test()"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "68ab32c8",
-   "metadata": {},
-   "source": [
-    "Next, the results for `Time`.\n",
-    "\n"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "e03b31e6",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "X = MS(['Wait time',\n",
-    "        'Failed',\n",
-    "        'Time'],\n",
-    "       intercept=False).fit_transform(D)\n",
-    "F = coxph().fit(X, 'Wait time', 'Failed')\n",
-    "F.log_likelihood_ratio_test()"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "fda76031",
-   "metadata": {},
-   "source": [
-    "We find that differences between centers are highly significant, as\n",
-    "are differences between times of day.\n",
-    "\n",
-    "Finally, we fit a complete Cox’s proportional hazards model to the data.\n",
-    "\n"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "e5022c12",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "X = MS(D.columns,\n",
-    "       intercept=False).fit_transform(D)\n",
-    "fit_queuing = coxph().fit(\n",
-    "                  X,\n",
-    "                 'Wait time',\n",
-    "                 'Failed')\n",
-    "fit_queuing.summary[['coef', 'se(coef)', 'p']]"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "847ebaa7",
-   "metadata": {},
-   "source": [
-    "The $p$-values for Center B and evening time\n",
-    "are very small. It is also clear that the\n",
-    "hazard — that is, the instantaneous risk that a call will be\n",
-    "answered — increases with the number of operators. Since we\n",
-    "generated the data ourselves, we know that the true coefficients for\n",
-    "Center B, Center C,  Operators\n",
-    "evening time, and morning time   are $-0.3$,\n",
-    "$0$, $0.04$, and $0.2$, $-0.2$, respectively. The coefficient estimates\n",
-    "resulting from the Cox model are fairly accurate."
-   ]
-  }
- ],
- "metadata": {
-  "jupytext": {
-   "cell_metadata_filter": "all",
-   "formats": "ipynb,Rmd",
-   "notebook_metadata_filter": "all"
-  },
-  "kernelspec": {
-   "display_name": "Python 3",
-   "language": "python",
-   "name": "python3"
-  },
-  "language_info": {
-   "codemirror_mode": {
-    "name": "ipython",
-    "version": 3
-   },
-   "file_extension": ".py",
-   "mimetype": "text/x-python",
-   "name": "python",
-   "nbconvert_exporter": "python",
-   "pygments_lexer": "ipython3",
-   "version": "3.6.2"
-  }
- },
- "nbformat": 4,
- "nbformat_minor": 5
-}