diff --git a/ipynb/Goldberg.ipynb b/ipynb/Goldberg.ipynb
index 328fde3..1d2e4b5 100644
--- a/ipynb/Goldberg.ipynb
+++ b/ipynb/Goldberg.ipynb
@@ -13,12 +13,12 @@
"\n",
"
\n",
"\n",
- "RNNs, LSTMs and Deep Learning are all the rage, and a recent [blog post](http://karpathy.github.io/2015/05/21/rnn-effectiveness/) by Andrej Karpathy is doing a great job explaining what these models are and how to train them.\n",
+ "[RNNs](https://en.wikipedia.org/wiki/Recurrent_neural_network), [LSTMs](https://en.wikipedia.org/wiki/Long_short-term_memory) and [Deep Learning](https://en.wikipedia.org/wiki/Deep_learning) are all the rage, and a recent [blog post](http://karpathy.github.io/2015/05/21/rnn-effectiveness/) by Andrej Karpathy is doing a great job explaining what these models are and how to train them.\n",
"It also provides some very impressive results of what they are capable of. This is a great post, and if you are interested in natural language, machine learning or neural networks you should definitely read it. \n",
"\n",
"Go [**read it now**](http://karpathy.github.io/2015/05/21/rnn-effectiveness/), then come back here. \n",
"\n",
- "You're back? good. Impressive stuff, huh? How could the network learn to imitate the input like that?\n",
+ "You're back? Good. Impressive stuff, huh? How could the network learn to imitate the input like that?\n",
"Indeed. I was quite impressed as well.\n",
"\n",
"However, it feels to me that most readers of the post are impressed by the wrong reasons.\n",
@@ -41,7 +41,7 @@
"\n",
"So, we are seeing *n* letters, and need to guess the *n+1*th one. We are also given a large-ish amount of text (say, all of Shakespear works) that we can use. How would we go about solving this task?\n",
"\n",
- "Mathematiacally, we would like to learn a function *P(c* | *h)*. Here, *c* is a character, *h* is a *n*-letters history, and *P(c* | *h)* stands for how likely is it to see *c* after we've seen *h*.\n",
+ "Mathematically, we would like to learn a function *P(c* | *h)*. Here, *c* is a character, *h* is a *n*-letters history, and *P(c* | *h)* stands for how likely is it to see *c* after we've seen *h*.\n",
"\n",
"Perhaps the simplest approach would be to just count and divide (a.k.a **maximum likelihood estimates**). We will count the number of times each letter *c* appeared after *h*, and divide by the total numbers of letters appearing after *h*. The **unsmoothed** part means that if we did not see a given letter following *h*, we will just give it a probability of zero.\n",
"\n",
@@ -54,7 +54,7 @@
"source": [
"## Training Code\n",
"\n",
- "Here is the code for training the model. `fname` is a file to read the characters from. `order` is the history size to consult. Note that we pad the data with `order` leading characters so that we also learn how to start.\n"
+ "Here is the code for training a language model, which implements *P(c* | *h)* with a counter of the number of times we have seen each character, for each history. The function `train_char_lm` takes a filename to read the characters from. `order` is the history size to consult. Note that we pad the data with `order` leading characters so that we also learn how to start.\n"
]
},
{
@@ -64,32 +64,33 @@
"outputs": [],
"source": [
"import random\n",
- "from collections import Counter, defaultdict\n",
- "from typing import List, Tuple\n",
- "\n",
- "class LanguageModel(defaultdict):\n",
- " \"\"\"A mapping from `order` history characters to a list of ('c', probability) pairs,\n",
- " e.g., for order=4, {'spea': [('k', 0.99), ('r', 0.01)])}.\"\"\"\n",
- " def __init__(self, order): self.order = order\n",
+ "import collections"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 2,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "class LanguageModel(collections.defaultdict):\n",
+ " \"\"\"A mapping from `order` history characters to a Counter of {'c': count} pairs,\n",
+ " e.g., for order=4, {'spea': Counter({'k': 9, 'r': 1})}.\"\"\"\n",
+ " def __init__(self, order): \n",
+ " self.order = order\n",
+ " self.default_factory = collections.Counter \n",
"\n",
"def train_char_lm(fname, order=4) -> LanguageModel:\n",
" \"\"\"Train an `order`-gram character-level language model on all the text in `fname`.\"\"\"\n",
" lm = LanguageModel(order)\n",
- " data = (PAD * order) + open(fname).read()\n",
- " # First read data into Counters of characters; then normalize\n",
- " lm.default_factory = Counter \n",
+ " data = (order * PAD) + open(fname).read()\n",
" for i in range(order, len(data)):\n",
" history, char = data[i - order:i], data[i]\n",
" lm[history][char] += 1\n",
- " for history in lm:\n",
- " lm[history] = normalize(lm[history])\n",
+ " for counter in lm.values():\n",
+ " counter.total = sum(counter.values()) # Cache total counts (for sample_character)\n",
" return lm\n",
"\n",
- "def normalize(counter) -> List[Tuple[str, float]]:\n",
- " \"\"\"Return (key, val) pairs, normalized so values sum to 1.0, largest first.\"\"\"\n",
- " total = float(sum(counter.values()))\n",
- " return [(k, v / total) for k, v in counter.most_common()]\n",
- "\n",
"PAD = '`' # Character to pad the beginning of a text"
]
},
@@ -102,7 +103,7 @@
},
{
"cell_type": "code",
- "execution_count": 2,
+ "execution_count": 3,
"metadata": {
"collapsed": false,
"jupyter": {
@@ -125,7 +126,7 @@
},
{
"cell_type": "code",
- "execution_count": 3,
+ "execution_count": 4,
"metadata": {},
"outputs": [],
"source": [
@@ -139,41 +140,6 @@
"Ok. Now let's do some queries:"
]
},
- {
- "cell_type": "code",
- "execution_count": 4,
- "metadata": {
- "collapsed": false,
- "jupyter": {
- "outputs_hidden": false
- }
- },
- "outputs": [
- {
- "data": {
- "text/plain": [
- "[('w', 0.817717206132879),\n",
- " ('r', 0.059625212947189095),\n",
- " ('u', 0.03747870528109029),\n",
- " (',', 0.027257240204429302),\n",
- " (\"'\", 0.017035775127768313),\n",
- " (' ', 0.013628620102214651),\n",
- " ('.', 0.0068143100511073255),\n",
- " ('?', 0.0068143100511073255),\n",
- " ('!', 0.0068143100511073255),\n",
- " (':', 0.005110732538330494),\n",
- " ('n', 0.0017035775127768314)]"
- ]
- },
- "execution_count": 4,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "lm['ello']"
- ]
- },
{
"cell_type": "code",
"execution_count": 5,
@@ -187,7 +153,17 @@
{
"data": {
"text/plain": [
- "[('t', 1.0)]"
+ "Counter({'r': 35,\n",
+ " 'w': 480,\n",
+ " 'u': 22,\n",
+ " ',': 16,\n",
+ " ' ': 8,\n",
+ " '.': 4,\n",
+ " '?': 4,\n",
+ " ':': 3,\n",
+ " 'n': 1,\n",
+ " \"'\": 10,\n",
+ " '!': 4})"
]
},
"execution_count": 5,
@@ -196,7 +172,7 @@
}
],
"source": [
- "lm['Firs']"
+ "lm['ello']"
]
},
{
@@ -212,49 +188,7 @@
{
"data": {
"text/plain": [
- "[('S', 0.16292134831460675),\n",
- " ('L', 0.10674157303370786),\n",
- " ('C', 0.09550561797752809),\n",
- " ('G', 0.0898876404494382),\n",
- " ('M', 0.0593900481540931),\n",
- " ('t', 0.05377207062600321),\n",
- " ('W', 0.033707865168539325),\n",
- " ('s', 0.03290529695024077),\n",
- " ('o', 0.030497592295345103),\n",
- " ('b', 0.024879614767255216),\n",
- " ('w', 0.024077046548956663),\n",
- " ('a', 0.02247191011235955),\n",
- " ('m', 0.02247191011235955),\n",
- " ('n', 0.020064205457463884),\n",
- " ('h', 0.019261637239165328),\n",
- " ('O', 0.018459069020866775),\n",
- " ('i', 0.016853932584269662),\n",
- " ('d', 0.015248796147672551),\n",
- " ('P', 0.014446227929373997),\n",
- " ('c', 0.012841091492776886),\n",
- " ('F', 0.012038523274478331),\n",
- " ('f', 0.011235955056179775),\n",
- " ('g', 0.011235955056179775),\n",
- " ('l', 0.01043338683788122),\n",
- " ('I', 0.009630818619582664),\n",
- " ('B', 0.009630818619582664),\n",
- " ('p', 0.00882825040128411),\n",
- " ('K', 0.008025682182985553),\n",
- " ('r', 0.0072231139646869984),\n",
- " ('A', 0.0056179775280898875),\n",
- " ('H', 0.0040128410914927765),\n",
- " ('k', 0.0040128410914927765),\n",
- " ('e', 0.0032102728731942215),\n",
- " ('T', 0.0032102728731942215),\n",
- " ('D', 0.0032102728731942215),\n",
- " ('y', 0.002407704654895666),\n",
- " ('v', 0.002407704654895666),\n",
- " ('u', 0.0016051364365971107),\n",
- " ('q', 0.0016051364365971107),\n",
- " ('E', 0.0016051364365971107),\n",
- " ('R', 0.0008025682182985554),\n",
- " ('N', 0.0008025682182985554),\n",
- " (\"'\", 0.0008025682182985554)]"
+ "Counter({'t': 864})"
]
},
"execution_count": 6,
@@ -262,6 +196,73 @@
"output_type": "execute_result"
}
],
+ "source": [
+ "lm['Firs']"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 7,
+ "metadata": {
+ "collapsed": false,
+ "jupyter": {
+ "outputs_hidden": false
+ }
+ },
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "Counter({'C': 119,\n",
+ " 'f': 14,\n",
+ " 'i': 21,\n",
+ " 't': 67,\n",
+ " 'u': 2,\n",
+ " 'S': 203,\n",
+ " 'h': 24,\n",
+ " 's': 41,\n",
+ " 'R': 1,\n",
+ " 'b': 31,\n",
+ " 'c': 16,\n",
+ " 'O': 23,\n",
+ " 'w': 30,\n",
+ " 'a': 28,\n",
+ " 'm': 28,\n",
+ " 'n': 25,\n",
+ " 'I': 12,\n",
+ " 'L': 133,\n",
+ " 'M': 74,\n",
+ " 'l': 13,\n",
+ " 'o': 38,\n",
+ " 'H': 5,\n",
+ " 'd': 19,\n",
+ " 'W': 42,\n",
+ " 'K': 10,\n",
+ " 'q': 2,\n",
+ " 'G': 112,\n",
+ " 'g': 14,\n",
+ " 'k': 5,\n",
+ " 'e': 4,\n",
+ " 'y': 3,\n",
+ " 'r': 9,\n",
+ " 'p': 11,\n",
+ " 'A': 7,\n",
+ " 'P': 18,\n",
+ " 'F': 15,\n",
+ " 'v': 3,\n",
+ " 'T': 4,\n",
+ " 'D': 4,\n",
+ " 'B': 12,\n",
+ " 'N': 1,\n",
+ " \"'\": 1,\n",
+ " 'E': 2})"
+ ]
+ },
+ "execution_count": 7,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
"source": [
"lm['rst ']"
]
@@ -279,29 +280,7 @@
"source": [
"## Generating from the model\n",
"\n",
- "Generating is also very simple. To generate a letter, we will take the history, look at the last *order* characters, and then sample a random letter based on the corresponding distribution:"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 7,
- "metadata": {},
- "outputs": [],
- "source": [
- "def generate_character(lm, history) -> str:\n",
- " \"\"\"Given a history of characters, sample a random next character from `lm`.\"\"\"\n",
- " p = random.random()\n",
- " for c, v in lm[history]:\n",
- " if p <= v: \n",
- " return c\n",
- " p -= v"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "To generate a passage of *k* characters, we just seed it with the initial history and run letter generation in a loop, updating the history at each turn."
+ "To generate a random text from a model, we maintain a history of *order* characters, which starts with all pad characters. We then enter a loop that randomly samples a character from the history's counter, then updates the history by dropping its first character and adding the randomly-sampled character to the end. "
]
},
{
@@ -312,13 +291,36 @@
"source": [
"def generate_text(lm, length=1000) -> str:\n",
" \"\"\"Sample a random `length`-long passage from `lm`.\"\"\"\n",
- " history = PAD * lm.order\n",
- " out = []\n",
- " for i in range(length):\n",
- " c = generate_character(lm, history)\n",
+ " history = lm.order * PAD\n",
+ " text = []\n",
+ " for _ in range(length):\n",
+ " c = sample_character(lm[history])\n",
" history = history[1:] + c\n",
- " out.append(c)\n",
- " return ''.join(out)"
+ " text.append(c)\n",
+ " return ''.join(text)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "To sample a single character from a counter, randomly choose an integer *n* from 1 to the total count of characters in the counter, then iterate through (character, count) items until the cumulative total of the counts meets or exceeds *n*."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 9,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "def sample_character(counter) -> str:\n",
+ " \"\"\"Randomly sample the nth character from the counter.\"\"\"\n",
+ " n = random.randint(1, counter.total)\n",
+ " cumulative = 0\n",
+ " for ch in counter:\n",
+ " cumulative += counter[ch]\n",
+ " if cumulative >= n: \n",
+ " return ch"
]
},
{
@@ -334,7 +336,7 @@
},
{
"cell_type": "code",
- "execution_count": 9,
+ "execution_count": 10,
"metadata": {
"collapsed": false,
"jupyter": {
@@ -346,43 +348,43 @@
"name": "stdout",
"output_type": "stream",
"text": [
- "Fiestis the so, ing'd? was hathy now by ollood re hichaved.\n",
+ "Fireve notenink by in and my hillan foll harseceare.\n",
"\n",
- "MANCE:\n",
- "So oul and carried bear wilese com now ifeck.\n",
+ "Secers faing tom th, dre met speat's ano weauld derve,\n",
+ "OPHOPANG Roy, so grive us thishat me hall\n",
+ "That liall hen a\n",
+ "goin souren;' ther hou hust peavio.\n",
+ "BET:\n",
+ "Forat th ofend now bainch sed cur hou mat fatied shour men of\n",
+ "And ne hinese so may; somminto scall for taking iftlee the been siome this and naby thure dont.\n",
+ "RO:\n",
+ "Ther to-ny no ithe of I me drit, car! wer talk army deris to not ity\n",
+ "shearmonerstlet he gove proultre dink agiver cand swe ithe a glus,\n",
+ "SIDUKENE:\n",
+ "Yor.\n",
"\n",
- "Vere ame\n",
- "A Cad prover, toods\n",
- "Whiseto wousave.\n",
+ "Marrows anters, stemusere, bourects smennown here me thave ap asleand my forter tel\n",
+ "So liket wit.\n",
"\n",
- "But to prath troureak do thque.\n",
+ "Runt?\n",
"\n",
- "SUS:\n",
- "Up to light sonste cat ing ing.\n",
+ "Nor me for pose\n",
+ "Def ing thempood I coame thoppestimp.\n",
+ "Tell ittle willy\n",
+ "As con\n",
+ "To thy soner\n",
+ "bless'd hasarcults what of thers nour\n",
+ "Why laillord!\n",
"\n",
- "EXASTHASTARVENRAY:\n",
- "In Clee rah, th sher gollooke herponefted? hin;\n",
- "Yout the cat this nothaved; tord belf.\n",
- "Gody hat threit es faw'd you wer:\n",
- "If ailt how at be it:\n",
- "A to miden lanign day himemalo; withat thy shat ans, th al at's th for tent! EDGAN:\n",
- "As livill your frows;\n",
- "Ay grour Romer, ant cou, dign,\n",
- "Behould a fieved theek\n",
- "I wood by ne untle, blove;\n",
- "O' womit hat I lass an.\n",
+ "MIL:\n",
+ "Forent itill down, ficke ord younsucks, lie dook:\n",
+ "Hear, a dill sty 't of togue; the I daus swast con now's anch lown re.\n",
+ "MENRY VIRANTO:\n",
+ "The ty, gonfectis me,\n",
+ "Tak Ridst\n",
+ "But all the of thothienday wilead him.\n",
"\n",
- "Tword nothee, plasou? To fer'd thisichou some me,\n",
- "Mill rol.\n",
- "\n",
- "Ha! I deast ind the many ouldou not: seep amend at's of to deat hent: whis dick.\n",
- "\n",
- "KINERMISATEUS Let a reempat ung.\n",
- "Mus he maltie. WAR Con ound our wilivem I;\n",
- "As rom de she a weat pey prews say arn Cithe ming paseas duch and such chat your you go my Duke ens;\n",
- "EDWARUTUS:\n",
- "I lontassues is se day Ladand dre lin wass re becy, ton\n",
- "Theas \n"
+ "By to yess war\n"
]
}
],
@@ -403,7 +405,7 @@
},
{
"cell_type": "code",
- "execution_count": 10,
+ "execution_count": 11,
"metadata": {
"collapsed": false,
"jupyter": {
@@ -415,41 +417,44 @@
"name": "stdout",
"output_type": "stream",
"text": [
- "First Outland? Look upon it is his better offence, which I at out:\n",
- "Sir John.\n",
+ "First Servantagem mayst can shall tends, the been of whether that senses of thou not her, sure that sure remain, thronet an expense.\n",
"\n",
- "KING PHILIP:\n",
- "Why yawn companiel?\n",
+ "Fool, her?\n",
+ "'Twas be told me?\n",
"\n",
- "CAMILLO:\n",
+ "SHALLOW:\n",
+ "De siege.\n",
"\n",
- "COSTARD:\n",
- "No, safe to my utter?\n",
+ "VALENTIO:\n",
+ "Never thy Montentend of rever my good quited, with laid best be? Tellingerous beseech were month.\n",
"\n",
- "APEMANTUS:\n",
- "Nor I worship, give thee: I'll draws eart sociable, noble greaten this lease are could hide use meet, it is trued Brutus me against would burn some to him the grace no more thy play.\n",
+ "KING CLAUDIUS:\n",
+ "He is faces.\n",
"\n",
- "VIOLA:\n",
- "You affection,\n",
- "Therein men\n",
- "such on from that I my grind homel, call, were beseech you evenger.\n",
+ "CLAUDIUS:\n",
+ "In sure, and I hit foolish\n",
+ "As well teach yours, Here,\n",
+ "How does into thing more upon moves mother funerald it my very thank\n",
+ "That here often'd me.\n",
+ "Her hath broth\n",
+ "What wingeneral rob: if them here is jestion the baby you and thee.\n",
"\n",
- "PISTOL:\n",
- "Let us should blow\n",
- "Will be to remembrancis!\n",
+ "HELENA:\n",
+ "O, their is,\n",
+ "And fig orders of fixed in these edges I would up, you this, for the good Service wilt seems to Caesar sing, smoking; and die, you may come wish these bed.\n",
"\n",
- "MACBETH:\n",
+ "PETRUCHIO:\n",
+ "Falstaff ore to fast would I have show naked not what let us wrathe, and promio, thouse: and Mortime:\n",
+ "Both about of his that's hence how dukes inder you.\n",
"\n",
- "MARGARET:\n",
- "The to unworthwith treath,\n",
- "Drop of tune's my bloody. What will me thy Fame is attaining her bed, and sir; he us all give some young Rome,\n",
- "Hostess Shortly come any rascal, o'er them off this Englands this be an assius, aris. Pray you saw them she day opinion for at is my he dange the fight, sir, her, and so long bound thee.\n",
+ "TRANIO:\n",
+ "At Melun.\n",
"\n",
- "DESDEMONA:\n",
- "Can I have foot the violets! where's amore you these\n",
- "him in hold fairs speak no pays and father?\n",
- "'Tis burn after the air dought amissive,\n",
- "\n"
+ "Fourtier and take my some frust and my known\n",
+ "Ever her too.\n",
+ "\n",
+ "TALBOT:\n",
+ "La man h\n"
]
}
],
@@ -471,7 +476,7 @@
},
{
"cell_type": "code",
- "execution_count": 11,
+ "execution_count": 12,
"metadata": {
"collapsed": false,
"jupyter": {
@@ -484,39 +489,42 @@
"output_type": "stream",
"text": [
"First Citizen:\n",
- "Ay, but I therefore let him from shore, to Mercury, set: then, some comfort.\n",
+ "Soft! take my castle call a new-devised impeach young one,\n",
+ "Were not\n",
+ "worth ten time to speak out,\n",
+ "Which Salique land:\n",
+ "Your mark his charge and sack great earnest of his practise, Gloucester, thou mean, he's hunted with this here stands,\n",
+ "'Tis one with such a question: his disturbed spirit, set a-work;\n",
+ "And so should I have in the liveries, all encertain money.\n",
"\n",
- "COUNTESS:\n",
- "Tell her, indeed, too, good show\n",
- "Can heavens, how I came hither art the next.\n",
+ "PERICLES:\n",
+ "How can these our favours\n",
+ "Have sat too curious, three thou wert as wise and right royal 'twas mine;\n",
+ "It is some taste of hers,\n",
+ "Hath seen such a place the event.\n",
+ "'Tis politic; he crosses to it, boy.\n",
"\n",
- "BARDOLPH:\n",
- "By these men.\n",
- "But direction: and, for secrecy:\n",
- "We shall I be,\n",
- "That so with your uncle to couples with her beauty is bound as thought! an 'twere the occasion this to-morrow night;\n",
- "Unless your mouths, if the realm.\n",
- "I never enter'd Pucelle jointure.\n",
+ "QUEEN MARGARET:\n",
+ "Give him,\n",
+ "for my battle's lost:\n",
+ "That well;\n",
+ "But bear my life.\n",
+ "I beg thy particular,--you had known well I shaked,\n",
+ "Which I feel.\n",
+ "I fight a question. Hubert, keep aloof at bay:\n",
+ "Sell ever soft hours,\n",
+ "Unless than thine own gladness he passage 'tis!--whose eyes can volley.\n",
"\n",
- "SIMPCOX:\n",
- "God keep from whom I, indeed too, 'mong other both at our fair pillow for a\n",
- "man against the prince, there!\n",
+ "HAMLET:\n",
+ "How angerly.\n",
"\n",
- "MARIA:\n",
- "You shall deliver us from their own grace, pardon you: yet Count Comfect; an the year groan at it.\n",
+ "PETRUCHIO:\n",
+ "Why, then it is\n",
+ "There were in our loves you,\n",
+ "A sea of glory, Gallia wars.\n",
"\n",
- "SIR ANDREW:\n",
- "'Slight, than you can be!\n",
- "Through thou scurvy railing, may surrender; so must be, love, kill myself, thou wert born i' the people,\n",
- "You are\n",
- "going to my father's show thyself?\n",
- "\n",
- "MONTAGUE:\n",
- "O, when nobles should you in.\n",
- "\n",
- "TRINCULO:\n",
- "Excellent this sleep,\n",
- "And say 'God save you to-morrow morning, but what you fear? myself in such \n"
+ "BASTARD:\n",
+ "Though he be n\n"
]
}
],
@@ -534,7 +542,7 @@
},
{
"cell_type": "code",
- "execution_count": 12,
+ "execution_count": 13,
"metadata": {
"collapsed": false,
"jupyter": {
@@ -547,39 +555,43 @@
"output_type": "stream",
"text": [
"First Citizen:\n",
- "O royal Caesar!\n",
+ "Doth this news; yet what I have been a breakfast, washes his\n",
+ "hands, and save yourself!\n",
+ "My master, my dear Moth?\n",
"\n",
- "First Tribune:\n",
- "We will make amends now: get you gone.\n",
+ "MOTH:\n",
+ "No, nor a man cannot make\n",
+ "him eat it that spake\n",
+ "that word?\n",
"\n",
- "MARTIUS:\n",
- "What else, fellow? Pray you, let me go.\n",
- "But what, is he arrested? Tell me at whose burthen\n",
- "The anger'd any heart alive\n",
- "To hear meekly, sir, they shall know it, would much better used\n",
- "On Navarre and his brethren come?\n",
+ "QUINTUS:\n",
+ "Not so, an't please your highness, give us any thing for my labour by his own attaint?\n",
+ "'Tis doubt he will utter?\n",
"\n",
- "BUCKINGHAM:\n",
- "Give me any gage of this seeming.\n",
+ "BRUTUS:\n",
+ "Hence! I will follow it! Come, and get to Naples? Keep in Tunis,\n",
+ "And Ferdinand, her brothers both from death,\n",
+ "But lusty, young, and of antiquity too; bawd-born.\n",
+ "Farewell, sweet Jack\n",
+ "Falstaff, where hath been prophesied France and Ireland\n",
+ "Bear that play;\n",
+ "For never yet a breaker of\n",
+ "proverbs: he will answer it. Some pigeons, Davy, a couple of Ford's\n",
+ "knaves, his hinds, bars me the poor\n",
+ "duke's office should do the duke of this place and great\n",
+ "ones I dare not: Sir Pierce of Exton, who\n",
+ "lately came from valiant Oxford?\n",
+ "How far hence\n",
+ "In mine own away;\n",
+ "But you gave in charge,\n",
+ "Is now dishonour me.\n",
"\n",
- "HORATIO:\n",
- "Ay, my good lord: our time too brief: I will thither: gracious offers from the humble-bees,\n",
- "And let another general, thou shouldst not bear my standard of the wheat must needs\n",
- "Appear unkinglike.\n",
+ "SUFFOLK:\n",
+ "Who waits there?\n",
"\n",
- "CAIUS LUCIUS:\n",
- "I have, my lord,\n",
- "I should not; for he this\n",
- "very day receive it friendly; but from this time.\n",
- "\n",
- "VALENTINE:\n",
- "How use doth breed a habit in a man!\n",
- "This shadow\n",
- "Doth limp behind that doth warrant.\n",
- "Hark, how our steeds for present business, nor my power\n",
- "To o'erthrown Antony,\n",
- "And very weak and melancholy upon your stubborn ancient skill to fear and cold hand of death hath snatch'd that it us befitted\n",
- "To bear themselves made, \n"
+ "RODERIGO:\n",
+ "I know not; except, in that as ever\n",
+ "knapped\n"
]
}
],
@@ -594,7 +606,37 @@
"source": [
"## This works pretty well\n",
"\n",
- "With an order of 4, we already get quite reasonable results. Increasing the order to 7 (about a word and a half of history) or 10 (about two short words of history) already gets us quite passable Shakepearan text. I'd say it is on par with the examples in Andrej's post. And how simple and un-mystical the model is!"
+ "With an order of 4, we already get quite reasonable results. Increasing the order to 7 (about a word and a half of history) or 10 (about two short words of history) already gets us quite passable Shakepearan text. I'd say it is on par with the examples in Andrej's post. And how simple and un-mystical the model is!\n",
+ "\n",
+ "## Aside: First words\n",
+ "\n",
+ "One thing you may have noticed: all the generated passages start with \"Fi\". Why is that? Because the training data starts with the word \"First\" (preceeded by padding), and so when we go to randomly `generate_text`, the only thing that follows the initial padding is the word \"First\". We could get more variety in the generated text by breaking the training text up into sections and inserting padding at the start of each section. But that would require some knowledge of the structure of the training text; right know the only assumption is that it is a sequence of characters."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 14,
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "First Citizen:\n",
+ "Before we proceed any further, hear me speak.\n",
+ "\n",
+ "All:\n",
+ "Speak, speak.\n",
+ "\n",
+ "First Citizen:\n",
+ "You are all resolved rather to die than to famish?\n",
+ "\n",
+ "All:\n"
+ ]
+ }
+ ],
+ "source": [
+ "! head shakespeare_input.txt"
]
},
{
@@ -616,7 +658,7 @@
},
{
"cell_type": "code",
- "execution_count": 13,
+ "execution_count": 15,
"metadata": {
"collapsed": false,
"jupyter": {
@@ -637,132 +679,6 @@
"! wc linux_input.txt"
]
},
- {
- "cell_type": "code",
- "execution_count": 14,
- "metadata": {
- "collapsed": false,
- "jupyter": {
- "outputs_hidden": false
- }
- },
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "/*\n",
- " * linux/kernel/printk.c\n",
- " *\n",
- " * Copyright (C) 1992, 1998-2004 Linus Torvalds, Ingo Molnar \n",
- " * Copyright (c) 2009 Rafael J. Wysocki , Novell Inc.\n",
- " *\n",
- " * This overall must be zero */\n",
- "\ttxc->ppsfreq, &utp->freq) ||\n",
- "\t\t\t__get_user(handler, &act->sa_handler;\n",
- "\t\tnext_event.tv64 != KTIME_MAX;\n",
- "\tnext_event = parent_freezer(freezer))) {\n",
- "\t\tif (!access_ok(VERIFY_READ, u_event, sizeof(*src);\n",
- "\n",
- "\t/* Convert (if necessary to check that the target CPU.\n",
- " */\n",
- "void gcov_info *info)\n",
- "{\n",
- "\treturn (copied == sizeof(debug_alloc_header *)\n",
- "\t\t\t\t(debug_alloc_header {\n",
- "\tchar reserved fields\\n\");\n",
- "\t\treturn;\n",
- "\n",
- "\tperf_output_begin(&handle, &snapshot_data *data)\n",
- "{\n",
- "\tstruct mcs_spinlock */\n",
- "void gcov_info *get_accumulated_info(node, info);\n",
- "\tif (i > len)\n",
- "\t\tcnt = TRACE_SIGNAL_DELIVERED;\n",
- "out:\n",
- "\ttracing_stop_tr(tr);\n",
- "\n",
- "\t__trace_function_single(int cpu);\n",
- "\n",
- "extern void kdb_dumpregs(regs);\n",
- "\t\tdbg_activate_work(work);\n",
- "\n",
- "\t/*\n",
- "\t * Can't set/change the\"\n",
- "\t\t\t\t\t \"1\", enable);\n",
- "extern struct ftrace_graph_entry_leaf(struc\n"
- ]
- }
- ],
- "source": [
- "lm = train_char_lm(\"linux_input.txt\", order=10)\n",
- "print(generate_text(lm))"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 15,
- "metadata": {
- "collapsed": false,
- "jupyter": {
- "outputs_hidden": false
- }
- },
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "/*\n",
- " * linux/kernel/time/tick-broadcast-hrtimer.c\n",
- " * This file emulates a local clock event device cannot go away as\n",
- "\t * long as we hold\n",
- "\t * lock->wait_lock held.\n",
- " */\n",
- "void __ptrace_unlink - unlink/remove profiling data set with an existing node. Needs to be called with lock->wait_lock);\n",
- "\t *\t\t\t\t\tacquire(lock);\n",
- "\t * or:\n",
- "\t *\n",
- "\t * unlock(wait_lock);\n",
- "\t *\t\t\t\t\tacquire(lock);\n",
- "\t */\n",
- "\treturn rc;\n",
- "}\n",
- "\n",
- "static int get_clock_desc(id, &cd);\n",
- "\tif (err)\n",
- "\t\treturn err;\n",
- "\n",
- "\tif (cd.clk->ops.clock_adjtime(clockid_t id, struct timex __user *) &txc);\n",
- "\tset_fs(oldfs);\n",
- "\tif (!err && compat_put_timespec(&out, rmtp))\n",
- "\t\treturn -EFAULT;\n",
- "\t}\n",
- "\tforce_successful_syscall_return();\n",
- "\treturn compat_jiffies_to_clock_t);\n",
- "\n",
- "u64 nsec_to_clock_t(tsk->delays->blkio_start = ktime_get();\n",
- "\tif (!ret) {\n",
- "\t\tprintk(KERN_CONT \".. corrupted trace buffer .. \");\n",
- "\treturn -1;\n",
- "}\n",
- "\n",
- "/* Will lock the rq it finds */\n",
- "static struct cgroup_subsys *ss;\n",
- "\tchar *tok;\n",
- "\tint ssid, ret;\n",
- "\n",
- "\t/* Do not accept '\\n' to prevent making /proc//cgroup.\n",
- " */\n",
- "int zap_other_thread\n"
- ]
- }
- ],
- "source": [
- "lm = train_char_lm(\"linux_input.txt\", order=15)\n",
- "print(generate_text(lm))"
- ]
- },
{
"cell_type": "code",
"execution_count": 16,
@@ -778,52 +694,49 @@
"output_type": "stream",
"text": [
"/*\n",
- " * linux/kernel/irq/handle.c\n",
- " *\n",
- " * Copyright (C) 2003-2004 Amit S. Kale \n",
- " * Copyright (C) 2008 Steven Rostedt \n",
- " *\n",
- " */\n",
- "#include \n",
- "#include \n",
- "#include \n",
- "#include \n",
- "#include \n",
+ " * linux/kernel_stat.h>\n",
"\n",
- "#include \n",
- "#include \n",
- "#include \n",
- "#include \"module-internal.h\"\n",
- "\n",
- "struct key *system_trusted_keyring, 1),\n",
- "\t\t\t\t\t \"asymmetric\",\n",
- "\t\t\t\t\t NULL,\n",
- "\t\t\t\t\t p,\n",
- "\t\t\t\t\t plen,\n",
- "\t\t\t\t\t ((KEY_POS_ALL & ~KEY_POS_SETATTR) |\n",
- "\t\t\t KEY_USR_VIEW | KEY_USR_READ),\n",
- "\t\t\t\t\t KEY_ALLOC_NOT_IN_QUOTA |\n",
- "\t\t\t\t\t KEY_ALLOC_TRUSTED);\n",
- "\t\tif (IS_ERR(key)) {\n",
- "\t\tswitch (PTR_ERR(key)) {\n",
- "\t\t\t/* Hide some search errors */\n",
- "\t\tcase -EACCES:\n",
- "\t\tcase -ENOTDIR:\n",
- "\t\tcase -EAGAIN:\n",
- "\t\t\treturn ERR_PTR(-EACCES);\n",
+ "int rcu_cpu_notify(CPU_CLUSTER_PM_EXIT, -1, NULL);\n",
+ "}\n",
"\n",
+ "bool __weak is_swbp_insn\n",
+ " * Returns:\n",
+ " *\tZero for success, 0 (invalid alignment with below mechanisms in the kprobe gone and remove\n",
+ " * @action: action to SIG_DFL for a signal frame,\n",
+ "\t and will be woken\n",
+ " * up from TASK_TRACED);\n",
+ "\tif (mod->state == MODULE_STATE_GOING:\n",
+ "\t\tstate = possible;\n",
+ "\t\t\tbreak;\n",
+ "\t\t\trest++;\n",
+ "\t\t}\n",
"\t\t/*\n",
- "\t\t * We could be clever and allow to attach a event to an\n",
- "\t\t * offline CPU and activate it when the CPU comes up, but\n",
- "\t\t * that's for later.\n",
- "\t\t */\n",
- "\t\tif (!cpu_online(cpu))\n",
- "\t\tc\n"
+ "\t\t * Another cpu said 'go' */\n",
+ "\t\t/* Still using kdb, this problematic.\n",
+ " */\n",
+ "void init_sched_dl_class(void)\n",
+ "{\n",
+ "\tfield_cachep = KMEM_CACHE(vm_area_struct *vma, struct autogroup *autogroup_kref_put(ag);\n",
+ "\n",
+ "\treturn 0;\n",
+ "}\n",
+ "\n",
+ "void do_raw_write_can_lock(l)\twrite_can_lock(l)\n",
+ "\n",
+ "/*\n",
+ " * Software Foundation, and allows traversing */\n",
+ "static int perf_output_sample_rate __read_mostly futex_hash_bucket(futex));\n",
+ " * smp_mb() anyway for documentation of its parent\n",
+ "\t * using getname.\n",
+ " *\n",
+ " * default: 5 msec, units: microseconds, all such timers will fire\n",
+ " * at the end of this function implementation\n",
+ " * @num: number of tot\n"
]
}
],
"source": [
- "lm = train_char_lm(\"linux_input.txt\", order=20)\n",
+ "lm = train_char_lm(\"linux_input.txt\", order=10)\n",
"print(generate_text(lm))"
]
},
@@ -842,44 +755,54 @@
"output_type": "stream",
"text": [
"/*\n",
- " * linux/kernel/itimer.c\n",
+ " * linux/kernel/irq/handle.c\n",
" *\n",
- " * Copyright (C) 2010\t\tSUSE Linux Products GmbH\n",
- " * Copyright (C) 2002 2003 by MontaVista Software.\n",
- " *\n",
- " * 2004-06-01 Fix CLOCK_REALTIME clock/timer TIMER_ABSTIME bug.\n",
- " *\t\t\t Copyright (C) 2004-2006 Ingo Molnar\n",
- " * Copyright (C) 2004 Nadia Yvette Chambers, IBM\n",
- " * (C) 2004 Nadia Yvette Chambers\n",
- " */\n",
- "#include \n",
- "#include \n",
- "#include \n",
- "#include \n",
- "#include \n",
- "#include \n",
- "#include \n",
+ " * Copyright (C) 2012 Dario Faggioli ,\n",
+ " * | |\\n\");\n",
+ "}\n",
"\n",
- "void timecounter_init(struct timecounter *tc)\n",
+ "static int __read_mostly =\n",
"{\n",
- "\tcycle_t cycle_now, cycle_delta;\n",
+ "\t.name\t\t= \"irqsoff\",\n",
+ "\t.init\t\t= preemptoff_tracer __read_mostly sysctl_hung_task_timeout_secs(struct ctl_table *table,\n",
+ "\t\t int write, void *data),\n",
+ "\t\t struct lock_list *unsafe_entry,\n",
+ "\t\t\tstruct lock_class_key irq_desc_lock_class;\n",
"\n",
- "\tsleeptime_injected = true;\n",
- "\t} else if (timespec64_compare(&ts_new, &timekeeping_suspend_time) > 0) {\n",
- "\t\tts_delta = timespec64_sub(tk_xtime(tk), timekeeping_suspend_time, delta_delta);\n",
- "\t\t}\n",
+ "#if defined(CONFIG_RCU_TRACE */\n",
+ "\n",
+ "static __init int user_namespace *ns = task_active_pid_ns(parent));\n",
+ "\tinfo.si_uid = from_kuid_munged(current_user_ns(), task_uid(p));\n",
+ "\tstruct siginfo info;\n",
+ "\n",
+ "\tif (argc != 1)\n",
+ "\t\treturn -EINVAL;\n",
"\t}\n",
+ "#endif\n",
"\n",
- "\ttimekeeping_update(tk, TK_MIRROR | TK_CLOCK_WAS_SET);\n",
+ "#ifdef CONFIG_SMP\n",
"\n",
- "\tif (action & TK_MIRROR)\n",
- "\t\tmemcpy(&shadow_timekeeper, &tk_core.timekeeper;\n",
- "\tstruct clocksource *clock = tk->tkr_mono.clock;\n",
- "\ttk->tkr_mono.read = \n"
+ "static void\n",
+ "irq_thread_check_affinity(struct rcu_node *rnp)\n",
+ "{\n",
+ "\treturn rnp->gp_tasks != NULL;\n",
+ "}\n",
+ "\n",
+ "/*\n",
+ " * return non-zero if there is a SIGKILL that should be deferred, ETT_NONE if nothing to defer.\n",
+ " */\n",
+ "enum event_trigger_free() (see\n",
+ " *\ttrace_event_trigger_enable_disable(file, 0);\n",
+ "\t\t\tbreak;\n",
+ "\t\t}\n",
+ "\n",
+ "\t\t/*\n",
+ "\t\t * Wait for all preempt-disabled section so we can as wel\n"
]
}
],
"source": [
+ "lm = train_char_lm(\"linux_input.txt\", order=15)\n",
"print(generate_text(lm))"
]
},
@@ -898,198 +821,273 @@
"output_type": "stream",
"text": [
"/*\n",
- " * linux/kernel/irq/handle.c\n",
+ " * linux/kernel/itimer.c\n",
" *\n",
- " * Copyright (C) 2000-2001 VERITAS Software Corporation.\n",
- " * Copyright (C) 2011 Peter Zijlstra \n",
- " * Copyright (C) 2004-2006 Tom Rini \n",
- " * Copyright (C) 2005-2006, Thomas Gleixner, Russell King\n",
+ " * Copyright (C) 2008 Red Hat, Inc., Peter Zijlstra \n",
+ " * Copyright (C) 2004, 2005, 2006 Red Hat, Inc., Ingo Molnar \n",
+ " * Copyright (C) 2004 IBM Corporation\n",
" *\n",
- " * This file contains the /proc/irq/ handling code.\n",
+ " * Author: Serge Hallyn \n",
+ " *\n",
+ " * This program is distributed in the hope that it will be useful,\n",
+ " * but WITHOUT ANY WARRANTY; without even the implied warranty of\n",
+ " * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the\n",
+ " * GNU General Public License as\n",
+ " * published by the Free Software Foundation; version 2\n",
+ " * of the License, or (at\n",
+ " * your option) any later version.\n",
+ " *\n",
+ " * This program is free software; you can redistribute it and/or\n",
+ " * modify it under the terms of the GNU GPL, version 2\n",
+ " *\n",
+ " * This file implements counting semaphores.\n",
+ " * A counting semaphore may be acquired 'n' times before sleeping.\n",
+ " * See mutex.c for single-acquisition sleeping locks which enforce\n",
+ " * rules which allow code to be debugged more easily.\n",
" */\n",
"\n",
- "#include \n",
- "#include \n",
- "#include \n",
- "#include \n",
- "#include \n",
- "#include \n",
- "#include \n",
- "#include \n",
- "#include \n",
- "#include \n",
- "\n",
- "#include \n",
- "\n",
"/*\n",
- " * Per cpu nohz control structure\n",
+ " * Some \n"
+ ]
+ }
+ ],
+ "source": [
+ "lm = train_char_lm(\"linux_input.txt\", order=20)\n",
+ "print(generate_text(lm))"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 19,
+ "metadata": {
+ "collapsed": false,
+ "jupyter": {
+ "outputs_hidden": false
+ }
+ },
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "/*\n",
+ " * linux/kernel/irq/manage.c\n",
+ " *\n",
+ " * Copyright (C) 2008-2014 Mathieu Desnoyers\n",
+ " *\n",
+ " * This program is distributed in the hope that it will be useful, but\n",
+ " * WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or\n",
+ " * FITNESS FOR A PARTICULAR PURPOSE. See the\n",
+ " * GNU General Public Licence\n",
+ " * as published by the Free Software Foundation, version 2 of the\n",
+ " * License.\n",
+ " *\n",
+ " * Jun 2006 - namespaces support\n",
+ " * OpenVZ, SWsoft Inc.\n",
+ " * (C) 2007 Sukadev Bhattiprolu , IBM\n",
+ " * Many thanks to Oleg Nesterov for comments and help\n",
+ " *\n",
" */\n",
- "static DEFINE_PER_CPU(struct task_struct *p;\n",
+ "\n",
+ "#include \n",
+ "#include \n",
+ "#include \n",
+ "#include \n",
+ "#include \n",
+ "#include \n",
+ "#include \t/* for cond_resched */\n",
+ "#include \n",
+ "#include \n",
+ "#include \n",
+ "#include \n",
+ "#include \n",
+ "#include \n",
+ "#include \n",
+ "#includ\n"
+ ]
+ }
+ ],
+ "source": [
+ "print(generate_text(lm))"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 20,
+ "metadata": {
+ "collapsed": false,
+ "jupyter": {
+ "outputs_hidden": false
+ }
+ },
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "/*\n",
+ " * linux/kernel/irq/manage.c\n",
+ " *\n",
+ " * Copyright (C) 2010 Red Hat, Inc.\n",
+ " *\n",
+ " * Note: Most of this code is borrowed heavily from the original softlockup\n",
+ " * detector, so thanks to Ingo for the initial implementation (includes suggestions from\n",
+ " *\t\tRusty Russell).\n",
+ " * 2004-Aug\tUpdated by Prasanna S Panchamukhi Changed Kprobes\n",
+ " *\t\texceptions notifier as suggested by Andi Kleen.\n",
+ " * 2004-July\tSuparna Bhattacharya added jumper probes\n",
+ " *\t\tinterface to access function arguments.\n",
+ " * 2004-Sep\tPrasanna S Panchamukhi\n",
+ " *\t\t with\n",
+ " *\t\thlists and exceptions notifier to be first on the priority list.\n",
+ " * 2005-May\tHien Nguyen , Jim Keniston\n",
+ " *\t\t and Prasanna S Panchamukhi Changed Kprobes\n",
+ " *\t\texceptions notifier as suggested by Andi Kleen.\n",
+ " * 2004-July\tSuparna Bhattacharya added function-return probe instances associated with the event\n",
+ " *\n",
+ " * If an event has triggers and any of those triggers has a filter or\n",
+ " * a post_trigger, trigger invocation is split\n",
+ " *\tin two - the first part checks the filter using the current\n",
+ " *\ttrace record; if a command has the @post_trigger flag set, it\n",
+ " *\tsets a bit for itself in the return value, otherwise it\n",
+ " *\tdirectly invokes the trigger. Once all commands have been\n",
+ " *\teither invoked or set their return flag, the current record function, skip it.\n",
+ " * If the ops ignores the function via notrace filter, skip it.\n",
+ " */\n",
+ "static inline void\n",
+ "__sched_info_switch(struct rq *rq, struct task_struct *p;\n",
"\tint retval;\n",
"\n",
- "\trcu_read_lock();\n",
- "\tfor_each_domain(cpu, sd)\n",
- "\t\tdomain_num++;\n",
- "\tentry = table = sd_alloc_ctl_entry(domain_num + 1);\n",
- "\tif (table == NULL)\n",
- "\t\treturn NULL;\n",
- "\n",
- "\t/*\n",
- "\t * We repeat when a time extend is encountered or we hit\n",
- "\t * the end of the page to save the\n",
- "\t\t * missed events, then record it there.\n",
- "\t\t */\n",
- "\t\tif (BUF_PAGE_SIZE - commit);\n",
- "\n",
- " out_unlock:\n",
- "\traw_spin_unlock_irqrestore(&nh->lock, flags);\n",
- "\treturn ret;\n",
- "}\n",
- "EXPORT_SYMBOL_GPL(tracepoint_probe_unregister(call->tp,\n",
- "\t\t\t\t\t\t call->class->perf_probe,\n",
- "\t\t\t\t\t call);\n",
- "\t\treturn 0;\n",
- "\tcase '$':\n",
- "\t\tstrcpy(remcom_in_buffer, cmd);\n",
- "\t\treturn 0;\n",
- "\tcase TRACE_REG_PERF_REGISTER:\n",
- "\t\treturn reg_event_syscall_enter(file, event);\n",
- "\t\treturn;\n",
- "\t}\n",
- "\n",
- "retry:\n",
- "\tif (!task_function_call(task, __perf_cgroup_move, task);\n",
- "}\n",
- "\n",
- "static void perf_cgroup_exit(struct cgroup_subsys_state *pos)\n",
- "{\n",
- "\tstruct cgroup_subsys_state *parent_css)\n",
- "{\n",
- "\tstruct freezer *freezer;\n",
- "\n",
- "\tfreezer = kzalloc(sizeof(*data), gfpflags);\n",
- "\tif (!data->cpu_data)\n",
- "\t\tgoto out_err_free;\n",
- "\n",
- "\tfor_each_possible_cpu(cpu)\n",
- "\t\tset_bit(0, &per_cpu(tick_cpu_sched, cpu);\n",
- "\n",
- "# ifdef CONFIG_HIGH_RES_TIMERS\n",
- "\n",
- "/*\n",
- " * High resolution timer enabled ?\n",
- " */\n",
- "static int tick_nohz_init_all(void)\n",
- "{\n",
- "\tint err = device_register(&tick_bc_dev);\n",
- "\n",
- "\tif (!err)\n",
- "\t\terr = register_module_notifier(&ftrace_module_exit_nb);\n",
- "\tif (ret)\n",
- "\t\tpr_warning(\"Failed to register tracepoint module enter notifier\\n\");\n",
- "\n",
- "\treturn ret;\n",
- "}\n",
- "\n",
- "/*\n",
- " * Avoid consuming memory with our now useless rbtree.\n",
- " */\n",
- "static int enqueue_hrtimer(struct hrtimer *timer) { }\n",
- "static inline void clocksource_dequeue_watchdog(struct clocksource *cs) { }\n",
- "static inline void sched_rt_rq_dequeue(rt_rq);\n",
- "\t\t\treturn 1;\n",
- "\t\t}\n",
- "\t}\n",
- "\n",
- "\thlock = curr->held_locks + i;\n",
+ "\tif (!desc || !irqd_can_balance(&desc->irq_data)) {\n",
"\t\t/*\n",
- "\t\t * We must not cross into another context:\n",
+ "\t\t * Already running: If it is shared get the other\n",
+ "\t\t * CPU to go looking for our mystery interrupt too\n",
"\t\t */\n",
- "\t\tif (move_group) {\n",
- "\t\t/*\n",
- "\t\t * Wait for all pre-existing t->on_rq and t->nvcsw\n",
- "\t\t * transitions to complete. Invoking synchronize_sched_expedited();\n",
+ "\t\tdesc->istate |= IRQS_SUSPENDED;\n",
+ "\t__disable_irq(desc, irq);\n",
+ "out:\n",
+ "\tirq_put_desc_busunlock(desc, flags);\n",
+ "\treturn err;\n",
"}\n",
- "EXPORT_SYMBOL_GPL(srcu_init_notifier_head(struct srcu_notifier_head *nh,\n",
- "\t\t\t\t unsigned long flags)\n",
- "{\n",
- "\t__irq_put_desc_unlock(desc, flags);\n",
- "\treturn ret;\n",
- "}\n",
- "\n",
- "#define DEFINE_OUTPUT_COPY(func_name, memcpy_func)\t\t\t\\\n",
- "static inline unsigned int count_highmem_pages(void) { return 0; }\n",
- "static inline int trace_branch_enable(struct trace_uprobe, consumer);\n",
- "\n",
- "\tudd.tu = tu;\n",
- "\tudd.bp_addr = instruction_pointer(regs);\n",
- "\t\tdata = DATAOF_TRACE_ENTRY(entry, false);\n",
- "\t}\n",
- "\n",
- "\tmemcpy(data, ucb->buf, tu->tp.size + dsize) {\n",
- "\t\tint len = tu->tp.size + dsize;\n",
- "\tsize = ALIGN(size + sizeof(u32), sizeof(u64));\n",
- "\tsize -= sizeof(u32);\n",
- "\n",
- "\trec = (struct syscall_trace_enter {\n",
- "\tstruct trace_entry\tent;\n",
- "\tunsigned long\t\t\tip;\n",
- "\tunsigned long\t\t\tcache_read;\n",
- "\tu64\t\t\t\tread_stamp;\n",
- "\t/* ring buffer pages to update, > 0 to add, < 0 to remove */\n",
- "\tint\t\t\t\tnr_pages_to_update;\n",
- "\tstruct list_head list;\n",
- "};\n",
- "\n",
- "struct postfix_elt {\n",
- "\tint op;\n",
- "\tchar *operand;\n",
- "\tstruct list_head *vec;\n",
- "\n",
- "\tif (idx < TVR_SIZE) {\n",
- "\t\tint i = expires & TVR_MASK;\n",
- "\t\tvec = base->tv4.vec + i;\n",
- "\t} else if (idx < 1 << (TVR_BITS + 2 * TVN_BITS)) {\n",
- "\t\tint i = (expires >> (TVR_BITS + 2 * TVN_BITS)) {\n",
- "\t\tint i = (expires >> (TVR_BITS + 2 * TVN_BITS)) {\n",
- "\t\tint i = (expires >> (TVR_BITS + 3 * TVN_BITS)) & TVN_MASK;\n",
- "\t\tvec = base->tv4.vec + i;\n",
- "\t} else if (idx < 1 << (TVR_BITS + TVN_BITS)) & TVN_MASK)\n",
- "\n",
- "/**\n",
- " * __run_timers - run all expired timers (if any) on this CPU.\n",
- " * @base: the timer vector to be processed.\n",
+ "/*\n",
+ " * RT-Mutexes: blocking mutual exclusion locks with PI support\n",
" *\n",
- " * This function is exported for use by the signal deliver code. It is\n",
- " * called just prior to the info block being released and passes that\n",
- " * block to us. It's function is to update the overrun entry AND to\n",
- " * restart the timer. It should only be called by rtc_resume(), and allows\n",
- " * a suspend offset to be injected into the timekeeping values.\n",
+ " * started by Ingo Molnar and Thomas Gleixner:\n",
+ " *\n",
+ " * Copyright (C) 1991, 1992 Linus Torvalds\n",
+ " *\n",
+ " * Modified to make sys_syslog() more flexible: added commands to\n",
+ " * return the last 4k of kernel messages, regardless of whether\n",
+ " * they've been read or not. Added option to suppress kernel printk's\n",
+ " * to the console. Added hook for sending the console messages\n",
+ " * elsewhere, in preparation for detecting the next grace period.\n",
+ "\t * But we can only be sure that RCU is idle if we are looking\n",
+ "\t * at the root rcu_node structure's lock in order to\n",
+ "\t * start one (if needed).\n",
+ "\t */\n",
+ "\tif (rnp != rnp_root) {\n",
+ "\t\traw_spin_lock(&ctx->lock);\n",
+ "}\n",
+ "\n",
+ "static inline void debug_init(struct timer_list *timer, void (*fn)(unsigned long),\n",
+ "\t\t\t\ti == entry->nb_args - 1 ? \"\" : \", \");\n",
+ "\t}\n",
+ "\tpos += snprintf(buf + pos, LEN_OR_ZERO, \" %s=%s\",\n",
+ "\t\t\t\ttp->args[i].name);\n",
+ "\t}\n",
+ "\n",
+ "#undef LEN_OR_ZERO\n",
+ "\n",
+ "\t/* return the length of the given event. Will return\n",
+ " * the length of the time extend if the event is a\n",
+ " * time extend.\n",
" */\n",
- "void timekeeping_inject_offset(struct timespec *ts);\n",
- "extern s32 timekeeping_get_tai_offset(void);\n",
- "extern void tick_clock_notify(void)\n",
+ "static inline void register_handler_proc(irq, new);\n",
+ "\tfree_cpumask_var(mask);\n",
+ "\n",
+ "\treturn 0;\n",
+ "}\n",
+ "\n",
+ "#ifdef CONFIG_COMPAT\n",
+ "static long posix_cpu_nsleep_restart(struct restart_block *restart)\n",
"{\n",
- "\tint cpu;\n",
+ "\tenum alarmtimer_type type,\n",
+ "\t\t\tstruct pid *new)\n",
+ "{\n",
+ "\tstruct pid_link *links)\n",
+ "{\n",
+ "\tenum pid_type type;\n",
"\n",
- "\tif (!watchdog_user_enabled)\n",
- "\t\treturn;\n",
+ "\tfor (type = PIDTYPE_PID; type < PIDTYPE_MAX; ++type) {\n",
+ "\t\tINIT_HLIST_NODE(&ri->hlist);\n",
+ "\t\tkretprobe_table_unlock(hash, &flags);\n",
+ "\t\thlist_add_head(&inst->hlist, &rp->free_instances);\n",
+ "\t}\n",
"\n",
- "\tif (cpumask_equal(&p->cpus_allowed, new_mask))\n",
- "\t\tgoto out;\n",
- "\n",
- "\tcpu = smp_processor_id();\n",
+ "\trp->nmissed = 0;\n",
+ "\t/* Establish function entry probe to avoid possible hanging */\n",
+ "static int trace_test_buffer_cpu(buf, cpu);\n",
+ "\t\tif (ret)\n",
+ "\t\t\tbreak;\n",
+ "\t}\n",
"\n",
"\t/*\n",
- "\t * Unthrottle events, since we scheduled we might have missed several\n",
- "\t * ticks already, also for a heavily scheduling task there is little\n",
- "\t * guarantee it'll get a tick in a timely manner.\n",
- " * Because an uncertain amount of memory will be freed in some uncertain\n",
- " * timeframe, we do not claim to have freed anything.\n",
- " */\n",
- "static int cpu_hotplug_disabled;\n",
+ "\t * We can't hold ctx->lock when iterating the ->flexible_group list due\n",
+ "\t * to allocations, but we need to prevent that we loop forever in the hrtimer\n",
+ "\t * interrupt routine. We give it 3 attempts to avoid\n",
+ "\t * overreacting on some spurious event.\n",
+ "\t *\n",
+ "\t * Acquire base lock for updating the offsets and retrieving\n",
+ "\t * the current rq->clock timestamp, except that would require using\n",
+ "\t * atomic ops.\n",
+ "\t */\n",
+ "\tif (irq_delta > delta)\n",
+ "\t\tirq_delta = delta;\n",
"\n",
- "#ifdef C\n"
+ "\trq->prev_irq_time += irq_delta;\n",
+ "\tdelta -= irq_delta;\n",
+ "#endif\n",
+ "#ifdef CONFIG_NO_HZ_FULL\n",
+ "static void nohz_kick_work_fn(struct work_struct *work)\n",
+ "{\n",
+ "\tsmp_wmb();\t/* see set_work_pool_and_clear_pending(work, pool->id);\n",
+ "\n",
+ "\t\tspin_unlock(&pool->lock);\n",
+ "\t/* see the comment above the definition of WQ_POWER_EFFICIENT */\n",
+ "#ifdef CONFIG_WQ_POWER_EFFICIENT_DEFAULT\n",
+ "static bool wq_power_efficient;\n",
+ "#endif\n",
+ "\n",
+ "module_param_named(cmd_enable, kdb_cmd_enabled, int, 0600);\n",
+ "\n",
+ "char kdb_grep_string[];\n",
+ "#define KDB_GREP_STRLEN 256\n",
+ "extern int kdb_grep_trailing;\n",
+ "extern char *kdb_cmds[];\n",
+ "extern unsigned int max_bfs_queue_depth;\n",
+ "\n",
+ "static unsigned int relay_file_poll(struct file *file, const char __user *ubuf,\n",
+ "\t\t size_t cnt, loff_t *ppos)\n",
+ "{\n",
+ "\treturn -ENOSYS;\n",
+ "}\n",
+ "\n",
+ "int proc_dointvec_jiffies,\n",
+ "\t},\n",
+ "\t{\n",
+ "\t\t.procname\t= \"sched_cfs_bandwidth_slice(void)\n",
+ "{\n",
+ "\treturn ((trace_type & TRACER_IRQS_OFF) &&\n",
+ "\t\tirqs_disabled());\n",
+ "}\n",
+ "#else\n",
+ "# define perf_compat_ioctl NULL\n",
+ "#endif\n",
+ "\n",
+ "int perf_event_task_disable(void)\n",
+ "{\n",
+ "\tstruct trace_buffer\ttrace_buffer;\n",
+ "#ifdef CONFIG_TR\n"
]
}
],
@@ -1104,12 +1102,11 @@
"## Analysis\n",
"\n",
"Order 10 is pretty much junk. In order 15 things sort-of make sense, but we jump abruptly between the `[sic]`\n",
- "and by order 20 we are doing quite nicely -- but are far from keeping good indentation and brackets. \n",
+ "and by order 20 we are doing quite nicely — but are far from keeping good indentation and brackets. \n",
"\n",
"How could we? we do not have the memory, and these things are not modeled at all. While we could quite easily enrich our model to support also keeping track of brackets and indentation (by adding information such as \"have I seen ( but not )\" to the conditioning history), this requires extra work, non-trivial human reasoning, and will make the model significantly more complex. \n",
"\n",
- "Karpathy's LSTM, on the other hand, seemed to have just learn it on its own. And that's impressive.\n",
- "\n"
+ "Karpathy's LSTM, on the other hand, seemed to have just learn it on its own. And that's impressive."
]
}
],