Add files via upload

2022-10-29 13:10:10 -07:00
parent 256c74204b
commit c4856ab03f
1 changed files with 1 additions and 1 deletions
--- a/ipynb/AlphaCode.ipynb
+++ b/ipynb/AlphaCode.ipynb
@@ -455,7 +455,7 @@
    "- They can hallucinate incorrect statements. This is a big issue in tasks like mathematics, where the small difference between the statements \"*x* < 4\" and \"*x* > 4\" makes a big difference to the outcome. In normal natural language, there is more redundancy and less chance for a single character difference to cause such big problems.\n",
    "- They need to be trained to provide trust. The Minerva model generates code, but does not generate documentation or tests that would build trust in the code.\n",
    "- The majority voting method is quick and easy, but incomplete. A better architecture would be to force consensus: if different runs produce different final answers, the system should have a way to reconcile the differences, figuring how and why the minority answers were generated, and making sure that mistakes in reasoning are not repeated in the majority answer.\n",
-    "- The models should learn from interactions. Currently they are trained on a large corpus, then fine-tuned on a specific subject matter, and then run with appropriate prompts. If the prompt asks for step-by-step reasoning, the model can generate that, but then it doesn't learn anything from the process of solving the problem (whether it gets it right or wrong); every new problem posed to it is the same as the first problem. In the article [*Learning by Distilling Contxt*](https://arxiv.org/abs/2209.15189), the authors suggest an approach where a model is conditioned to predict the final answer and the step-by-step reasoning, given the problem description and the prompting instructions (such as \"show your reasoning step by step\"). The system is then fine-tuned to predict the final answer from the problem desscription, without seeing any prompting instructions or step-by-step reasoning. This approach has an interesting parallel to the [Dreyfus model of skill acquisition](https://www.bumc.bu.edu/facdev-medicine/files/2012/03/Dreyfus-skill-level.pdf), in which novices work by the rote application of rules. This works in routine situations, but the novice does not have a complete understanding of the contexts in which the rules will not apply. An expert uses their situational experience to arrive at a solution without the explicit application of rules. So the fine-tuning in this architecture can be seen as a process of building contextual arrangement and compiling step-by-step rules into immediate action.\n",
+    "- The models should learn from interactions. Currently they are trained on a large corpus, then fine-tuned on a specific subject matter, and then run with appropriate prompts. If the prompt asks for step-by-step reasoning, the model can generate that, but then it doesn't learn anything from the process of solving the problem (whether it gets it right or wrong); every new problem posed to it is the same as the first problem. In the article [*Learning by Distilling Context*](https://arxiv.org/abs/2209.15189), the authors suggest an approach where a model is conditioned to predict the final answer and the step-by-step reasoning, given the problem description and the prompting instructions (such as \"show your reasoning step by step\"). The system is then fine-tuned to predict the final answer from the problem desscription, without seeing any prompting instructions or step-by-step reasoning. This approach has an interesting parallel to the [Dreyfus model of skill acquisition](https://www.bumc.bu.edu/facdev-medicine/files/2012/03/Dreyfus-skill-level.pdf), in which novices work by the rote application of rules. This works in routine situations, but the novice does not have a complete understanding of the contexts in which the rules will not apply. An expert uses their situational experience to arrive at a solution without the explicit application of rules. So the fine-tuning in this architecture can be seen as a process of building contextual arrangement and compiling step-by-step rules into immediate action.\n",
    "- The eminent computer scientist Edsger Dijkstra predicted that machine learning, especially with gradient descent, could never be applied to programming, writing \"*In the discrete world of computing, there is no meaningful metric in which 'small' changes and 'small' effects go hand in hand, and there never will be.*\" Systems like AlphaCode have proven him partially wrong, but further progress would be easier if our programming languages were designed in such a way that the space of programs could be more easily explored by making small changes, and if it was faster to evaluate the quality of a program. Perhaps we'd be better off with functional languages that facilitate caching of intermediate results, so that when a small change is suggested, recomputing the program mostly uses precomputed results.\n",
    "- In modern software development many artifacts are produced. There's the code, but also documentation, test suites, design documents, performance timing results, user experience experiments and results, traces of user interactions, and so on. And then there's a machine learning model. We can optimize the machine learning model by feeding it inputs, examining the outputs, and modifying the model to minimize the loss between the expected and observed outputs. This is possible because the model is differentiable. But the machine learning model is just a small part of the overall software development process. If all the other parts could be incorporated into an end-to-end differntiable model, the process of evolving the system would be easier. Consider the scenario where the user experience researchers do an experiment comparring ten different user interfaces, and determine which one is best. The engineers then go implement that UI. Sometime later, the world changes: maybe the blend of users is different, maybe users migrate to devices with a different screen size. What would trigger an update to the UI? today, we rely on institutional memory: someone says, \"Hey, I remember that UX study a few years back; maybe we should look at it again and see if a different UI would be better.\" But if the experiment documents and everything else were all in an end-to-end model, then the model itself could detect when a change is warranted. Building languages that allow for the incorporation of all these different kinds of documents is a challenge for the future.\n"
   ]