Inhalt

Aktueller Ordner: /

.ipynb (pretty JSON)

{
    "metadata": {
        "kernelspec": {
            "name": "python",
            "display_name": "Python (Pyodide)",
            "language": "python"
        },
        "language_info": {
            "codemirror_mode": {
                "name": "python",
                "version": 3
            },
            "file_extension": ".py",
            "mimetype": "text\/x-python",
            "name": "python",
            "nbconvert_exporter": "python",
            "pygments_lexer": "ipython3",
            "version": "3.8"
        }
    },
    "nbformat_minor": 5,
    "nbformat": 4,
    "cells": [
        {
            "id": "e01da20e-e944-4535-bd80-879e62175f32",
            "cell_type": "markdown",
            "source": "Die qualitative Sozialforschung hat den Kognitivismus verschlafen.\nSo wurde verpasst, die Rekonstruktion latenter Sinnstrukturen\ndurch die Konstruktion generativer Regeln im Sinne von Algorithmen abzusichern. \nFür valide erhobene Kategoriensysteme (vg. Mayring) lassen sich algorithmische Regeln \neines endlichen Automaten angeben \n(vg. Koop, Paul.: ARS, Grammar-Induction, Parser, Grammar-Transduction).\n\nJetzt parasitieren Posthumanismus, Poststrukturalismus und Transhumanismus die Opake KI.\nUnd parasitieren sie diese nicht, so sind sie wechselseitige Symbionten.\n\nKarl Popper wird dann durch Harry Potter ersetzt und \nqualitative Sozialforschung und Grosse Sprachmodelle werden zu wenig erklärenden,\naber beeindruckendem Cargo-Kult einer nichts erklärenden und alles \nverschleiernden Postmoderne.\n\nFür die Algorithmisch rekursive Sequenzanalyse wurde gezeigt,\ndass für das Protokoll einer Handlungssequenz\nmindestens eine Grammatik angegeben werden kann\n(Induktor in Scheme, Parser in Pascal, Transduktor in Lisp, vgl Koop, P.).\n\nARS ist ein qualitatives Verfahren, \ndas latente Regeln protokollierter Handlungssequenzen\nwiderlegbar rekonstruieren kann.\n\nEin Großes Sprachmodell lässt sich so nachprogrammieren, dass es die \nermittelten Kategorien einer qualitativen Inhaltsanalyse (vgl. Mayring) \nrekonstruieren kann.\n\nDer Erklärungswert eines solchen Modells ist aber vernachlässigbar, \nweil gerade eben nichts erklärt wird.\n\nUm das zu zeigen, wird im Folgenden \ndie Nachprogrammierung eines Großen Sprachmodells beschrieben.\n",
            "metadata": []
        },
        {
            "id": "769da2cf-f3a4-4744-9a6d-27bfb0fe8b5c",
            "cell_type": "markdown",
            "source": "Aus dem Korpus der Kodierungen eines transkribierten Protokolls kann mit einem tiefen Sprachmodell\neine Simulation eines Verkaufsgespräches gefahren werden. \nDer Algorithmus des tiefen Sprachmodell steht dann für die generative Struktur.\nGute Einführungen bieten:\n    \nSteinwender, J., Schwaiger, R.:\nNeuronale Netze programmieren mit Python\n2. Auflage 2020\nISBN 978-3-8362-7452-4\n\nTrask, A. W.:\nNeuronale Netze und Deep Learning kapieren\nDer einfache Praxiseinstieg mit Beispielen in Python\n1. Auflage 2020\nISBN 978-3-7475-0017-0\n\nHirschle, J.:\nDeep Natural Language Processing\n1. Auflage 2022\nISBN 978-3-446-47363-8\n\nDie Datenstrukturen in diesem Text sind aus dem oben genannten Titel von A. W. Trask nachprogrammiert. \nDaraus ist dann das tiefe Sprachmodell für Verkaufsgespäche abgeleitet.\n    \n    \n",
            "metadata": []
        },
        {
            "id": "d3578aba-f3a8-493a-9e28-31c60cb52ffd",
            "cell_type": "markdown",
            "source": "Neuronale Netze sind mehrdimensionale, meist zweidimensionale Datenfelder rationaler Zahlen. \nEine verborgene Schicht aus voraussagenden Gewichten gewichtet die Daten der Eingabeschicht, propagiert die Ergebnisse zur nächsten Schicht und so fort, \nbis eine offene Ausgabeschicht sie dann ausgibt.\n\nIn der Trainingsphase werden die Gewichte zurückpropagiert, bei Grossen Sprachmodellen mit rekurrenten Netzwerken mit Aufmerksamkeit auf dem protokollierten Kontext.\n\nIn den dem Verständnis dienenden Beispielen wird versucht, die Spielergebnisse einer Mannschaft durch Gewichtung der Zehenzahl,\nder bisher gewonnenen Spiele und der Anzahl an Fans, die zukünftigen Geweinnchancen zu ermitteln.",
            "metadata": []
        },
        {
            "id": "b4746ec9-6747-44b8-b5d3-ccad6f1e8fbd",
            "cell_type": "markdown",
            "source": "Nur ein Eingabedatum, hier die Zehenzahl:",
            "metadata": []
        },
        {
            "id": "f3c53651-23c2-4dda-9e9a-d6e5219b29ea",
            "cell_type": "code",
            "source": "# Das Netzwerk\ngewicht = 0.1\ndef neurales_netzwerk (eingabe, gewicht):\n    ausgabe = eingabe * gewicht\n    return ausgabe\n\n# Anwendung des Netzwerkes\nanzahl_der_zehen = [8.5, 9.5, 10, 9]\neingabe = anzahl_der_zehen[0]\nausgabe = neurales_netzwerk (eingabe, gewicht)\nprint(ausgabe)\n\n",
            "metadata": [],
            "outputs": [
                {
                    "name": "stdout",
                    "output_type": "stream",
                    "text": "0.8500000000000001\n"
                }
            ],
            "execution_count": 3
        },
        {
            "id": "7bc67fcb-573e-4fbb-8cfa-9abe517c5ca5",
            "cell_type": "markdown",
            "source": "Jetzt mit drei Eingabedaten (Zehenzahl, bisherige Gewinne, Anzahl Fans):",
            "metadata": []
        },
        {
            "id": "34833e44-f410-40ae-9f28-cf0d16ac4853",
            "cell_type": "code",
            "source": "def propagierungsfunktion(a,b):\n    assert(len(a) == len(b))\n    ausgabe = 0\n    for i in range(len(a)):\n        ausgabe += (a[i] * b[i])\n    return ausgabe\n\ngewicht = [0.1, 0.2, 0] \n    \ndef neurales_netzwerk(eingabe, gewicht):\n    ausgabe = propagierungsfunktion(eingabe,gewicht)\n    return ausgabe\n\n\nzehen =  [8.5, 9.5, 9.9, 9.0]\ngewinnrate = [0.65, 0.8, 0.8, 0.9]\nfans = [1.2, 1.3, 0.5, 1.0]\n\neingabe = [zehen[0],gewinnrate[0],fans[0]]\nausgabe = neurales_netzwerk(eingabe,gewicht)\n\nprint(ausgabe)",
            "metadata": [],
            "outputs": [
                {
                    "name": "stdout",
                    "output_type": "stream",
                    "text": "0.9800000000000001\n"
                }
            ],
            "execution_count": 5
        },
        {
            "id": "fe585346-ccde-4f6d-8a57-eff7da278164",
            "cell_type": "markdown",
            "source": "Jetzt mit der Bibliothek numy (Datenfelder, Vektoren, Matrizen):",
            "metadata": []
        },
        {
            "id": "98d43668-8d49-479a-a5e6-ea25fb3f3e53",
            "cell_type": "code",
            "source": "import numpy as ny\ngewicht = ny.array([0.1, 0.2, 0])\ndef neurales_netzwerk(eingabe, gewicht):\n    ausgabe = eingabe.dot(gewicht)\n    return ausgabe\n    \nzehen      = ny.array([8.5, 9.5, 9.9, 9.0])\ngewinnrate = ny.array([0.65, 0.8, 0.8, 0.9])\nfans       = ny.array([1.2, 1.3, 0.5, 1.0])\n\n\neingabe = ny.array([zehen[0],gewinnrate[0],fans[0]])\nausgabe = neurales_netzwerk(eingabe,gewicht)\n\nprint(ausgabe)",
            "metadata": [],
            "outputs": [
                {
                    "name": "stdout",
                    "output_type": "stream",
                    "text": "0.9800000000000001\n"
                }
            ],
            "execution_count": 2
        },
        {
            "id": "56fbcd3e-b4fb-47e0-97e2-2dd472cb9fd4",
            "cell_type": "markdown",
            "source": "Die Gewichte lassen sich so lange anpassen, \nbis der Fehler minimiert ist.",
            "metadata": []
        },
        {
            "id": "5a2c5086-7d91-4565-ab52-19e1db0b6c5d",
            "cell_type": "code",
            "source": "# Prinzipielles Beispiel\ngewicht = 0.5\neingabe = 0.5\nerwuenschte_vorhersage = 0.8\n\nschrittweite = 0.001\n\nfor iteration in range(1101):\n\n    vorhersage = eingabe * gewicht\n    fehler = (vorhersage - erwuenschte_vorhersage) ** 2\n\n    print(\"Fehler:\" + str(fehler) + \" Vorhersage:\" + str(vorhersage))\n    \n    hoehere_vorhersage = eingabe * (gewicht + schrittweite)\n    tieferer_fehler = (gewünschte_vorhersage - hoehere_orhersage) ** 2\n\n    hoehere_vorhersage = eingabe * (gewicht - schrittweite)\n    tiefere_fehler = (erwuenschte_vorhersage - tiefere_vorhersage) ** 2\n\n    if(tieferer_fehler <  hoeherer_fehler):\n        gewicht = gewicht - schrittweite\n        \n    if(tieferer_fehler >  hoeherer_fehler):\n        gewicht = gewicht + schrittweite",
            "metadata": [],
            "outputs": [],
            "execution_count": null
        },
        {
            "id": "04f4d530-6133-45d3-ade3-56d179da55b8",
            "cell_type": "code",
            "source": "# Trask, A. W.:\n# Neuronale Netze und Deep Learning kapieren\n# Der einfache Praxiseinstieg mit Beispielen in Python\n# 1. Auflage 2020\n# ISBN 978-3-7475-0017-0\n\nimport numpy as np\n\n# Objektklasse Datenfeld\nclass Tensor (object):\n    \n    def __init__(self,data,\n                 autograd=False,\n                 creators=None,\n                 creation_op=None,\n                 id=None):\n        \n        self.data = np.array(data)\n        self.autograd = autograd\n        self.grad = None\n\n        if(id is None):\n            self.id = np.random.randint(0,1000000000)\n        else:\n            self.id = id\n        \n        self.creators = creators\n        self.creation_op = creation_op\n        self.children = {}\n        \n        if(creators is not None):\n            for c in creators:\n                if(self.id not in c.children):\n                    c.children[self.id] = 1\n                else:\n                    c.children[self.id] += 1\n\n    def all_children_grads_accounted_for(self):\n        for id,cnt in self.children.items():\n            if(cnt != 0):\n                return False\n        return True \n        \n    def backward(self,grad=None, grad_origin=None):\n        if(self.autograd):\n \n            if(grad is None):\n                grad = Tensor(np.ones_like(self.data))\n\n            if(grad_origin is not None):\n                if(self.children[grad_origin.id] == 0):\n                    return\n                    print(self.id)\n                    print(self.creation_op)\n                    print(len(self.creators))\n                    for c in self.creators:\n                        print(c.creation_op)\n                    raise Exception(\"cannot backprop more than once\")\n                else:\n                    self.children[grad_origin.id] -= 1\n\n            if(self.grad is None):\n                self.grad = grad\n            else:\n                self.grad += grad\n            \n\n            assert grad.autograd == False\n            \n\n            if(self.creators is not None and \n               (self.all_children_grads_accounted_for() or \n                grad_origin is None)):\n\n                if(self.creation_op == \"add\"):\n                    self.creators[0].backward(self.grad, self)\n                    self.creators[1].backward(self.grad, self)\n                    \n                if(self.creation_op == \"sub\"):\n                    self.creators[0].backward(Tensor(self.grad.data), self)\n                    self.creators[1].backward(Tensor(self.grad.__neg__().data), self)\n\n                if(self.creation_op == \"mul\"):\n                    new = self.grad * self.creators[1]\n                    self.creators[0].backward(new , self)\n                    new = self.grad * self.creators[0]\n                    self.creators[1].backward(new, self)                    \n                    \n                if(self.creation_op == \"mm\"):\n                    c0 = self.creators[0]\n                    c1 = self.creators[1]\n                    new = self.grad.mm(c1.transpose())\n                    c0.backward(new)\n                    new = self.grad.transpose().mm(c0).transpose()\n                    c1.backward(new)\n                    \n                if(self.creation_op == \"transpose\"):\n                    self.creators[0].backward(self.grad.transpose())\n\n                if(\"sum\" in self.creation_op):\n                    dim = int(self.creation_op.split(\"_\")[1])\n                    self.creators[0].backward(self.grad.expand(dim,\n                                                               self.creators[0].data.shape[dim]))\n\n                if(\"expand\" in self.creation_op):\n                    dim = int(self.creation_op.split(\"_\")[1])\n                    self.creators[0].backward(self.grad.sum(dim))\n                    \n                if(self.creation_op == \"neg\"):\n                    self.creators[0].backward(self.grad.__neg__())\n                    \n                if(self.creation_op == \"sigmoid\"):\n                    ones = Tensor(np.ones_like(self.grad.data))\n                    self.creators[0].backward(self.grad * (self * (ones - self)))\n                \n                if(self.creation_op == \"tanh\"):\n                    ones = Tensor(np.ones_like(self.grad.data))\n                    self.creators[0].backward(self.grad * (ones - (self * self)))\n                \n                if(self.creation_op == \"index_select\"):\n                    new_grad = np.zeros_like(self.creators[0].data)\n                    indices_ = self.index_select_indices.data.flatten()\n                    grad_ = grad.data.reshape(len(indices_), -1)\n                    for i in range(len(indices_)):\n                        new_grad[indices_[i]] += grad_[i]\n                    self.creators[0].backward(Tensor(new_grad))\n                    \n                if(self.creation_op == \"cross_entropy\"):\n                    dx = self.softmax_output - self.target_dist\n                    self.creators[0].backward(Tensor(dx))\n                    \n    def __add__(self, other):\n        if(self.autograd and other.autograd):\n            return Tensor(self.data + other.data,\n                          autograd=True,\n                          creators=[self,other],\n                          creation_op=\"add\")\n        return Tensor(self.data + other.data)\n\n    def __neg__(self):\n        if(self.autograd):\n            return Tensor(self.data * -1,\n                          autograd=True,\n                          creators=[self],\n                          creation_op=\"neg\")\n        return Tensor(self.data * -1)\n    \n    def __sub__(self, other):\n        if(self.autograd and other.autograd):\n            return Tensor(self.data - other.data,\n                          autograd=True,\n                          creators=[self,other],\n                          creation_op=\"sub\")\n        return Tensor(self.data - other.data)\n    \n    def __mul__(self, other):\n        if(self.autograd and other.autograd):\n            return Tensor(self.data * other.data,\n                          autograd=True,\n                          creators=[self,other],\n                          creation_op=\"mul\")\n        return Tensor(self.data * other.data)    \n\n    def sum(self, dim):\n        if(self.autograd):\n            return Tensor(self.data.sum(dim),\n                          autograd=True,\n                          creators=[self],\n                          creation_op=\"sum_\"+str(dim))\n        return Tensor(self.data.sum(dim))\n    \n    def expand(self, dim,copies):\n\n        trans_cmd = list(range(0,len(self.data.shape)))\n        trans_cmd.insert(dim,len(self.data.shape))\n        new_data = self.data.repeat(copies).reshape(list(self.data.shape) + [copies]).transpose(trans_cmd)\n        \n        if(self.autograd):\n            return Tensor(new_data,\n                          autograd=True,\n                          creators=[self],\n                          creation_op=\"expand_\"+str(dim))\n        return Tensor(new_data)\n    \n    def transpose(self):\n        if(self.autograd):\n            return Tensor(self.data.transpose(),\n                          autograd=True,\n                          creators=[self],\n                          creation_op=\"transpose\")\n        \n        return Tensor(self.data.transpose())\n    \n    def mm(self, x):\n        if(self.autograd):\n            return Tensor(self.data.dot(x.data),\n                          autograd=True,\n                          creators=[self,x],\n                          creation_op=\"mm\")\n        return Tensor(self.data.dot(x.data))\n    \n    def sigmoid(self):\n        if(self.autograd):\n            return Tensor(1 \/ (1 + np.exp(-self.data)),\n                          autograd=True,\n                          creators=[self],\n                          creation_op=\"sigmoid\")\n        return Tensor(1 \/ (1 + np.exp(-self.data)))\n\n    def tanh(self):\n        if(self.autograd):\n            return Tensor(np.tanh(self.data),\n                          autograd=True,\n                          creators=[self],\n                          creation_op=\"tanh\")\n        return Tensor(np.tanh(self.data))\n    \n    def index_select(self, indices):\n\n        if(self.autograd):\n            new = Tensor(self.data[indices.data],\n                         autograd=True,\n                         creators=[self],\n                         creation_op=\"index_select\")\n            new.index_select_indices = indices\n            return new\n        return Tensor(self.data[indices.data])\n    \n    def softmax(self):\n        temp = np.exp(self.data)\n        softmax_output = temp \/ np.sum(temp,\n                                       axis=len(self.data.shape)-1,\n                                       keepdims=True)\n        return softmax_output\n    \n    def cross_entropy(self, target_indices):\n\n        temp = np.exp(self.data)\n        softmax_output = temp \/ np.sum(temp,\n                                       axis=len(self.data.shape)-1,\n                                       keepdims=True)\n        \n        t = target_indices.data.flatten()\n        p = softmax_output.reshape(len(t),-1)\n        target_dist = np.eye(p.shape[1])[t]\n        loss = -(np.log(p) * (target_dist)).sum(1).mean()\n    \n        if(self.autograd):\n            out = Tensor(loss,\n                         autograd=True,\n                         creators=[self],\n                         creation_op=\"cross_entropy\")\n            out.softmax_output = softmax_output\n            out.target_dist = target_dist\n            return out\n\n        return Tensor(loss)\n        \n    \n    def __repr__(self):\n        return str(self.data.__repr__())\n    \n    def __str__(self):\n        return str(self.data.__str__())  \n\nclass Layer(object):\n    \n    def __init__(self):\n        self.parameters = list()\n        \n    def get_parameters(self):\n        return self.parameters\n\n    \nclass SGD(object):\n    \n    def __init__(self, parameters, alpha=0.1):\n        self.parameters = parameters\n        self.alpha = alpha\n    \n    def zero(self):\n        for p in self.parameters:\n            p.grad.data *= 0\n        \n    def step(self, zero=True):\n        \n        for p in self.parameters:\n            \n            p.data -= p.grad.data * self.alpha\n            \n            if(zero):\n                p.grad.data *= 0\n\n\nclass Linear(Layer):\n\n    def __init__(self, n_inputs, n_outputs, bias=True):\n        super().__init__()\n        \n        self.use_bias = bias\n        \n        W = np.random.randn(n_inputs, n_outputs) * np.sqrt(2.0\/(n_inputs))\n        self.weight = Tensor(W, autograd=True)\n        if(self.use_bias):\n            self.bias = Tensor(np.zeros(n_outputs), autograd=True)\n        \n        self.parameters.append(self.weight)\n        \n        if(self.use_bias):        \n            self.parameters.append(self.bias)\n\n    def forward(self, input):\n        if(self.use_bias):\n            return input.mm(self.weight)+self.bias.expand(0,len(input.data))\n        return input.mm(self.weight)\n\n\nclass Sequential(Layer):\n    \n    def __init__(self, layers=list()):\n        super().__init__()\n        \n        self.layers = layers\n    \n    def add(self, layer):\n        self.layers.append(layer)\n        \n    def forward(self, input):\n        for layer in self.layers:\n            input = layer.forward(input)\n        return input\n    \n    def get_parameters(self):\n        params = list()\n        for l in self.layers:\n            params += l.get_parameters()\n        return params\n\n\nclass Embedding(Layer):\n    \n    def __init__(self, vocab_size, dim):\n        super().__init__()\n        \n        self.vocab_size = vocab_size\n        self.dim = dim\n        \n        # this random initialiation style is just a convention from word2vec\n        self.weight = Tensor((np.random.rand(vocab_size, dim) - 0.5) \/ dim, autograd=True)\n        \n        self.parameters.append(self.weight)\n    \n    def forward(self, input):\n        return self.weight.index_select(input)\n\n\nclass Tanh(Layer):\n    def __init__(self):\n        super().__init__()\n    \n    def forward(self, input):\n        return input.tanh()\n\n\nclass Sigmoid(Layer):\n    def __init__(self):\n        super().__init__()\n    \n    def forward(self, input):\n        return input.sigmoid()\n    \n\nclass CrossEntropyLoss(object):\n    \n    def __init__(self):\n        super().__init__()\n    \n    def forward(self, input, target):\n        return input.cross_entropy(target)\n\n    \n# Sprachmodell Long Short Term Memory\nclass LSTMCell(Layer):\n    \n    def __init__(self, n_inputs, n_hidden, n_output):\n        super().__init__()\n\n        self.n_inputs = n_inputs\n        self.n_hidden = n_hidden\n        self.n_output = n_output\n\n        self.xf = Linear(n_inputs, n_hidden)\n        self.xi = Linear(n_inputs, n_hidden)\n        self.xo = Linear(n_inputs, n_hidden)        \n        self.xc = Linear(n_inputs, n_hidden)        \n        \n        self.hf = Linear(n_hidden, n_hidden, bias=False)\n        self.hi = Linear(n_hidden, n_hidden, bias=False)\n        self.ho = Linear(n_hidden, n_hidden, bias=False)\n        self.hc = Linear(n_hidden, n_hidden, bias=False)        \n        \n        self.w_ho = Linear(n_hidden, n_output, bias=False)\n        \n        self.parameters += self.xf.get_parameters()\n        self.parameters += self.xi.get_parameters()\n        self.parameters += self.xo.get_parameters()\n        self.parameters += self.xc.get_parameters()\n\n        self.parameters += self.hf.get_parameters()\n        self.parameters += self.hi.get_parameters()        \n        self.parameters += self.ho.get_parameters()        \n        self.parameters += self.hc.get_parameters()                \n        \n        self.parameters += self.w_ho.get_parameters()        \n    \n    def forward(self, input, hidden):\n        \n        prev_hidden = hidden[0]        \n        prev_cell = hidden[1]\n        \n        f = (self.xf.forward(input) + self.hf.forward(prev_hidden)).sigmoid()\n        i = (self.xi.forward(input) + self.hi.forward(prev_hidden)).sigmoid()\n        o = (self.xo.forward(input) + self.ho.forward(prev_hidden)).sigmoid()        \n        g = (self.xc.forward(input) + self.hc.forward(prev_hidden)).tanh()        \n        c = (f * prev_cell) + (i * g)\n\n        h = o * c.tanh()\n        \n        output = self.w_ho.forward(h)\n        return output, (h, c)\n    \n    def init_hidden(self, batch_size=1):\n        init_hidden = Tensor(np.zeros((batch_size,self.n_hidden)), autograd=True)\n        init_cell = Tensor(np.zeros((batch_size,self.n_hidden)), autograd=True)\n        init_hidden.data[:,0] += 1\n        init_cell.data[:,0] += 1\n        return (init_hidden, init_cell)\n\nimport sys,random,math\nfrom collections import Counter\nimport numpy as np\nimport sys\n\nnp.random.seed(0)\n\n# Einlesen des VKG KORPUS\nf = open('VKGKORPUS.TXT','r')\nraw = f.read()\nf.close()\n\n\n\nvocab = list(set(raw))\nword2index = {}\nfor i,word in enumerate(vocab):\n    word2index[word]=i\nindices = np.array(list(map(lambda x:word2index[x], raw)))\n\nembed = Embedding(vocab_size=len(vocab),dim=512)\nmodel = LSTMCell(n_inputs=512, n_hidden=512, n_output=len(vocab))\nmodel.w_ho.weight.data *= 0\n\ncriterion = CrossEntropyLoss()\noptim = SGD(parameters=model.get_parameters() + embed.get_parameters(), alpha=0.05)\n\ndef generate_sample(n=30, init_char=' '):\n    s = \"\"\n    hidden = model.init_hidden(batch_size=1)\n    input = Tensor(np.array([word2index[init_char]]))\n    for i in range(n):\n        rnn_input = embed.forward(input)\n        output, hidden = model.forward(input=rnn_input, hidden=hidden)\n#         output.data *= 25\n#         temp_dist = output.softmax()\n#         temp_dist \/= temp_dist.sum()\n\n#         m = (temp_dist > np.random.rand()).argmax()\n        m = output.data.argmax()\n        c = vocab[m]\n        input = Tensor(np.array([m]))\n        s += c\n    return s\n\nbatch_size = 16\nbptt = 25\nn_batches = int((indices.shape[0] \/ (batch_size)))\n\ntrimmed_indices = indices[:n_batches*batch_size]\nbatched_indices = trimmed_indices.reshape(batch_size, n_batches).transpose()\n\ninput_batched_indices = batched_indices[0:-1]\ntarget_batched_indices = batched_indices[1:]\n\nn_bptt = int(((n_batches-1) \/ bptt))\ninput_batches = input_batched_indices[:n_bptt*bptt].reshape(n_bptt,bptt,batch_size)\ntarget_batches = target_batched_indices[:n_bptt*bptt].reshape(n_bptt, bptt, batch_size)\nmin_loss = 1000\n\n# Training des neuronalen Netztes\ndef train(iterations=400):\n    for iter in range(iterations):\n        total_loss = 0\n        n_loss = 0\n\n        hidden = model.init_hidden(batch_size=batch_size)\n        batches_to_train = len(input_batches)\n    #     batches_to_train = 32\n        for batch_i in range(batches_to_train):\n\n            hidden = (Tensor(hidden[0].data, autograd=True), Tensor(hidden[1].data, autograd=True))\n\n            losses = list()\n            for t in range(bptt):\n                input = Tensor(input_batches[batch_i][t], autograd=True)\n                rnn_input = embed.forward(input=input)\n                output, hidden = model.forward(input=rnn_input, hidden=hidden)\n\n                target = Tensor(target_batches[batch_i][t], autograd=True)    \n                batch_loss = criterion.forward(output, target)\n\n                if(t == 0):\n                    losses.append(batch_loss)\n                else:\n                    losses.append(batch_loss + losses[-1])\n\n            loss = losses[-1]\n\n            loss.backward()\n            optim.step()\n            total_loss += loss.data \/ bptt\n\n            epoch_loss = np.exp(total_loss \/ (batch_i+1))\n            min_loss =1000\n            if(epoch_loss < min_loss):\n                min_loss = epoch_loss\n                print()\n\n            log = \"\\r Iter:\" + str(iter)\n            log += \" - Alpha:\" + str(optim.alpha)[0:5]\n            log += \" - Batch \"+str(batch_i+1)+\"\/\"+str(len(input_batches))\n            log += \" - Min Loss:\" + str(min_loss)[0:5]\n            log += \" - Loss:\" + str(epoch_loss)\n            if(batch_i == 0):\n                log += \" - \" + generate_sample(n=70, init_char='T').replace(\"\\n\",\" \")\n            if(batch_i % 1 == 0):\n                sys.stdout.write(log)\n\n        optim.alpha *= 0.99\n\n\n\n\ntrain(100)\n\ndef generate_sample(n=30, init_char=' '):\n    s = \"\"\n    hidden = model.init_hidden(batch_size=1)\n    input = Tensor(np.array([word2index[init_char]]))\n    for i in range(n):\n        rnn_input = embed.forward(input)\n        output, hidden = model.forward(input=rnn_input, hidden=hidden)\n        output.data *= 15\n        temp_dist = output.softmax()\n        temp_dist \/= temp_dist.sum()\n\n#         m = (temp_dist > np.random.rand()).argmax() # sample from predictions\n        m = output.data.argmax() # take the max prediction\n        c = vocab[m]\n        input = Tensor(np.array([m]))\n        s += c\n    return s\nprint(generate_sample(n=500, init_char='\\n'))\n\n",
            "metadata": [],
            "outputs": [],
            "execution_count": null
        },
        {
            "id": "2da20a86-0024-458d-9aae-1a5a1dc588d8",
            "cell_type": "code",
            "source": "print(generate_sample(n=500, init_char='\\n'))",
            "metadata": [],
            "outputs": [],
            "execution_count": null
        },
        {
            "id": "187d2f92-b099-4525-a01f-fdd361e4dcee",
            "cell_type": "markdown",
            "source": "Ausgabe eines generierten Beispiels:",
            "metadata": []
        },
        {
            "id": "adb6271a-8bcf-42d7-8da4-6249fe1a731e",
            "cell_type": "markdown",
            "source": "\nKBG VBG \nKBBD VBBD KBA VBA KAE VAE KAA VAA \nKBBD VBBD KBA VBA KBBD VBBD KBA VBA KBBD VBBD KBA VBA KAE VAE \nKAA VAA \nKAV VAV \nKBG VBG \nKBBD VBBD KBA VBA KAE VAE KAE VAE KAE VAE KAE VAE KAA VAA \nKBBD VBBD KBA VBA KAE VAE KAE VAE KAA VAA \nKBBD VBBD KBA VBA KAE VAE KAA VAA \nKBBD VBBD KBA VBA KBBD VBBD KBA VBA KAE VAE KAA VAA \nKAV VAV\nKBG VBG\nKBBD KBA VBA KBBD VBBD KBA VBA KAE VAE KAE VAE KAA VAA\nKBBD VBBD KBA VBA KBBD KBA VBA KBBD VBBD KBA VBA KBBD VBBD KBA \nKAE VAE KAA VAA\nKBBD VBBD KBA VBA KAE VAE KAE VAE VAE KAA VAA\nKAV VAV",
            "metadata": []
        },
        {
            "id": "41e8de0f-c027-4d31-abc2-9a5047159ce2",
            "cell_type": "code",
            "source": "print(generate_sample(n=500, init_char=' '))",
            "metadata": [],
            "outputs": [],
            "execution_count": null
        },
        {
            "id": "c4e2ba35-d32b-42fb-b1a6-f48b8c1b3b42",
            "cell_type": "markdown",
            "source": "Im Gegensatz zu kognitivistischen Modellen \n(ARS, Koop,P. Grammar Induction, Parser, Grammar Transduction)\nerklärt ein solches Großes Sprachmodell nichts unddeshalb werden \nGroße Sprachmodell von Postmodernismus, Posthumanismus und Transhumanismus \nmit parasitärer Intention gefeiert.\n\nWenn man ein Lehrbuch über die Regeln von Verkaufsgesprächen schreiben will, \naber einen Softwareagenten erhält, der gerne Verkaufsgespräche führt,\nhat man auf sehr hohem Niveau schlechte Arbeit gemacht.\n\n",
            "metadata": []
        }
    ]
}