flreader – Dateibrowser

Inhalt

Aktueller Ordner: /
.ipynb (pretty JSON)

{
    "metadata": {
        "kernelspec": {
            "name": "python",
            "display_name": "Python (Pyodide)",
            "language": "python"
        },
        "language_info": {
            "codemirror_mode": {
                "name": "python",
                "version": 3
            },
            "file_extension": ".py",
            "mimetype": "text\/x-python",
            "name": "python",
            "nbconvert_exporter": "python",
            "pygments_lexer": "ipython3",
            "version": "3.8"
        }
    },
    "nbformat_minor": 5,
    "nbformat": 4,
    "cells": [
        {
            "id": "d9fca15d-ede2-461a-a93b-b93ed994e074",
            "cell_type": "markdown",
            "source": "Paul Koop 2023",
            "metadata": []
        },
        {
            "id": "c966e10a-8adf-4c57-9b56-cac3795f97ac",
            "cell_type": "markdown",
            "source": "Qualitative social research and large language models",
            "metadata": []
        },
        {
            "id": "10c05849-a444-4a2c-a9b6-b5c2242b19c8",
            "cell_type": "markdown",
            "source": "Using a recorded and categorized sales conversation, I will demonstrate that large language models can simulate interactions but cannot provide explanations. First, I will develop a generative transformer and train it with the categorized transcripts. It will become clear that while the model is capable of convincingly mimicking a dialogue, it is unable to provide sound explanations.Subsequently, I will create an inducer, a transductor, and a parser for the same data. This will illustrate that the dialogue grammars generated by these methods possess explanatory power, in contrast to the large language model and the generative pre-trained transformer.Finally, I will integrate the developed grammar into a multi-agent system that is capable of simulating sales conversations and explaining them accordingly.",
            "metadata": []
        },
        {
            "id": "645149f1-69f4-4cd2-b04f-03afddbeb802",
            "cell_type": "markdown",
            "source": "Qualitative social research has missed the boat on cognitivism.\nAs a result, it failed to secure the reconstruction of latent structures of meaning through the construction of generative rules in the sense of algorithms.\nFor validly collected category systems (cf. Mayring), algorithmic rules of a finite automaton can be specified\n(cf. Koop, Paul: *ARS, Grammar-Induction, Parser, Grammar-Transduction*).\n\nNow, posthumanism, poststructuralism, and transhumanism are parasitizing the opaque AI.\nAnd if they are not parasitizing it, then they are mutual symbionts.\n\nKarl Popper is then replaced by Harry Potter,\nand qualitative social research and large language models become an awe-inspiring cargo cult of a postmodernism that explains nothing and obscures everything.\n\nIt has been demonstrated for **Algorithmic Recursive Sequence Analysis**\nthat for the protocol of an action sequence,\nat least one grammar can be specified\n(inductor in Scheme, parser in Pascal, transducer in Lisp, cf. Koop, P.).\n\n**ARS is a qualitative method**\ncapable of reconstructing falsifiable latent rules of recorded action sequences.\n\nA large language model can be reprogrammed to reconstruct\nthe categories identified in a qualitative content analysis (cf. Mayring).\n\nHowever, the explanatory value of such a model is negligible,\nprecisely because nothing is actually explained.\n\nTo demonstrate this, the following describes\nthe reprogramming of a large language model.",
            "metadata": []
        },
        {
            "id": "1b4120e8-002a-46c1-b64d-a3a641453117",
            "cell_type": "markdown",
            "source": "From the corpus of codings of a transcribed protocol, a simulation of a sales conversation can be conducted using a deep language model.\nThe algorithm of the deep language model then represents the generative structure.\n\nGood introductions are provided by:\n    \nSteinwender, J., Schwaiger, R.:\nNeuronale Netze programmieren mit Python\n2. Auflage 2020\nISBN 978-3-8362-7452-4\n\nTrask, A. W.:\nNeuronale Netze und Deep Learning kapieren\nDer einfache Praxiseinstieg mit Beispielen in Python\n1. Auflage 2020\nISBN 978-3-7475-0017-0\n\nHirschle, J.:\nDeep Natural Language Processing\n1. Auflage 2022\nISBN 978-3-446-47363-8\n\nThe data structures in this text are reimplemented based on the above-mentioned book by A. W. Trask.\nFrom this, the deep language model for sales conversations was derived.\n    \n    \n",
            "metadata": []
        },
        {
            "id": "20589e4f-b08c-4c67-9285-505d4f70ac71",
            "cell_type": "markdown",
            "source": "Neural networks are multidimensional, mostly two-dimensional data fields of rational numbers. A hidden layer of predictive weights weights the data from the input layer, propagates the results to the next layer, and so on, until an open output layer then outputs them.In the training phase, the weights are backpropagated, with large language models using recurrent networks with attention on the logged context.In the examples serving understanding, an attempt is made to determine a team's game results by weighting the number of toes, the games won so far, and the number of fans to assess future winning chances.",
            "metadata": []
        },
        {
            "id": "fe3ddd08-51f7-4ccd-ae86-d8b7a724bfcb",
            "cell_type": "markdown",
            "source": "Only one input date, here the number of toes:",
            "metadata": []
        },
        {
            "id": "f3c53651-23c2-4dda-9e9a-d6e5219b29ea",
            "cell_type": "code",
            "source": "# Das Netzwerk\ngewicht = 0.1\ndef neurales_netzwerk (eingabe, gewicht):\n    ausgabe = eingabe * gewicht\n    return ausgabe\n\n# Anwendung des Netzwerkes\nanzahl_der_zehen = [8.5, 9.5, 10, 9]\neingabe = anzahl_der_zehen[0]\nausgabe = neurales_netzwerk (eingabe, gewicht)\nprint(ausgabe)\n\n",
            "metadata": [],
            "outputs": [
                {
                    "name": "stdout",
                    "output_type": "stream",
                    "text": "0.8500000000000001\n"
                }
            ],
            "execution_count": 3
        },
        {
            "id": "133b8848-a758-402a-90e0-ddb9bffff251",
            "cell_type": "markdown",
            "source": "Now with three input data (number of toes, previous winnings, number of fans):",
            "metadata": []
        },
        {
            "id": "34833e44-f410-40ae-9f28-cf0d16ac4853",
            "cell_type": "code",
            "source": "def propagierungsfunktion(a,b):\n    assert(len(a) == len(b))\n    ausgabe = 0\n    for i in range(len(a)):\n        ausgabe += (a[i] * b[i])\n    return ausgabe\n\ngewicht = [0.1, 0.2, 0] \n    \ndef neurales_netzwerk(eingabe, gewicht):\n    ausgabe = propagierungsfunktion(eingabe,gewicht)\n    return ausgabe\n\n\nzehen =  [8.5, 9.5, 9.9, 9.0]\ngewinnrate = [0.65, 0.8, 0.8, 0.9]\nfans = [1.2, 1.3, 0.5, 1.0]\n\neingabe = [zehen[0],gewinnrate[0],fans[0]]\nausgabe = neurales_netzwerk(eingabe,gewicht)\n\nprint(ausgabe)",
            "metadata": [],
            "outputs": [
                {
                    "name": "stdout",
                    "output_type": "stream",
                    "text": "0.9800000000000001\n"
                }
            ],
            "execution_count": 5
        },
        {
            "id": "5bf21181-d513-4125-9bd5-28ed03c0d7da",
            "cell_type": "markdown",
            "source": "Jetzt mit der Bibliothek numy (Datenfelder, Vektoren, Matrizen):",
            "metadata": []
        },
        {
            "id": "98d43668-8d49-479a-a5e6-ea25fb3f3e53",
            "cell_type": "code",
            "source": "import numpy as ny\ngewicht = ny.array([0.1, 0.2, 0])\ndef neurales_netzwerk(eingabe, gewicht):\n    ausgabe = eingabe.dot(gewicht)\n    return ausgabe\n    \nzehen      = ny.array([8.5, 9.5, 9.9, 9.0])\ngewinnrate = ny.array([0.65, 0.8, 0.8, 0.9])\nfans       = ny.array([1.2, 1.3, 0.5, 1.0])\n\n\neingabe = ny.array([zehen[0],gewinnrate[0],fans[0]])\nausgabe = neurales_netzwerk(eingabe,gewicht)\n\nprint(ausgabe)",
            "metadata": [],
            "outputs": [
                {
                    "name": "stdout",
                    "output_type": "stream",
                    "text": "0.9800000000000001\n"
                }
            ],
            "execution_count": 2
        },
        {
            "id": "feb32504-9cb5-4cd1-96a7-5ba34016dc7e",
            "cell_type": "markdown",
            "source": "The weights can be adjusted until the error is minimized.",
            "metadata": []
        },
        {
            "id": "5a2c5086-7d91-4565-ab52-19e1db0b6c5d",
            "cell_type": "code",
            "source": "# Prinzipielles Beispiel\ngewicht = 0.5\neingabe = 0.5\nerwuenschte_vorhersage = 0.8\n\nschrittweite = 0.001\n\nfor iteration in range(1101):\n\n    vorhersage = eingabe * gewicht\n    fehler = (vorhersage - erwuenschte_vorhersage) ** 2\n\n    print(\"Fehler:\" + str(fehler) + \" Vorhersage:\" + str(vorhersage))\n    \n    hoehere_vorhersage = eingabe * (gewicht + schrittweite)\n    tieferer_fehler = (gewünschte_vorhersage - hoehere_orhersage) ** 2\n\n    hoehere_vorhersage = eingabe * (gewicht - schrittweite)\n    tiefere_fehler = (erwuenschte_vorhersage - tiefere_vorhersage) ** 2\n\n    if(tieferer_fehler <  hoeherer_fehler):\n        gewicht = gewicht - schrittweite\n        \n    if(tieferer_fehler >  hoeherer_fehler):\n        gewicht = gewicht + schrittweite",
            "metadata": [],
            "outputs": [],
            "execution_count": null
        },
        {
            "id": "04f4d530-6133-45d3-ade3-56d179da55b8",
            "cell_type": "code",
            "source": "# Trask, A. W.:\n# Neuronale Netze und Deep Learning kapieren\n# Der einfache Praxiseinstieg mit Beispielen in Python\n# 1. Auflage 2020\n# ISBN 978-3-7475-0017-0\n\nimport numpy as np\n\n# Objektklasse Datenfeld\nclass Tensor (object):\n    \n    def __init__(self,data,\n                 autograd=False,\n                 creators=None,\n                 creation_op=None,\n                 id=None):\n        \n        self.data = np.array(data)\n        self.autograd = autograd\n        self.grad = None\n\n        if(id is None):\n            self.id = np.random.randint(0,1000000000)\n        else:\n            self.id = id\n        \n        self.creators = creators\n        self.creation_op = creation_op\n        self.children = {}\n        \n        if(creators is not None):\n            for c in creators:\n                if(self.id not in c.children):\n                    c.children[self.id] = 1\n                else:\n                    c.children[self.id] += 1\n\n    def all_children_grads_accounted_for(self):\n        for id,cnt in self.children.items():\n            if(cnt != 0):\n                return False\n        return True \n        \n    def backward(self,grad=None, grad_origin=None):\n        if(self.autograd):\n \n            if(grad is None):\n                grad = Tensor(np.ones_like(self.data))\n\n            if(grad_origin is not None):\n                if(self.children[grad_origin.id] == 0):\n                    return\n                    print(self.id)\n                    print(self.creation_op)\n                    print(len(self.creators))\n                    for c in self.creators:\n                        print(c.creation_op)\n                    raise Exception(\"cannot backprop more than once\")\n                else:\n                    self.children[grad_origin.id] -= 1\n\n            if(self.grad is None):\n                self.grad = grad\n            else:\n                self.grad += grad\n            \n\n            assert grad.autograd == False\n            \n\n            if(self.creators is not None and \n               (self.all_children_grads_accounted_for() or \n                grad_origin is None)):\n\n                if(self.creation_op == \"add\"):\n                    self.creators[0].backward(self.grad, self)\n                    self.creators[1].backward(self.grad, self)\n                    \n                if(self.creation_op == \"sub\"):\n                    self.creators[0].backward(Tensor(self.grad.data), self)\n                    self.creators[1].backward(Tensor(self.grad.__neg__().data), self)\n\n                if(self.creation_op == \"mul\"):\n                    new = self.grad * self.creators[1]\n                    self.creators[0].backward(new , self)\n                    new = self.grad * self.creators[0]\n                    self.creators[1].backward(new, self)                    \n                    \n                if(self.creation_op == \"mm\"):\n                    c0 = self.creators[0]\n                    c1 = self.creators[1]\n                    new = self.grad.mm(c1.transpose())\n                    c0.backward(new)\n                    new = self.grad.transpose().mm(c0).transpose()\n                    c1.backward(new)\n                    \n                if(self.creation_op == \"transpose\"):\n                    self.creators[0].backward(self.grad.transpose())\n\n                if(\"sum\" in self.creation_op):\n                    dim = int(self.creation_op.split(\"_\")[1])\n                    self.creators[0].backward(self.grad.expand(dim,\n                                                               self.creators[0].data.shape[dim]))\n\n                if(\"expand\" in self.creation_op):\n                    dim = int(self.creation_op.split(\"_\")[1])\n                    self.creators[0].backward(self.grad.sum(dim))\n                    \n                if(self.creation_op == \"neg\"):\n                    self.creators[0].backward(self.grad.__neg__())\n                    \n                if(self.creation_op == \"sigmoid\"):\n                    ones = Tensor(np.ones_like(self.grad.data))\n                    self.creators[0].backward(self.grad * (self * (ones - self)))\n                \n                if(self.creation_op == \"tanh\"):\n                    ones = Tensor(np.ones_like(self.grad.data))\n                    self.creators[0].backward(self.grad * (ones - (self * self)))\n                \n                if(self.creation_op == \"index_select\"):\n                    new_grad = np.zeros_like(self.creators[0].data)\n                    indices_ = self.index_select_indices.data.flatten()\n                    grad_ = grad.data.reshape(len(indices_), -1)\n                    for i in range(len(indices_)):\n                        new_grad[indices_[i]] += grad_[i]\n                    self.creators[0].backward(Tensor(new_grad))\n                    \n                if(self.creation_op == \"cross_entropy\"):\n                    dx = self.softmax_output - self.target_dist\n                    self.creators[0].backward(Tensor(dx))\n                    \n    def __add__(self, other):\n        if(self.autograd and other.autograd):\n            return Tensor(self.data + other.data,\n                          autograd=True,\n                          creators=[self,other],\n                          creation_op=\"add\")\n        return Tensor(self.data + other.data)\n\n    def __neg__(self):\n        if(self.autograd):\n            return Tensor(self.data * -1,\n                          autograd=True,\n                          creators=[self],\n                          creation_op=\"neg\")\n        return Tensor(self.data * -1)\n    \n    def __sub__(self, other):\n        if(self.autograd and other.autograd):\n            return Tensor(self.data - other.data,\n                          autograd=True,\n                          creators=[self,other],\n                          creation_op=\"sub\")\n        return Tensor(self.data - other.data)\n    \n    def __mul__(self, other):\n        if(self.autograd and other.autograd):\n            return Tensor(self.data * other.data,\n                          autograd=True,\n                          creators=[self,other],\n                          creation_op=\"mul\")\n        return Tensor(self.data * other.data)    \n\n    def sum(self, dim):\n        if(self.autograd):\n            return Tensor(self.data.sum(dim),\n                          autograd=True,\n                          creators=[self],\n                          creation_op=\"sum_\"+str(dim))\n        return Tensor(self.data.sum(dim))\n    \n    def expand(self, dim,copies):\n\n        trans_cmd = list(range(0,len(self.data.shape)))\n        trans_cmd.insert(dim,len(self.data.shape))\n        new_data = self.data.repeat(copies).reshape(list(self.data.shape) + [copies]).transpose(trans_cmd)\n        \n        if(self.autograd):\n            return Tensor(new_data,\n                          autograd=True,\n                          creators=[self],\n                          creation_op=\"expand_\"+str(dim))\n        return Tensor(new_data)\n    \n    def transpose(self):\n        if(self.autograd):\n            return Tensor(self.data.transpose(),\n                          autograd=True,\n                          creators=[self],\n                          creation_op=\"transpose\")\n        \n        return Tensor(self.data.transpose())\n    \n    def mm(self, x):\n        if(self.autograd):\n            return Tensor(self.data.dot(x.data),\n                          autograd=True,\n                          creators=[self,x],\n                          creation_op=\"mm\")\n        return Tensor(self.data.dot(x.data))\n    \n    def sigmoid(self):\n        if(self.autograd):\n            return Tensor(1 \/ (1 + np.exp(-self.data)),\n                          autograd=True,\n                          creators=[self],\n                          creation_op=\"sigmoid\")\n        return Tensor(1 \/ (1 + np.exp(-self.data)))\n\n    def tanh(self):\n        if(self.autograd):\n            return Tensor(np.tanh(self.data),\n                          autograd=True,\n                          creators=[self],\n                          creation_op=\"tanh\")\n        return Tensor(np.tanh(self.data))\n    \n    def index_select(self, indices):\n\n        if(self.autograd):\n            new = Tensor(self.data[indices.data],\n                         autograd=True,\n                         creators=[self],\n                         creation_op=\"index_select\")\n            new.index_select_indices = indices\n            return new\n        return Tensor(self.data[indices.data])\n    \n    def softmax(self):\n        temp = np.exp(self.data)\n        softmax_output = temp \/ np.sum(temp,\n                                       axis=len(self.data.shape)-1,\n                                       keepdims=True)\n        return softmax_output\n    \n    def cross_entropy(self, target_indices):\n\n        temp = np.exp(self.data)\n        softmax_output = temp \/ np.sum(temp,\n                                       axis=len(self.data.shape)-1,\n                                       keepdims=True)\n        \n        t = target_indices.data.flatten()\n        p = softmax_output.reshape(len(t),-1)\n        target_dist = np.eye(p.shape[1])[t]\n        loss = -(np.log(p) * (target_dist)).sum(1).mean()\n    \n        if(self.autograd):\n            out = Tensor(loss,\n                         autograd=True,\n                         creators=[self],\n                         creation_op=\"cross_entropy\")\n            out.softmax_output = softmax_output\n            out.target_dist = target_dist\n            return out\n\n        return Tensor(loss)\n        \n    \n    def __repr__(self):\n        return str(self.data.__repr__())\n    \n    def __str__(self):\n        return str(self.data.__str__())  \n\nclass Layer(object):\n    \n    def __init__(self):\n        self.parameters = list()\n        \n    def get_parameters(self):\n        return self.parameters\n\n    \nclass SGD(object):\n    \n    def __init__(self, parameters, alpha=0.1):\n        self.parameters = parameters\n        self.alpha = alpha\n    \n    def zero(self):\n        for p in self.parameters:\n            p.grad.data *= 0\n        \n    def step(self, zero=True):\n        \n        for p in self.parameters:\n            \n            p.data -= p.grad.data * self.alpha\n            \n            if(zero):\n                p.grad.data *= 0\n\n\nclass Linear(Layer):\n\n    def __init__(self, n_inputs, n_outputs, bias=True):\n        super().__init__()\n        \n        self.use_bias = bias\n        \n        W = np.random.randn(n_inputs, n_outputs) * np.sqrt(2.0\/(n_inputs))\n        self.weight = Tensor(W, autograd=True)\n        if(self.use_bias):\n            self.bias = Tensor(np.zeros(n_outputs), autograd=True)\n        \n        self.parameters.append(self.weight)\n        \n        if(self.use_bias):        \n            self.parameters.append(self.bias)\n\n    def forward(self, input):\n        if(self.use_bias):\n            return input.mm(self.weight)+self.bias.expand(0,len(input.data))\n        return input.mm(self.weight)\n\n\nclass Sequential(Layer):\n    \n    def __init__(self, layers=list()):\n        super().__init__()\n        \n        self.layers = layers\n    \n    def add(self, layer):\n        self.layers.append(layer)\n        \n    def forward(self, input):\n        for layer in self.layers:\n            input = layer.forward(input)\n        return input\n    \n    def get_parameters(self):\n        params = list()\n        for l in self.layers:\n            params += l.get_parameters()\n        return params\n\n\nclass Embedding(Layer):\n    \n    def __init__(self, vocab_size, dim):\n        super().__init__()\n        \n        self.vocab_size = vocab_size\n        self.dim = dim\n        \n        # this random initialiation style is just a convention from word2vec\n        self.weight = Tensor((np.random.rand(vocab_size, dim) - 0.5) \/ dim, autograd=True)\n        \n        self.parameters.append(self.weight)\n    \n    def forward(self, input):\n        return self.weight.index_select(input)\n\n\nclass Tanh(Layer):\n    def __init__(self):\n        super().__init__()\n    \n    def forward(self, input):\n        return input.tanh()\n\n\nclass Sigmoid(Layer):\n    def __init__(self):\n        super().__init__()\n    \n    def forward(self, input):\n        return input.sigmoid()\n    \n\nclass CrossEntropyLoss(object):\n    \n    def __init__(self):\n        super().__init__()\n    \n    def forward(self, input, target):\n        return input.cross_entropy(target)\n\n    \n# Sprachmodell Long Short Term Memory\nclass LSTMCell(Layer):\n    \n    def __init__(self, n_inputs, n_hidden, n_output):\n        super().__init__()\n\n        self.n_inputs = n_inputs\n        self.n_hidden = n_hidden\n        self.n_output = n_output\n\n        self.xf = Linear(n_inputs, n_hidden)\n        self.xi = Linear(n_inputs, n_hidden)\n        self.xo = Linear(n_inputs, n_hidden)        \n        self.xc = Linear(n_inputs, n_hidden)        \n        \n        self.hf = Linear(n_hidden, n_hidden, bias=False)\n        self.hi = Linear(n_hidden, n_hidden, bias=False)\n        self.ho = Linear(n_hidden, n_hidden, bias=False)\n        self.hc = Linear(n_hidden, n_hidden, bias=False)        \n        \n        self.w_ho = Linear(n_hidden, n_output, bias=False)\n        \n        self.parameters += self.xf.get_parameters()\n        self.parameters += self.xi.get_parameters()\n        self.parameters += self.xo.get_parameters()\n        self.parameters += self.xc.get_parameters()\n\n        self.parameters += self.hf.get_parameters()\n        self.parameters += self.hi.get_parameters()        \n        self.parameters += self.ho.get_parameters()        \n        self.parameters += self.hc.get_parameters()                \n        \n        self.parameters += self.w_ho.get_parameters()        \n    \n    def forward(self, input, hidden):\n        \n        prev_hidden = hidden[0]        \n        prev_cell = hidden[1]\n        \n        f = (self.xf.forward(input) + self.hf.forward(prev_hidden)).sigmoid()\n        i = (self.xi.forward(input) + self.hi.forward(prev_hidden)).sigmoid()\n        o = (self.xo.forward(input) + self.ho.forward(prev_hidden)).sigmoid()        \n        g = (self.xc.forward(input) + self.hc.forward(prev_hidden)).tanh()        \n        c = (f * prev_cell) + (i * g)\n\n        h = o * c.tanh()\n        \n        output = self.w_ho.forward(h)\n        return output, (h, c)\n    \n    def init_hidden(self, batch_size=1):\n        init_hidden = Tensor(np.zeros((batch_size,self.n_hidden)), autograd=True)\n        init_cell = Tensor(np.zeros((batch_size,self.n_hidden)), autograd=True)\n        init_hidden.data[:,0] += 1\n        init_cell.data[:,0] += 1\n        return (init_hidden, init_cell)\n\nimport sys,random,math\nfrom collections import Counter\nimport numpy as np\nimport sys\n\nnp.random.seed(0)\n\n# Einlesen des VKG KORPUS\nf = open('VKGKORPUS.TXT','r')\nraw = f.read()\nf.close()\n\n\n\nvocab = list(set(raw))\nword2index = {}\nfor i,word in enumerate(vocab):\n    word2index[word]=i\nindices = np.array(list(map(lambda x:word2index[x], raw)))\n\nembed = Embedding(vocab_size=len(vocab),dim=512)\nmodel = LSTMCell(n_inputs=512, n_hidden=512, n_output=len(vocab))\nmodel.w_ho.weight.data *= 0\n\ncriterion = CrossEntropyLoss()\noptim = SGD(parameters=model.get_parameters() + embed.get_parameters(), alpha=0.05)\n\ndef generate_sample(n=30, init_char=' '):\n    s = \"\"\n    hidden = model.init_hidden(batch_size=1)\n    input = Tensor(np.array([word2index[init_char]]))\n    for i in range(n):\n        rnn_input = embed.forward(input)\n        output, hidden = model.forward(input=rnn_input, hidden=hidden)\n#         output.data *= 25\n#         temp_dist = output.softmax()\n#         temp_dist \/= temp_dist.sum()\n\n#         m = (temp_dist > np.random.rand()).argmax()\n        m = output.data.argmax()\n        c = vocab[m]\n        input = Tensor(np.array([m]))\n        s += c\n    return s\n\nbatch_size = 16\nbptt = 25\nn_batches = int((indices.shape[0] \/ (batch_size)))\n\ntrimmed_indices = indices[:n_batches*batch_size]\nbatched_indices = trimmed_indices.reshape(batch_size, n_batches).transpose()\n\ninput_batched_indices = batched_indices[0:-1]\ntarget_batched_indices = batched_indices[1:]\n\nn_bptt = int(((n_batches-1) \/ bptt))\ninput_batches = input_batched_indices[:n_bptt*bptt].reshape(n_bptt,bptt,batch_size)\ntarget_batches = target_batched_indices[:n_bptt*bptt].reshape(n_bptt, bptt, batch_size)\nmin_loss = 1000\n\n# Training des neuronalen Netztes\ndef train(iterations=400):\n    for iter in range(iterations):\n        total_loss = 0\n        n_loss = 0\n\n        hidden = model.init_hidden(batch_size=batch_size)\n        batches_to_train = len(input_batches)\n    #     batches_to_train = 32\n        for batch_i in range(batches_to_train):\n\n            hidden = (Tensor(hidden[0].data, autograd=True), Tensor(hidden[1].data, autograd=True))\n\n            losses = list()\n            for t in range(bptt):\n                input = Tensor(input_batches[batch_i][t], autograd=True)\n                rnn_input = embed.forward(input=input)\n                output, hidden = model.forward(input=rnn_input, hidden=hidden)\n\n                target = Tensor(target_batches[batch_i][t], autograd=True)    \n                batch_loss = criterion.forward(output, target)\n\n                if(t == 0):\n                    losses.append(batch_loss)\n                else:\n                    losses.append(batch_loss + losses[-1])\n\n            loss = losses[-1]\n\n            loss.backward()\n            optim.step()\n            total_loss += loss.data \/ bptt\n\n            epoch_loss = np.exp(total_loss \/ (batch_i+1))\n            min_loss =1000\n            if(epoch_loss < min_loss):\n                min_loss = epoch_loss\n                print()\n\n            log = \"\\r Iter:\" + str(iter)\n            log += \" - Alpha:\" + str(optim.alpha)[0:5]\n            log += \" - Batch \"+str(batch_i+1)+\"\/\"+str(len(input_batches))\n            log += \" - Min Loss:\" + str(min_loss)[0:5]\n            log += \" - Loss:\" + str(epoch_loss)\n            if(batch_i == 0):\n                log += \" - \" + generate_sample(n=70, init_char='T').replace(\"\\n\",\" \")\n            if(batch_i % 1 == 0):\n                sys.stdout.write(log)\n\n        optim.alpha *= 0.99\n\n\n\n\ntrain(100)\n\ndef generate_sample(n=30, init_char=' '):\n    s = \"\"\n    hidden = model.init_hidden(batch_size=1)\n    input = Tensor(np.array([word2index[init_char]]))\n    for i in range(n):\n        rnn_input = embed.forward(input)\n        output, hidden = model.forward(input=rnn_input, hidden=hidden)\n        output.data *= 15\n        temp_dist = output.softmax()\n        temp_dist \/= temp_dist.sum()\n\n#         m = (temp_dist > np.random.rand()).argmax() # sample from predictions\n        m = output.data.argmax() # take the max prediction\n        c = vocab[m]\n        input = Tensor(np.array([m]))\n        s += c\n    return s\nprint(generate_sample(n=500, init_char='\\n'))\n\n",
            "metadata": [],
            "outputs": [],
            "execution_count": null
        },
        {
            "id": "2da20a86-0024-458d-9aae-1a5a1dc588d8",
            "cell_type": "code",
            "source": "print(generate_sample(n=500, init_char='\\n'))",
            "metadata": [],
            "outputs": [],
            "execution_count": null
        },
        {
            "id": "554f14b2-a1f1-4dea-93c5-00847f3d6717",
            "cell_type": "markdown",
            "source": "Issue of a generated example:",
            "metadata": []
        },
        {
            "id": "aa88cb25-5495-42b1-9074-8b58f143fc6e",
            "cell_type": "markdown",
            "source": "\nKBG VBG \nKBBD VBBD KBA VBA KAE VAE KAA VAA \nKBBD VBBD KBA VBA KBBD VBBD KBA VBA KBBD VBBD KBA VBA KAE VAE \nKAA VAA \nKAV VAV \nKBG VBG \nKBBD VBBD KBA VBA KAE VAE KAE VAE KAE VAE KAE VAE KAA VAA \nKBBD VBBD KBA VBA KAE VAE KAE VAE KAA VAA \nKBBD VBBD KBA VBA KAE VAE KAA VAA \nKBBD VBBD KBA VBA KBBD VBBD KBA VBA KAE VAE KAA VAA \nKAV VAV\nKBG VBG\nKBBD KBA VBA KBBD VBBD KBA VBA KAE VAE KAE VAE KAA VAA\nKBBD VBBD KBA VBA KBBD KBA VBA KBBD VBBD KBA VBA KBBD VBBD KBA \nKAE VAE KAA VAA\nKBBD VBBD KBA VBA KAE VAE KAE VAE VAE KAA VAA\nKAV VAV",
            "metadata": []
        },
        {
            "id": "41e8de0f-c027-4d31-abc2-9a5047159ce2",
            "cell_type": "code",
            "source": "print(generate_sample(n=500, init_char=' '))",
            "metadata": [],
            "outputs": [],
            "execution_count": null
        },
        {
            "id": "78fb1940-eeec-4c24-bf1e-1fab10bffdeb",
            "cell_type": "markdown",
            "source": "In contrast to cognitive models (ARS, Koop, P. Grammar Induction, Parser, Grammar Transduction), such a large language model explains nothing and therefore large language models are celebrated by postmodernism, posthumanism, and transhumanism with parasitic intent.If one wants to write a textbook on the rules of sales conversations but ends up with a software agent that enjoys conducting sales conversations, one has done poor work at a very high level.\n\n",
            "metadata": []
        },
        {
            "id": "5c6212ea-3f94-4c1d-b18f-221131d26aa0",
            "cell_type": "code",
            "source": "",
            "metadata": {
                "trusted": true
            },
            "outputs": [],
            "execution_count": null
        },
        {
            "id": "ce86d174-223c-452f-85a4-eb69b5d9e18e",
            "cell_type": "code",
            "source": "",
            "metadata": {
                "trusted": true
            },
            "outputs": [],
            "execution_count": null
        },
        {
            "id": "f1238e6d-2c16-49a9-8348-adc9cee4ecc5",
            "cell_type": "markdown",
            "source": "Social structures and processesCausal inference with probabilistic context-free grammars and Bayesian networksWith multi-agent systems and decision trees for role agents in dialogue according to empirically validated action grammarAlgorithmically recursive sequence analysis",
            "metadata": []
        },
        {
            "id": "2869eba5-58a9-44d4-a198-b4a0818f5a4a",
            "cell_type": "markdown",
            "source": "Social structures and processes leave behind traces that are physically and semantically nonspecific, which can be read as protocols of their reproduction and transformation. When read in this way, the protocols are texts—discrete, finite sequences of symbols. The rules governing reproduction and transformation can be reconstructed as probabilistic, context-free grammars or as Bayesian networks. This reconstruction then represents a causal inference of the transformation rules of social structures and processes. In the example presented here, the protocol is an audio recording of a sales conversation at a weekly market (https:\/\/github.com\/pkoopongithub\/algorithmisch-rekursive-sequenzanalyse\/blob\/main\/Aachen\\_280694\\_11Uhr.mp3). The sequence analysis of the transcribed protocol (https:\/\/github.com\/pkoopongithub\/algorithmisch-rekursive-sequenzanalyse\/blob\/main\/oechsle.pdf) and the coding with the generated categories (https:\/\/github.com\/pkoopongithub\/algorithmisch-rekursive-sequenzanalyse\/blob\/main\/fallstruktur.pdf) are also provided there.\n",
            "metadata": []
        },
        {
            "id": "2941e596-683c-47dc-bc0e-74bf5cab7dae",
            "cell_type": "code",
            "source": ";;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;\n;; Paul Koop M.A. GRAMMATIKINDUKTION empirisch                    ;;\n;; gesicherter Verkaufsgespraeche                                 ;;\n;;                                                                ;;\n;; Die Simulation wurde ursprunglich entwickelt,                  ;;\n;; um die Verwendbarkeit von kontextfreien Grammatiken            ;;\n;; fuer die Algorithmisch Rekursive Sequanzanalyse                ;;\n;; zu ueberpruefen\t\t\t\t\t          ;;\n;; Modellcharakter hat allein der Quelltext.                      ;;\n;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;\n\n;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;\n\n;;  __|__   ___|__    __|__   __|__   __|__   __|__               ;;\n;;  |   |   |    |    |   |   |   |   |   |   |   |               ;;\n;; KBG->VBGKBBd->VBBdKBA->VBAKAE->VAEKAA->VAAKAV-> VAV            ;;\n;;                                                                ;;\n;; Die Produktionen --> sind entsprechend ihrer                   ;;\n;; emp. Auftrittswahrscheinlichkeit gewichtet                     ;;\n;; DIE GRAMMATIK WIRD AUS DEM KORPUS INDUZIERT                    ;;\n;; ein Left-to-the-Right-Modell                                   ;;                                                               \n;;                                                                ;;\n;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;\n;;; Transformationsmatrix                                      ;;;;;\n;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;\n;;a\tb c\td e\tf c\td\te\tf\tg\th\ti\tj\tg\th\ti\tj\tk\tl ;;\n;;0\t1 2\t3 4\t5 2\t3\t4\t5\t6\t7\t8\t9\t6\t7\t8\t9\t10\t11;;\n;;                                                                ;;\n;;\t0\t1\t2\t3\t4\t5\t6\t7\t8\t9\t10\t11\t12\t13\t      ;;\n;;0\t-\t1\t\t\t\t\t\t\t\t\t\t\t\t\t\t  ;;\n;;1\t-\t\t2\t\t\t\t\t\t\t\t\t\t\t\t\t  ;;\n;;2\t-\t\t-\t2\t\t\t\t\t\t\t\t\t\t\t\t  ;;\n;;3\t-\t\t-\t\t2\t\t\t\t\t\t\t\t\t\t\t  ;;\n;;4\t-\t-\t-\t-\t-\t2\t\t\t\t\t\t\t\t\t\t  ;;\n;;5\t-\t\t1\t\t\t\t2\t\t\t\t\t\t\t\t\t  ;;\n;;6\t-\t\t-\t\t\t\t-\t2\t\t\t\t\t\t\t\t  ;;\n;;7\t-\t\t-\t\t\t\t-\t\t2\t\t\t\t\t\t\t  ;;\n;;8\t-\t\t-\t-\t-\t-\t-\t-\t-\t2\t\t\t\t\t\t  ;;\n;;9\t-\t\t\t\t\t\t1\t\t\t\t1\t\t\t\t\t  ;;\n;;10\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t1\t\t\t  ;;\n;;11\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t  ;;\n;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;\n;;                                                                ;;\n;; Begruessung           := BG                                    ;;\n;; Bedarf                := Bd                                    ;;\n;; Bedarfsargumentation  := BA                                    ;;\n;; Abschlusseinwaende    := AE                                    ;;\n;; Verkaufsabschluss     := AA                                    ;;\n;; Verabscheidung        := AV                                    ;;\n;; Kunde                 := vorangestelltes K                     ;;\n;; Verkaeufer            := vorangestelltes V                     ;;\n;;                                                                ;;\n;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;\n\n;; Korpus \n   (define korpus (list 'KBG 'VBG 'KBBd 'VBBd 'KBA 'VBA 'KBBd 'VBBd 'KBA 'VBA 'KAE 'VAE 'KAE 'VAE 'KAA 'VAA 'KAV 'VAV));; 0 - 17\n   \n   \n\n   ;; Korpus durchlaufen  \n    (define (lesen korpus)\n     ;; car ausgeben\n    (display (car korpus))\n     ;; mit cdr weitermachen\n     (if(not(null? (cdr korpus)))\n       (lesen (cdr korpus))\n       ;;(else)\n     )\n   )\n   \n;; Lexikon \n   (define lexikon (vector 'KBG 'VBG 'KBBd 'VBBd 'KBA 'VBA 'KAE 'VAE 'KAA 'VAA 'KAV 'VAV)) ;; 0 - 12 \n   \n\n\n   ;; Index fuer Zeichen ausgeben \n    (define (izeichen zeichen)\n     (define wertizeichen 0)\n     (do ((i 0 (+ i 1)))\n      ( (equal? (vector-ref lexikon i) zeichen)) \n      (set! wertizeichen (+ 1 i))\n     )\n     ;;index zurueckgeben\n     wertizeichen\n   )\n   \n;; transformationsmatrix \n   (define zeile0 (vector 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0))\n   (define zeile1 (vector 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0))\n   (define zeile2 (vector 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0))\n   (define zeile3 (vector 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0))\n   (define zeile4 (vector 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0))\n   (define zeile5 (vector 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0))\n   (define zeile6 (vector 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0))\n   (define zeile7 (vector 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0))\n   (define zeile8 (vector 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0))\n   (define zeile9 (vector 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0))\n   (define zeile10 (vector 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0))\n   (define zeile11 (vector 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0))\n   (define zeile12 (vector 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0))\n   (define zeile13 (vector 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0))\n   (define zeile14 (vector 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0))\n   (define zeile15 (vector 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0))\n   (define zeile16 (vector 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0))\n   (define zeile17 (vector 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0))\n   \n   (define matrix (vector zeile0 zeile1 zeile2 zeile3 zeile4 zeile5 zeile6 zeile7 zeile8 zeile9 zeile10 zeile11 zeile12 zeile13 zeile14 zeile15 zeile16 zeile17))\n   \n \n   ;; Transformationen zaehlen \n      ;; Korpus durchlaufen  \n   (define (transformationenZaehlen korpus)\n     ;; car zaehlen\n      (vector-set! (vector-ref matrix (izeichen (car korpus))) (izeichen (car(cdr korpus))) (+ 1 (vector-ref  (vector-ref matrix (izeichen (car korpus))) (izeichen (car(cdr korpus))))))\n     ;; mit cdr weitermachen\n      (if(not(null? (cdr (cdr korpus))))\n       (transformationenZaehlen (cdr korpus))\n       ;;(else)\n      )\n   )\n\n   \n   ;; Transformation aufaddieren\n   \n   ;; Zeilensummen bilden und Prozentwerte bilden\n  \n\n;; Grammatik\n   (define grammatik (list '- ))\n   \n   ;; aus matrix regeln bilden und regeln in grammatik einfügene \n   (define (grammatikerstellen matrix)\n    (do ((a 0 (+ a 1)))\n        ((= a 12) )(newline)\n      (do ((b 0 (+ b 1)))\n          ((= b 12))\n        (if (< 0 (vector-ref  (vector-ref matrix a) b) )  \n         (display (cons (vector-ref lexikon a) (cons '-> (vector-ref lexikon b))))\n         )\n      )\n    )\n   )",
            "metadata": [],
            "outputs": [],
            "execution_count": null
        },
        {
            "id": "8a998033-e0a8-4585-b368-c1cdcde79160",
            "cell_type": "markdown",
            "source": "To create the grammar, the transformation table is created and from this the grammar is derived.",
            "metadata": []
        },
        {
            "id": "b5126980-38fb-480c-8b2c-e530c303470d",
            "cell_type": "code",
            "source": " (transformationenZaehlen korpus)\n (grammatikerstellen matrix)",
            "metadata": [],
            "outputs": [],
            "execution_count": null
        },
        {
            "id": "9c578f33-d92f-424f-9a39-f3f2efb38902",
            "cell_type": "markdown",
            "source": "Die Grammatik wird dann erstellt",
            "metadata": []
        },
        {
            "id": "858ded95-2bd3-4e45-a845-68052994612e",
            "cell_type": "code",
            "source": "(KBG -> . VBG)\n(VBG -> . KBBd)\n(KBBd -> . VBBd)\n(VBBd -> . KBA)\n(KBA -> . VBA)\n(VBA -> . KBBd)(VBA -> . KAE)\n(KAE -> . VAE)\n(VAE -> . KAE)(VAE -> . KAA)\n(KAA -> . VAA)\n(VAA -> . KAV)\n(KAV -> . VAV)\n",
            "metadata": [],
            "outputs": [],
            "execution_count": null
        },
        {
            "id": "1afb30a3-fb8e-4bb8-95ef-5e8fc145c65b",
            "cell_type": "markdown",
            "source": "With this grammar and the empirical occurrence probabilities, a transducer can be created that simulates protocols.",
            "metadata": []
        },
        {
            "id": "6ed91914-ad90-4ec6-8401-051e3ed86c93",
            "cell_type": "code",
            "source": ";;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;\n;; Paul Koop M.A. 1994 Sequenzanalyse empirisch                   ;;\n;; gesicherter Verkaufsgespraeche                                 ;;\n;;                                                                ;;\n;; Die Simulation wurde ursprunglich entwickelt,                  ;;\n;; um die Verwendbarkeit von kontextfreien Grammatiken            ;;\n;; fuer die Algorithmisch Rekursive Sequanzanalyse                ;;\n;; zu ueberpruefen\t\t\t\t\t          ;;\n;; Modellcharakter hat allein der Quelltext.                      ;;\n;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;\n\n;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;\n;;                                                                ;;\n;;                        VKG                                     ;;\n;;    _____________________|_____________________                 ;;\n;;    |                    |                    |                 ;;\n;;    BG------------------>VT------------------>AV                ;;\n;;    |             _______|________            |                 ;;\n;;    |             |              |            |                 ;;\n;;    |             B------------->A            |                 ;;\n;;    |        _____|____       ___|_____       |                 ;;\n;;    |        |        |       |       |       |                 ;;\n;;    |       BBd----->BA      AE----->AA       |                 ;;\n;;  __|__   ___|__    __|__   __|__   __|__   __|__               ;;\n;;  |   |   |    |    |   |   |   |   |   |   |   |               ;;\n;; KBG->VBGKBBd->VBBdKBA->VBAKAE->VAEKAA->VAAKAV-> VAV            ;;\n;;                                                                ;;\n;; Die Produktionen --> sind entsprechend ihrer                   ;;\n;; emp. Auftrittswahrscheinlichkeit gewichtet                     ;;\n;; Waehrend die Kanten des Strukturbaumes ein Top-down-Modell     ;;\n;; wiedergeben, bilden die Produktionen                           ;;\n;; des Kategoriensystem-Systems (K-System)                        ;;\n;; ein Left-to-the-Right-Modell                                   ;;                                                               \n;;                                                                ;;\n;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;\n\n;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;\n;;                                                                ;;\n;; Verkaufsgespraech     := VKG                                   ;;\n;; Verkaufstaetigkeit    := VT                                    ;;\n;; Bedarfsteil           := B                                     ;;\n;; Abschlussteil         := A                                     ;;\n;; Begruessung           := BG                                    ;;\n;; Bedarf                := Bd                                    ;;\n;; Bedarfsargumentation  := BA                                    ;;\n;; Abschlusseinwaende    := AE                                    ;;\n;; Verkaufsabschluss     := AA                                    ;;\n;; Verabscheidung        := AV                                    ;;\n;; Kunde                 := vorangestelltes K                     ;;\n;; Verkaeufer            := vorangestelltes V                     ;;\n;;                                                                ;;\n;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;\n\n;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;\n;;                                                                ;;\n;; - Die Fallstruktur wird rein physikalisch protokolliert        ;;\n;;   mechanisch, magnetisch, optisch oder digital D\/A-Wandler     ;;\n;;   (interpretationsfreies physikalisches Protokoll)             ;;\n;;   z.B. Mikrophonierung, Kinematographie,                       ;;\n;;   Optik, Akustik, mechanische, analoge, digitale Technik       ;;\n;; - Das Protokoll wird transkribiert                             ;;\n;;   (Vertextung, diskrete Ereigniskette,                         ;;\n;;    Plausibilitaet, Augenscheinvalidität)                       ;;\n;;   Searle, Austin: Sprechakte, Paraphrase, moegl.               ;;\n;;   Intentionen, konstitutive, konventionelle Regeln             ;;\n;; - Durch Lesartenproduktion und Lesartenfalsifikation           ;;                      \n;;   wird Sequenzstelle fuer Sequenzstelle informell              ;;\n;;   das Regelsystem erzeugt                                      ;;\n;;   Searle, Austin: Sprechakte, Paraphrase, moegl.               ;;\n;;   Intentionen, konstitutive, konventionelle Regeln             ;;\n;;   (bei jeder Sequenzstelle werden extensiv Lesarten erzeugt,   ;;\n;;    die Lesarten jeder nachfolgenden Sequenzstelle              ;;\n;;    falsifizieren die Lesarten der vorausgehenden Sequenzstelle,;;\n;;    Oevermann: Sequenzanalyse                                   ;;\n;;    das Regelsystem bildet ein kontextfreies Chomskysystem,     ;;\n;;    die Ersetzungsregeln sind nach Auftrittswahrscheinlichkeit  ;;\n;;    gewichtet, die Interkodierreliabilitaet wird bestimmt,      ;;\n;;    z.B. Mayring R, Signifikanz z.B. Chi-Quadrat)               ;;\n;; - Die Regeln werden in ein K-System uebersetzt                 ;;\n;;   dabei werden die Auftrittshaeufigkeiten kumuliert            ;;\n;;   um den Rechenaufwand zur Laufzeit zu minimieren              ;;\n;;   Chomsky: formale Sprachen                                    ;;               \n;; - Auf einem Computer wird unter LISP eine Simulation gefahren  ;;\n;;   McCarthy, Papert, Solomon, Bobrow, Feuerzeig               \n;; - Das Resultat der Simulation, eine terminale Zeichenkette,    ;;\n;;   wird in ein Protokoll uebersetzt                             ;;\n;; - Das künstlich erzeugte Protokoll wird auf seine Korrelation  ;;               \n;;   mit empirischen Protokollen ueberprueft                      ;;                 \n;; - Bei Bedarf werden Korrekturen am K-System vorgenommen        ;;\n;;   und die Simulation wird wiederholt                           ;;\n;;                                                                ;;\n;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;\n\n\n;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;\n;;                                                                ;;\n;; Welt 3 Popper                                                  ;;\n;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;\n\n\n(setq w3\n'(\n (anfang 100 (s vkg)) ;; hier nur Fallstruktur Verkaufsgespraeche\n ((s vkg) 100 ende) \n )\n)\n\n;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;\n;;                                                                ;;\n;; Kunde teilt Bedarf mit, Verkaeufer spiegelt Bedarf Kunde       ;;\n;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;\n\n\n(setq bbd\n'(\n (kbbd 100 vbbd)\n )\n)\n\n;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;\n;;                                                                ;;\n;; wechselseitige Bedarfsargumentation nach Bedarfsmitteilung     ;;\n;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;\n\n\n(setq ba\n'(\n (kba 100 vba)\n )\n)\n\n;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;\n;;                                                                ;;\n;; wechselseitige Einwandsabklaerung                              ;;\n;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;\n\n\n(setq ae\n'(\n(kae 100 vae)\n )\n)\n\n;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;\n;;                                                                ;;\n;; Verkaufsabschluss                                              ;;\n;; des Abschlussteils nach den Abschlusseinwaenden                ;;\n;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;\n\n\n(setq aa\n'(\n (kaa 100 vaa)\n )\n)\n\n\n;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;\n;;                                                                ;;\n;; Bedarfsteils                                                   ;;\n;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;\n\n(setq b\n'(\n ((s bbd) 100 (s ba))\n )\n)\n\n\n;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;\n;;                                                                ;;\n;; Abschlussteil                                                  ;;\n;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;\n\n\n(setq a\n'(\n ((s ae)50(s ae))\n ((s ae)100(s aa))\n )\n)\n\n;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;\n;;                                                                ;;\n;; Verkaufsteil                                                   ;;\n;; im Anschluss an Begruessung                                    ;;\n;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;\n\n(setq vt\n'(\n ((s b)50(s b))\n ((s b)100(s a))\n )\n)\n\n;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;\n;;                                                                ;;\n;; Begruessung                                                    ;;\n;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;\n\n(setq bg\n'(\n (kbg 100 vbg)\n )\n)\n\n;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;\n;;                                                                ;;\n;; Verabschiedung                                                 ;;\n;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;\n\n\n(setq av\n'(\n (kav 100 vav)\n )\n)\n\n;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;\n;;                                                                ;;\n;; Verkaufsgespraech                                              ;;\n;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;\n\n\n(setq vkg\n'(\n ((s bg)100(s vt))\n ((s vt)50(s vt))\n ((s vt)100(s av))\n )\n)\n\n\n;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;\n;;                                                                ;;\n;; Algorithmus ueber generativer Struktur                         ;;\n;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;\n\n\n;; Generiert die Sequenz\n(defun gs (st r);; Uebergabe Sequenzstelle und Regelliste \n(cond\n\n  ;; gibt nil zurück, wenn das Sequenzende ereicht ist\n  ((equal st nil) nil)\n\n  ;; gibt terminale Sequenzstelle mit Nachfolgern zurueck \n  ((atom st)(cons st(gs(next st r(random 101))r)))\n      \n  ;; gibt expand. nichtterm. Sequenzstelle mit Nachfolger zurueck\n  (t (cons(eval st)(gs(next st r(random 101))r)))       \n)\n)\n\n;; Generiert nachfolgende Sequenzstelle\n(defun next (st r z);; Sequenzstelle, Regeln und Haeufigkeitsmass \n(cond\n\n  ;; gibt nil zurueck, wenn das Sequenzende erreicht ist\n  ((equal r nil)nil)\n\n  ;; waehlt Nachfolger mit Auftrittsmass h                                 \n  (\n    (\n       and(<= z(car(cdr(car r))))\n       (equal st(car(car r)))\n    )\n   (car(reverse(car r)))\n  )\n\n  ;; in jedem anderen Fall wird Regelliste weiter durchsucht\n  (t(next st (cdr r)z))                \n)\n)\n\n;; waehlt erste Sequenzstelle aus Regelliste\n;;vordefinierte funktion first wird ueberschrieben, alternative umbenennen\n(defun first (list)\n(car(car list))\n)\n\n;; startet Simulation fuer eine Fallstruktur\n(defun s (list) ;; die Liste mit dem K-System wird uebergeben     \n(gs(first list)list)\n) \n\n;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;\n;;                                                                ;;\n;; Ruft den Algorithmus auf \/ Welt 3 Popper \/alt. jew. Fallstrukt.;;\n;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;\n\n;; alternativ (s vkg) \/ von der Konsole aus (s w3) oder (s vkg)\n(s w3) ",
            "metadata": [],
            "outputs": [],
            "execution_count": null
        },
        {
            "id": "218b266d-bcde-4ee2-af25-e4bb0ec1779e",
            "cell_type": "markdown",
            "source": "CL-USER 20 > (s w3)\n(ANFANG ((KBG VBG) (((KBBD VBBD) (KBA VBA)) ((KAE VAE) (KAA VAA))) (((KBBD VBBD) (KBA VBA)) ((KAE VAE) (KAA VAA))) (((KBBD VBBD) (KBA VBA)) ((KBBD VBBD) (KBA VBA)) ((KAE VAE) (KAA VAA))) (((KBBD VBBD) (KBA VBA)) ((KBBD VBBD) (KBA VBA)) ((KBBD VBBD) (KBA VBA)) ((KAE VAE) (KAA VAA))) (KAV VAV)) ENDE)",
            "metadata": []
        },
        {
            "id": "5ad680cd-b1f6-4e41-8997-f16a205bf2d6",
            "cell_type": "markdown",
            "source": "A more comprehensive example without the brackets:",
            "metadata": []
        },
        {
            "id": "03c4c087-97d1-4eeb-b001-36358f367abb",
            "cell_type": "markdown",
            "source": "KBG VBG \nKBBD VBBD KBA VBA KAE VAE KAA VAA \nKBBD VBBD KBA VBA KBBD VBBD KBA VBA KBBD VBBD KBA VBA KAE VAE \nKAA VAA \nKAV VAV \nKBG VBG \nKBBD VBBD KBA VBA KAE VAE KAE VAE KAE VAE KAE VAE KAA VAA \nKBBD VBBD KBA VBA KAE VAE KAE VAE KAA VAA \nKBBD VBBD KBA VBA KAE VAE KAA VAA \nKBBD VBBD KBA VBA KBBD VBBD KBA VBA KAE VAE KAA VAA \nKAV VAV\nKBG VBG\nKBBD KBA VBA KBBD VBBD KBA VBA KAE VAE KAE VAE KAA VAA\nKBBD VBBD KBA VBA KBBD KBA VBA KBBD VBBD KBA VBA KBBD VBBD KBA \nKAE VAE KAA VAA\nKBBD VBBD KBA VBA KAE VAE KAE VAE VAE KAA VAA\nKAV VAV",
            "metadata": []
        },
        {
            "id": "f29b8486-f00a-4d5f-aace-00b96fa9ce51",
            "cell_type": "code",
            "source": "",
            "metadata": [],
            "outputs": [],
            "execution_count": null
        },
        {
            "id": "0344631d-246f-4f4b-960e-6e422dc756cb",
            "cell_type": "markdown",
            "source": "The linguistic corpus in this example: The words of the corpus are separated by spaces. The words of the corpus are categories that are assigned to the changing interactions of the buyer and seller in a qualitative interpretation of the transcript of a sales conversation carried out in 1993 and 1994. The audio files, the transcripts, the interpretations, and the generated source codes (Inductor Scheme, Parser Pascal, Transductor Lisp) are freely available for download at the location where this Jupyter Notebook file is located.",
            "metadata": []
        },
        {
            "id": "40855371-260a-4813-a9e2-ccc2762641a7",
            "cell_type": "markdown",
            "source": "The program reads the corpus from a file and extracts the terminal symbols by searching for all substrings that start with 'K' or 'V' and consist of at least one uppercase letter. The leading 'K' or 'V' are removed from the terminal symbols to obtain the nonterminal symbols. Then, the rule productions are created by collecting all terminal symbols that correspond to each nonterminal symbol. Finally, the program outputs the grammar rules and the start symbol.",
            "metadata": []
        },
        {
            "id": "a85968d0-7ab6-430b-8ebc-fb732ab972fa",
            "cell_type": "code",
            "source": "PROGRAM parser (INPUT,OUTPUT);\nUSES CRT;\n(***************************************************************************)\n(* Paul Koop Chart Parser VKG                                              *)\n(*                                                                         *)\n(***************************************************************************)\n\n  (*-----------------------------------------------------------------------*)\n  (* Vereinbarungsteil                                                     *)\n  (*-----------------------------------------------------------------------*)\n\n  CONST\n    c0               =     0;\n    c1               =     1;\n    c2               =     2;\n    c3               =     3;\n    c4               =     4;\n    c5               =     5;\n   c10               =    10;\n   c11               =    11;\n   cmax              =    80;\n   cwort             =    20;\n   CText             :    STRING(.cmax.) = '';\n   datei             =    'LEXIKONVKG.ASC';\n   blank             =    ' ';\n\n   CopyRight\n   =    'Demo-Parser Chart-Parser Version 1.0(c)1992 by Paul Koop';\n\n  TYPE\n   TKategorien       = ( Leer, VKG, BG, VT, AV, B, A, BBD, BA, AE, AA,\n                          KBG, VBG, KBBD, VBBD, KBA, VBA, KAE, VAE,\n                          KAA, VAA, KAV, VAV);\n\n\n   PTKategorienListe = ^TKategorienListe;\n   TKategorienListe  = RECORD\n                        Kategorie :TKategorien;\n                        weiter    :PTKategorienListe;\n                       END;\n\n   PTKante           = ^TKante;\n   PTKantenListe     = ^TKantenListe;\n\n   TKantenListe      = RECORD\n                        kante:PTKante;\n                        next :PTKantenListe;\n                       END;\n\n   TKante            = RECORD\n                        Kategorie :TKategorien;\n                        vor,\n                        nach,\n                        zeigt     :PTKante;\n                        gefunden  :PTKantenListe;\n                        aktiv     :BOOLEAN;\n                        nummer    :INTEGER;\n                        nachkomme :BOOLEAN;\n                        CASE  Wort:BOOLEAN OF\n                         TRUE :\n                             (inhalt:STRING(.cwort.););\n                         FALSE:\n                             (gesucht :PTKategorienListe;);\n                        END;\n\n\n   TWurzel    = RECORD\n                  spalte,\n                  zeigt     :PTKante;\n                 END;\n\n   TEintrag    = RECORD\n                 A,I   :PTKante;\n                 END;\n\n   PTAgenda    = ^TAgenda;\n   TAgenda     = RECORD\n                  A,I  :PTKante;\n                  next,\n                  back : PTAgenda;\n                 END;\n\n   PTLexElem   = ^TLexElem;\n   TLexElem    = RECORD\n                  Kategorie: TKategorien;\n                  Terminal : STRING(.cwort.);\n                  naechstes: PTLexElem;\n                 END;\n\n   TGrammatik  = ARRAY (.c1..c10.)\n                 OF\n                 ARRAY (.c1..c4.)\n                 OF TKategorien;\n  CONST\n   Grammatik :      TGrammatik =\n               (\n                (VKG, BG,      VT,    AV),\n                (BG,  KBG,     VBG,   Leer),\n                (VT,  B,       A,     Leer),\n                (AV,  KAV,     VAV,   Leer),\n                (B,   BBd,     BA,    Leer),\n                (A,   AE,      AA,    Leer),\n                (BBd, KBBd,    VBBd,  Leer),\n                (BA,  KBA,     VBA,   Leer),\n                (AE,  KAE,     VAE,   Leer),\n                (AA,  KAA,     VAA,   Leer)\n               );\n\n  nummer :INTEGER = c0;\n\n  (*-----------------------------------------------------------------------*)\n  (* Variablen                                                             *)\n  (*-----------------------------------------------------------------------*)\n\n\n  VAR\n   Wurzel,\n   Pziel       : TWurzel;\n   Pneu        : PTKante;\n\n   Agenda,\n   PAgenda,\n   Paar        : PTAgenda;\n\n   LexWurzel,\n   LexAktuell,\n   LexEintrag  : PTLexElem;\n   Lexikon     : Text;\n\n\n(***************************************************************************)\n(* FUNKTIONEN                                                              *)\n(***************************************************************************)\n\n\n  (*-----------------------------------------------------------------------*)\n  (* KantenZaehler                                                         *)\n  (*-----------------------------------------------------------------------*)\n\n  FUNCTION NimmNummer:INTEGER;\n   BEGIN\n    Nummer := Nummer + c1;\n    NimmNummer := Nummer\n   END;\n\n\n\n(***************************************************************************)\n(* PROZEDUREN                                                              *)\n(***************************************************************************)\n\n\n\n\n  (*-----------------------------------------------------------------------*)\n  (* LexikonLesen                                                          *)\n  (*-----------------------------------------------------------------------*)\n\n  PROCEDURE LiesDasLexikon (VAR f:Text;\n                             G:TGrammatik;\n                             l:PTLexElem);\n    VAR\n     zaehler :INTEGER;\n     z11     : 1..c11;\n     z4      : 1.. c4;\n     ch      :   CHAR;\n     st5     : STRING(.c5.);\n\n   BEGIN\n    ASSIGN(f,datei);\n    LexWurzel := NIL;\n    RESET(f);\n    WHILE NOT EOF(f)\n     DO\n      BEGIN\n       NEW(LexEintrag);\n       IF LexWurzel = NIL\n        THEN\n         BEGIN\n          LexWurzel := LexEintrag;\n          LexAktuell:= LexWurzel;\n          LexEintrag^.naechstes := NIL;\n         END\n        ELSE\n         BEGIN\n          LexAktuell^.naechstes := LexEintrag;\n          LexEIntrag^.naechstes := NIL;\n          LexAktuell            := LexAktuell^.naechstes;\n         END;\n       LexEintrag^.Terminal := '';\n       st5 := '';\n       FOR Zaehler := c1 to c5\n        DO\n         BEGIN\n          READ(f,ch);\n          st5 := st5 + UPCASE(ch)\n         END;\n       REPEAT\n        READ(f,ch);\n        LexEintrag^.terminal := LexEintrag^.Terminal + UPCASE(ch);\n       UNTIL EOLN(f);\n       READLN(f);\n       IF st5 = 'KBG**' THEN  LexEintrag^.Kategorie := KBG    ELSE\n       IF st5 = 'VBG**' THEN  LexEintrag^.Kategorie := VBG    ELSE\n       IF st5 = 'KBBD*' THEN  LexEintrag^.Kategorie := KBBD   ELSE\n       IF st5 = 'VBBD*' THEN  LexEintrag^.Kategorie := VBBD   ELSE\n       IF st5 = 'KBA**' THEN  LexEintrag^.Kategorie := KBA    ELSE\n       IF st5 = 'VBA**' THEN  LexEintrag^.Kategorie := VBA    ELSE\n       IF st5 = 'KAE**' THEN  LexEintrag^.Kategorie := KAE    ELSE\n       IF st5 = 'VAE**' THEN  LexEintrag^.Kategorie := VAE    ELSE\n       IF st5 = 'KAA**' THEN  LexEintrag^.Kategorie := KAA    ELSE\n       IF st5 = 'VAA**' THEN  LexEintrag^.Kategorie := VAA    ELSE\n       IF st5 = 'KAV**' THEN  LexEintrag^.Kategorie := KAV    ELSE\n       IF st5 = 'VAV**' THEN  LexEintrag^.Kategorie := VAV\n      END;\n   END;\n\n\n  (*-----------------------------------------------------------------------*)\n  (* SatzLesen                                                             *)\n  (*-----------------------------------------------------------------------*)\n\n  PROCEDURE LiesDenSatz;\n   VAR\n    satz:        STRING(.cmax.);\n    zaehler:     INTEGER;\n   BEGIN\n    CLRSCR;\n    WRITELN(CopyRight);\n    WRITE('-----> ');\n    Wurzel.spalte := NIL;\n    Wurzel.zeigt  := NIL;\n    READLN(satz);\n    FOR zaehler := c1 to LENGTH(satz)\n     DO satz(.zaehler.) := UPCASE(satz(.zaehler.));\n    Satz := Satz + blank;\n    Writeln('-----> ',satz);\n    WHILE satz <> ''\n    DO\n    BEGIN\n       NEW(Pneu);\n       Pneu^.nummer   :=NimmNummer;\n       Pneu^.wort     := TRUE;\n       NEW(Pneu^.gefunden);\n       Pneu^.gefunden^.kante := Pneu;\n       pneu^.gefunden^.next  := NIL;\n       Pneu^.gesucht         := NIL;\n       Pneu^.nachkomme       :=FALSE;\n       IF Wurzel.zeigt = NIL\n        THEN\n         BEGIN\n           Wurzel.zeigt := pneu;\n           Wurzel.spalte:= pneu;\n           PZiel.spalte := pneu;\n           PZiel.zeigt  := Pneu;\n           pneu^.vor    := NIL;\n           Pneu^.zeigt  := NIL;\n           Pneu^.nach   := NIL;\n         END\n        ELSE\n         BEGIN\n          Wurzel.zeigt^.zeigt := Pneu;\n          Pneu^.vor           := Wurzel.zeigt;\n          Pneu^.nach          := NIL;\n          Pneu^.zeigt         := NIL;\n          Wurzel.zeigt        := Wurzel.zeigt^.zeigt;\n         END;\n       pneu^.aktiv   := false;\n       pneu^.inhalt  := COPY(satz,c1,POS(blank,satz)-c1);\n       LexAktuell    := LexWurzel;\n       WHILE LexAktuell <> NIL\n        DO\n         BEGIN\n          IF LexAktuell^.Terminal = pneu^.inhalt\n           Then\n            BEGIN\n             pneu^.Kategorie := LexAktuell^.Kategorie;\n            END;\n          LexAktuell := LexAktuell^.naechstes;\n         END;\n       DELETE(satz,c1,POS(blank,satz));\n      END;\n   END;\n\n\n\n\n  (*-----------------------------------------------------------------------*)\n  (* Regel3KanteInAgendaEintragen                                          *)\n  (*-----------------------------------------------------------------------*)\n\n  PROCEDURE Regel3KanteInAgendaEintragen (Kante:PTKante);\n   VAR\n    Wurzel,\n    PZiel  :TWurzel;\n   PROCEDURE NeuesAgendaPaarAnlegen;\n    BEGIN\n     NEW(paar);\n     IF Agenda = NIL\n      THEN\n       BEGIN\n        Agenda := Paar;\n        Pagenda:= Paar;\n        Paar^.next := NIL;\n        Paar^.back := NIL;\n       END\n      ELSE\n       BEGIN\n        PAgenda^.next := Paar;\n        Paar^.next    := NIL;\n        Paar^.back    := Pagenda;\n        Pagenda       := Pagenda^.next;\n      END;\n    END;\n\n   BEGIN\n    IF Kante^.aktiv\n     THEN\n      BEGIN\n       Wurzel.zeigt := Kante^.zeigt;\n       WHILE wurzel.zeigt <> NIL\n        DO\n        BEGIN\n         IF NOT(wurzel.zeigt^.aktiv)\n          THEN\n           BEGIN\n            NeuesAgendaPaarAnlegen;\n            paar^.A := kante;\n            paar^.I := wurzel.zeigt;\n           END;\n        Wurzel.zeigt  := Wurzel.zeigt^.nach\n        END\n      END\n     ELSE\n     BEGIN\n       PZiel.zeigt  := Kante;\n       WHILE NOT(PZiel.zeigt^.Wort)\n        DO PZiel.Zeigt := PZiel.Zeigt^.Vor;\n       Wurzel.Zeigt    := PZiel.Zeigt;\n       Wurzel.Spalte   := PZiel.Zeigt;\n       PZiel.Spalte    := Pziel.zeigt;\n       WHILE wurzel.spalte <> NIL\n        DO\n        BEGIN\n         WHILE wurzel.zeigt <> NIL\n         DO\n         BEGIN\n          IF wurzel.zeigt^.aktiv\n           AND (Wurzel.zeigt^.zeigt = PZiel.spalte)\n           THEN\n            BEGIN\n             NeuesAGendaPaarAnlegen;\n             paar^.I := kante;\n             paar^.A := wurzel.zeigt;\n            END;\n          Wurzel.zeigt  := Wurzel.zeigt^.nach\n         END;\n         wurzel.spalte  := wurzel.spalte^.vor;\n         wurzel.zeigt   := wurzel.spalte;\n        END\n       END\n      END;\n\n  (*-----------------------------------------------------------------------*)\n  (* AgendaAusgabe                                                         *)\n  (*-----------------------------------------------------------------------*)\n\n  PROCEDURE NimmAgendaEintrag(VAR PEintrag:PTAgenda);\n   BEGIN\n      IF PAgenda = Agenda\n      THEN\n       BEGIN\n        PEintrag := Agenda;\n        PAgenda  := NIL;\n        Agenda   := NIL;\n       END\n      ELSE\n       BEGIN\n        PAGENDA       := PAGENDA^.back;\n        PEintrag      := PAgenda^.next;\n        PAGENDA^.next := NIL;\n       END;\n   END;\n\n\n\n\n  (*-----------------------------------------------------------------------*)\n  (* Regel2EineNeueKanteAnlegen                                            *)\n  (*-----------------------------------------------------------------------*)\n\n  PROCEDURE Regel2EineNeueKanteAnlegen( Kante     :PTKante;\n                                        Kategorie :TKategorien;\n                                        Gram      :TGrammatik );\n   VAR\n     Wurzel             :TWurzel;\n     PHilfe,\n     PGesuchteKategorie :PTKategorienListe;\n     zaehler,\n     zaehler2           :INTEGER;\n\n   BEGIN\n   Wurzel.zeigt := Kante;\n   Wurzel.spalte:= Kante;\n   WHILE Wurzel.zeigt^.nach <> NIL\n    DO Wurzel.zeigt := Wurzel.zeigt^.nach;\n    FOR zaehler := c1 To c11\n     DO\n      IF  (kategorie = Gram(.zaehler,c1.))\n      AND (kategorie <> Leer)\n       THEN\n       BEGIN\n        Gram(.zaehler,c1.) := Leer;\n        NEW(pneu);\n        Wurzel.zeigt^.nach := pneu;\n        pneu^.nummer       := NimmNummer;\n        pneu^.vor          := Wurzel.zeigt;\n        Pneu^.nach         := NIL;\n        Pneu^.zeigt        := wurzel.spalte;\n        Wurzel.zeigt       := Wurzel.zeigt^.nach;\n        pneu^.aktiv        := true;\n        pneu^.kategorie    := kategorie;\n        Pneu^.Wort         := false;\n        Pneu^.gesucht      := NIL;\n        Pneu^.gefunden     := NIL;\n        Pneu^.nachkomme    := FALSE;\n        FOR zaehler2 := c2 TO c4\n         DO\n         BEGIN\n          IF Gram(.zaehler,zaehler2.) <> Leer\n           THEN\n            BEGIN\n             NEW(PGesuchteKategorie);\n             PGesuchteKategorie^.weiter:= NIL;\n             PGesuchteKategorie^.Kategorie := Gram(.zaehler,zaehler2.);\n             IF Pneu^.gesucht = NIL\n              THEN\n               BEGIN\n                PHilfe        := PGesuchteKategorie;\n                Pneu^.gesucht := PHilfe;\n               END\n              ELSE\n               BEGIN\n                PHilfe^.weiter := PGesuchteKategorie;\n                PHilfe         := PHilfe^.weiter;\n               END\n            END\n         END;\n        Regel3KanteInAgendaEintragen (pneu);\n        Regel2EineNeueKanteAnlegen(Wurzel.spalte,\n                                   pneu^.gesucht^.kategorie,gram);\n      END;\n   END;\n\n\n\n  (*-----------------------------------------------------------------------*)\n  (* Regel1EineKanteErweiternen                                            *)\n  (*-----------------------------------------------------------------------*)\n\n  PROCEDURE Regel1EineKanteErweitern(paar:PTAgenda);\n   VAR\n    PneuHilf,Pneugefneu,AHilf :PTKantenListe;\n   BEGIN\n\n   IF paar^.I^.kategorie = paar^.A^.gesucht^.kategorie\n    THEN\n     BEGIN\n      NEW(pneu);\n      pneu^.nummer      := NimmNummer;\n      pneu^.kategorie   := Paar^.A^.kategorie;\n(*---------------------------------------------------*)\n      Pneu^.gefunden := NIL;\n      AHilf := Paar^.A^.gefunden;\n\n      WHILE AHilf <> NIL\n       DO\n       BEGIN\n        NEW(Pneugefneu);\n        IF Pneu^.gefunden = NIL\n         THEN\n          BEGIN\n           Pneu^.gefunden := Pneugefneu;\n           PneuHilf       := Pneu^.gefunden;\n           PneuHilf^.next := NIL;\n          END\n         ELSE\n          BEGIN\n           PneuHilf^.next   := Pneugefneu;\n           PneuHilf         := PneuHilf^.next;\n           PneuHilf^.next   := NIL;\n          END;\n\n        Pneugefneu^.kante     := AHilf^.kante;\n        AHilf                 := AHilf^.next;\n       END;\n\n       NEW(Pneugefneu);\n       IF Pneu^.gefunden = NIL\n        THEN\n         BEGIN\n          Pneu^.gefunden := Pneugefneu;\n          Pneugefneu^.next := NIL;\n         END\n        ELSE\n         BEGIN\n           PneuHilf^.next   := Pneugefneu;\n           PneuHilf         := PneuHilf^.next;\n           PneuHilf^.next   := NIL;\n         END;\n       Pneugefneu^.kante    := Paar^.I;\n    (*--------------------------------------------*)\n       Pneu^.wort             := FALSE;\n       IF Paar^.A^.gesucht^.weiter = NIL\n        THEN Pneu^.gesucht   := NIL\n        ELSE Pneu^.gesucht   := Paar^.A^.gesucht^.weiter;\n       Pneu^.nachkomme := TRUE;\n\n      IF pneu^.gesucht   = NIL\n       THEN Pneu^.aktiv := false\n       ELSE Pneu^.aktiv := true;\n\n      WHILE Paar^.A^.nach <> NIL\n       DO Paar^.A       := Paar^.A^.nach;\n\n      Paar^.A^.nach     := pneu;\n      pneu^.vor         := Paar^.A;\n      pneu^.zeigt       := Paar^.I^.zeigt;\n      pneu^.nach        := NIL;\n\n      Regel3KanteInAgendaEintragen (pneu);\n      IF Pneu^.aktiv\n       THEN Regel2EineNeueKanteAnlegen(Pneu^.zeigt,\n                                     pneu^.gesucht^.kategorie,Grammatik);\n     END;\n\n\n   END;\n  (*-----------------------------------------------------------------------*)\n  (* SatzAnalyse                                                           *)\n  (*-----------------------------------------------------------------------*)\n\n   PROCEDURE SatzAnalyse;\n    BEGIN\n    WHILE Agenda <> NIL\n    DO\n     BEGIN\n      NimmAgendaEintrag(Paar);\n      Regel1EineKanteErweitern(Paar);\n     END;\n\n    END;\n  (*-----------------------------------------------------------------------*)\n  (* SatzAusgabe                                                           *)\n  (*-----------------------------------------------------------------------*)\n\n   PROCEDURE GibAlleSatzalternativenAus;\n    CONST\n     BlankAnz:INTEGER = c2;\n    VAR\n     PHilf   :PTkantenListe;\n\n    PROCEDURE SatzAusgabe(Kante:PTKante;BlankAnz:INTEGER);\n     VAR\n\n     Zaehler:INTEGER;\n     PHilf  :PTKantenListe;\n     BEGIN\n      FOR Zaehler := c1 TO BlankAnz DO WRITE(blank);\n\n      IF Kante^.kategorie = VKG     THEN WRITELN ('VKG ') ELSE\n      IF Kante^.kategorie = BG      THEN WRITELN ('BG  ') ELSE\n      IF Kante^.kategorie = VT      THEN WRITELN ('VT  ') ELSE\n      IF Kante^.kategorie = AV      THEN WRITE   ('AV  ') ELSE\n      IF Kante^.kategorie = B       THEN WRITELN ('B   ') ELSE\n      IF Kante^.kategorie = A       THEN WRITE   ('A   ') ELSE\n      IF Kante^.kategorie = BBD     THEN WRITE   ('BBD ') ELSE\n      IF Kante^.kategorie = BA      THEN WRITELN ('BA  ') ELSE\n      IF Kante^.kategorie = AE      THEN WRITE   ('AE  ') ELSE\n      IF Kante^.kategorie = AA      THEN WRITE   ('AA  ') ELSE\n      IF Kante^.kategorie = KBG     THEN WRITELN ('KBG ') ELSE\n      IF Kante^.kategorie = VBG     THEN WRITELN ('VBG ') ELSE\n      IF Kante^.kategorie = KBBD    THEN WRITELN ('KBBD') ELSE\n      IF Kante^.kategorie = VBBD    THEN WRITE   ('VBBD') ELSE\n      IF Kante^.kategorie = KBA     THEN WRITELN ('KBA ') ELSE\n      IF Kante^.kategorie = VBA     THEN WRITE   ('VBA ') ELSE\n      IF Kante^.kategorie = KAE     THEN WRITE   ('KAE ') ELSE\n      IF Kante^.kategorie = VAE     THEN WRITELN ('VAE ') ELSE\n      IF Kante^.kategorie = KAA     THEN WRITE   ('KAA ') ELSE\n      IF Kante^.kategorie = VAA     THEN WRITE   ('VAA ') ELSE\n      IF Kante^.kategorie = KAV     THEN WRITE   ('KAV ') ELSE\n      IF Kante^.kategorie = VAV     THEN WRITE   ('VAV ');\n\n      IF Kante^.wort\n       THEN\n        WRITELN('----> ',Kante^.inhalt)\n       ELSE\n        BEGIN\n        PHilf := Kante^.gefunden;\n        WHILE PHilf <> NIL\n         DO\n          BEGIN\n           Satzausgabe(PHilf^.kante,Blankanz+c1);\n           PHilf := Philf^.next;\n          END\n        END\n    END;\n\n    BEGIN\n      WHILE Wurzel.zeigt^.vor <> NIL\n       DO Wurzel.zeigt := Wurzel.zeigt^.vor;\n\n      WHILE Wurzel.zeigt <> NIL\n      DO\n      BEGIN\n       IF (Wurzel.zeigt^.kategorie = VKG)\n         AND ((NOT(Wurzel.zeigt^.aktiv))\n         AND (wurzel.zeigt^.zeigt = NIL))\n         THEN\n          BEGIN\n           WRITELN('VKG');\n           PHilf := Wurzel.zeigt^.gefunden;\n           WHILE PHilf <> NIL\n            DO\n             BEGIN\n              Satzausgabe(PHilf^.kante,Blankanz+c1);\n              PHilf := Philf^.next;\n             END\n          END;\n      Wurzel.zeigt := Wurzel.zeigt^.nach;\n      END;\n\n    END;\n\n  (*-----------------------------------------------------------------------*)\n  (* FreigabeDesBenutztenSpeicherplatzes                                   *)\n  (*-----------------------------------------------------------------------*)\n\n  PROCEDURE LoescheDieListe;\n   PROCEDURE LoescheWort(kante :PTKante);\n    PROCEDURE LoescheSpalte(kante:PTKante);\n     VAR\n      Pgefunden :PTKantenListe;\n      Pgesucht  :PTKategorienListe;\n     PROCEDURE LoescheGesucht(p:PTKategorienListe);\n      BEGIN\n       IF p^.weiter <> NIL\n        THEN LoescheGesucht(p^.weiter);\n       IF P <> NIL THEN DISPOSE(P);\n      END;\n     PROCEDURE LoescheGefunden(Kante:PTKante;p:PTKantenListe);\n      BEGIN\n       IF p^.next <> NIL\n        THEN LoescheGefunden(Kante,p^.next);\n       DISPOSE(P);\n      END;\n     BEGIN(*LoescheSpalte*)\n      IF Kante^.nach <> NIL\n       THEN LoescheSpalte(kante^.nach);\n      IF (NOT Kante^.nachkomme) AND ((Kante^.gesucht <> NIL)\n       AND (NOT Kante^.wort))\n       THEN LoescheGesucht(Kante^.gesucht);\n      IF Kante^.gefunden <> NIL\n       THEN LoescheGefunden(Kante,Kante^.gefunden);\n      DISPOSE(Kante)\n     END;(*LoescheSpalte*)\n    BEGIN(*LoescheWort*)\n     IF Kante^.zeigt <> NIL\n      THEN LoescheWort(Kante^.zeigt);\n    LoescheSpalte(Kante);\n    END;(*LoescheWort*)\n   BEGIN(*LoescheDieListe*)\n    WHILE Wurzel.spalte^.vor <> NIL\n     DO Wurzel.spalte := Wurzel.spalte^.vor;\n    LoescheWort(Wurzel.spalte);\n   END;(*LoescheDieListe*)\n(***************************************************************************)\n(* HAUPTPROGRAMM DES CHART PARSERS                                         *)\n(***************************************************************************)\n\n  BEGIN\n   Agenda := NIL;\n   PAgenda := Agenda;\n   LiesDasLexikon(Lexikon,Grammatik,LexWurzel);\n   LiesDenSatz;\n   WHILE Wurzel.spalte^.vor <> NIL\n    DO Wurzel.spalte := Wurzel.spalte^.vor;\n   Regel2EineNeueKanteAnlegen(Wurzel.spalte,VKG,Grammatik);\n   SatzAnalyse;\n   GibAlleSatzalternativenAus;\n   LoescheDieListe;\n(***************************************************************************)\n(* ENDE DES HAUPTPROGRAMMS DES CHART PARSERS                               *)\n(***************************************************************************)\n\n  END.\n",
            "metadata": [],
            "outputs": [],
            "execution_count": null
        },
        {
            "id": "94e9c447-3746-41e8-af52-12526cb41068",
            "cell_type": "markdown",
            "source": "Demo-Parser Chart-Parser Version 1.0(c)1992 by Paul Koop\n- - - - - > KBG VBG KBBD KBA VBA KAE VAE KAA VAA KAV VAV\n- - - - - > KBG VBG KBBD KBA VBA KAE VAE KAA VAA KAV VAV\nVKG\n       BG\n         KBG\n- - - - >   KBG\n         VBG\n- - - - >  VBG\n      VT\n         B\n            BBD                KBBD\n - - - - >  KBBD\n              VBBD - - - - > VBBD\n            BA\n              KBA\n- - - - >.  KBA\n              VBA    - - - - >  VBA\n         A                   AE             KAE  - - - - >  KAE\n              VAE\n- - - - >   VAE\n            AA                              KAA  - - - - >  KAA\n               VAA  - - - -  >  VAA\n       AV               KAV - - - - > KAV\n         VAV  - - - - >   VAV",
            "metadata": []
        },
        {
            "id": "614d8b37-eb96-4274-b931-9e842a996905",
            "cell_type": "code",
            "source": "import re\n\n# Lesen des Korpus aus einer Datei\n#with open(\"VKGKORPUS.TXT\", \"r\") as f:\n#    korpus = f.read()\nkorpus = \"KBG VBG KBBD VBBD KBA VBA KAE VAE KAA VAA KBBD VBBD KBA VBA KBBD VBBD KBA VBA KBBD VBBD KBA VBA KAE VAE KAA VAA KAV VAV\"\n# Extrahieren der Terminalsymbole aus dem Korpus\nterminals = re.findall(r\"[KV][A-Z]+\", korpus)\n\n# Entfernen der vorangestellten K- oder V-Zeichen aus den Terminalsymbolen\nnon_terminals = list(set([t[1:] for t in terminals]))\n\n# Erzeugen der Regelproduktionen\nproductions = []\nfor nt in non_terminals:\n    rhs = [t for t in terminals if t[1:] == nt]\n    productions.append((nt, rhs))\n\n# Ausgabe der Grammatikregeln\nprint(\"Regeln:\")\nfor nt, rhs in productions:\n    print(nt + \" -> \" + \" | \".join(rhs))\n\n# Ausgabe der Startsymbol\nprint(\"Startsymbol: VKG\")",
            "metadata": {
                "trusted": true
            },
            "outputs": [
                {
                    "name": "stdout",
                    "output_type": "stream",
                    "text": "Regeln:\nAV -> KAV | VAV\nBG -> KBG | VBG\nAA -> KAA | VAA | KAA | VAA\nAE -> KAE | VAE | KAE | VAE\nBA -> KBA | VBA | KBA | VBA | KBA | VBA | KBA | VBA\nBBD -> KBBD | VBBD | KBBD | VBBD | KBBD | VBBD | KBBD | VBBD\nStartsymbol: VKG\n"
                }
            ],
            "execution_count": 5
        },
        {
            "id": "ba6f9fcf-15a4-4bbb-9fbb-f9f63fcceb01",
            "cell_type": "code",
            "source": "",
            "metadata": [],
            "outputs": [],
            "execution_count": null
        },
        {
            "id": "bb35d468-1591-4795-a215-b9a72be478d1",
            "cell_type": "markdown",
            "source": "The nonterminal symbols are here the first letters of the terminal symbols without the preceding \"K\" or \"V\". The starting rule is 'VK', which means that the seller (V) initiates the conversation and the buyer (K) responds. Note that the production rules work in both directions, as the conversation between the seller and the buyer is reciprocal.",
            "metadata": []
        },
        {
            "id": "12dee38e-2feb-471c-b739-0100a81c036b",
            "cell_type": "code",
            "source": "import re\nfrom collections import defaultdict\n\ncorpus = \"KBG VBG KBBD VBBD KBA VBA KAE VAE KAA VAA KBBD VBBD KBA VBA KBBD VBBD KBA VBA KBBD VBBD KBA VBA KAE VAE KAA VAA KAV VAV\"\n\n# Erstellen eines Wörterbuchs, um die Anzahl der Vorkommen von Terminalsymbolden zu zählen.\nvocab = defaultdict(int)\nfor word in corpus.split():\n    vocab[word] += 1\n\n# Entfernen von Präfixen K und V von Terminalsymbolen.\nterminals = list(set([re.sub(r'^[KV]', '', w) for w in vocab.keys()]))\n\n# Erstellen der Produktionen für die Grammatik.\nproductions = []\nfor w in vocab.keys():\n    if re.match(r'^K', w):\n        lhs = 'K'\n    elif re.match(r'^V', w):\n        lhs = 'V'\n    else:\n        lhs = re.sub(r'^[KV]', '', w)\n    rhs = w\n    productions.append((lhs, [rhs]))\n\n# Erstellen der Startregel der Grammatik.\nstart = 'VK'\n\n# Ausgabe der Grammatik.\nprint(f'Start: {start}')\nfor lhs, rhs in productions:\n    print(f'{lhs} -> {\" \".join(rhs)}')\n",
            "metadata": {
                "trusted": true
            },
            "outputs": [
                {
                    "name": "stdout",
                    "output_type": "stream",
                    "text": "Start: VK\nK -> KBG\nV -> VBG\nK -> KBBD\nV -> VBBD\nK -> KBA\nV -> VBA\nK -> KAE\nV -> VAE\nK -> KAA\nV -> VAA\nK -> KAV\nV -> VAV\n"
                }
            ],
            "execution_count": 6
        },
        {
            "id": "1e48196c-dabe-45cd-ac23-6506b2358106",
            "cell_type": "markdown",
            "source": "The program reads the given corpus and extracts the non-terminal symbols by removing all symbols that start with 'K' or 'V'. It then iterates over the corpus and counts the production rules by counting the subsequent terminal symbol for each occurrence of a non-terminal symbol. Finally, it calculates the probabilities of the production rules by dividing the frequency of each right-hand side of a non-terminal symbol by the total number of occurrences of the left symbol.The program then outputs the induced grammar, displaying the probabilities of the production rules.",
            "metadata": []
        },
        {
            "id": "4f0db5a0-06b1-4041-b55a-60dbad21f0d0",
            "cell_type": "code",
            "source": "from collections import defaultdict\nimport random\n\n# define the grammar production rules\ngrammar = defaultdict(list)\n\n# read in the corpus\ncorpus = \"KBG VBG KBBD VBBD KBA VBA KAE VAE KAA VAA KBBD VBBD KBA VBA KBBD VBBD KBA VBA KBBD VBBD KBA VBA KAE VAE KAA VAA KAV VAV\".split()\n\n# get the non-terminal symbols\nnonterminals = set([symbol[1:] for symbol in corpus if symbol.startswith(\"K\") or symbol.startswith(\"V\")])\n\n# iterate over the corpus and count the production rules\nfor i in range(1, len(corpus)):\n    curr_symbol = corpus[i]\n    prev_symbol = corpus[i-1]\n    if prev_symbol.startswith(\"K\") or prev_symbol.startswith(\"V\"):\n        grammar[prev_symbol[1:]].append(curr_symbol)\n\n# calculate the probabilities for the production rules\nfor lhs in grammar.keys():\n    productions = grammar[lhs]\n    total_count = len(productions)\n    probabilities = defaultdict(float)\n    for rhs in productions:\n        probabilities[rhs] += 1.0\n    for rhs in probabilities.keys():\n        probabilities[rhs] \/= total_count\n    grammar[lhs] = probabilities\n\n# print the grammar\nprint(\"Grammar:\")\nfor lhs in grammar.keys():\n    print(lhs + \" ->\")\n    for rhs in grammar[lhs].keys():\n        print(\"  \" + rhs + \" : \" + str(grammar[lhs][rhs]))\n",
            "metadata": {
                "trusted": true
            },
            "outputs": [
                {
                    "name": "stdout",
                    "output_type": "stream",
                    "text": "Grammar:\nBG ->\n  VBG : 0.5\n  KBBD : 0.5\nBBD ->\n  VBBD : 0.5\n  KBA : 0.5\nBA ->\n  VBA : 0.5\n  KAE : 0.25\n  KBBD : 0.25\nAE ->\n  VAE : 0.5\n  KAA : 0.5\nAA ->\n  VAA : 0.5\n  KBBD : 0.25\n  KAV : 0.25\nAV ->\n  VAV : 1.0\n"
                }
            ],
            "execution_count": 7
        },
        {
            "id": "c5fff79b-99b4-4fe1-9a5a-a8706a41a362",
            "cell_type": "markdown",
            "source": "Here is an example program in Python that uses the given probabilistic grammar to parse the provided corpus:",
            "metadata": []
        },
        {
            "id": "5fdfe647-666f-4657-b1d1-f1cc22ca3051",
            "cell_type": "code",
            "source": "import random\n\n# Die gegebene probabilistische Grammatik\ngrammar = {\n    'BG': {'VBG': 0.5, 'KBBD': 0.5},\n    'BBD': {'VBBD': 0.5, 'KBA': 0.5},\n    'BA': {'VBA': 0.5, 'KAE': 0.25, 'KBBD': 0.25},\n    'AE': {'VAE': 0.5, 'KAA': 0.5},\n    'AA': {'VAA': 0.5, 'KAV': 0.25, 'KBBD': 0.25},\n    'AV': {'VAV': 1.0},\n}\n\n# Das zu parsende Korpus\ncorpus = ['KBG', 'VBG', 'KBBG', 'VBBD', 'KAE', 'VBA', 'KAE', 'VAA', 'KBBG', 'VBBD', 'KBA', 'VBA', 'KBBG', 'VBBD', 'KBA', 'VBA', 'KAE', 'VAE', 'KAA', 'VAA', 'KAV', 'VAV']\n\n# Initialisiere die Tabelle mit leeren Einträgen\nchart = [[{} for i in range(len(corpus) + 1)] for j in range(len(corpus) + 1)]\n\n# Fülle die Tabelle mit den Terminalsymbolen und den Wahrscheinlichkeiten\nfor i in range(len(corpus)):\n    for lhs, rhs_probs in grammar.items():\n        for rhs, prob in rhs_probs.items():\n            if rhs == corpus[i]:\n                chart[i][i+1][lhs] = {'prob': prob, 'prev': None}\n\n# Fülle die Tabelle mit den Nichtterminalsymbolen und den Wahrscheinlichkeiten\nfor span in range(2, len(corpus) + 1):\n    for start in range(len(corpus) - span + 1):\n        end = start + span\n        for split in range(start + 1, end):\n            for lhs, rhs_probs in grammar.items():\n                for rhs, prob in rhs_probs.items():\n                    if len(rhs) == 2:\n                        left, right = rhs\n                        if left in chart[start][split] and right in chart[split][end]:\n                            prod_prob = prob * chart[start][split][left]['prob'] * chart[split][end][right]['prob']\n                            if lhs not in chart[start][end] or prod_prob > chart[start][end][lhs]['prob']:\n                                chart[start][end][lhs] = {'prob': prod_prob, 'prev': (split, left, right)}\n\n# Ausgabe des Parsing-Baums\ndef print_tree(start, end, symbol):\n    if symbol in chart[start][end]:\n        if chart[start][end][symbol]['prev'] is None:\n            return [symbol]\n        split, left, right = chart[start][end][symbol]['prev']\n        return [symbol, print_tree(start, split, left), print_tree(split, end, right)]\n    else:\n        return []\n\n# Parse den Satz und gib den resultierenden Parse-Baum aus\nparse_tree = print_tree(0, len(corpus), 'BG')\nprint(parse_tree)\n",
            "metadata": [],
            "outputs": [],
            "execution_count": null
        },
        {
            "id": "0b57b39d-0314-4a2a-9dd1-2db27e74406d",
            "cell_type": "markdown",
            "source": "A probabilistic grammar can be interpreted as a Bayesian network. In a Bayesian network, the dependencies between variables are modeled using directed edges, while the probabilities of individual variables and edges are represented by probability distributions.\n\nIn a probabilistic grammar, the production rules are modeled as variables, and the terminals and non-terminals as states. Each production has a certain probability, which can be represented by a probability distribution. The probability of generating a particular sentence can then be calculated based on the production rules and their associated probabilities.\n\nThe states in the probabilistic grammar can be interpreted as nodes in the Bayesian network, while the production rules can be represented as directed edges. The probabilities of the production rules can then be modeled as edge conditions. By calculating the posterior probability, a probabilistic prediction can be made about which sentence is most likely, given the observations.\n\n\n\n",
            "metadata": []
        },
        {
            "id": "931a4b0f-f27d-4d1b-bc6f-bd575c9c7bf1",
            "cell_type": "markdown",
            "source": "The corpus can be understood as a record of the mutual interaction between two software agents of a multi-agent system. The agents of this multi-agent system have access to the last generated terminal symbol and the probabilistic grammar, which can be interpreted as a Bayesian network. They use this knowledge to generate the next terminal symbol. Agent K generates the buyer terminal symbols. Agent V generates the seller terminal symbols.",
            "metadata": []
        },
        {
            "id": "b400dd09-7464-4b06-8d15-f958b21c4a4c",
            "cell_type": "markdown",
            "source": "Here is an example program that starts agent K and sets the terminal symbol \"KBG\". Agent V then generates the next terminal symbol based on the provided grammar and the last terminal symbol \"KBG\". This continues in a loop until a maximum number of terminal symbols is reached.",
            "metadata": []
        },
        {
            "id": "507942c5-8a4b-4635-9f6d-cdefd69d15de",
            "cell_type": "code",
            "source": "import random\n\n# Grammatik als probabilistisches Bayessches Netz definieren\ngrammar = {\n    \"BG\": {\"VBG\": 0.5, \"KBBD\": 0.5},\n    \"BBD\": {\"VBBD\": 0.5, \"KBA\": 0.5},\n    \"BA\": {\"VBA\": 0.5, \"KAE\": 0.25, \"KBBD\": 0.25},\n    \"AE\": {\"VAE\": 0.5, \"KAA\": 0.5},\n    \"AA\": {\"VAA\": 0.5, \"KBBD\": 0.25, \"KAV\": 0.25},\n    \"AV\": {\"VAV\": 1.0}\n}\n\n# Funktion zur Generierung des nächsten Terminalzeichens\ndef generate_next_terminal(grammar, last_terminal):\n    # Wähle die Produktion basierend auf dem letzten Terminalzeichen und der Grammatik aus\n    productions = grammar[last_terminal]\n    production = random.choices(list(productions.keys()), list(productions.values()))[0]\n    return production\n\n# Maximale Anzahl von Terminalzeichen\nmax_length = 10\n\n# Startzeichen für Agent K\nlast_terminal = \"KBG\"\n\n# Schleife für Interaktion zwischen Agent K und Agent V\nfor i in range(max_length):\n    # Agent K generiert das nächste Terminalzeichen basierend auf der Grammatik und dem letzten gesetzten Zeichen\n    next_terminal = generate_next_terminal(grammar, last_terminal)\n    print(\"Agent K: \", last_terminal)\n\n    # Agent V generiert das nächste Terminalzeichen basierend auf der Grammatik und dem letzten gesetzten Zeichen\n    last_terminal = generate_next_terminal(grammar, next_terminal)\n    print(\"Agent V: \", next_terminal)\n\n# Letztes Terminalzeichen ausgeben, das von Agent K generiert wurde\nprint(\"Letztes Terminalzeichen: \", last_terminal)\n",
            "metadata": [],
            "outputs": [],
            "execution_count": null
        },
        {
            "id": "df5f17d5-dbfb-4640-beec-132552d3524c",
            "cell_type": "markdown",
            "source": "Agent K:  KBG\nAgent V:  KBBD\nAgent K:  KBBD\nAgent V:  KAE\nAgent K:  KAE\nAgent V:  VAE\nAgent K:  VAE\nAgent V:  KAA\nAgent K:  KAA\nAgent V:  VAA\nAgent K:  VAA\nLetztes Terminalzeichen:  VAA\n",
            "metadata": []
        },
        {
            "id": "5977a198-3788-48d9-874f-d426e406c4e2",
            "cell_type": "markdown",
            "source": "Es ist möglich, das Beispielprogramm entsprechend zu erweitern, um die genannten Eigenschaften der Agenten und die Rollenverteilung zu berücksichtigen. Ein Entscheidungsbaum legt erst die Rollen der Agenten fest. Dann Handeln die Agenten nach der Handlungsgrammatik. Hier ist eine einfaches erweiterte Version des Programms:",
            "metadata": []
        },
        {
            "id": "845922a1-22eb-44d1-a1ac-dd3a3d61684d",
            "cell_type": "code",
            "source": "import random\n\n# Die gegebene probabilistische Grammatik\ngrammar = {\n    'BG': {'VBG': 0.5, 'KBBD': 0.5},\n    'BBD': {'VBBD': 0.5, 'KBA': 0.5},\n    'BA': {'VBA': 0.5, 'KAE': 0.25, 'KBBD': 0.25},\n    'AE': {'VAE': 0.5, 'KAA': 0.5},\n    'AA': {'VAA': 0.5, 'KAV': 0.25, 'KBBD': 0.25},\n    'AV': {'VAV': 1.0},\n}\n\n# Zufällige Belegung von Ware und Zahlungsmittel bei den Agenten\nagent_k_ware = random.uniform(0, 100)\nagent_k_zahlungsmittel = 100 - agent_k_ware\nagent_v_ware = random.uniform(0, 100)\nagent_v_zahlungsmittel = 100 - agent_v_ware\n\n# Entscheidung über die Rollenverteilung basierend auf Ware und Zahlungsmittel\nif agent_k_ware > agent_v_ware:\n    agent_k_role = 'Käufer'\n    agent_v_role = 'Verkäufer'\nelse:\n    agent_k_role = 'Verkäufer'\n    agent_v_role = 'Käufer'\n\n# Ausgabe der Rollenverteilung und der Belegung von Ware und Zahlungsmittel\nprint(\"Agent K: Rolle =\", agent_k_role, \"| Ware =\", agent_k_ware, \"| Zahlungsmittel =\", agent_k_zahlungsmittel)\nprint(\"Agent V: Rolle =\", agent_v_role, \"| Ware =\", agent_v_ware, \"| Zahlungsmittel =\", agent_v_zahlungsmittel)\nprint()\n\n# Agent K startet den Dialog mit dem Terminalzeichen 'KBG'\nlast_terminal = 'KBG'\n\n# Maximale Anzahl von Terminalzeichen im Dialog\nmax_terminals = 10\n\n# Dialog-Schleife\nfor i in range(max_terminals):\n    # Agent K generiert das nächste Terminalzeichen basierend auf der Grammatik und dem letzten Terminalzeichen\n    next_terminal = random.choices(list(grammar[last_terminal].keys()), weights=list(grammar[last_terminal].values()))[0]\n    \n    # Agent V generiert das nächste Terminalzeichen basierend auf der Grammatik und dem letzten Terminalzeichen\n    next_terminal = random.choices(list(grammar[last_terminal].keys()), weights=list(grammar[last_terminal].values()))[0]\n\n    # Aktualisierung des letzten Terminalzeichens\n    last_terminal = next_terminal\n    \n    # Ausgabe des aktuellen Terminalzeichens\n    print(\"Agent K:\", next_terminal)\n\n    # Break, wenn das Terminalzeichen 'VAV' erreicht ist\n    if next_terminal == 'VAV':\n        break\n",
            "metadata": [],
            "outputs": [],
            "execution_count": null
        },
        {
            "id": "3a902c66-db46-4d56-8163-6939e2ef580c",
            "cell_type": "markdown",
            "source": "Agent K: Rolle = Verkäufer | Ware = 60.935380690830155 | Zahlungsmittel = 39.064619309169845\nAgent V: Rolle = Käufer | Ware = 46.51117771417693 | Zahlungsmittel = 53.48882228582307\n\nAgent K: KBBD\nAgent V: VBBD\nAgent K: KBA\nAgent V: VAE\nAgent K: KBBD\nAgent V: VBBD\nAgent K: KBA\nAgent V: VBBD\nAgent K: KBA\nAgent V: VAE\nAgent K: KAA\nAgent V: VAA\nAgent K: KBBD\nAgent V: VBBD\nAgent K: KBA\nAgent V: VAE\nAgent K: KAA\nAgent V: VAA\nAgent K: KAA\nAgent V: VAA\nAgent K: KAA\nAgent V: VAA\nAgent K: KAV\nAgent V: VAV\n",
            "metadata": []
        }
    ]
}