Inhalt
Aktueller Ordner:
ARS_ExplainableAIARS_XAI_Aut_Eng.tex
% Options for packages loaded elsewhere
\PassOptionsToPackage{unicode}{hyperref}
\PassOptionsToPackage{hyphens}{url}
\documentclass[
12pt,
a4paper,
oneside,
titlepage
]{article}
\usepackage[utf8]{inputenc}
\usepackage[T1]{fontenc}
\usepackage{lmodern}
\usepackage{amsmath,amssymb}
\usepackage{graphicx}
\usepackage{xcolor}
\usepackage{hyperref}
\usepackage{geometry}
\geometry{a4paper, left=3cm, right=3cm, top=3cm, bottom=3cm}
\usepackage{setspace}
\onehalfspacing
\usepackage{parskip}
\usepackage[english]{babel}
\usepackage{csquotes}
\usepackage{microtype}
\usepackage{booktabs}
\usepackage{longtable}
\usepackage{array}
\usepackage{listings}
\usepackage{xcolor}
\usepackage{caption}
\usepackage{subcaption}
\usepackage{float}
\usepackage{url}
\usepackage{natbib}
\usepackage{titling}
% Listing-Style for Python
\lstset{
language=Python,
basicstyle=\ttfamily\small,
keywordstyle=\color{blue},
commentstyle=\color{green!40!black},
stringstyle=\color{red},
showstringspaces=false,
numbers=left,
numberstyle=\tiny,
numbersep=5pt,
breaklines=true,
frame=single,
backgroundcolor=\color{gray!5},
tabsize=2,
captionpos=b
}
% Title
\title{\Huge\textbf{Between Interpretation and Computation} \\
\LARGE Formal Decidability as a Foundation \\
\LARGE for Explainable Sequence Analysis}
\author{
\large
\begin{tabular}{c}
ARS Research Team \\
Institute for Qualitative Social Research \\
RWTH Aachen University
\end{tabular}
}
\date{\large 2026}
\begin{document}
\maketitle
\begin{abstract}
This paper introduces a formal decision procedure for Algorithmic Recursive
Sequence Analysis (ARS). The foundation is a position-sensitive coding system
that encodes speaker roles, phase membership, and structural position of each
terminal symbol in a 5-bit code. Based on this, a deterministic finite automaton
is defined that decides the well-formedness of dialogue sequences. The decision
is fully reconstructible and thus fulfills the central XAI criteria of
transparency, comprehensibility, and traceability. Unlike statistical methods,
the decision is not based on training data or probabilities but exclusively on
explicit structural rules. This fulfills the methodological requirement for a
separation of structure and statistics and builds a bridge between qualitative
hermeneutics and formal modeling.
\end{abstract}
\newpage
\tableofcontents
\newpage
\section{Introduction: The Validity Problem of Sequential Analysis}
Qualitative social research has developed a variety of methods to reconstruct
the sequential order of social interaction. Objective hermeneutics
\citep{Oevermann1979} and conversation analysis \citep{Sacks1974} share the
fundamental insight that meaning in interactions is constituted not punctually
but sequentially. Each speech act derives its meaning from its position in the
sequence and from its relation to preceding and following utterances.
This insight, however, stands in tension with the requirements of formal
modeling. While qualitative research relies on detailed, case-reconstructive
interpretation of meaning structures, formal methods necessarily operate with
generalizing categories. The consequence is a methodological dilemma: either
one preserves interpretive depth and renounces formal modeling, or one gains
formal precision at the cost of meaning reduction.
Algorithmic Recursive Sequence Analysis (ARS) has pointed a way out of this
dilemma by formalizing interpretively obtained categories as terminal symbols
and reconstructing their sequential order as a grammar. This approach, however,
remains at the level of token identification: the well-formedness of a sequence
must be checked through external rule knowledge.
The present paper takes this a step further. It develops a coding system that
embeds the structural information of each terminal symbol in such a way that
the well-formedness of a sequence becomes a property of the character string
itself. On this basis, a formal decision procedure is defined that decides the
acceptance of a sequence deterministically and fully reconstructibly.
\section{The Coding System: Structure as Code}
\subsection{Requirements for a Structural Coding System}
A coding system that aims to make the well-formedness of sequences decidable
must fulfill the following requirements:
\begin{enumerate}
\item \textbf{Speaker identification}: The role of the speaker
(customer/seller) must be recognizable from the code itself.
\item \textbf{Phase membership}: Membership in a dialogical phase
(greeting, need, completion, farewell) must be encoded.
\item \textbf{Position sensitivity}: The position within the phase
(initiation, continuation, completion) must be distinguishable.
\item \textbf{Monotonicity check}: It must be decidable whether the
phase progression follows the rules.
\item \textbf{Alternation check}: It must be decidable whether the
speaker roles alternate correctly.
\end{enumerate}
\subsection{The 5-Bit Coding System}
From these requirements emerges a 5-digit binary system:
\[
\underbrace{S}_{1} \underbrace{P_1P_2}_{2} \underbrace{U_1U_2}_{2}
\]
\begin{itemize}
\item \textbf{Bit 1 (Speaker)}:
\(0 = \text{Customer (K)}\), \(1 = \text{Seller (V)}\)
\item \textbf{Bits 2-3 (Main phase)}:
\(00 = \text{Greeting (BG)}\),
\(01 = \text{Need phase (B)}\),
\(10 = \text{Completion phase (A)}\),
\(11 = \text{Farewell (AV)}\)
\item \textbf{Bits 4-5 (Subphase)}:
\(00 = \text{Base level}\),
\(01 = \text{Follow-up level}\)
\end{itemize}
\subsection{Coding of Terminal Symbols}
From this system, the following codings emerge:
\begin{table}[h]
\centering
\caption{Coding of Terminal Symbols}
\label{tab:coding}
\begin{tabular}{@{} l l c l @{}}
\toprule
\textbf{Symbol} & \textbf{Meaning} & \textbf{Code} & \textbf{Interpretation} \\
\midrule
KBG & Customer greeting & 00000 & Customer, BG, Base \\
VBG & Seller greeting & 10000 & Seller, BG, Base \\
KBBd & Customer need & 00100 & Customer, B, Base \\
VBBd & Seller inquiry & 10100 & Seller, B, Base \\
KBA & Customer response & 00101 & Customer, B, Follow-up \\
VBA & Seller reaction & 10101 & Seller, B, Follow-up \\
KAE & Customer inquiry & 01000 & Customer, A, Base \\
VAE & Seller information & 11000 & Seller, A, Base \\
KAA & Customer completion & 01001 & Customer, A, Follow-up \\
VAA & Seller completion & 11001 & Seller, A, Follow-up \\
KAV & Customer farewell & 01100 & Customer, AV, Base \\
VAV & Seller farewell & 11100 & Seller, AV, Base \\
\bottomrule
\end{tabular}
\end{table}
\section{Formal Decision Procedure}
\subsection{Dialogue Phases as State Space}
The dialogical structure is represented by a finite state space:
\[
Q = \{q_0, q_{BG}, q_B, q_A, q_{AV}, q_\bot\}
\]
\begin{itemize}
\item \(q_0\): Start state (empty sequence)
\item \(q_{BG}\): Greeting phase
\item \(q_B\): Need phase
\item \(q_A\): Completion phase
\item \(q_{AV}\): Farewell
\item \(q_\bot\): Error state
\end{itemize}
The set of accepting states is:
\[
F = \{q_{AV}\}
\]
A sequence is well-formed if and only if it ends in an accepting state.
\subsection{Definition of the Automaton}
We define a deterministic finite automaton
\[
\mathcal{A} = (Q, \Sigma, \delta, q_0, F)
\]
with:
\begin{itemize}
\item \(Q\): set of states
\item \(\Sigma \subseteq \{0,1\}^5\): terminal alphabet
\item \(\delta: Q \times \Sigma \to Q\): transition function
\item \(q_0\): start state
\item \(F\): accepting states
\end{itemize}
\subsection{The Transition Function}
The transition function \(\delta\) implements the following rules:
\textbf{Greeting phase:}
\begin{align*}
\delta(q_0, 00000) &= q_{BG} \quad \text{(KBG)} \\
\delta(q_{BG}, 10000) &= q_{BG} \quad \text{(VBG)}
\end{align*}
\textbf{Need phase:}
\begin{align*}
\delta(q_{BG}, 00100) &= q_B \quad \text{(KBBd)} \\
\delta(q_B, 10100) &= q_B \quad \text{(VBBd)} \\
\delta(q_B, 00101) &= q_B \quad \text{(KBA)} \\
\delta(q_B, 10101) &= q_B \quad \text{(VBA)}
\end{align*}
\textbf{Completion phase:}
\begin{align*}
\delta(q_B, 01000) &= q_A \quad \text{(KAE)} \\
\delta(q_A, 11000) &= q_A \quad \text{(VAE)} \\
\delta(q_A, 01001) &= q_{AV} \quad \text{(KAA)} \\
\delta(q_{AV}, 11001) &= q_{AV} \quad \text{(VAA)}
\end{align*}
\textbf{Farewell:}
\begin{align*}
\delta(q_{AV}, 01100) &= q_{AV} \quad \text{(KAV)} \\
\delta(q_{AV}, 11100) &= q_{AV} \quad \text{(VAV)}
\end{align*}
\textbf{Error cases:}
All undefined transitions lead to the error state:
\[
\delta(q, \sigma) = q_\bot \quad \text{if no rule defined}
\]
\subsection{Decidability of Well-formedness}
\textbf{Theorem 1 (Decidability)}:
The well-formedness problem is decidable for the automaton \(\mathcal{A}\).
\textit{Proof}: The automaton \(\mathcal{A}\) is finite, deterministic, and
completely defined. For every input \(w = \sigma_1 \ldots \sigma_n \in \Sigma^*\)
there exists exactly one run
\[
q_0 \xrightarrow{\sigma_1} q_1 \xrightarrow{\sigma_2} \cdots \xrightarrow{\sigma_n} q_n.
\]
Since \(Q\) is finite, this run is finitely computable.
\(w\) is well-formed if and only if \(q_n \in F\).
Thus the problem is decidable. \(\square\)
\section{Fulfillment of XAI Criteria}
\subsection{Transparency}
The decision of the automaton is fully transparent:
\begin{itemize}
\item The state set \(Q\) is explicitly given.
\item The transition function \(\delta\) is completely defined.
\item Every step in the run can be documented.
\end{itemize}
Unlike statistical models, there are no hidden weights, no latent variables,
and no training data influencing the decision.
\subsection{Reconstructibility}
For every accepted or rejected sequence, the complete decision path can be
reconstructed:
\[
q_0 \xrightarrow{\sigma_1} q_1 \xrightarrow{\sigma_2} \cdots \xrightarrow{\sigma_n} q_n
\]
Each transition is justified by the definition of \(\delta\). The rejection of
a sequence is always traceable to the first undefined transition.
\subsection{Separation of Structure and Statistics}
The automaton \(\mathcal{A}\) contains no probabilistic information whatsoever.
Its decisions are:
\begin{itemize}
\item \textbf{deterministic}: same input → same output
\item \textbf{context-free}: independent of empirical frequencies
\item \textbf{structure-preserving}: derived from the grammar
\end{itemize}
Statistical analyses can be conducted subsequently on the accepted sequences,
without affecting the structural decision.
\subsection{Comparison with Statistical Methods}
\begin{table}[h]
\centering
\caption{Comparison with Statistical Methods}
\label{tab:comparison}
\begin{tabular}{@{} p{3cm} p{4cm} p{4cm} @{}}
\toprule
\textbf{Criterion} & \textbf{Statistical Methods} & \textbf{Automaton \(\mathcal{A}\)} \\
\midrule
Decision basis & Training data, weights & Explicit rules \\
Transparency & Low (black box) & Complete \\
Reconstructibility & Approximative & Exact \\
Data dependency & High & None \\
Explainability & Post-hoc & Ad-hoc \\
\bottomrule
\end{tabular}
\end{table}
\section{Application to Empirical Data}
\subsection{The Seven Transcripts}
The following seven terminal symbol strings are given in the original notation:
\begin{verbatim}
1: KBG,VBG,KBBd,VBBd,KBA,VBA,KBBd,VBBd,KBA,VAA,KAA,VAV,KAV
2: VBG,KBBd,VBBd,VAA,KAA,VBG,KBBd,VAA,KAA
3: KBBd,VBBd,VAA,KAA
4: KBBd,VBBd,KBA,VBA,KBBd,VBA,KAE,VAE,KAA,VAV,KAV
5: KBG,VBG,KBBd,VBBd,KAA
6: KBBd,VBBd,KBA,VAA,KAA
7: KBG,VBBd,KBBd,VBA,VAA,KAA,VAV,KAV
\end{verbatim}
\subsection{Transformation into the Coding System}
Applying the 5-bit coding system yields the following binary sequences:
\begin{lstlisting}[caption=Coded Terminal Symbol Strings]
1: 00000,10000,00100,10100,00101,10101,00100,10100,00101,11001,01001,11100,01100
2: 10000,00100,10100,11001,01001,10000,00100,11001,01001
3: 00100,10100,11001,01001
4: 00100,10100,00101,10101,00100,10101,01000,11000,01001,11100,01100
5: 00000,10000,00100,10100,01001
6: 00100,10100,00101,11001,01001
7: 00000,10100,00100,10101,11001,01001,11100,01100
\end{lstlisting}
\subsection{Validation by the Automaton}
Applying the automaton \(\mathcal{A}\) to the coded sequences yields:
\begin{table}[h]
\centering
\caption{Validation Results}
\label{tab:validation}
\begin{tabular}{@{} c l c @{}}
\toprule
\textbf{Transcript} & \textbf{Final State} & \textbf{Well-formed} \\
\midrule
1 & \(q_{AV}\) & ✓ \\
2 & \(q_{AV}\) & ✓ \\
3 & \(q_{AV}\) & ✓ \\
4 & \(q_{AV}\) & ✓ \\
5 & \(q_{AV}\) & ✓ \\
6 & \(q_{AV}\) & ✓ \\
7 & \(q_{AV}\) & ✓ \\
\bottomrule
\end{tabular}
\end{table}
All seven transcripts are accepted as well-formed, which meets expectations.
\section{Discussion}
\subsection{Methodological Significance}
The presented procedure solves a central methodological problem of qualitative
sequence analysis: The validity of an interpretation is no longer justified by
external criteria or statistical plausibility, but by formal decidability. A
sequence is no longer "plausible" but "well-formed" – and this is decidable.
This corresponds to the requirement formulated in objective hermeneutics for
strict rule-governedness of social interaction \citep[ p.~372]{Oevermann1979}.
The rules are not merely asserted but explicated as a formal transition
function.
\subsection{Relation to the XAI Discussion}
Explainable AI (XAI) has formulated the demand for transparency and
reconstructibility of technical systems \citep{Samek2019, BarredoArrieta2020}.
The presented procedure fulfills this demand in a strict sense:
\begin{itemize}
\item \textbf{Meaningfulness}: The states and transitions are semantically
interpretable.
\item \textbf{Accuracy}: The decision follows exactly the defined rules.
\item \textbf{Knowledge Limits}: The limits of the procedure are explicitly
given by the state set \(Q\).
\end{itemize}
Unlike post-hoc explanations that attempt to retrospectively interpret
black-box decisions, the procedure is conceived as explainable from the ground
up (Explanation by Design).
\subsection{Limits of the Procedure}
The limits of the procedure are identical to the limits of the underlying
grammar:
\begin{itemize}
\item The procedure captures only the intended phases and transitions.
\item More complex interaction patterns (interruptions, parallelism)
require an extension of the state space.
\item The coding is limited to the binary system; finer differentiations
require more bits.
\end{itemize}
\section{Conclusion and Outlook}
This paper has shown how a position-sensitive coding system in conjunction with
a deterministic finite automaton makes the well-formedness of dialogue sequences
formally decidable. The procedure fulfills the central XAI criteria of
transparency, reconstructibility, and explainability while maintaining the
methodological standards of qualitative research.
The separation of structural decision and statistical analysis allows empirical
frequencies to be collected subsequently without affecting the structural
decision. This fulfills the methodological requirement for a clear distinction
between structural rules and empirical regularities.
Further research could:
\begin{enumerate}
\item Extend the procedure to more complex interaction types
(multi-person interactions, interruptions).
\item Expand the coding to include additional dimensions
(emotional tone, prosodic features).
\item Systematically investigate the interaction with statistical methods
(PCFG on the coded sequences).
\end{enumerate}
What remains crucial throughout is methodological control: the formal structure
must respect the interpretive character of the analysis and must not lead to
its automation.
\newpage
\begin{thebibliography}{99}
\bibitem[Barredo Arrieta et al.(2020)]{BarredoArrieta2020}
Barredo Arrieta, A., Díaz-Rodríguez, N., Del Ser, J., Bennetot, A., Tabik, S.,
Barbado, A., Garcia, S., Gil-Lopez, S., Molina, D., Benjamins, R., Chatila, R.,
\& Herrera, F. (2020). Explainable Artificial Intelligence (XAI): Concepts,
taxonomies, opportunities and challenges toward responsible AI.
\textit{Information Fusion}, 58, 82-115.
\bibitem[Flick(2019)]{Flick2019}
Flick, U. (2019). \textit{Qualitative Sozialforschung: Eine Einführung} (9. Aufl.).
Rowohlt.
\bibitem[Oevermann et al.(1979)]{Oevermann1979}
Oevermann, U., Allert, T., Konau, E., \& Krambeck, J. (1979). Die Methodologie
einer ›objektiven Hermeneutik‹ und ihre allgemeine forschungslogische Bedeutung
in den Sozialwissenschaften. In H.-G. Soeffner (Hrsg.), \textit{Interpretative
Verfahren in den Sozial- und Textwissenschaften} (S. 352-434). Metzler.
\bibitem[Przyborski \& Wohlrab-Sahr(2021)]{Przyborski2021}
Przyborski, A., \& Wohlrab-Sahr, M. (2021). \textit{Qualitative Sozialforschung:
Ein Arbeitsbuch} (5. Aufl.). De Gruyter Oldenbourg.
\bibitem[Sacks et al.(1974)]{Sacks1974}
Sacks, H., Schegloff, E. A., \& Jefferson, G. (1974). A simplest systematics for
the organization of turn-taking for conversation. \textit{Language}, 50(4), 696-735.
\bibitem[Samek \& Müller(2019)]{Samek2019}
Samek, W., \& Müller, K.-R. (2019). Towards Explainable Artificial Intelligence.
In W. Samek, G. Montavon, A. Vedaldi, L. K. Hansen, \& K.-R. Müller (Hrsg.),
\textit{Explainable AI: Interpreting, Explaining and Visualizing Deep Learning}
(S. 1-10). Springer.
\end{thebibliography}
\newpage
\appendix
\section{The Seven Transcripts in Coded Form}
\subsection{Transcript 1}
\textbf{Original:} KBG, VBG, KBBd, VBBd, KBA, VBA, KBBd, VBBd, KBA, VAA, KAA, VAV, KAV
\textbf{Coded:} 00000, 10000, 00100, 10100, 00101, 10101, 00100, 10100, 00101, 11001, 01001, 11100, 01100
\subsection{Transcript 2}
\textbf{Original:} VBG, KBBd, VBBd, VAA, KAA, VBG, KBBd, VAA, KAA
\textbf{Coded:} 10000, 00100, 10100, 11001, 01001, 10000, 00100, 11001, 01001
\subsection{Transcript 3}
\textbf{Original:} KBBd, VBBd, VAA, KAA
\textbf{Coded:} 00100, 10100, 11001, 01001
\subsection{Transcript 4}
\textbf{Original:} KBBd, VBBd, KBA, VBA, KBBd, VBA, KAE, VAE, KAA, VAV, KAV
\textbf{Coded:} 00100, 10100, 00101, 10101, 00100, 10101, 01000, 11000, 01001, 11100, 01100
\subsection{Transcript 5}
\textbf{Original:} KBG, VBG, KBBd, VBBd, KAA
\textbf{Coded:} 00000, 10000, 00100, 10100, 01001
\subsection{Transcript 6}
\textbf{Original:} KBBd, VBBd, KBA, VAA, KAA
\textbf{Coded:} 00100, 10100, 00101, 11001, 01001
\subsection{Transcript 7}
\textbf{Original:} KBG, VBBd, KBBd, VBA, VAA, KAA, VAV, KAV
\textbf{Coded:} 00000, 10100, 00100, 10101, 11001, 01001, 11100, 01100
\end{document}