---
title: "What English interjections let us predict: Stable causal-pragmatic clustering and path dependence"
author: "Brett Reynolds"
year: "2026"
status: "Submitted to Journal of Pragmatics"
canonical_url: "https://lingbuzz.net/lingbuzz/009852"
website_url: "https://brettreynolds.ca/papers/english-interjections/"
markdown_url: "https://brettreynolds.ca/papers/english-interjections/paper.md"
version: "author-manuscript mirror"
version_date: "2026-06-04"
keywords: ["interjections", "theoretical pragmatics", "projectibility", "path dependence"]
---
# What English interjections let us predict: Stable causal-pragmatic clustering and path dependence

**Author-manuscript mirror.** This Markdown file is provided for accessibility, search, and machine readability. The canonical public record is linked in the metadata above.

## Abstract
This paper contributes to theoretical pragmatics by asking what English interjections let speakers, hearers, and analysts predict in use. Identifying a form as an interjection supports inferences about prosodic packaging, syntactic supplementarity, use-conditional stance, sequential position, response function, and interactional uptake. These inferences are pragmatic rather than truth-conditional: they concern how a form enters an utterance, what stance or response it displays, and what uptake it invites.

The paper argues that these inferences are licensed by a stable causal-pragmatic cluster. Invocative, verbal, and onomatopoeic routes recruit forms into a similar pragmatic profile, while leaving different pathway residues. Regional/indexical conditioning supplies a further source of boundary structure. Supplementary Corpus of Global Web-Based English (GloWbE) and Corpus of Historical American English (COHA) probes illustrate boundary behaviour, exit residue, and attrition. The result is a path-dependent account of pragmatic category membership: English interjections are projectible because their prosodic, syntactic, semantic, and interactional diagnostics are linked, though the evidence doesn’t establish strict homeostatic control.


# Introduction

English interjections are small forms with large pragmatic consequences. A speaker’s *oh*, *wow*, *hey*, *damn*, or *mm-hm* can package a response, display stance, open or sustain a sequence, initiate repair, or invite uptake. They belong to the resources by which speakers produce and interpret language in context. This paper asks what follows from treating them as a pragmatic category: what does identifying a form as an interjection let speakers, hearers, and analysts predict?

This question is difficult because interjections have often been treated as peripheral to grammar and theory. In 1586, William Bullokar wrote the earliest grammar of English. It included interjections, which Bullokar defined as words that betoken “a sudden passion of the mind” (Bullokar \[1586\] 1980, 373).

Four centuries later, the category’s status has declined. Jespersen (1924, 90) treated interjection as a manner in which words of other categories may be used. Quirk et al. (1985, 67) included the category but called it “marginal and anomalous.” Huddleston and Pullum (2002, 22) listed interjection as one of nine lexical categories in *The Cambridge Grammar of the English Language* (*CGEL*) (*ah*, *damn*, *gosh*, *hey*, *oh*, *ooh*, *ouch*, *whoa*, *wow*), but Huddleston and Pullum (2005, 16) dropped it from the textbook because “there really isn’t anything interesting for a grammar to say” about interjections. A category real enough for the definitive grammar but thin enough for the textbook to cut.

The empirical picture hasn’t simplified. English interjections span phonology, morphology, syntax, semantics, and pragmatics. The category resists clean definition, since no single property is necessary and sufficient, and its boundaries with nouns, verbs, adverbs, fillers, and routine formulae are graded rather than sharp. Cross-linguistically, “interjection” is at best a comparative concept (Haspelmath 2010).

Those difficulties matter for pragmatics, not only for grammar. If interjections are a category, the relevant payoff isn’t an essence or a single defining feature. It’s <span class="smallcaps">projectibility</span>: the inferential payoff by which observing some properties of a new case licenses predictions about others (Goodman 1955). For interjections, prosodic isolation, syntactic supplementarity, or stance display should license expectations about non-inflection, non-referential or use-conditional meaning, sequential position, response function, and interactional uptake.

The object of analysis is the category position, not the membership list alone. Individual items such as *wow*, *damn*, *bruh*, and *fie* matter because they show what the category lets speakers and hearers infer: likely prosody, syntactic independence, limits on argumenthood, stance or response function, and possible pathways into or out of the profile.

By <span class="smallcaps">path dependence</span>, I mean that a form’s route into the interjectional profile continues to shape its later boundary behaviour. Convergence on the profile doesn’t erase source history: invocative, verbal, and onomatopoeic recruitment leave different residues, and regional/indexical conditioning adds another source of boundary structure. Those residues support diagnostic historical inference.

Throughout, I use <span class="smallcaps">property cluster</span> descriptively for the empirical pattern of co-occurring features. English interjections form a <span class="smallcaps">stable projectible cluster</span> and a <span class="smallcaps">causal-pragmatic kind</span>: a category whose strongest inferences concern use, stance, sequence, and uptake. Whether the cluster is <span class="smallcaps">homeostatic</span> in the stricter sense remains a further maintenance hypothesis.

This inferential framing motivates a causal-network analysis. Khalidi (2013, 2018) argues that natural kinds are nodes in causal structures rather than sets defined by essences. On this account, a category is theoretically useful when its place in a causal network supports reliable induction. Applied to interjections, the claim is that several diagnostics, including prosodic isolation, syntactic independence, non-referentiality, non-inflection, and interactional force, cohere because several causal pathways converge on the same pragmatic node. The resulting profile is thin but projectible.

The evidential order matters. The analysis first identifies a <span class="smallcaps">projection target</span>: what categorization as an interjection lets speakers, hearers, and analysts infer. It then asks whether a <span class="smallcaps">stable profile</span> supports that target, what <span class="smallcaps">network order</span> makes the profile non-accidental, what <span class="smallcaps">stabilizers</span> may maintain it, and whether any stabilizer amounts to <span class="smallcaps">corrective control</span>. Only the final step would license strict homeostatic language.

Diachrony matters because pragmatic profiles have histories. Forms such as *gee*, *damn*, or *bruh* change category position as historically contentful or addressive expressions come to manage stance, sequence, and interactional alignment in recurrent contexts (cf. Jucker and Taavitsainen 2013). Change over time helps explain why the pragmatic profile is stable while individual forms enter, leave, and retain source-specific residues.

The network classifies lexical items in grammatical environments, not word families as wholes. A word family can contain distinct historically related lexemes. The *damn* word family includes verbal *damn* in *I damn you* and standalone interjectional *damn!*. Prenominal *damn* in *the damn car* belongs to that family too, but the analysis needn’t decide here whether it’s a noun-phrase-internal extension of the interjection lexeme or a separate adjectival lexeme. What matters is that it’s syntactically integrated inside a noun phrase while retaining use-conditional expressive force.

The label “interjection” covers the overlap among three partly overlapping nodes: a syntactic node, a semantic or use-conditional node, and an interactional-pragmatic node. I write these as <span class="smallcaps">interjection<sub>syn</sub></span>, <span class="smallcaps">interjection<sub>sem</sub></span>, and <span class="smallcaps">interjection<sub>int</sub></span>. The sections below first describe the diagnostic cluster, then explain why these nodes overlap and why they come apart.

Section <a href="#sec:cluster" data-reference-type="ref" data-reference="sec:cluster">2</a> documents the cluster. Section <a href="#sec:causal-framework" data-reference-type="ref" data-reference="sec:causal-framework">3</a> introduces the causal-network framework that explains why the interjection profile supports pragmatic inference. The rest of the paper identifies the network order that structures the cluster, extends Gehweiler (2008)’s single-case analysis to a general claim about path dependence, and argues that interjections exhibit a distinctive form of projectibility, primarily pragmatic rather than truth-conditional.

# The property cluster

English interjections exhibit a cluster of co-occurring properties across phonology, morphology, syntax, semantics, and pragmatics. This section surveys those properties, drawing primarily on the empirical base documented in Ameka (1992), Quirk et al. (1985), and Huddleston and Pullum (2002). It answers two questions: what properties cluster, and what local diagnostic inferences each property supports. These inferences show that individual properties are structured rather than accidental. Section <a href="#sec:projectibility" data-reference-type="ref" data-reference="sec:projectibility">5</a> asks the stronger question: when does a partial observation license inference across linguistic domains? The causal question, why these properties travel together, is deferred to Section <a href="#sec:causal-structure" data-reference-type="ref" data-reference="sec:causal-structure">4</a>.

## Phonology

Interjections are prosodically isolated. They typically occupy their own intonation unit, separated from surrounding discourse by pauses (Ameka 1992, 108). In writing, this isolation surfaces as punctuation (commas, dashes, exclamation marks), setting the interjection off from the clause it accompanies (Huddleston and Pullum 2002, 1350).

The corpus evidence below is written, so prosodic isolation is tracked through proxies: punctuation, standalone position, and absence of integration into the host clause. These proxies are useful for historical and large-corpus work, but they’re proxies rather than direct acoustic evidence.

Interjections may also contain phonemes outside the speaker’s dialect. The interjection *uh-oh* (\[əʔo\]) is one of the few cases where a glottal stop appears in dialects that otherwise lack it. Other atypical sounds include the alveolar clicks in *tut-tut* (\[ǀǀ\]), the voiceless bilabial fricative in *whew* (\[ɸɪu\]), and (for dialects that no longer use it) the voiceless velar fricative in *ugh* (\[əx\]) (Quirk et al. 1985, 853). Spelling pronunciations often arise for these forms (e.g., \[tət tət\] for *tut-tut*, \[əɡ\] for *ugh*), suggesting that the atypical phonemes are marked even for native speakers.

Prosodic isolation is a strong cue: if a novel form occupies its own intonation unit and resists integration into the surrounding prosodic contour, a listener can predict syntactic non-integration and non-referentiality. The atypical phonemes reinforce this inference. But prosodic isolation isn’t unique to interjections, since vocatives, appositives, and other supplements share it (cf. Wilkins 1992). What distinguishes interjections is the convergence of prosodic isolation with the properties below.

## Morphology

English interjections characteristically don’t inflect or derive. They resist the morphological operations that characterize other open-class categories: no plural (\**ouches*), no tense (\**ouched*), no comparative (\**oucher*), no nominalization (\**ouchness*) (Ameka 1992, 106). Since Libert (2019) notes that “few, if any, papers focus on the morphology of interjections”, the apparent absence of morphological productivity may partly reflect a gap in the research rather than a firm empirical generalization.

Some interjections retain remnant morphology from earlier categorial membership. *Heavens* and *bollocks* preserve the nominal plural suffix, and *goddammit* preserves a verb-object-pronoun structure, but none of these remnants are productive. The plural in *heavens* doesn’t contrast with a singular \**heaven!* (as an interjection), and speakers don’t extend the pattern to new forms.

Non-inflection predicts non-argumenthood. Forms that don’t inflect can’t agree with a verb or take a determiner, so they can’t fill the syntactic slots that require these operations. But non-inflection alone isn’t diagnostic: adverbs also don’t inflect (Ameka 1992, 106). The stronger diagnostic is syntactic: adverbs characteristically form constituents with other words, while canonical interjections occur as standalone supplements and resist integration with neighbouring material (Meinard 2015, 152).

## Syntax

Interjections characteristically function as <span class="smallcaps">supplements</span>, “parenthetical strings that are not integrated in clause structure” (Huddleston and Pullum 2002, 1350). In *Damn, we’re going to be late!*, the interjection isn’t a complement or modifier of anything in the clause. It’s appended to the beginning as a parenthetical.

Supplement function isn’t unique to interjections. Relative clauses, noun phrases, adjective phrases, and adverb phrases can all function as supplements (Huddleston and Pullum 2002, 1356):

<span id="ex:supplements" label="ex:supplements"></span> *Ah*, so you were there after all! (interjection) We called in to see Sue’s parents, *which made us rather late*. (relative clause) A university professor, *Dr. Brown*, was arrested. (noun phrase) What distinguishes interjection supplements is their combination with the other cluster properties: non-referentiality, non-inflection, prosodic isolation, and emotive force. The function is shared; the category differs.

A handful of interjections exceptionally take complements. Huddleston and Pullum (2002, 1361fn) analyse *damn* in *damn these mosquitoes!* as the head of an interjection phrase (IntjP) taking a noun phrase (NP) complement, one of the very few cases where an interjection projects beyond a single word. But even here the syntax is constrained: no subject is present or intended, and there’s no tense inflection. The construction is better treated as a residue of verb syntax than as full verbal behaviour, a point that matters for the recruitment pathways discussed in Section <a href="#sec:causal-structure" data-reference-type="ref" data-reference="sec:causal-structure">4</a>. (The internal structure of lexicalized pairs like *oh boy* and *holy cow* is less clear and won’t be resolved here.)

Supplement function predicts prosodic isolation and resistance to syntactic integration:

<span id="ex:syntactic-predictions" label="ex:syntactic-predictions"></span> *I think that, damn, we’re going to be late.* Supplement function also predicts a semantic consequence: if the form is syntactically non-integrated, its contribution is independent of the proposition expressed by the host clause. Remove the interjection and the truth conditions don’t change, the hallmark of non-truth-conditional meaning (cf. Potts 2007, 166).

## Semantics

English interjections characteristically don’t refer. Here, <span class="smallcaps">non-referentiality</span> means absence of ordinary entity/event denotation and argumental construal while allowing semantic content. Nouns and verbs typically pick out entities or events in the world. Interjections express internal states of their users (Leech 2006): anger (*damn*), disgust (*eww*, *yuck*), surprise (*wow*), regret (*alas*), pain (*ouch*), realization (*eureka*). Some interjections are more propositional than emotive: *yes* and *no* affirm or deny propositions, and *oh* can signal recognition of new information rather than expressing feeling.

Wierzbicka (1992) argues that interjections have genuine semantic content, decomposable via Natural Semantic Metalanguage into deictic primitives (*I*, *you*, *now*, *here*). That remains compatible with the definition above. The forms may have content, and *yes* and *no* may operate on propositions, while still lacking ordinary entity/event denotation. Wierzbicka’s explications characterize the use-conditional content that <span class="smallcaps">interjection<sub>sem</sub></span> (Section <a href="#sec:fieldrelative-nodes" data-reference-type="ref" data-reference="sec:fieldrelative-nodes">3.4</a>) tracks. Non-referentiality is a cluster property rather than a definition.

The contrast with nouns is sharpest where both share a lexical source. *Jesus* as a noun refers to a person; *Jesus!* as an interjection doesn’t. The interjection use has been “bleached of \[its\] original meaning”, and it no longer refers to the entity the noun originally picked out (Gehweiler 2008, 72). This semantic shift is part of the recruitment pathway from noun to interjection (Section <a href="#sec:causal-structure" data-reference-type="ref" data-reference="sec:causal-structure">4</a>).

The projectibility here is mostly negative. A form that doesn’t refer resists being made definite, quantified over, or used as an argument: \**the damn!*, \**every ouch*, \**Damn surprised me*, and \**There’s damn*. These restrictions are non-trivial because they distinguish interjections from nouns that have undergone only partial bleaching. Section <a href="#sec:synchronic-coupling" data-reference-type="ref" data-reference="sec:synchronic-coupling">4.5</a> treats the same relation from the causal side: non-referentiality helps keep the form out of argument structure, which in turn supports prosodic isolation.

## Pragmatics

Interjections typically carry exclamatory or imperative force: *ouch!* expresses pain, *shh!* demands silence (Quirk et al. 1985, 88). But the pragmatic range extends beyond exclamation. Interjections greet (*hey*, *hello*), mark agreement (*yes*, *amen*), indicate understanding (*oh*, *uh-huh*), and signal backchanneling, the listener’s ongoing attention to a speaker’s turn (Dingemanse 2020; Stivers 2019).

The most common interjections tagged in the Corpus of Global Web-Based English (GloWbE; Davies and Fuchs 2015) illustrate this pragmatic breadth: *yes*, *no*, *oh*, *yeah*, *hi*, *hey*, *wow*, *hello*, *ah*, *ha*. The list is dominated by interactional items, forms whose primary function is to manage the conversation rather than to contribute propositional content. Dingemanse (2024) estimates that roughly one in seven conversational turns is an interjection, and interactional tools dominate the most frequent cases: continuers like *mm-hm* that scaffold extended turns, repair initiators like *huh?* that calibrate mutual understanding, and change-of-state tokens like *oh* that display evolving knowledge (cf. Norrick 2009).

The boundary with <span class="smallcaps">routine formulae</span> is instructive. Coulmas (1981, 2–3) defines routine formulae as “highly conventionalized prepatterned expressions whose occurrence is tied to more or less standard communication situations”: *bye*, *hello*, *sorry*, *thank you*. Ameka (1992, 109–10) argues these differ from interjections in three ways: they include addressees, they’re predictable responses to social conventions, and they’re always speech acts rather than mere reflections of the speaker’s mental state. Wilkins (1992, 142) counters that these differences make routine formulae “a distinct pragmatic and semantic subtype of interjections”, not a separate category.

The disagreement is field-relative in a causal-network sense. For a grammar, routine formulae pattern near interjections because they share prosodic isolation and pragmatic function. For speech-act analysis or interactional pragmatics, they support different predictions: addressee orientation, occurrence in standard social sequences, and conventional felicity conditions. The same form can sit close to one node for one inquiry and farther away for another, depending on which causal relations and predictions the inquiry tracks.

The projective claim should be level-specific. A novel non-propositional stance form licenses expectations about standalone positioning, supplement-like syntax, backchannel availability, and resistance to embedding under reported speech (*\*She said that damn*). A novel routine formula licenses expectations about addressee-directedness and placement in conventional social sequences. The overlap explains why routine formulae sit at the boundary; the divergence explains why the boundary is disputed. Section <a href="#sec:causal-structure" data-reference-type="ref" data-reference="sec:causal-structure">4</a> asks which links make those claims reliable.

## Co-occurrence, not necessity

These properties co-occur statistically, not necessarily. The literature’s characteristic hedges are the point: interjections “commonly” occupy their own intonation unit, “typically” lack inflection, “usually” resist syntactic integration, and “generally” express emotive meaning (cf. Ameka 1992; Wilkins 1992). No single property is necessary and sufficient.

The pattern calls for explanation. Network order couples the properties: prosodic isolation reinforces syntactic non-integration, non-referentiality removes the need for argument structure, and pragmatic conventionalization entrenches the emotive function.

The coupling is visible when it breaks. *Damn* as an interjection (*Damn!*) has the full cluster: prosodic isolation, syntactic non-integration, non-referentiality, and emotive force. Prenominal *damn* in *the damn car* retains emotive force but loses prosodic-syntactic independence because it’s embedded inside an noun phrase (cf. Potts 2007). When that link is neutralized, the properties erode together. A bare property list can’t explain this pattern; a process-linked cluster explains it.

Table <a href="#tab:diagnostic-matrix" data-reference-type="ref" data-reference="tab:diagnostic-matrix">1</a> makes the cluster inspectable. Each row gives a lexical item in a grammatical environment rather than a word family as a whole (see Section <a href="#sec:intro" data-reference-type="ref" data-reference="sec:intro">1</a>), and the columns show its profile against the five cluster properties and the three field-relative nodes previewed in the introduction and defined in Section <a href="#sec:fieldrelative-nodes" data-reference-type="ref" data-reference="sec:fieldrelative-nodes">3.4</a>. The compact labels Intj<sub>syn</sub>, Intj<sub>sem</sub>, and Intj<sub>int</sub> refer to syntactic, semantic/use-conditional, and interactional-pragmatic interjection nodes, respectively. The table shows convergence and decoupling. Standalone *damn!* and *oh* show convergence. In *the damn car*, prenominal *damn* has the semantic node but lacks the syntactic one. *mm-hm* has the interactional node, while its syntactic status is debatable. Boundary items (*bye*, *um*) share some but not all cluster properties, the pattern that Section <a href="#sec:boundaries" data-reference-type="ref" data-reference="sec:boundaries">6.3</a> explains as causal-link overlap.

<div id="tab:diagnostic-matrix">

|  |  |  |  |  |  |  |  |  |  |
|:---|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|
| *gee!* |  |  |  |  |  |  |  |  |  |
| *damn!* |  |  |  |  |  |  |  |  |  |
| *damn these …!* |  | ($`\checkmark`$) | ($`\checkmark`$) |  |  |  | ($`\checkmark`$) |  |  |
| *the damn car* |  |  |  |  |  |  |  |  |  |
| *oh* |  |  |  |  |  |  |  |  |  |
| *mm-hm* |  |  |  |  | ($`\checkmark`$) |  | ($`\checkmark`$) |  |  |
| *huh?* |  |  |  |  | ($`\checkmark`$) |  | ($`\checkmark`$) |  |  |
| *bruh* |  |  |  |  |  |  |  |  |  |
| *bye* |  |  |  | ($`\checkmark`$) |  |  |  |  | ($`\checkmark`$) |
| *um* | ($`\checkmark`$) |  |  |  |  |  | ($`\checkmark`$) |  | ($`\checkmark`$) |

Diagnostic matrix: constructions against cluster properties and field-relative nodes.  = property present; ($`\checkmark`$) = partial or debated; = absent. Intj<sub>syn</sub>, Intj<sub>sem</sub>, and Intj<sub>int</sub> abbreviate syntactic, semantic/use-conditional, and interactional-pragmatic nodes. Sources: properties from Ameka (1992), Quirk et al. (1985), Huddleston and Pullum (2002); <span class="smallcaps">interjection<sub>sem</sub></span> from Potts (2007); <span class="smallcaps">interjection<sub>int</sub></span> from Stivers (2019), Dingemanse (2020).

</div>

The next two sections develop the theoretical framework (Section <a href="#sec:causal-framework" data-reference-type="ref" data-reference="sec:causal-framework">3</a>) and the causal structure itself (Section <a href="#sec:causal-structure" data-reference-type="ref" data-reference="sec:causal-structure">4</a>).

# Explaining pragmatic projectibility

The property cluster documented above has a distinctive shape: statistically reliable co-occurrence, no necessary and sufficient conditions, graded boundaries, and hedged universals. That profile is exactly what makes interjections valuable for pragmatics. The label “interjection” supports predictions about use, but those predictions cross phonology, syntax, semantics, interaction, and diachrony. Causal-network analysis explains why those inferences are available without treating the category as an essence.

## The framework

Khalidi (2013, 2018) argues that natural kinds are nodes in causal structures. A kind can be unified by causal relations rather than by an essence. Those relations may be local mutual-reinforcement relations, one-directional pathways, or converging historical routes. What matters is that the category marks a place in a causal network: a position where properties, processes, and interventions are connected tightly enough to support induction. The clustering matters because it lets us predict properties of new instances from observed properties of old ones.

The claim is methodological rather than metaphysically heavyweight. Interjections need not be “genuine kinds” in the strongest sense. The category only needs to be a projectible kind for linguistic inquiry: a node whose causal links explain why particular observations support particular predictions.

I use the framework as a diagnostic ladder (Reynolds 2026). First, what’s the <span class="smallcaps">projection target</span>: the field-relative inference the category is supposed to support? Second, what <span class="smallcaps">stable profile</span> supports that target? Third, what <span class="smallcaps">network order</span> makes the profile non-accidental? Fourth, what <span class="smallcaps">stabilizers</span>, if any, keep the profile available across relevant contexts? Fifth, is any stabilizer a <span class="smallcaps">controller</span>: a perturbation-sensitive process that corrects departures from the profile? Projectibility is the payoff; strict homeostasis is the final and most demanding evidential claim.

The framework also clarifies how the argument relates to <span class="smallcaps">homeostatic-property-cluster theory</span>. Boyd (1999) showed that fuzzy categories can be useful and projectible when their properties are connected non-accidentally. Interjections have that shape: they lack necessary and sufficient conditions, but show recurrent co-occurrence among prosodic, syntactic, semantic, and pragmatic diagnostics. The question is how strong a causal claim that recurrence supports.

Some relevant properties form local stabilizing relations. Prosodic isolation and syntactic non-integration reinforce one another: a form produced as its own intonation unit is less available for complementation or modification, while a form outside clause structure is more likely to receive independent prosodic packaging. Repeated use in that position can make standalone interpretation increasingly expectable for speakers and hearers.

Other processes are interactional or directional. Uptake affects future use; repeated use affects expectation; expectation affects later uptake. Loss of referential content makes argument structure less motivated; syntactic deactivation pushes a form toward standalone use; repeated conversational deployment entrenches stance, response, repair, or sequence-management functions. These relations explain why the diagnostics travel together. They establish network order and plausible maintenance rather than homeostasis (cf. Weinberger 2026).

The historical evidence adds a further point. Semantic bleaching, syntactic deactivation, and onomatopoeic or interactional coinage can all recruit forms into a similar pragmatic profile, but they don’t do so by the same route or with the same residues. A bleached invocation, a deactivated imperative, and an onomatopoeic or interactional form may converge on prosodic independence, syntactic supplementarity, non-referential or use-conditional meaning, and interactional force while retaining different traces of their sources. The resulting category is organized by causal-pragmatic relations with different temporal and explanatory profiles.

That structure is enough for pragmatic projectibility, but not enough for strict homeostasis. The evidence considered here doesn’t identify a tracked category-level variable, an error signal, or a response pathway that restores the interjectional profile when it is perturbed. Interjections are stable and projectible because their diagnostics are linked within a causal-pragmatic structure.

## Causal networks vs. prototype theory

Prototype theory (Rosch 1975; Taylor 2003) was developed for exactly the kind of data Section <a href="#sec:cluster" data-reference-type="ref" data-reference="sec:cluster">2</a> documents, and it handles that data well: graded membership, better-and-worse exemplars, and fuzzy boundaries via overlapping similarity spaces. It also correctly predicts that boundary disputes (interjections vs. fillers, vs. routine formulae) should recur instead of being resolvable by definition. These are genuine explanatory achievements, and the causal-network account doesn’t replace them.

A causal-network account adds a different kind of question. Prototype theory explains why borderline cases should be expected: categories are organized around better and worse exemplars. It doesn’t explain why the properties cluster in this particular pattern or what should happen when one of their links is disrupted.

The difference matters because the causal-network account generates predictions about change, conditioning, and residues. It predicts that forms recruited by different pathways should retain different traces of their source constructions, that apparent gradience should decompose under variety conditioning, and that some properties may be more central to network order than others (hub-property asymmetry). These predictions are probed for interjections in Section <a href="#sec:discussion" data-reference-type="ref" data-reference="sec:discussion">6</a>.

Pathway residues provide a further discriminator. If the proposed causal links are real, messy boundary cases shouldn’t be messy in arbitrary ways. Retained complements should point to verbal sources; euphemistic phonological deformation should point to taboo-managed invocative sources; regional concentration should point to regional or indexical recruitment. Prototype theory offers no analogous retrodictive test.

## What causal structure adds

Several analytical stances generate different predictions for interjections. The weakest is a <span class="smallcaps">property list</span>: interjections share prosodic isolation, non-inflection, non-referentiality, and emotive force. That’s descriptively useful, but it makes no prediction about what happens when one property changes.

Feature specifications are stronger alternatives, for example \[<span class="smallcaps">+peripheral</span>, <span class="smallcaps">-argument-taking</span>, <span class="smallcaps">+expressive</span>\]. This generates some of the same distributional consequences: no inflection follows from the absence of phi-features, and no argument-taking follows from the relevant syntactic feature bundle. But feature specifications treat those consequences as entailments of the feature bundle, not as effects of identifiable processes.

A constructional account is a stronger competitor. It can represent many of the same pairings of form, meaning, and use (Goldberg 1995, 2006; Croft 2001). The causal-network claim is compatible with that representational move. It asks a different question: which positions in a constructional inventory are projectible, why do their properties cohere, and which changes should make them come apart? Construction grammar models the inventory; the causal-kind question asks where the inventory supports cross-constructional and cross-field inference. The two main pressure points are pathway residues and field-relative decomposition. If those payoffs fail under further testing, the causal-network account collapses to a descriptive inventory rather than an explanatory category analysis.

The causal-network account adds five claims that property lists, feature bundles, and constructional representations don’t settle by themselves:

1.  Co-movement. Prosodic isolation, syntactic supplementarity, non-referentiality, and pragmatic force recur in a patterned way. The account asks which links explain why observing one property licenses expectations about another.

2.  Asymmetry. Some properties are upstream or more diagnostic than others. Prosodic packaging can make syntactic integration less available, while the reverse relation has a different status. A flat feature bundle can’t represent that ordering.

3.  Persistence. The category profile can remain available while individual forms enter and leave. That makes a stability claim while leaving homeostasis open.

4.  Boundary behaviour. Different recruitment pathways should leave different residues at the category boundary. Feature theory can list residual properties but doesn’t explain why they align with particular histories.

5.  Failure. When a prediction fails, the pattern of failure should identify which link or projection target has broken. If *fie* loses all three nodes together under community-wide attrition, the result is a boundary condition on ordered erosion rather than evidence of corrective control.

If a feature-based account can replicate co-movement, asymmetry, persistence, boundary behaviour, and failure without network order, causal structure is doing no additional explanatory work. The empirical pressure point is whether the probes in Section <a href="#sec:discussion" data-reference-type="ref" data-reference="sec:discussion">6</a> support those claims.

## Field-relative projectibility

The causal-network account developed by Khalidi (2013, 2018) also explains why projectibility is field-relative. If kinds are nodes in causal structures, different fields may track different nodes because they ask different explanatory and predictive questions. The resulting categories remain realist: each inquiry tracks the causal relations that matter for its explanations and predictions.

A parallel formulation in homeostatic-kind theory makes the point explicit: “A kind may be natural ‘from the point of view of’ some discipline or disciplinary matrix but not ‘from the point of view of’ another” (Boyd 1999, 150). The present paper treats that formulation as a useful gloss on the causal-network account.

The familiar linguistic parallel is the interrogative/question distinction. Syntacticians track <span class="smallcaps">interrogative</span>: a clause type with properties such as subject–auxiliary inversion, fronting of interrogative words, and polar-question syntax. Pragmatists track <span class="smallcaps">question</span>: a speech act that requests information, projects an answer, and organizes uptake. The two often coincide, and they also come apart: *Can you pass the salt?* is syntactically interrogative and pragmatically a request, while *I wonder where she went* can raise an information issue without having interrogative clause syntax.

Interjections create a similar problem with less terminological help. Syntax, semantics, and interactional pragmatics all use the word “interjection”, but each field tracks a slightly different node: different properties, different causal links, different predictions. I distinguish these with subscripts: [^2]

- <span class="smallcaps">interjection<sub>syn</sub></span>, the syntactician’s category (*CGEL*, Quirk et al.). Distributional cluster: supplement function, non-inflecting, non-argument-taking. Linked by prosodic-syntactic coupling. Projects syntactic behaviour.

- <span class="smallcaps">interjection<sub>sem</sub></span>, the semanticist’s category, overlapping with Potts (2007)’s <span class="smallcaps">expressives</span>. Use-conditional meaning cluster: independent of at-issue content, nondisplaceable, perspective-dependent, descriptively ineffable (Potts 2007, 166). Often recruited through semantic bleaching. Projects meaning type. Different extension: prenominal *damn* in *the damn car* is <span class="smallcaps">interjection<sub>sem</sub></span> but not <span class="smallcaps">interjection<sub>syn</sub></span>.

- <span class="smallcaps">interjection<sub>int</sub></span>, the interactional-pragmatic category, overlapping with Stivers (2019)’s response tokens and Dingemanse (2020)’s liminal signs. Sequential/stance cluster: turn-initial, backchannel, stance display. Linked by routine entrenchment and possibly by regional/indexical conditioning. Projects pragmatic behaviour in interaction. Different extension: *mm-hm* is <span class="smallcaps">interjection<sub>int</sub></span> but debatably <span class="smallcaps">interjection<sub>syn</sub></span>.

The subscript notation makes explicit that these are overlapping but non-identical nodes, each linked by different ordering or stabilizing relations, and projectible for different purposes. The traditional label “interjection” names their zone of overlap, and the paper’s remaining sections analyze that overlap from multiple angles.

# Causal structure: recruitment and coupling

The property cluster documented in Section <a href="#sec:cluster" data-reference-type="ref" data-reference="sec:cluster">2</a> raises two questions that co-occurrence alone can’t answer: how do items enter the interjectional region of the network, and which causal links make the properties travel together? In what follows, “the interjection cluster” refers to the broad property profile. I use “the intersection” only for cases that belong simultaneously to <span class="smallcaps">interjection<sub>syn</sub></span>, <span class="smallcaps">interjection<sub>sem</sub></span>, and <span class="smallcaps">interjection<sub>int</sub></span>. A causal-network account requires both a diachronic account of recruitment and a synchronic account of the network order among properties.

<div id="tab:link-status">

| Link | Status | Evidence type | Main vulnerability |
|:---|:---|:---|:---|
| Semantic bleaching from invocatives | Historically documented | Gehweiler-style corpus trajectory | Operationalizing invocation versus exclamation. |
| Syntactic deactivation from verbs and imperatives | Plausible and reconstructive | Residual complement structure, imperative frames, discourse-marker uses | Needs fuller diachronic series. |
| Onomatopoeic or liminal coinage | Plausible entry route | Synchronic profile of coined forms | Coinage isn’t the same kind of process as bleaching. |
| Prosodic-syntactic coupling | Stabilizer candidate, partly proxy-based | Supplement syntax, punctuation, standalone position | Written corpora underrepresent prosody. |
| Non-referentiality ratchet | Ordering link | Argument-structure restrictions | *Yes*, *no*, and response tokens complicate the property. |
| Routine entrenchment | Stabilizer candidate for <span class="smallcaps">interjection<sub>int</sub></span> | Conversation-analytic and pragmatic literature | Needs more spoken-interaction data for the present cases. |
| Regional/indexical conditioning | Maintenance hypothesis | Regional concentration and register sensitivity | Identity-work evidence remains outside the present data. |

Status of the proposed ordering and stabilizing links

</div>

## Recruitment pathways

Section <a href="#sec:cluster" data-reference-type="ref" data-reference="sec:cluster">2</a> already pointed to three recruitment pathways: semantic bleaching from nouns, syntactic deactivation from verbs, and nonce coinage via onomatopoeia. The first is historically demonstrated; the second and third are proposed pathways with clear examples but without the extended diachronic documentation that Path 1 has. The first two are analysed in detail here; the third is taken up in Section <a href="#sec:turnover" data-reference-type="ref" data-reference="sec:turnover">4.7</a>.

The pragmatic issue is how changes in communicative use become categorially consequential: grammaticalization and pragmaticalization matter here because they alter a form’s communicative job as well as its position in a syntactic tree (cf. Traugott and Dasher 2002; Heine and Kuteva 2002).

## Path 1: religious noun to interjection

Gehweiler (2008) traces a four-stage trajectory from proper name to primary interjection, using the development of *gee* from *Jesus* as the central case:

<span id="ex:bleaching" label="ex:bleaching"></span> Stage 1: Proper name used as invocation (*Jesus!* addressed to the deity) Stage 2: Exclamatory use, no longer addressing the deity (*Jesus!* as an expression of surprise) Stage 3: Phonological modification (*Jeez*, *Gee*) Stage 4: Primary interjection (*gee!*), fully detached from the source noun The stages layer cluster properties unevenly. Prosodic isolation is already present at stage 1, because a vocative like *Jesus!* occupies its own intonation unit (contrast the integrated argument in *Put your faith in Jesus*). What changes is the content: non-referentiality arrives at stage 2, when the speaker is no longer addressing the deity. Phonological divergence at stage 3 makes the etymological connection opaque. By stage 4, the form is a standalone interjection that many speakers don’t connect to its source. The vocative is the gateway: it provides the prosodic isolation that the bleaching pathway exploits. The prosodic slot was there first; the semantic content drained out of it (Gehweiler 2008, 85).

The trajectory is constrained, not a random walk. Each stage makes the next more probable: once the invocation loses its referential force (stage 2), the phonological connection to the source is the last link to the nominal category, and weakening that link (stage 3) enables full independence (stage 4). Gehweiler’s corpus data documents the constraint: the phonological modifications (*Jeez*, *Gee*, *Gosh*, *Golly*) cluster in the seventeenth and eighteenth centuries, with the transition from invocation to primary interjection happening relatively rapidly once the modifications appear (Gehweiler 2008, 80–85).

Two cautions matter for the general claim. The boundary between invocation and exclamation is hard to operationalize in historical texts, since prosodic and interactional evidence is indirect. The seventeenth- and eighteenth-century clustering of euphemistic forms may reflect external taboo pressure as well as internal category pressures. The causal-network claim doesn’t require Gehweiler’s sequence to be an autonomous internal cline. It requires the weaker and more historically plausible claim that social pressure, invited inference, and reanalysis can jointly push a source noun toward the same interjectional profile.

Two processes operate within this path. Gehweiler (2008, 76) distinguishes gradual Invited Inferencing (speaker-based: the expressive use is conversationally implicated, then conventionalized) from abrupt reanalysis (hearer-based: the hearer parses the exclamation as a standalone unit rather than an invocation). Both feed the same cluster, but they operate at different timescales and involve different agents. Their convergence supports the network-order claim: the same node can be reached by different routes.

The constrained trajectory generates a specific prediction: partial bleaching (stage 2) should be detectable by the absence of downstream properties. A form at stage 2 should be non-referential but should retain the source phonology and resist full prosodic independence. The prediction is testable and goes beyond historical description.

## Path 2: verb to interjection

A second trajectory enters the cluster via syntactic deactivation rather than phonological modification. *Damn* originates as a verb but in its interjection use has lost subject and tense while retaining (in a few constructions) an NP complement (*damn these mosquitoes!*). The complement retention shows that the syntactic deactivation is partial: the form carries some verb properties into the interjection cluster. In *CGEL*’s terms, it’s the head of an IntjP, not a verb phrase (VP) (Huddleston and Pullum 2002, 1361fn).

The process is decategorialization: loss of verbal morphosyntax (subject, tense) and gain of expressive force. *Damn* as a verb is a speech act (content meaning: condemning), while *damn!* as <span class="smallcaps">interjection<sub>syn</sub></span> is a stance marker (procedural meaning: frustration).

This doesn’t count as degrammaticalization in the strict sense, since the form doesn’t gain the morphosyntactic substance of a major lexical category (cf. Norde 2009). It’s better treated as pragmatic conventionalization with syntactic deactivation: the verbal source loses argument-structural obligations as the interjectional use gains procedural and interpersonal force.

For other verb-derived interjections, the imperative is the gateway, just as the vocative is for Path 1. Imperatives already lack an overt subject and can stand alone prosodically; the interjection use completes the deactivation. *Come on* and *look* (in their discourse-marker uses) have lost subject, tense, and complement-taking ability. They’re standalone interjections with no residual argument structure.

*Damn* shows partial syntactic deactivation (complement retained, subject and tense lost), while *come on* and *look* represent full deactivation (no complement, no argument structure at all). The degree of deactivation correlates with the degree of integration into the interjection cluster.

This correlation generates a graded prediction: a form that retains complement-taking ability should also retain other verb-like properties (e.g., prosodic integration in some constructions), while a fully deactivated form should show the complete interjection profile. The ordering relation predicts the inferential package available at each stage.

## Convergent pathways

The two paths enter the cluster from different directions, one through phonological modification of a bleached noun, the other through syntactic deactivation of a verb, but both exploit a gateway construction that already provides prosodic isolation: the vocative for Path 1, the imperative for Path 2. Both converge on the same property profile. The convergence is what the causal-network account predicts: different recruitment processes can bring items into the same projectible profile, producing the same cluster of co-occurring properties through different routes.

Both paths can be situated within Traugott and Dasher (2002)’s subjectification trajectory (content $`>`$ procedural, nonsubjective $`>`$ intersubjective), but they instantiate it differently. Path 1 is grammaticalization in Traugott’s technical sense: stage 1 (invocation) carries content meaning (addressing an entity) plus subjective stance (reverence); stage 2 (exclamation) loses content and gains procedural meaning (expressive force); stages 3–4 (phonological modification) complete the shift to fully procedural, intersubjective meaning, with hearers interpreting the form as a conventional expression, not an invocation. Path 2, as described above, is decategorialization: loss of verbal morphosyntax with gain of expressive force.

Onomatopoeia (Section <a href="#sec:turnover" data-reference-type="ref" data-reference="sec:turnover">4.7</a>) is categorically different. Nonce coinages enter the cluster with no prior lexical history (no bleaching, no deactivation, no subjectification). Three diachronically distinct processes (grammaticalization, pragmaticalization, nonce coinage) converge on the same synchronic category. Grammaticalization theory alone can’t subsume all three. A causal-network account can: the shared point is the node they converge on, with outcomes recruited by different processes but linked by the same synchronic ordering relations.

## Synchronic coupling

Recruitment explains how items enter the cluster. The synchronic question is what gives the cluster network order now, and what might keep that order stable. Four synchronic links are relevant: prosodic-syntactic coupling, a non-referentiality ratchet, interactional routine entrenchment, and regional/indexical conditioning. They don’t have the same status. The first two order structural and semantic diagnostics. The third is the central stabilizer candidate for <span class="smallcaps">interjection<sub>int</sub></span>. The fourth remains a maintenance hypothesis.

### Prosodic isolation and syntactic non-integration

These two properties reinforce each other, though the evidence is stronger in one direction. Prosodic separation makes syntactic integration harder: a form in its own intonation unit is less available as a complement or modifier. Gehweiler (2008, 85) provides direct evidence for this direction: sentence-initial position of invocations contributed to the emergence of the interjection, because the prosodic slot preceded and enabled the syntactic independence.

The same coupling is visible synchronically. Standalone *Damn!* has both properties: it occupies its own intonation unit and has no role in clause structure. In *I think that, damn, we’re going to be late*, *damn* remains possible only as a parenthetical supplement, set off from the *that*-clause. In *the damn car*, the form loses both properties at once: it’s inside the noun phrase and receives the prosody of the phrase.

The prediction is concrete. If a novel form is reliably packaged as its own intonation unit, it should resist complementation, modification by ordinary phrase-internal material, and coordination with integrated phrases. If it’s pulled into a host phrase, prosodic isolation should weaken together with <span class="smallcaps">interjection<sub>syn</sub></span> status.

### The non-referentiality ratchet

Bleaching removes referential content, which weakens the semantic motivation for argument structure. A form with no entity or event to construe is less available as an argument, complement, or modifier. Once that syntactic role is lost, prosodic isolation becomes easier to keep in place. The result is a one-directional cascade. Once argument structure is lost, the form occupies a different functional niche, so ordinary pressures toward referential syntax are reduced.

This ratchet supports a different set of predictions from the prosodic-syntactic link. A non-referential stance form should resist definiteness, quantification, and argument position: \**the damn!*, \**every ouch*, \**Damn surprised me*. It should also resist embedding as propositional content (*\*She said that damn*) unless it’s quoted as a form. If referential content becomes active again, the form should move toward noun, verb, or routine-formula regions of the network. The ratchet keeps the form near the interjection node by blocking return to the source category, while prosodic-syntactic coupling links the central structural properties.

### Interactional routine entrenchment

The interactional-pragmatic node needs a candidate stabilizer of its own. Conversation repeatedly places interjectional forms in recognizable sequential slots: continuers, repair initiators, change-of-state tokens, stance displays, and response tokens (Stivers 2019; Dingemanse 2020). Dingemanse (2024) argues that these interactional uses are central rather than peripheral for high-frequency interjections.

Repeated deployment in these slots favours a particular package of properties. A continuer such as *mm-hm* works because it can be recognized without taking over the turn or adding propositional content. A change-of-state token such as *oh* works because it can mark an information update before the speaker builds further at-issue content. A stance token such as *bruh* works because it can display evaluation while leaving the surrounding proposition untouched. The interactional routine favours short, standalone, prosodically independent, non-referential material.

This candidate stabilizer supports the projectibility of <span class="smallcaps">interjection<sub>int</sub></span>. If a novel form is recruited as a minimal stance, response, repair, or backchannel resource, one should expect turn-initial or standalone availability, independent prosodic packaging, resistance to embedding under reported speech, and availability for uptake as an interactional move. The same candidate stabilizer also explains why the function can persist when category membership weakens: a full clause such as *I agree* can do agreement work, but it lacks the reduced, non-propositional packaging that makes *mm-hm* a central <span class="smallcaps">interjection<sub>int</sub></span> case.

### Regional/indexical conditioning

Some interjections are sharply conditioned by region, register, or community, and some may also do identity work. *Lah* in Singapore and Malaysia, *yaar* in India and Pakistan, *haba* in Nigeria, and *bruh* in African American English (Louf et al. 2023) are plausible cases. The GloWbE evidence in Section <a href="#sec:discussion" data-reference-type="ref" data-reference="sec:discussion">6</a> shows country and variety conditioning. It doesn’t by itself establish social indexing in the stronger sociolinguistic sense: uptake, stance, enregisterment, style shifting, metapragmatic awareness, or distribution by speaker group within comparable contexts.

Regional/indexical conditioning is a maintenance hypothesis. The open question is whether it stabilizes the cluster profile (property co-occurrence) or individual items (inventory persistence). If it stabilizes the profile, regional/indexical conditioning helps constitute the node. If it stabilizes only the item, it operates as a lexical-persistence process alongside the interjection cluster.

When a regional interjection is adopted by out-group speakers as ironic quotation, the causal-network account predicts a contrast between inventory persistence and profile stabilization. If out-group *lah* loses prosodic isolation or emotive force while retaining its social-marking function, regional/indexical conditioning stabilizes the inventory alone. If it retains the full profile, the link is part of the causal structure of the kind. This comparison remains for future work.

The interjection cluster has two structural ordering links (prosodic-syntactic coupling and the non-referentiality ratchet), one interactional-pragmatic stabilizer candidate (routine entrenchment), and one regional/indexical maintenance hypothesis. Together, the first three explain why prosodic, syntactic, semantic, and pragmatic diagnostics travel together synchronically. They also underwrite the probes in Section <a href="#sec:discussion" data-reference-type="ref" data-reference="sec:discussion">6</a>: disrupting different links should produce different patterns of residue, conditioning, or erosion.

Figure <a href="#fig:causal-network" data-reference-type="ref" data-reference="fig:causal-network">1</a> summarizes the proposed causal-pragmatic network. It’s a schematic map of the links used in the argument rather than a complete model of every dependency among the diagnostics.

<figure id="fig:causal-network" data-latex-placement="htbp">

<figcaption>Schematic causal-pragmatic network for English interjections. Blue nodes mark recruitment, ordering, or stabilizing processes; white nodes mark diagnostics; and green nodes mark field-relative interjection nodes. Solid arrows indicate proposed ordering links. The double arrow marks reciprocal prosodic-syntactic coupling. Dashed orange arrows mark the regional/indexical maintenance hypothesis. The three field-relative nodes overlap in extension but support different predictions.</figcaption>
</figure>

## Causal residues at the category boundary

The interjection cluster is a convergence point, but convergence doesn’t erase history. Forms recruited through different pathways can share the same pragmatic profile while retaining traces of the source construction that brought them there. Those traces explain why the category is both recognizable and persistently irregular. The irregularities are pathway signatures rather than noise around a prototype.

This gives the category a second kind of inference. Section <a href="#sec:projectibility" data-reference-type="ref" data-reference="sec:projectibility">5</a> develops forward projectibility: observing some properties of a novel form licenses predictions about other properties. Historical pragmatics also needs a <span class="smallcaps">retrodictive counterpart to projectibility</span>: observing a residual property can license an inference about the pathway that produced it. A causal-network account can make that inference because it links properties to recruitment pathways, ordering relations, and source constructions.

<div id="tab:pathway-residues">

| Pathway | Gateway | Residue | Retrodictive prediction |
|:---|:---|:---|:---|
| Religious noun or proper name | Vocative invocation | Taboo avoidance, euphemistic phonological divergence, loss of reference without complement residue (*Jesus!* $`>`$ *jeez*, *gee*; *God!* $`>`$ *gosh*, *golly*) | Euphemistic deformation plus no argument structure points to an invocative source. |
| Verb or imperative | Directive or imperative frame | Complement residue, second-person orientation, directive force, particle retention (*damn these mosquitoes!*, *come on*, *look*) | An NP complement or directive orientation points to a verbal or deontic source. |
| Onomatopoeic or interactional coinage | Iconic or interactional vocalization | Atypical phonology, reduplication, spelling variation, weak etymological transparency (*ugh*, *whew*, *tut-tut*, *uh-oh*, *shh*) | Noncanonical phonology without a plausible lexical source points to iconic coinage. |
| Interactional recruitment or regional/indexical conditioning | Sequential slot or community marker | Regional skew, register sensitivity, indexical persistence (*lah*, *yaar*, *haba*, *bruh*, *mm-hm*) | Uneven distribution by community points to regional/indexical entrenchment or conditioning. |

Pathway residues in the English interjection cluster

</div>

The residue analysis turns apparent exceptions into evidence. *Damn these mosquitoes!* becomes a verbal-path residue rather than an anomalous interjection with a complement. *Gee* records taboo-managed invocative recruitment through euphemistic phonology. *Lah* looks like the kind of item expected if regional or indexical conditioning stabilizes some interjectional forms. Prototype theory predicts fuzzy boundaries. It doesn’t predict that different boundary failures should line up with different historical routes.

The same logic applies when forms leave the cluster. A pre-registered COHA check of verbal derivatives shows that early VP-head uses of *wowed*, *booed*, *oohed*, and *shooed* retain source-interjection semantics after syntactic integration (Appendix <a href="#app:morphgain" data-reference-type="ref" data-reference="app:morphgain">9</a>). [^3] These exit cases fit the residue account: syntactic integration removes <span class="smallcaps">interjection<sub>syn</sub></span>, while source-interjection semantics persists. They probe boundary behaviour, not corrective control.

The *fie* study marks the limit of residue reasoning under global attrition. The pre-registered prediction was that <span class="smallcaps">interjection<sub>int</sub></span> would weaken before <span class="smallcaps">interjection<sub>sem</sub></span>, and <span class="smallcaps">interjection<sub>sem</sub></span> before <span class="smallcaps">interjection<sub>syn</sub></span>. In COHA, all three node diagnostics decline together (Appendix <a href="#app:fie" data-reference-type="ref" data-reference="app:fie">10</a>). [^4]

When a community stops actively using a form, evidence for the pathway can disappear across nodes at once. The relevant process is loss of the ecology in which the ordering links operate, not pathway-specific property loss. Interjections remember how they got there, and those residues make the category’s boundary behaviour structured rather than accidental.

## Stability through turnover

Pathway residues show that interjectional forms remember how they entered the network. Turnover shows the complementary fact: the interjection profile can remain available while its lexical inventory changes. Individual forms enter and leave, but the broad profile remains available for new material.

The Corpus of Historical American English (COHA) illustrates this turnover. Among words tagged as interjections, *nay* ranks in the top 10 in the 1820s but falls to negligible frequency by the 2010s; *yeah*, entirely absent from the 1820s data, is among the five most frequent interjection-tagged forms by the 2010s. [^5] The items change while the projectible category position remains available.

Regional variation shows the profile’s availability under different local inventories. The GloWbE corpus (Davies and Fuchs 2015, 1.9 billion words across 20 English-speaking countries) shows that candidate interjectional forms are distributed unevenly across varieties, with *lah* almost exclusively in Singapore and Malaysia, *yaar* in India and Pakistan, and *haba* in Nigeria. The corpus evidence establishes distributional conditioning; the linguistic analysis identifies the shared profile that makes those forms comparable.

The cluster is also continuously replenished via onomatopoeia. Nonce interjections can be freely coined (Quirk et al. 1985, 74), and new coinages immediately adopt the cluster profile: they’re prosodically isolated, non-referential, emotively laden, and syntactically non-integrated.

This generates a testable prediction: a newly conventionalized interjectional use should exhibit the full cluster profile without having traversed the documented noun-to-interjection or verb-to-interjection paths. Standalone interjectional uses of recent forms like *bruh*, *yeet*, and *oof* fit this expectation, since those uses are prosodically isolated, non-inflecting, non-referential, and syntactically non-integrated, even when the same word families include verbal or nominal lexemes. The category position remains available even as its members turn over.

Dingemanse (2020) characterizes such items as <span class="smallcaps">liminal signs</span>, signs at the boundary between sound and speech, whose liminality is functional, not defective. Cultural evolutionary processes “sample the space of possible resources”. The causal-network account describes that space as a property space structured by the recruitment pathways and synchronic links above.

The recruited, ordered, and replenished cluster is now ready for the paper’s central inferential question: what can partial evidence license?

# Projectibility: what the category lets you predict

Section <a href="#sec:cluster" data-reference-type="ref" data-reference="sec:cluster">2</a> showed what each property, individually, lets you infer. Section <a href="#sec:causal-structure" data-reference-type="ref" data-reference="sec:causal-structure">4</a> identified the ordering and stabilizing links that couple those properties. This section asks a different question: what does the package license, and for whom? Section <a href="#sec:residues" data-reference-type="ref" data-reference="sec:residues">4.6</a> treated retrodictive inference from residue to recruitment history; here <span class="smallcaps">projectibility</span> means forward inference from observed category membership to further behaviour.

Projectibility also helps make a language learnable and usable: speakers and hearers use partial cues to anticipate how a form will behave before they’ve observed its full distribution. Linguists then exploit the same structure for field-specific purposes.

The paper’s central claim is that interjections are a <span class="smallcaps">causal-pragmatic kind</span> whose richest projectibility concerns stance expression, interactional-pragmatic distribution, and conversational management rather than truth-conditional denotation. The section first shows how partial observations support participant inference, then shows how the same structure yields different predictions for syntax, semantics, and interactional pragmatics. From here on, bare “interjection” refers to the zone where the three field-relative nodes overlap unless a subscript specifies otherwise.

The running case is the word family *damn*. Standalone interjectional *damn!*, complement-taking *damn these mosquitoes!*, and prenominal *damn* in *the damn car* belong to the same word family. They needn’t share a lexeme. That contrast makes the methodological point concrete: the network classifies lexical items in grammatical environments, not word families.

## Partial-observation inference

Projectibility becomes circular if it says only this: if X is an interjection, it has interjection properties. That’s trivially true of any category. Causal-network projectibility is different. Observing some cluster properties in a novel form licenses inference to other properties you haven’t yet observed.

Consider the standalone interjectional use of *bruh*, a recently conventionalized form. A learner or hearer encountering that use for the first time observes a partial package: standalone position, prosodic isolation, morphological bareness, and stance meaning. From those observations, the three field-relative nodes license a battery of further expectations:

<span id="ex:bruh" label="ex:bruh"></span> <span class="smallcaps">interjection<sub>syn</sub></span>: the interjectional use won’t inflect within that use (\**bruhs*, \**bruhed*), will resist syntactic integration (\**I think that bruh, we’re late*), and won’t take a determiner or function as an argument (\**the bruh*, \**Bruh surprised me*). This leaves room for separate nominal or verbal lexemes in the same word family. <span class="smallcaps">interjection<sub>int</sub></span>: the form should be available for turn-initial or standalone interactional use. <span class="smallcaps">interjection<sub>sem</sub></span>: its expressive meaning is expected to be use-conditional and speaker-oriented. These expectations vary in strength, and the asymmetry matters:

- Near-entailments (a): the <span class="smallcaps">interjection<sub>syn</sub></span> predictions mostly extrapolate within the same domain from observed distributional facts. Any framework recognizing the relevant morphosyntactic properties generates many of the same predictions. These are real but not specific to the causal-network account.

- Cross-domain inferences (b, c): the <span class="smallcaps">interjection<sub>int</sub></span> prediction (interactional-pragmatic distribution from phonological form) and the <span class="smallcaps">interjection<sub>sem</sub></span> prediction (use-conditional meaning from stance expression) involve genuine leaps across linguistic levels. These are specific to the causal-network account: they rely on the cluster’s internal links, not on any single property.

All three are licensed because identifiable links order the cluster’s properties. The non-referentiality ratchet (Section <a href="#sec:synchronic-coupling" data-reference-type="ref" data-reference="sec:synchronic-coupling">4.5</a>) links loss of referential content with syntactic non-integration. Prosodic-syntactic coupling then links syntactic non-integration with prosodic isolation. The inferences are grounded in network order, not in definitions.

## Field-relative projectibility

Different fields track categories with roughly the same extension but different causal links and different inferential payoffs. That field-relativity follows from treating kinds as nodes in causal structures: a field tracks the node whose causal relations support its explanations and predictions. The three interjection nodes introduced in Section <a href="#sec:fieldrelative-nodes" data-reference-type="ref" data-reference="sec:fieldrelative-nodes">3.4</a> do their work as different packages of inference, not as rival definitions.

<span class="smallcaps">interjection<sub>syn</sub></span> projects syntactic behaviour: supplement function, resistance to *that*-complementation, resistance to coordination with non-interjections, and no ordinary argument structure. These predictions are grounded in the prosodic-syntactic coupling link from Section <a href="#sec:synchronic-coupling" data-reference-type="ref" data-reference="sec:synchronic-coupling">4.5</a>. <span class="smallcaps">interjection<sub>sem</sub></span> projects meaning type: use-conditional contribution, speaker-oriented attitude, nondisplaceability, and descriptive ineffability (Potts 2007, 166). I treat these as a cluster linked by semantic bleaching, rather than as a stipulative definition of Potts (2007)’s <span class="smallcaps">expressives</span>. <span class="smallcaps">interjection<sub>int</sub></span> projects interactional-pragmatic behaviour: turn-initial or standalone positioning, independent intonation contour, stance display, and availability for backchanneling. These predictions are grounded in routine entrenchment and possibly in regional/indexical conditioning (Stivers 2019; Dingemanse 2020).

The nodes overlap because some links, especially prosodic isolation and non-referentiality, contribute to more than one node. They come apart because other links are field-specific. A novel form categorized as <span class="smallcaps">interjection<sub>syn</sub></span> (prosodically isolated, non-inflecting, syntactically non-integrated) should tend also to be <span class="smallcaps">interjection<sub>sem</sub></span> (use-conditional, speaker-oriented) and <span class="smallcaps">interjection<sub>int</sub></span> (turn-initial, available for backchanneling). Central cases such as *wow*, *ouch*, and *hey* fit that default. Boundary cases locate the decoupling: *mm-hm* is <span class="smallcaps">interjection<sub>int</sub></span> but debatably <span class="smallcaps">interjection<sub>syn</sub></span> (O’Connell and Kowal 2005).

The three constructions of *damn* show the same point inside a single word family.

For standalone *damn!*, all three nodes agree: the form is prosodically isolated, syntactically non-integrated, use-conditional, and available as a stance marker. For *damn these mosquitoes!*, the picture is mixed: <span class="smallcaps">interjection<sub>syn</sub></span> holds (it heads an IntjP) but with limited complement-taking; <span class="smallcaps">interjection<sub>sem</sub></span> holds (the meaning is expressive); <span class="smallcaps">interjection<sub>int</sub></span> is weakened (it’s not a backchannel or turn-management device).

For prenominal *damn* in *the damn car*, the nodes diverge further: <span class="smallcaps">interjection<sub>syn</sub></span> fails entirely (the form is syntactically integrated inside a noun phrase), but <span class="smallcaps">interjection<sub>sem</sub></span> holds (use-conditional, perspective- dependent). Neither *CGEL*’s syntax nor Potts’s semantics alone predicts this gradient. The three-node analysis does, because it tracks which links are active in each construction.

## Category vs. function

*CGEL*’s distinction between category and function sharpens the projectibility point: what does the <span class="smallcaps">interjection<sub>syn</sub></span> category label add beyond the <span class="smallcaps">supplement</span> function label? Supplements of any category (noun phrases \[NPs\], adjective phrases \[AdjPs\], relative clauses) share prosodic isolation and syntactic non-integration. But <span class="smallcaps">interjection<sub>syn</sub></span> adds a package of predictions that no other category-in-supplement-function supports:

- Non-referentiality. Supplements can be referential (*a university professor, Dr. Brown, …*); interjections characteristically can’t.

- Emotive/interpersonal force. Not all supplements express stance; interjections characteristically do.

- Free coinage. No other category in supplement function can be freely coined via onomatopoeia.

- Non-inflection. NP supplements inflect freely; interjections don’t.

The function label tells you about syntactic integration. The category label tells you about inference: it licenses a package of predictions that the function label alone can’t provide. The category/function distinction matters beyond the *CGEL* tradition: any framework that distinguishes what kind of thing an expression is from what job it’s doing in a particular clause will face the same question. The causal-network answer is that the category earns its keep by licensing inferences that the function doesn’t.

## Interjections as a causal-pragmatic kind

The three-node analysis explains why interjections are centrally pragmatic. Not all categories project in the same domain. Nouns project morphosyntactically: they take determiners, inflect for number, fill argument slots. Categorizing something as a noun tells you almost nothing about its pragmatics, since nouns aren’t a pragmatic category.

For interjections, the situation is reversed. The three nodes overlap, and the richest projectibility belongs to <span class="smallcaps">interjection<sub>int</sub></span>, the interactional-pragmatic cluster. Dingemanse (2024) reaches the same conclusion from interactional data: the most frequent and functionally central interjections are interactional tools (continuers, repair initiators, change-of-state tokens) rather than emotive exclamations, and their primary projectibility is about sequential position and conversational management. The forms at the intersection of all three nodes are a <span class="smallcaps">causal-pragmatic kind</span>: their strongest inferences are about use, not denotation.

Why pragmatic? Because causal links generate predictions within the domains they operate in. The prosodic-syntactic coupling from Section <a href="#sec:synchronic-coupling" data-reference-type="ref" data-reference="sec:synchronic-coupling">4.5</a> predicts prosodic and syntactic behaviour, with no corresponding prediction about morphological paradigms or truth-conditional content. Regional/indexical conditioning predicts uneven persistence across varieties, with no corresponding prediction about inflectional regularity. The strongest established links among the three interjection nodes are prosodic and interactional; the regional or indexical link remains a maintenance hypothesis. These links and hypotheses underwrite predictions in their respective domains.

The same point holds outside interjections. Noun projectibility is morphosyntactic because the underlying processes (agreement, structural analogy, discourse frequency) operate in morphosyntactic domains.

This connects to the content/procedural distinction. The central projectibility of interjections is procedural and use-conditional rather than denotational (Traugott and Dasher 2002). The <span class="smallcaps">interjection<sub>sem</sub></span> (expressive) cluster is the semantic reflex: use-conditional meaning is the semantic signature of procedural content. The parallel with other domain-specific kinds supports the pattern. <span class="smallcaps">Animacy</span> projects grammatically (argument realization, reference tracking); <span class="smallcaps">colour</span> projects ecologically (edibility, toxicity) but usually doesn’t project grammatically in the way animacy can (cf. Seifart 2010). In each case, the domain of projectibility follows from the underlying processes.

The richest projectibility is pragmatic, but <span class="smallcaps">interjection<sub>syn</sub></span> still belongs in the grammar because it’s projectible and has its own causal links. Neutralize prosodic isolation and the form is reclassified. The syntactic category licenses distributional inference: it predicts supplement function, embedding resistance, and coordination restrictions, even if those predictions are less rich than the pragmatic ones. A thin syntactic node is still a projectible syntactic node. The framework requires reliable inference, and <span class="smallcaps">interjection<sub>syn</sub></span> does.

# Discussion

## Empirical probes and boundary conditions

Section <a href="#sec:projectibility" data-reference-type="ref" data-reference="sec:projectibility">5</a> asked what categorizing a novel form as an interjection lets you predict about that form. At the category level, the causal-network account should also explain patterns across the interjectional region as a whole. Prototype theory accommodates graded membership, fuzzy boundaries, and better-and-worse exemplars (Rosch 1975). The causal-network account adds network-order predictions with named defeat conditions. The probes below address stable profiles, boundary behaviour, and failure. They aren’t tests of corrective control.

The strongest probe is pathway residue. If interjections are recruited by different causal pathways, residual properties should line up with recruitment history. The defeat condition is random distribution: retained complements, euphemistic deformation, atypical phonology, and regional skew should show no relation to source pathways. The evidence in Section <a href="#sec:residues" data-reference-type="ref" data-reference="sec:residues">4.6</a> supports the residue prediction, and the pre-registered morphology-gain study adds an exit-residue case: *wowed*, *booed*, *oohed*, and *shooed* gain syntactic integration while retaining source-interjection semantics. The payoff is pragmatic and diachronic: boundary failures diagnose recruitment routes and explain why later uses keep source-specific residues.

A second probe asks whether conditioning recovers structure. Apparent gradation at the interjection boundary should decompose when conditioned on register, dialect, or community. The defeat condition is parity between conditioned and pooled models. A proof-of-concept probe uses GloWbE frequency data (Davies and Fuchs 2015): for the 10 most frequent interjection-tagged items plus regional and boundary items, frequency-per-million was extracted by country (20 countries) by raw string search, and the coefficient of variation (CV; standard deviation divided by the mean) was computed across countries.

The results show three tiers of geographic conditioning. Regional interjections (*lah*, *yaar*, *haba*) are near-categorical: *lah* occurs at 9.35 per million in Malaysia and 6.72 in Singapore but near zero elsewhere; *yaar* at 2.88 in Pakistan and 2.33 in India but near zero elsewhere; *haba* at 8.78 in Nigeria but near zero elsewhere.

Core interjections show intermediate but structured variation: *ah* has the highest CV among core items (0.71), with Singapore (60.42) and Jamaica (53.16) at eight times the rate of Kenya (7.62). Even broadly distributed items like *yeah* (CV 0.55) show a 6:1 ratio between the United States (103.78) and Tanzania (16.85). Only *yes* and *no* approach uniformity (CVs of 0.21–0.22). Boundary items continue the pattern: the fillers *um* (CV 0.61) and *uh* (CV 0.76) and the routine formula *bye* (CV 0.61) all show structured geographic variation.

Conditioning on country recovers structure at every level of this sample. Variety structures the apparent gradience between core interjections, fillers, and boundary items.

A Bayesian multilevel model (negative binomial with crossed random effects for item and country) supports the same point. Adding country as a grouping factor improves prediction over an item-only baseline: the country-level random-effect standard deviation is 0.44 (95% credible interval \[CrI\]: 0.28–0.65), and the conditioned model outperforms the unconditional model by 18 expected log predictive density (ELPD) units (standard error \[SE\] 13) on exact leave-one-out cross-validation.

The descriptive CV results are the primary evidence; the model shows that the pattern survives a crossed random-effects structure. Full model specification and results are in the supplementary materials.[^6] Several methodological limitations remain. GloWbE’s part-of-speech tagger, trained on inner-circle English, fails to tag regional interjections (*yaar* and *haba* receive no interjection tag at all), which means the tagged frequencies underestimate regional items. The *ha* data is noisy because the tagger captures laughter representations (*HA! HA!*) alongside genuine interjection uses.

Country differences in GloWbE may also reflect corpus composition, genre mix, internet ecology, and orthographic convention as much as dialect or community structure. Raw strings are imperfect proxies too: *yes*, *no*, *ha*, *ah*, *oh*, *well*, *right*, and *like* are not automatically interjection tokens. Some are clausal, quoted, discourse-marker uses, literary representations, or noise. The methodological payoff is that conditioning by variety reveals structure that pooled category labels hide.

A third, weaker probe concerns hub-property asymmetry. One property may be more central to network order than others, with prosodic isolation and non-referentiality as the main candidates. The defeat condition is equal contribution by all properties. The present evidence leaves this prediction open.

Onomatopoeic interjections like *pop* can be syntactically integrated into clause structure; when prosodic isolation is neutralized, as when *pop* in *pop went the cork* is an adverb rather than a standalone interjection (Meinard 2015, 152), reclassification is immediate. Restoring partial referentiality, by contrast, allows the form to hover between noun and interjection uses (as with *God*). This suggests prosodic isolation may be the hub, but the evidence is indirect.

The probes differ in evidential weight. Pathway residue is the main explanatory payoff, conditioning by variety is a proof of concept, and hub-property asymmetry is an open conjecture. Prototype theory describes the gradience but doesn’t predict the trajectory of entry, the pathway-specific structure of boundary cases, or the structure hidden in apparent variation.

At the framework level, if a feature-based account (e.g., \[<span class="smallcaps">+peripheral</span>, <span class="smallcaps">-argument-taking</span>, <span class="smallcaps">+expressive</span>\]) generates the same predictions without causal structure, the network account would be a relabelling. The causal-network payoffs are pathway residues and the field-relative decomposition; if both fail under further testing, the case for interjections as a causal-pragmatic kind collapses to a descriptive property list.

## Interjections as a thin causal-pragmatic kind

<span class="smallcaps">interjection<sub>syn</sub></span> is thin compared with the corresponding noun and verb nodes. It lacks the dense network of agreement, argument realization, derivation, and inflectional behaviour that makes those categories so stable. Thin syntax still licenses inference. The syntactic node predicts supplement function, embedding resistance, coordination restrictions, and non-argumenthood.

The category is richest where three nodes intersect. The syntactic node supplies distributional projectibility; the semantic node supplies use-conditional and expressive projectibility; and the interactional node supplies sequential and interactional projectibility. Interjections are thin as a syntactic category and rich as a causal-pragmatic kind.

## Boundary disputes as causal-link overlap

The graded boundaries that embarrassed classical category theory (interjections vs. nouns, verbs, adverbs, fillers, routine formulae) are field-relative predictions of the causal-network account. Each boundary reflects partial causal-link overlap and differences in what a field needs the category to predict.

Routine formulae (*bye*, *hello*, *sorry*) share prosodic isolation and pragmatic function with interjections but add addressee-directedness and social predictability (Ameka 1992, 109–10). For grammar, those shared properties place them near interjections. For speech-act analysis, the additional properties matter more, because they predict conventional response slots, addressees, and felicity conditions. They’re structured by additional processes (social convention, politeness norms) that prototypical interjections don’t require.

Fillers (*um*, *uh*) share prosodic distinctiveness and syntactic non-integration but differ in emphasis, pause distribution, and citation-introduction ability (O’Connell and Kowal 2005). They occupy a neighbouring region of the property space, structured by partially overlapping links (processing-pressure management rather than emotive expression).

The Ameka-Wilkins debate can be reframed in these terms. Ameka’s separation is right for a field that needs to predict social routines and speech-act felicity (Ameka 1992, 109). Wilkins’s inclusion is right for a field that needs to track pragmatic and semantic overlap with interjections (Wilkins 1992, 142). A causal-network account accommodates both observations because projectibility is indexed to explanatory purpose.

# Conclusion

English interjections are a pragmatic category whose members support projectible inferences about use. Categorizing a form as an interjection lets speakers, hearers, and analysts predict prosodic packaging, syntactic supplementarity, use-conditional stance, sequential position, response function, and uptake. The category is thin as syntax but rich as pragmatics.

The reason is that the familiar interjection profile is a stable causal-pragmatic cluster. Diachronic pathways recruit forms into the profile: grammaticalization via semantic bleaching, pragmaticalization via syntactic deactivation, and nonce coinage via onomatopoeia. Synchronic links then structure the profile: prosodic-syntactic coupling, the non-referentiality ratchet, interactional routine entrenchment, and possibly regional/indexical conditioning. These links explain why partial observations support further inferences.

The field-relative decomposition (<span class="smallcaps">interjection<sub>syn</sub></span>, <span class="smallcaps">interjection<sub>sem</sub></span>, and <span class="smallcaps">interjection<sub>int</sub></span>) shows why the traditional label is useful but not uniform. Syntax, semantics, and interactional pragmatics track overlapping nodes with different predictive payoffs. Boundary cases such as *damn these mosquitoes!*, prenominal *damn*, fillers, routine formulae, and obsolete *fie* aren’t embarrassing exceptions; they show which links are active and which have weakened.

The broader pragmatic payoff is path dependence. Interjections don’t merely express emotion; they enter utterances with histories that shape stance, sequence, and uptake. *Gee*, *damn*, *bruh*, *wowed*, and *fie* matter because they show projectible profiles, boundary behaviour, exit residues, and failure conditions. They show how a fuzzy category can support pragmatic inference without a classical essence.

The evidence establishes projection targets, a stable profile, network order, and maintenance hypotheses. It doesn’t establish strict homeostatic control. That limit matters: pragmatic categories can be stable and inductively useful without being maintained by a perturbation-sensitive control system. English interjections are one such category.

# Acknowledgements

I compiled empirical material for the Wikipedia article “English interjections”, of which I’m the primary author. This paper reanalyses that material through the lens of pragmatic projectibility.

Supplementary materials (data and analysis code for the corpus tests) are available at <https://github.com/BrettRey/interjections-hpc>.

# Declaration of generative AI and AI-assisted technologies in the manuscript preparation process

During the preparation of this work the author used Claude (Anthropic) for drafting and revision support. After using this tool, the author reviewed and edited the content as needed and takes full responsibility for the content of the submitted article.

# Supplementary materials: GloWbE conditioning analysis

## Data

Frequency counts for 18 items were extracted from the GloWbE corpus (Davies and Fuchs 2015): 1.9 billion words across 20 countries. Items include the 10 most frequent interjection-tagged forms (*yes*, *no*, *oh*, *yeah*, *hi*, *hey*, *wow*, *hello*, *ah*, *ha*), three regional interjections (*lah*, *yaar*, *haba*), the semi-regional *eh*, the boundary items *cheers* and *bye*, and the fillers *um* and *uh*.

Figure <a href="#fig:country-variation" data-reference-type="ref" data-reference="fig:country-variation">2</a> displays raw frequency per million words by country for each item.

<figure id="fig:country-variation" data-latex-placement="ht">
<embed src="country-variation.pdf" />
<figcaption>Interjection frequency per million words across 20 GloWbE countries. Red points indicate the median across items within each country. Regional items (<em>lah</em>, <em>yaar</em>, <em>haba</em>) cluster in one or two countries. Core items show structured cross-country variation. Only <em>yes</em> and <em>no</em> approach uniformity.</figcaption>
</figure>

## Model specification

Four Bayesian negative binomial models were fitted using `brms` 2.23 in R, with 4 chains of 2,000 iterations (1,000 warmup) each. The confirmatory models (m0–m2) use exact reloo for leave-one-out cross-validation; the exploratory model (m3) is assessed by Pareto-smoothed importance sampling (PSIS) diagnostics only.

> `m0` (unconditional):
> `count ~ 1 + offset(log(words_m)) + (1 | item)`
>
> `m1` (conditioned):
> `count ~ 1 + offset(log(words_m)) +`
> `(1 | item) + (1 | country)`
>
> `m2` (item type):
> `count ~ 1 + item_type + offset(log(words_m)) +`
> `(1 | item) + (1 | country)`
>
> `m3` (exploratory):
> `count ~ 1 + item_type + offset(log(words_m)) +`
> `(1 | item) + (1 + item_type || country)`

Model m3 estimates country-level varying slopes by item type. Because several item-type cells are sparse (semi-regional: 1 item, boundary: 2, filler: 2, regional: 3), these slopes are thinly supported and m3 is treated as exploratory.

## Results

Exact leave-one-out cross-validation (confirmatory models):

<div class="center">

| Model              |         ELPD diff | SE diff |
|:-------------------|------------------:|--------:|
| m1 (conditioned)   |               0.0 |     0.0 |
| m2 (item type)     |  $`-`$<!-- -->9.3 |     7.9 |
| m0 (unconditional) | $`-`$<!-- -->17.9 |    13.3 |

</div>

The conditioned model (m1) is preferred. Adding item-type fixed effects (m2) doesn’t improve on m1 because the item random effect already captures between-item variation. The exploratory model (m3) is excluded from formal ranking because 9 observations retain Pareto $`k > 0.7`$ after moment matching, indicating unstable PSIS leave-one-out estimates.

Random effects from the conditioned model (m1):

- Country standard deviation (SD) = 0.44 (95% CrI: 0.28–0.65)

- Item SD = 2.07 (95% CrI: 1.49–2.96)

- Ratio: 4.7:1 (item identity explains more variation, while country adds structure)

The evidence is directionally consistent: conditioning on country improves prediction ($`\Delta`$ELPD = 17.9, SE 13.3), and the country random-effect SD is positive across the interval. The SE is large relative to the point estimate, reflecting the small sample (18 items $`\times`$ 20 countries = 360 observations). The descriptive CV results in the main text provide the primary evidence for the conditioning probe; the model suggests the pattern survives the crossed random-effects structure.

# Supplementary materials: morphology-gain analysis (COHA)

## Data and extraction

The morphology-gain study was pre-registered before extraction and coding (`f1a2193`, 27 May 2026). The corpus was COHA (1820–2019). The fixed source families were *wow*, *boo*, *ooh*, and *shoo*. The fixed target derivatives were *wowed*, *wowing*, *wows*; *booed*, *booing*, *boos*; *oohed*, *oohing*, *oohs*; and *shooed*, *shooing*, *shoos*. Bare source forms were excluded.

Extraction was completed on 27 May 2026. The token-level analysis used parsed key-word-in-context (KWIC) exports. Chart requests for *oohed* and *oohing* returned server errors, but KWIC exports loaded and were used for coding. Minor chart/KWIC row-count discrepancies are documented in the extraction summary.

The go/no-go threshold required at least 20 analyzable derivative tokens and at least two decades with three or more analyzable derivative tokens. All four families met the threshold.

<div class="center">

| Family | Analyzable tokens | Decades with 3+ | Status |
|:-------|------------------:|----------------:|:-------|
| *boo*  |               364 |              10 | GO     |
| *ooh*  |                52 |               6 | GO     |
| *shoo* |               410 |              13 | GO     |
| *wow*  |               143 |               9 | GO     |

</div>

## Coding

Each analyzable token was coded for two confirmatory properties. First, <span class="smallcaps">verbal syntax</span>: whether the derivative heads a VP. Second, <span class="smallcaps">semantic residue</span>: whether the derivative retains the source interjection’s meaning profile: impressed or evaluative response for *wow*; disapproval or contempt for *boo*; appreciative or surprised response for *ooh*; and directive away-sending for *shoo*.

Two secondary descriptive properties were also coded. <span class="smallcaps">Interactional residue</span> tracks whether the verbal event retains the source interjection’s social-interactional profile as a recognizable act of response, uptake, display, or direction. <span class="smallcaps">Targeted entity</span> tracks whether a target, object, addressee, or stimulus is recoverable. These secondary codes weren’t used in the confirmatory test.

## Confirmatory result

For each source family, the confirmatory window was the earliest decade with at least three analyzable derivative tokens, plus following decades until the window contained at least ten tokens. The success criterion was `verbal_syntax = 1` and `semantic_residue = 1`. The pre-registered support criterion required the lower bound of the Jeffreys 95% interval to exceed 0.5.

<div class="center">

| Family | Tokens | Window      | $`n`$ | Successes | Jeffreys 95% CI  |
|:-------|-------:|:------------|------:|----------:|:-----------------|
| *boo*  |    364 | 1920s       |    17 |        17 | \[0.865, 1.000\] |
| *ooh*  |     52 | 1960s–1980s |    10 |        10 | \[0.783, 1.000\] |
| *shoo* |    410 | 1890s–1900s |    16 |        16 | \[0.857, 1.000\] |
| *wow*  |    143 | 1930s–1940s |    11 |        11 | \[0.800, 1.000\] |

</div>

All four families support the pre-registered prediction. The secondary diachronic model wasn’t estimated because semantic residue had no variation among included VP-head tokens.

## Reliability and files

A blind 10% recode was completed ($`n=97`$, seed 2026). The confirmatory codes were fully reliable: inclusion, VP-head status, and semantic residue each had agreement = 1.00 and Cohen’s $`\kappa = 1.00`$. The secondary descriptive codes were less stable.

<div class="center">

| Code                  | Agreement | Cohen’s $`\kappa`$ |
|:----------------------|----------:|-------------------:|
| Inclusion             |     1.000 |              1.000 |
| VP-head status        |     1.000 |              1.000 |
| Semantic residue      |     1.000 |              1.000 |
| Interactional residue |     0.887 |              0.297 |
| Targeted entity       |     0.557 |              0.025 |

</div>

The analysis files are in . The main outputs are:

-
-
-
-
-
-

The scripts are:

-
-
-
-
-

# Supplementary materials: obsolescence analysis (⟨<span class="upright">fie</span>⟩ in COHA)

## Data and coding

616 tokens of *fie* were extracted from COHA (1820–2019), excluding 141 optical character recognition (OCR) artefacts (where *fie* is a misread of *he*, *the*, or *five*). Each token was coded for three binary node-membership diagnostics: syntactic, semantic/use-conditional, and interactional-pragmatic interjectionhood. These correspond to the <sub>syn</sub>, <sub>sem</sub>, and <sub>int</sub> nodes used above. Coding criteria are in the pre-registration. The decade column was hidden during coding.

Intra-coder reliability (10% blind recode, $`n = 62`$):

<div class="center">

| Diagnostic | Agreement | Cohen’s $`\kappa`$ |
|:---|---:|---:|
| <span class="smallcaps">interjection<sub>syn</sub></span> | 87.1% | 0.68 |
| <span class="smallcaps">interjection<sub>sem</sub></span> | 96.8% | 0.90 |
| <span class="smallcaps">interjection<sub>int</sub></span> | 96.8% | 0.90 |

</div>

## Raw proportions

Figure <a href="#fig:fie-raw" data-reference-type="ref" data-reference="fig:fie-raw">3</a> shows the proportion of tokens retaining each diagnostic by decade. All three diagnostics are near ceiling (85–100%) through the 1890s, decline sharply between 1900 and 1940, and fluctuate at low levels thereafter. The late rebounds (1980s, 2000s) reflect historical fiction and Shakespeare adaptations in which characters use *fie* as a live interjection.

<figure id="fig:fie-raw" data-latex-placement="ht">
<embed src="fie-raw-proportions.pdf" />
<figcaption>Proportion of <em>fie</em> tokens coded 1 for each diagnostic, by decade (COHA, 1820–2019). All three diagnostics decline in parallel.</figcaption>
</figure>

## Confirmatory model

Bayesian logistic decline models (per the pre-registration) estimate the 50% effective date (ED50; $`\tau`$: the year at which each diagnostic’s probability crosses 0.5):

<div class="center">

| Diagnostic | $`\tau`$ (median) | 95% CrI |
|:---|---:|---:|
| <span class="smallcaps">interjection<sub>syn</sub></span> | 1947 | 1937–1958 |
| <span class="smallcaps">interjection<sub>int</sub></span> | 1948 | 1939–1959 |
| <span class="smallcaps">interjection<sub>sem</sub></span> | 1949 | 1941–1960 |

</div>

Pre-committed prediction: Pr($`\tau_{\text{int}} < \tau_{\text{sem}} <
\tau_{\text{syn}}`$) = 0.13 (threshold for “inconsistent”: $`< 0.25`$). The predicted sequential erosion wasn’t observed. All three diagnostics declined together. See Figure <a href="#fig:fie-tau" data-reference-type="ref" data-reference="fig:fie-tau">4</a>.

<figure id="fig:fie-tau" data-latex-placement="ht">
<embed src="fie-tau-posteriors.pdf" />
<figcaption>Posterior distributions of ED50 (<span class="math inline"><em>τ</em></span>) for each diagnostic. The three distributions overlap almost completely, indicating simultaneous decline.</figcaption>
</figure>

<div id="refs" class="references csl-bib-body hanging-indent">

<div id="ref-ameka1992" class="csl-entry">

Ameka, Felix. 1992. “Interjections: The Universal yet Neglected Part of Speech.” *Journal of Pragmatics* 18 (2–3): 101–18. <https://doi.org/10.1016/0378-2166(92)90048-G>.

</div>

<div id="ref-boyd1999" class="csl-entry">

Boyd, Richard. 1999. “Homeostasis, Species, and Higher Taxa.” In *Species: New Interdisciplinary Essays*, edited by Robert A. Wilson. MIT Press. <https://doi.org/10.7551/mitpress/6396.003.0012>.

</div>

<div id="ref-bullokar1586" class="csl-entry">

Bullokar, William. (1586) 1980. *Pamphlet for Grammar, 1586*. Edited by J. R. Turner. Leeds Texts and Monographs, New Series 1. University of Leeds, School of English.

</div>

<div id="ref-coulmas1981" class="csl-entry">

Coulmas, Florian. 1981. “Introduction: Conversational Routine.” In *Conversational Routine: Explorations in Standardized Communication Situations and Prepatterned Speech*, edited by Florian Coulmas. Mouton.

</div>

<div id="ref-croft2001" class="csl-entry">

Croft, William. 2001. *Radical Construction Grammar: Syntactic Theory in Typological Perspective*. Oxford University Press. <https://doi.org/10.1093/acprof:oso/9780198299554.001.0001>.

</div>

<div id="ref-davies2010" class="csl-entry">

Davies, Mark. 2010. *The Corpus of Historical American English (COHA)*. <https://www.english-corpora.org/coha/>.

</div>

<div id="ref-daviesfuchs2015" class="csl-entry">

Davies, Mark, and Robert Fuchs. 2015. “Expanding Horizons in the Study of World Englishes with the 1.9 Billion Word Global Web-Based English Corpus (GloWbE).” *English World-Wide* 36 (1): 1–28. <https://doi.org/10.1075/eww.36.1.01dav>.

</div>

<div id="ref-dingemanse2020" class="csl-entry">

Dingemanse, Mark. 2020. “Between Sound and Speech: Liminal Signs in Interaction.” *Research on Language and Social Interaction* 53 (1): 188–96. <https://doi.org/10.1080/08351813.2020.1712967>.

</div>

<div id="ref-dingemanse2024" class="csl-entry">

Dingemanse, Mark. 2024. “Interjections at the Heart of Language.” *Annual Review of Linguistics* 10: 257–77. <https://doi.org/10.1146/annurev-linguistics-031422-124743>.

</div>

<div id="ref-gehweiler2008" class="csl-entry">

Gehweiler, Elke. 2008. “From Proper Name to Primary Interjection: The Case of *Gee*!” *Journal of Historical Pragmatics* 9 (1): 71–88. <https://doi.org/10.1075/jhp.9.1.05geh>.

</div>

<div id="ref-goldberg1995constructions" class="csl-entry">

Goldberg, Adele E. 1995. *Constructions: A Construction Grammar Approach to Argument Structure*. University of Chicago Press.

</div>

<div id="ref-goldberg2006" class="csl-entry">

Goldberg, Adele E. 2006. *Constructions at Work: The Nature of Generalization in Language*. Oxford University Press. <https://doi.org/10.1093/acprof:oso/9780199268511.001.0001>.

</div>

<div id="ref-Goodman1955" class="csl-entry">

Goodman, Nelson. 1955. *Fact, Fiction, and Forecast*. Harvard University Press.

</div>

<div id="ref-haspelmath2010" class="csl-entry">

Haspelmath, Martin. 2010. “Comparative Concepts and Descriptive Categories in Crosslinguistic Studies.” *Language* 86 (3): 663–87. <https://doi.org/10.1353/lan.2010.0021>.

</div>

<div id="ref-heinekuteva2002lexicon" class="csl-entry">

Heine, Bernd, and Tania Kuteva. 2002. *World Lexicon of Grammaticalization*. Cambridge University Press. <https://doi.org/10.1017/CBO9780511613463>.

</div>

<div id="ref-huddleston2002" class="csl-entry">

Huddleston, Rodney, and Geoffrey K. Pullum. 2002. *The Cambridge Grammar of the English Language*. Cambridge University Press. <https://doi.org/10.1017/9781316423530>.

</div>

<div id="ref-HuddlestonPullum2005" class="csl-entry">

Huddleston, Rodney, and Geoffrey K. Pullum. 2005. *A Student’s Introduction to English Grammar*. Cambridge University Press.

</div>

<div id="ref-jespersen1924" class="csl-entry">

Jespersen, Otto. 1924. *The Philosophy of Grammar*. George Allen & Unwin.

</div>

<div id="ref-juckertaavitsainen2013" class="csl-entry">

Jucker, Andreas H., and Irma Taavitsainen. 2013. *English Historical Pragmatics*. Edinburgh Textbooks on the English Language. Edinburgh University Press.

</div>

<div id="ref-khalidi2013" class="csl-entry">

Khalidi, Muhammad Ali. 2013. *Natural Categories and Human Kinds: Classification in the Natural and Social Sciences*. Cambridge University Press. <https://doi.org/10.1017/CBO9780511998553>.

</div>

<div id="ref-khalidi2018nodes" class="csl-entry">

Khalidi, Muhammad Ali. 2018. “Natural Kinds as Nodes in Causal Networks.” *Synthese* 195 (4): 1379–96. <https://doi.org/10.1007/s11229-015-0841-y>.

</div>

<div id="ref-leech2006" class="csl-entry">

Leech, Geoffrey. 2006. *Glossary of English Grammar*. Edinburgh University Press.

</div>

<div id="ref-libert2019" class="csl-entry">

Libert, Alan. 2019. “Interjections.” In *Oxford Bibliographies in Linguistics*. Oxford University Press. <https://doi.org/10.1093/obo/9780199772810-0230>.

</div>

<div id="ref-louf2023" class="csl-entry">

Louf, Thomas, Bruno Gonçalves, José J. Ramasco, David Sánchez, and Jack Grieve. 2023. “American Cultural Regions Mapped Through the Lexical Analysis of Social Media.” *Humanities and Social Sciences Communications* 10. <https://doi.org/10.1057/s41599-023-01611-3>.

</div>

<div id="ref-meinard2015" class="csl-entry">

Meinard, Maruszka Eve Marie. 2015. “Distinguishing Onomatopoeias from Interjections.” *Journal of Pragmatics* 76: 150–68. <https://doi.org/10.1016/j.pragma.2014.11.011>.

</div>

<div id="ref-norde2009" class="csl-entry">

Norde, Muriel. 2009. *Degrammaticalization*. Oxford University Press. <https://doi.org/10.1093/acprof:oso/9780199207923.001.0001>.

</div>

<div id="ref-norrick2009" class="csl-entry">

Norrick, Neal R. 2009. “Interjections as Pragmatic Markers.” *Journal of Pragmatics* 41 (5): 866–91. <https://doi.org/10.1016/j.pragma.2008.08.005>.

</div>

<div id="ref-oconnell2005" class="csl-entry">

O’Connell, Daniel C., and Sabine Kowal. 2005. “Uh and Um Revisited: Are They Interjections for Signaling Delay?” *Journal of Psycholinguistic Research* 34 (6): 555–76. <https://doi.org/10.1007/s10936-005-9164-3>.

</div>

<div id="ref-potts2007" class="csl-entry">

Potts, Christopher. 2007. “The Expressive Dimension.” *Theoretical Linguistics* 33 (2): 165–98. <https://doi.org/10.1515/TL.2007.011>.

</div>

<div id="ref-quirk1985" class="csl-entry">

Quirk, Randolph, Sidney Greenbaum, Geoffrey Leech, and Jan Svartvik. 1985. *A Comprehensive Grammar of the English Language*. Longman.

</div>

<div id="ref-reynolds2026notEveryStableCluster" class="csl-entry">

Reynolds, Brett. 2026. “Not Every Stable Cluster Is Homeostatic: Stability, Network Order, and Control in Projectible Kinds.” <https://philarchive.org/rec/REYNES>.

</div>

<div id="ref-rosch1975" class="csl-entry">

Rosch, Eleanor. 1975. “Cognitive Representations of Semantic Categories.” *Journal of Experimental Psychology: General* 104 (3): 192–233. <https://doi.org/10.1037/0096-3445.104.3.192>.

</div>

<div id="ref-seifart2010" class="csl-entry">

Seifart, Frank. 2010. “Nominal Classification.” *Language and Linguistics Compass* 4 (8): 719–36. <https://doi.org/10.1111/j.1749-818X.2010.00194.x>.

</div>

<div id="ref-stivers2019" class="csl-entry">

Stivers, Tanya. 2019. “How We Manage Social Relationships Through Answers to Questions: The Case of Interjections.” *Discourse Processes* 56 (3): 191–209. <https://doi.org/10.1080/0163853X.2018.1441214>.

</div>

<div id="ref-taylor2003" class="csl-entry">

Taylor, John R. 2003. *Linguistic Categorization*. 3rd ed. Oxford University Press. <https://doi.org/10.1093/oso/9780199266647.001.0001>.

</div>

<div id="ref-TraugottDasher2002" class="csl-entry">

Traugott, Elizabeth Closs, and Richard B. Dasher. 2002. *Regularity in Semantic Change*. Cambridge University Press. <https://doi.org/10.1017/CBO9780511486500>.

</div>

<div id="ref-weinberger2026HomeostasisCausalControl" class="csl-entry">

Weinberger, Naftali. 2026. “Homeostasis and Causal Control.” *Biology & Philosophy* 41. <https://doi.org/10.1007/s10539-026-10018-8>.

</div>

<div id="ref-wierzbicka1992" class="csl-entry">

Wierzbicka, Anna. 1992. “The Semantics of Interjection.” *Journal of Pragmatics* 18 (2–3): 159–92. <https://doi.org/10.1016/0378-2166(92)90050-L>.

</div>

<div id="ref-wilkins1992" class="csl-entry">

Wilkins, David P. 1992. “Interjections as Deictics.” *Journal of Pragmatics* 18 (2–3): 119–58. <https://doi.org/10.1016/0378-2166(92)90049-H>.

</div>

</div>

[^1]: Contact: <brett.reynolds@humber.ca>

[^2]: The interrogative/question contrast uses different words for the syntactic and pragmatic nodes because both terms are established in their respective traditions. For interjections, every field uses the same word, so subscripts are the honest notation. An alternative would have been to adopt distinct terms from the relevant literatures: <span class="smallcaps">expressive</span> from Potts (2007) for the semantic node, and <span class="smallcaps">response token</span> from Stivers (2019) or <span class="smallcaps">liminal sign</span> from Dingemanse (2020) for the interactional-pragmatic node.

    The Latin <span class="smallcaps">ejaculatio</span> (“a throwing out”, mirroring <span class="smallcaps">interjectio</span>, “a throwing between”) would have been etymologically perfect for the interactional-pragmatic category, but modern English has rendered the term unavailable.

[^3]: The analysis plan was committed before extraction and coding (`f1a2193`, 27 May 2026). The full pre-registration, data, and code are archived at <https://github.com/BrettRey/interjections-hpc>.

[^4]: The analysis plan was committed before data extraction (`e18675b`, 15 March 2026). The full pre-registration, data, and code are archived at <https://github.com/BrettRey/interjections-hpc>.

[^5]: Based on part-of-speech–tagged frequencies in COHA (Davies 2010). Frequencies are normalized per million words. The interjection tag captures a conventional inventory rather than a theoretically motivated one; the point here is the turnover pattern, not the tagging decisions.

[^6]: Available at <https://github.com/BrettRey/interjections-hpc>.