Analyzing Nominal Morphology: Part 2

14th November 2015 / by James Tauber

In Analyzing Nominal Morphology: Part 1, I talked about putting together a list of nominal distinguishers and verifying it on the MorphGNT, generating a per-lexeme theme + distinguisher analysis. Here, I’ll outline some further steps I’ve taken.

As well as producing a YAML file with entries for each lexeme, I also now generate a (space-delimited) tabular form that looks like this:

ἀβαρής a-4a -- M n-3d(2aA) ἀβαρ AS ἀβαρῆ ἀβαρ ῆ εσ+α
ἄβυσσος n-2b -- F n-2b ἀβυσσ GS ἀβύσσου ἀβύσσ ου ο+ιο
ἄβυσσος n-2b -- F n-2b ἀβυσσ AS ἄβυσσον ἄβυσσ ον ο+ν
ἀγαθοποιέω verb PA M n=3c(5b-OU) ἀγαθοποι NS ἀγαθοποιῶν ἀγαθοποι ῶν ουντ+
ἀγαθοποιέω verb PA M n=3c(5b-OU) ἀγαθοποι NP ἀγαθοποιοῦντες ἀγαθοποι οῦντες ουντ+ες
ἀγαθοποιέω verb PA M n=3c(5b-OU) ἀγαθοποι AP ἀγαθοποιοῦντας ἀγαθοποι οῦντας ουντ+ας
ἀγαθοποιέω verb PA F n-1c ἀγαθοποιουσ NP ἀγαθοποιοῦσαι ἀγαθοποιοῦσ αι α+ι
ἀγαθοποιΐα n-1a -- F n-1a ἀγαθοποιϊ DS ἀγαθοποιΐᾳ ἀγαθοποιΐ ᾳ α+ι
ἀγαθοποιός a-3a -- M n-2a ἀγαθοποι GP ἀγαθοποιῶν ἀγαθοποι ῶν +ων
ἀγαθός a-1a(2a) -- M n-2a ἀγαθ NS ἀγαθός ἀγαθ ός ο+ς

The columns are:

lemma
Mounce category (or verb for particples) for overall lexeme
aspect / voice (for participles)
gender
Mounce category used for particular sub-paradigm (different from overall lexeme for adjectives or participles)
lexeme-level theme
case / number
form
form-specific theme
form-specific distinguisher
stem ending and suffix

What’s helpful about this format is you can use awk, grep, sort, wc and other Unix tools to very quickly extract information. (I may soon put it in SQL and expose a web interface too). So you can see all the times a particular distinguisher is used, or all the times it’s used for a particular case / number. Or what all the sandhi rules are.

I’ve already written a Python script that generates a list of paradigms based on this (keyed off Mounce category for now, until I’ve finalized my own, which will actually be defined by these paradigms).

The paradigms look like:

n-3b(1) M (10):
    NS:   ξ          {κ+ς}
    GS:   κος        {κ+ος}
    DS:   κι         {κ+ι}
    AS:   κα         {κ+α}
    NP:   κες        {κ+ες}
    GP:   κων        {κ+ων}
    AP:   κας        {κ+ας}

There’s actually a feedback loop where inconsistencies and errors spotted in this paradigm output inform corrections to the underlying distinguisher rules.

The code and data are available at https://github.com/morphgnt/morphological-lexicon/tree/master/projects/nominal_distinguishers.

← First Prototype of New Online Reader Initial Thoughts on the Cost of Learning a Form →

Comments on “Analyzing Nominal Morphology: Part 2”

J. K. Tauber

at the intersection of computing, linguistics, philology, and learning science

Now • Projects • Articles • Labs • Blog

Atom Feed

By day I’m an entrepreneur, web technologist and open-source developer but my academic background is in linguistics (along with some classics, comparative philology, and educational statistics) and my main avocation is working on text, annotations, analysis and software relating to historical languages with a particular interest in facilitating better learning.

While my focus has mostly been on Biblical Greek, much of the work is highly relevant to other Hellenistic Greek texts, other dialects of Ancient Greek and, indeed, texts in completely different languages as well.

All code written for this endeavour is open source and text and data is made available under a Creative Commons license to the extent allowed by the sources used.

I can be contacted at jtauber@jtauber.com.

Analyzing Nominal Morphology: Part 2

Comments on “Analyzing Nominal Morphology: Part 2”

J. K. Tauber

at the intersection of computing, linguistics, philology, and learning science

Get Posts by Email