J. K. Tauber

Initial Thoughts on the Cost of Learning a Form

13th November 2015 / by James Tauber

Over the years, when generating vocab coverage stats or orderings for graded readers, I’ve used either lemmas or inflected forms as the items being learnt.

The problem with using inflected forms is that it assumes knowing one form of a lexeme has nothing to do with knowing any other form of that lexeme. The problem with using lemmas is that it assumes knowing one form of a lexeme is enough to know all of them.

Analyzing Nominal Morphology: Part 1

12th November 2015 / by James Tauber

While much of my work going back 10 years or more was on the nominals, the last few years I’ve been focused on verbal morphology. I decided that for my SBL paper, however, I’d revisit some of my noun work and ended up exploring some ideas afresh.

Technical Aspects of Openness

11th November 2015 / by James Tauber

In my previous post, I talked about the legal / licensing aspects of open linguistic data but there are technical aspects in order for linguistic data to be open too.

Why I Use CC-BY-SA Licenses

10th November 2015 / by James Tauber

I don’t think I’ve ever articulated why I favour a Creative Commons CC-BY-SA license on all my New Testament Greek data.

Mean Log Frequency of Dependency Paths

9th November 2015 / by James Tauber

Adding another potential readbility metric, let’s look at the mean log frequency of dependency paths.

At the Half Way Point

8th November 2015 / by James Tauber

Exactly two weeks ago I said I’d be blogging every day until my talk at SBL. Well, that’s two weeks away so I’m at the half way point. I think the blogging has gone well.

Generating Readers

7th November 2015 / by James Tauber

Back in April 2014, Brian Renshaw posted a Good Friday Greek Reader. It was presumably manually produced but I knew such things could be generated automatically and so went about building a system to do so.

Inline Annotation of Sandhi

6th November 2015 / by James Tauber

In many Greek morphology projects, I’ve wanted a way of conveying the surface form of an inflected word while also conveying the underlying components prior to the application of the sandhi rule. A couple of years ago, I came up with a simple representation for inline annotation.

Morphological Parts of Speech in Greek

5th November 2015 / by James Tauber

The parts of speech in a particular language can be drawn up on the basis of syntactic properties, morphological properties, and/or (perhaps most problematically) semantic properties.

What if we just want to classify lexemes in the MorphGNT based on what morphosynactic and morphosemantic features they have?

Mean Log Frequency of Forms

4th November 2015 / by James Tauber

In a previous post, we looked at which chapters had the highest mean log frequency of lexemes. The code provided there was applicable to other items, though, so let’s now take a look at mean log frequency of forms.

Initial Thoughts on the Cost of Learning a Form

Analyzing Nominal Morphology: Part 1

Technical Aspects of Openness

Why I Use CC-BY-SA Licenses

Mean Log Frequency of Dependency Paths

At the Half Way Point

Generating Readers

Inline Annotation of Sandhi

Morphological Parts of Speech in Greek

Mean Log Frequency of Forms

J. K. Tauber

at the intersection of computing, linguistics, philology, and learning science

Get Posts by Email