Mean Dependency Depth

With dependency paths calculated for the Greek New Testament, we can use mean dependency depth as a proxy for syntactic complexity.

In Mean Log Frequency of Lexemes I mentioned that, as well as mean log word frequency, reading comprehension measures such as the Lexile® framework use average sentence length. Now that we have Dependency Paths calculated, we can explore potentially more useful proxies for syntactic complexity.

As an initial experiment, we’ll simply take the mean dependency depth of each target where our targets are chapters and by “dependency depth” I simply mean the number of labels in the dependency path. In other words np-O-CL-CL will count as 4 and we’ll just average across all the words in each chapter.

An initial run reveals one interesting problem. Luke 3 is given a considerably higher score than anything else because of the analysis of the genealogy (A the son of B the son of C…and so on, leads to very long paths). Reading that genealogy is arguably not that taxing syntactically which highlights one flaw in the dependency depth approach (or, perhaps the analysis chosen for the genealogy).

This aside, let’s look at what this measure identifies as easiest chapters:

2685 67009
2715 67006
2746 66014
2831 67014
2840 66013
2840 69005
2841 67007
2869 66007
2888 67016
2892 69003

Interestingly, the top 10 chapters for lowest mean dependency depth are all in Romans, 1 Corinthians and Galatians.

If we average, instead, across entire books, the top ten are:

  • 3 John
  • 1 Corinthians
  • 1 John
  • James
  • Galatians
  • John
  • Romans
  • Matthew
  • Mark
  • 2 John

which is perhaps a little less surprising.

The hardest chapters, Luke 3 aside, are the first chapters of Ephesians, 2 Timothy and Colossians, which probably isn’t much of a surprise either. The hardest books overall are Ephesians and Colossians.

The code is available here (tweak line 13 to get book-level stats).

Note, this all may be quite sensitive to the choice of analysis. It would be an interesting exercise to see, for example, what the PROIEL dependency analysis yields.

In future posts, we’ll try a few more measures and then try to bring them together to see how chapters (or books, or authors) compare across multiple criteria.


Comments on “Mean Dependency Depth”