Mean Log Frequency of Forms
The code change is a simple change to one line.
The top 10 are:
6277 2304 449 6373 2305 429 6500 2302 585 6558 0403 657 6562 2303 467 6596 1001 401 6600 0408 905 6617 2301 207 6640 0702 287 6646 2720 406
In other words:
- 1 John 4 (also 1st for lexemes)
- 1 John 5 (also 2nd for lexemes)
- 1 John 2 (8th for lexemes)
- John 3 (9th for lexemes)
- 1 John 3 (7th for lexemes)
- Ephesians 1 (11th for lexemes)
- John 8 (6th for lexemes)
- 1 John 1 (4th for lexemes)
- 1 Corinthians 2 (32nd for lexemes)
- Revelation 20 (14th for lexemes)
Generally form frequency will track pretty closely with lexeme frequency because a form being common makes the lexeme common. This makes 1 Corithinians 2 interesting.
Frequent words and forms obviously doesn’t necessarily mean shallow syntax, though. 1 John 4, 5 and 2 are respectively the 36th 67th and 38th by mean dependency depth. There are no chapters that are in the top ten of both mean log form frequency AND mean dependency depth.
So we now have mean log frequences for lexemes and forms as well as mean dependency depth. In future posts, I’ll add parse codes and the actual dependency path to the mix and then we can look at combining all five metrics. I’ll also look at paragraphs rather than chapters as targets.