Dependency Paths

For numerous corpus linguistics applications, it’s useful to have a word-level indication of syntax. A presentation by Vanessa and Robert Gorman gave me the idea of using dependency paths for this purpose so I’ve now calculated them for the GNT based on the GBI syntax trees.

The presentation by the Gormans was entitled Greek Historiography Through Dependency Syntax Treebanking and they refer to the dependency paths as “syntactic words” or “swords” for short.

While their particular interest is authorship, the Gormans make an excellent point about the value of these dependency paths:

The chief advantage of recasting dependencies as syntax words is that they are immediately valuable: with trivial modifications such texts can be put into standard text-processing software to produce type-token ratios, word frequency histograms, etc., providing detailed syntactic information about individual authors.

I’ve previously written about Converting the GBI Syntax Trees to a Dependency Analysis so it’s just a small step to producing dependency paths.

So if we take the output for the first part of John 3.16 from this dependency conversion:

64003016001 Οὕτως 64003016003 ADV
64003016002 γὰρ 64003016003 conj
64003016003 ἠγάπησεν None CL
64003016004 ὁ 64003016005 det
64003016005 θεὸς 64003016003 S
64003016006 τὸν 64003016007 det
64003016007 κόσμον 64003016003 O
64003016008 ὥστε 64003016013 conj
64003016009 τὸν 64003016010 det
64003016010 υἱὸν 64003016013 O
64003016011 τὸν 64003016012 det
64003016012 μονογενῆ 64003016010 np
64003016013 ἔδωκεν, 64003016003 CL

we can easily build up the dependency paths / swords:

64003016001 Οὕτως ADV-CL
64003016002 γὰρ conj-CL
64003016003 ἠγάπησεν CL
64003016004 ὁ det-S-CL
64003016005 θεὸς S-CL
64003016006 τὸν det-O-CL
64003016007 κόσμον O-CL
64003016008 ὥστε conj-CL-CL
64003016009 τὸν det-O-CL-CL
64003016010 υἱὸν O-CL-CL
64003016011 τὸν det-np-O-CL-CL
64003016012 μονογενῆ np-O-CL-CL
64003016013 ἔδωκεν, CL-CL

So it will tell you that μονογενῆ is qualifying the object of a subordinate clause (at least according to the GBI analysis). We’ve thrown away the noun it’s modifying (υἱὸν) and the verb in the subordinate clause it’s the object of (ἔδωκεν) and the verb in the main clause (ἠγάπησεν), but np-O-CL-CL is a decent label for its syntactic role as qualifying the object of a subordinate clause.

The code I used is available here.


Comments on “Dependency Paths”