Parts of Speech and Number of Accents

I thought I'd write a quick Python script to check how many accents were on each of the lemmata in [MorphGNT] 5.06.

Here are the counts by part of speech and number of accents on lemma:

|     |  0      |  1      |  2  |
+-----+---------+---------+-----+
| A   |  -      |  9159   |  -  |
| C   |  924    |  17361  |  -  |
| D   |  1592   |  4606   |  -  |
| I   |  -      |  17     |  -  |
| N   |  30     |  28271  |  1  |
| P   |  5433   |  5488   |  -  |
| RA  |  19862  |  4      |  -  |
| RD  |  -      |  1744   |  -  |
| RI  |  -      |  1165   |  -  |
| RP  |  -      |  11584  |  -  |
| RR  |  -      |  1677   |  -  |
| V   |  8      |  28101  |  1  |
| X   |  147    |  844    |  -  |

Some of the low numbers are definitely errors in the database. Now to investigate...

UPDATE (2005-07-16): both 2-accent cases were mistakes. The 30 0-accent nouns and 5 of the 0-accent verbs were foreign loan words that intentionally weren't accented but 3 of the 0-accent verbs were mistakes. The 4 accented articles were the result of crasis with the following noun and the word should probably be analyzed as a noun rather than an article. I guess there'll be a 5.07 release soon. NOTE: I haven't looked at the particles, adverbs, conjunctions or prepositions yet.


originally published on jtauber.com


Comments on “Parts of Speech and Number of Accents”