Morphology learning using tree of aligned suffix rulesKsenia Shalonova, Peter Flach, Morphology learning using tree of aligned suffix rules. ICML Workshop on Challenges and Applications of Grammar Induction. June 2007. PDF, 606 Kbytes.
Linguistic morphology concerns the structure of words, for instance how the plural form of nouns is obtained from their singular form, or how the past tense of verbs is generated from their infinitive. We describe an approach to function learning in morphology, where given a basic form of a word the goal is to generate a grammatical wordform. This task concerns learning the regular grammar or transformation in the format stem+suffix1 -> stem+suffix2 where suffix1 and suffix 2 are suffixes of two words in one word-pair and stem is the coinciding part in two words. Our approach is based on the tree of aligned suffix rules (TASR) where suffix rules represent left-hand and right-hand word suffixes of the input word pairs. The tree is built top-down, from general rules to specific rules, using suffix rule frequency and rule subsumption to decide which rules go where in the tree. The tree is executed bottom-up, i.e., the most specific rule that fires is chosen. In comparison to rule-induction approaches with similar functionality from the literature, the proposed method is faster, generates less rules and has got the following set of properties: is relatively simple, achieves high performance and has a more clear linguistic interpretation. We also describe preliminary thoughts on inducing morphological rules that are close to context-free mechanism.