Modeling the phonological lexicon across multiple languages


The phonological lexicon, the list of phonological forms for a given language, is of central importance to both phonological and psycholinguistic theory. In this talk, we present a novel approach to the study of the phonological lexicon via complex network modeling. This modeling relies on the notion that each word in the lexicon is connected to its nearest phonological neighbors. By connecting all words in this manner, the lexicon can be examined as a network of interconnected nodes, and the tools of complexity science can be applied. By comparing the lexicons of nineteen genetically and typologically dissimilar languages to over four thousand pseudo-lexicons randomly generated under controlled conditions, it is possible to determine the provenance of various features of lexical structure. We confirm that, in accordance with linguists’ intuitions, phonotactics play an important role in determining the structure of lexicons. However, above and beyond the effects of phonotactics, we find evidence for two higher-level cognitive constraints operating over the lexicon: a pressure to form new words out of subcomponents of existing words; and a pressure against the existence of phonologically similar words. These pressures are hypothesized to be of functional importance for language acquisition, production, and comprehension. Future work and extensions to other domains are discussed.