Sponsored Links
-->

Wednesday, January 31, 2018

Building Morphological Analyzers - ppt download
src: slideplayer.com

In linguistic typology, polysynthetic languages are highly synthetic languages, i.e. languages in which words are composed of many morphemes (word parts that have independent meaning but may or may not be able to stand alone). Polysynthetic languages typically have long "sentence-words" such as the Yupik word tuntussuqatarniksaitengqiggtuq which means "He had not yet said again that he was going to hunt reindeer." The word consists of the morphemes tuntu-ssur-qatar-ni-ksaite-ngqiggte-uq with the meanings, reindeer-hunt-future-say-negation-again-third.person.singular.indicative; and except for the morpheme tuntu "reindeer", none of the other morphemes can appear in isolation.

Whereas isolating languages have a low morpheme-to-word ratio, polysynthetic languages have a very high ratio. There is no generally agreed upon definition of polysynthesis. Generally polysynthetic languages have polypersonal agreement although some agglutinative languages that are not polysynthetic also have it, such as Basque, Hungarian and Georgian. Some authors apply the term polysynthetic to languages with high morpheme-to-word ratios, but others use it for languages that are highly head-marking, or those that frequently use noun incorporation.

Polysynthetic languages can be agglutinative or fusional depending on whether they encode one or multiple grammatical categories per affix.

At the same time, the question of whether to call a particular language polysynthetic is complicated by the fact that morpheme and word boundaries are not always clear cut, and languages may be highly synthetic in one area but less synthetic in other areas (e.g., verbs and nouns in Southern Athabaskan languages or Inuit languages). Many polysynthetic languages display complex evidentiality and/or mirativity systems in their verbs.

The term was invented by Peter Stephen Du Ponceau, who considered polysynthesis, as characterized by sentence words and noun incorporation, a defining feature of all Native American languages. This characterization was shown to be wrong, since many indigenous American languages are not polysynthetic, but it is a fact that polysynthetic languages are not evenly distributed throughout the world, but more frequent in the Americas, Australia, Siberia, and New Guinea; however, there are also examples in other areas. The concept became part of linguistic typology with the work of Edward Sapir, who used it as one of his basic typological categories. Recently, Mark C. Baker has suggested formally defining polysynthesis as a macro-parameter within Noam Chomsky's principles and parameters theory of grammar. Other linguists question the basic utility of the concept for typology since it covers many separate morphological types that have little else in common.


Video Polysynthetic language



Meaning

The word "polysynthesis" is composed of the Greek roots poly meaning "many" and synthesis meaning "placing together".

In linguistics a word is defined as a unit of meaning that can stand alone in a sentence, and which can be uttered in isolation. Words may be simple, consisting of a single unit of meaning, or they can be complex, formed by combining many small units of meaning, called morphemes. In a general non-theoretical sense polysynthetic languages are those languages that have a high degree of morphological synthesis, and which tend to form long complex words containing long strings of morphemes, including derivational and inflectional morphemes. A language then is "synthetic" or "synthesizing" if it tends to have more than one morpheme per word, and a polysynthetic language is a language that has "many" morphemes per word. The concept was originally used only to describe those languages that can form long words that correspond to an entire sentence in English or other Indo-European languages, and the word is still most frequently used to refer to such "sentence words".

Often polysynthesis is achieved when languages have extensive agreement between elements verbs and their arguments so that the verb is marked for agreement with the grammatical subject and object. In this way a single word can encode information about all the elements in a transitive clause. In Indo-European languages the verb is usually only marked for agreement with the subject (e.g. Spanish hablo "I speak" where the -o ending marks agreement with the first person singular subject), but in many languages verbs also agree with the object (e.g. the Kiswahili word nakupenda "I love you" where the n- prefix marks agreement with the first person singular subject and the ku- prefix marks agreement with a second person singular object).

Many polysynthetic languages combine these two strategies, and also have ways of inflecting verbs for concepts normally encoded by adverbs or adjectives in Indo-European languages. In this way highly complex words can be formed, for example the Yupik word tuntussuqatarniksaitengqiggtuq which means "He had not yet said again that he was going to hunt reindeer." The word consists of the morphemes tuntu-ssur-qatar-ni-ksaite-ngqiggte-uq with the meanings, reindeer-hunt-future-say-negation-again-third.person.singular.indicative, and except for the morpheme tuntu "reindeer", none of the other morphemes can appear in isolation.

Another way to achieve a high degree of synthesis is when languages can form compound words by incorporation of nouns, so that entire words can be incorporated into the verb word, as baby is incorporated in the English verb babysit.

Another common feature of polysynthetic languages is a tendency to use head marking as a means of syntactic cohesion. This means that many polysynthetic languages mark grammatical relations between verbs and their constituents by indexing the constituents on the verb with agreement morphemes, and the relation between noun phrases and their constituents by marking the head noun with agreement morphemes. There are some dependent-marking languages that may be considered to be polysynthetic because they use case stacking to achieve similar effects, and very long words.

Examples

An example from Chukchi, a polysynthetic, incorporating, and agglutinating language of Russia which also has grammatical cases unlike the majority of incorporating polysynthetic languages:

T?mey??levtp??t?rk?n.
t-?-mey?-?-levt-p??t-?-rk?n
1.SG.SUBJ-great-head-hurt-PRES.1
'I have a fierce headache.'

From Classical Ainu of Japan, another polysynthetic, incorporating, and agglutinating language:

Usaopuspe aeyaykotuymasiramsuypa.
usa-opuspe a-e-yay-ko-tuyma-si-ram-suy-pa
various-rumors 1SG-APL-REFL-APL-far-REFL-heart-sway-ITER
'I wonder about various rumors.' (lit. 'I keep swaying my heart afar and toward myself over various rumors'.)

The Mexican language Nahuatl is also considered to be polysynthetic, incorporating and agglutinating. The following verb shows how the verb is marked for subject, patient, object, and indirect object:

Nimitzt?tlamaquilt?z
ni-mits-te:-tla-maki-lti:-s'
I-you-someone-something-give-CAUSATIVE-FUTURE
"I shall make somebody give something to you"

The Australian language Tiwi is also considered highly polysynthetic:

Pitiwuliyondjirrurlimpirrani
Pi-ti-wuliyondji-rrurlimpirr-ani.
3PL-3SG.FEM-dead.wallaby-carry.on.shoulders-PST.HABIT
"They would carry the dead wallaby on their shoulders."

And the Canadian First Nation language Mohawk:

Sahonwanhotónkwahse
sa-honwa-nhoton-kw-a-hse
again-PAST-she/him-opendoor-reversive-un-for(PERF form)
"she opened the door for him again"

An example from Western Greenlandic, an exclusively suffixing polysynthetic language:

Aliikkusersuillammassuaanerartassagaluarpaalli.
aliikku-sersu-i-llammas-sua-a-nerar-ta-ssa-galuar-paal-li
entertainment-provide-SEMITRANS-one.good.at-COP-say.that-REP-FUT-sure.but-3.PL.SUBJ/3SG.OBJ-but
'However, they will say that he is a great entertainer, but ...'

Maps Polysynthetic language



History of the concept

Peter Stephen Du Ponceau on Native American languages

The term "polysynthesis" was first used by Peter Stephen DuPonceau (a.k.a. Pierre Étienne Du Ponceau) in 1819 as a term to describe the structural characteristics of American languages .

Three principal results have forcibly struck my mind... They are the following:

  1. That the American languages in general are rich in grammatical forms, and that in their complicated construction, the greatest order, method and regularity prevail
  2. That these complicated forms, which I call polysynthesis, appear to exist in all those languages, from Greenland to Cape Horn.
  3. That these forms appear to differ essentially from those of the ancient and modern languages of the old hemisphere.

The manner in which words are compounded in that particular mode of speech, the great number and variety of ideas which it has the power of expressing in one single word; particularly by means of the verbs; all these stamp its character for abundance, strength, and comprehensiveness of expression, in such a manner, that those accidents must be considered as included in the general descriptive term polysynthetic.

I have explained elsewhere what I mean by a polysynthetic or syntactic construction of language.... It is that in which the greatest number of ideas are comprised in the least number of words. This is done principally in two ways. 1. By a mode of compounding locutions which is not confined to joining two words together, as in the Greek, or varying the inflection or termination of a radical word as in the most European languages, but by interweaving together the most significant sounds or syllables of each simple word, so as to form a compound that will awaken in the mind at once all the ideas singly expressed by the words from which they are taken. 2. By an analogous combination of various parts of speech, particularly by means of the verb, so that its various forms and inflections will express not only the principal action, but the greatest possible number of the moral ideas and physical objects connected with it, and will combine itself to the greatest extent with those conceptions which are the subject of other parts of speech, and in other languages require to be expressed by separate and distinct words.... Their most remarkable external appearance is that of long polysyllabic words, which being compounded in the manner I have stated, express much at once.

The term was made popular in a posthumously published work by Wilhelm von Humboldt (1836), and it was long considered that all the indigenous languages of the Americas were of the same type. Humboldt considered language structure to be an expression of the psychological stage of evolution of a people, and since Native Americans were considered uncivilized, polysynthesis came to be seen as the lowest stage of grammatical evolution, characterized by a lack of rigorous rules and clear organization known in European languages. Duponceau himself had argued that the complex polysynthetic nature of American languages was a relic of a more civilized past, and that this suggested that the Indians of his time had degenerated from a previous advanced stage. Duponceau's colleague Albert Gallatin contradicted this theory, arguing rather that synthesis was a sign of a lower cultural level, and that while the Greek and Latin languages were somewhat synthetic, Native American languages were much more so - and consequently polysynthesis was the hallmark of the lowest level of intellectual evolution.

This view was still prevalent when linguist William Dwight Whitney wrote in 1875. He considered polysynthesis to be a general characteristic of American languages, but he did qualify the statement by mentioning that certain languages such as Otomi and the Tupi-Guarani languages had been claimed to be basically analytic.

D. G. Brinton

The ethnologist Daniel Garrison Brinton, the first professor of anthropology in the US, followed Duponceau, Gallatin and Humboldt in seeing polysynthesis, which he distinguished from incorporation, as a defining feature of all the languages of the Americas. He defined polysynthesis in this way:

Polysynthesis is a method of word-building, applicable either to nominals or verbals, which not only employs juxtaposition, with aphaeresis, syncope, apocope, etc., but also words, forms of words, and significant phonetic elements which have no separate existence apart from such compounds. This latter peculiarity marks it off altogether from the processes of agglutination and collocation.

Their absence has not been demonstrated in any [language] of which we have sufficient and authentic material on which to base a decision. The opinion of Du Ponceau and Humboldt, therefore, that these processes belong to the ground-plan of American languages, and are their leading characteristics, must be regarded as still uncontroverted in any instance.

In the 1890s the question of whether polysynthesis could be considered a general characteristic of Native American languages became a hotly contested issue as Brinton debated the question with John Hewitt. Brinton, who had never done fieldwork with any indigenous group continued to defend Humboldt and Duponceau's view of the exceptional nature of American languages against the claim of Hewitt, who was half Tuscarora and had studied the Iroquoian languages, that languages such as the Iroquois had grammatical rules and verbs just like European languages.

Edward Sapir's morphological types

Edward Sapir reacted to the prevailing view in Americanist linguistics which considered the languages of the Americas to belong to a single basic polysynthetic type, arguing instead that American indigenous languages were highly diverse and encompassed all known morphological types. He also built on the work of Leonard Bloomfield who in his 1914 work "language" dismissed morphological typology, stating specifically that the term polysynthetic had never been clearly defined.

In Sapir's 1921 book also titled "Language", he argued that instead of using the morphological types as a strict classification scheme it made more sense to classify languages as relatively more or less synthetic or analytic, with the isolating and polysynthetic languages in each of the extremes of that spectrum. He also argued that languages were rarely purely of one morphological type, but used different morphological strategies in different parts of the grammar.

Hence has arisen the still popular classification of languages into an "isolating" group, an "agglutinative" group, and an "inflective" group. Sometimes the languages of the American Indians are made to straggle along as an uncomfortable "polysynthetic" rear-guard to the agglutinative languages. There is justification for the use of all of these terms, though not perhaps in quite the spirit in which they are commonly employed. In any case it is very difficult to assign all known languages to one or other of these groups, the more so as they are not mutually exclusive. A language may be both agglutinative and inflective, or inflective and polysynthetic, or even polysynthetic and isolating, as we shall see a little later on.

An analytic language is one that either does not combine concepts into single words at all (Chinese) or does so economically (English, French). In an analytic language the sentence is always of prime importance, the word is of minor interest. In a synthetic language (Latin, Arabic, Finnish) the concepts cluster more thickly, the words are more richly chambered, but there is a tendency, on the whole, to keep the range of concrete significance in the single word down to a moderate compass. A polysynthetic language, as its name implies, is more than ordinarily synthetic. The elaboration of the word is extreme. Concepts which we should never dream of treating in a subordinate fashion are symbolized by derivational affixes or "symbolic" changes in the radical element, while the more abstract notions, including the syntactic relations, may also be conveyed by the word. A polysynthetic language illustrates no principles that are not already exemplified in the more familiar synthetic languages. It is related to them very much as a synthetic language is related to our own analytic English. The three terms are purely quantitative--and relative, that is, a language may be "analytic" from one standpoint, "synthetic" from another. I believe the terms are more useful in defining certain drifts than as absolute counters. It is often illuminating to point out that a language has been becoming more and more analytic in the course of its history or that it shows signs of having crystallized from a simple analytic base into a highly synthetic form.

Sapir introduced a number of other distinctions according to which languages could be morphologically classified, and proposed combining them to form more complex classifications. He proposed classifying languages both by the degree of synthesis, classifying languages as either analytic, synthetic or polysynthetic, and by the technique used to achieve synthesis, classifying languages as agglutinative, fusional, or symbolic. Among the examples of polysynthetic languages he gave was Haida which he considered to use the agglutinative-isolating technique, Yana and Nootka both of which he considered agglutinative, Chinook and Algonkin which he considered fusional. The Siouan languages he considered "mildly polysynthetic" and agglutinative-fusional.

Following Sapir's understanding of Polysynthesis, his student Benjamin Lee Whorf proposed a distinction between oligosynthetic and polysynthetic languages, where the former term was applied to languages with a very small number of morphemes of which all other lexical units are composed. No language has been shown to fit the description of an oligosynthetic language and the concept is not in general use in linguistics.


Inuktitut - The Canadian Encyclopedia
src: s3.amazonaws.com


Contemporary approaches

Generative approaches

The sentence structure of polysynthetic languages has been taken as a challenge for linguists working within Noam Chomsky's generative theoretical framework that operates with the assumption that all the world's languages share a set of basic syntactic principles.

Non-configurationality and the pronominal argument hypothesis

Eloise Jelinek, having worked with Salishan and Athabascan languages, proposed an analysis of polysynthetic languages in which the morphemes that agree with the arguments of the verb are not just considered indexes of the arguments, but in fact constitute the primary expression of the arguments within the sentence. Because this theory posits that the pronominal agreement morphemes are the true syntactic arguments of the sentence, Jelinek's hypothesis was called the pronominal argument hypothesis. If the hypothesis were correct, it would mean that free standing nouns in such languages did not constitute syntactical arguments, but simply adjoined specifiers or adjuncts. This in turn explained why many polysynthetic languages seem to be non-configurational, i.e. they have no strict rules for word order and seemingly violate many of the basic rules for syntactic structures posited within the generative framework.

Mark C. Baker's polysynthesis parameter

In 1996 Mark C. Baker proposed a definition of polysynthesis as a syntactic macroparameter within Noam Chomsky's "principles and parameters" program. He defines polysynthetic languages as languages that conform to the syntactic rule that he calls the "polysynthesis parameter", and that as a result show a special set of morphological and syntactic properties. The polysynthesis parameter states that all phrasal heads must be marked with either agreement morphemes of their direct argument or else incorporate these arguments in that head. This definition of polysynthesis leaves out some languages that are commonly stated as examples of polysynthetic languages (such as Inuktitut), but can be seen as the reason for certain common structural properties in others, such as Mohawk and Nahuatl. Baker's definition, probably because of its heavy dependence on generative theory, has not been accepted as a general definition of polysynthesis.

Johanna Mattissen's affixal and compositional subtypes

Johanna Mattissen suggests that polysynthetic languages can be fundamentally divided into two typological categories, which differ in the way morphemes are organised to form words. She calls the two types for affixal and compositional polysynthesis respectively.

Affixal

Affixally polysynthetic languages, as the name suggests, are those that use only non-root-bound morphemes to express concepts that in less synthetic languages are expressed by separate words such as adjectives and adverbs. They also use these bound morphemes to make other nouns and verbs from a basic root, which can lead to very complex word forms without non-lexical suffixes. These bound morphemes often relate to body parts, other essential items of the culture of the language's speakers or features of the landscape where the language is spoken. Deictics and other spatial and temporal relations are also very common among these bound morphemes in affixally polysynthetic languages.

Affixally polysynthetic languages do not use noun incorporation or verb serialisation, since this violates the rule concerning the number of roots allowable per word. Many make a weak distinction between nouns and verbs, which allows using affixes to translate these parts of speech.

Affixally polysynthetic languages may have a word structure that is either

  • templatic, with a fixed number of slots for different elements, which are fixed in their position and order relative to each other; or
  • scope ordered, with forms not restricted in complexity and length. The components are fixed in their relative scope and are thus ordered according to the intended meaning. Usually in this case a few components are actually fixed, such as the root in Eskimo-Aleut languages.

Examples of affixally polysynthetic languages include Inuktitut, Cherokee, Athabaskan languages, the Chimakuan languages (Quileute) and the Wakashan languages.

Compositional

In compositionally polysynthetic languages, there usually can be more than one free morpheme per word, which gives rise to noun incorporation and verb serialisation to create extremely long words. Bound affixes, though less important in compositionally polysynthetic languages than in affixally polysynthetic languages, tend to be equally abundant in both types.

It is believed that all affixally polysynthetic languages evolved from compositionally polysynthetic ones via the conversion of morphemes that could stand on their own into affixes.

Because they possess a greater number of free morphemes, compositionally polysynthetic languages are much more prone than affixally polysynthetic ones to evolve into simpler languages with less complex words. On the other hand, they are generally easier to distinguish from non-polysynthetic languages than affixally polysynthetic languages.

Examples of compositionally polysynthetic languages include Classical Ainu , Sora, Chukchi, Tonkawa, and most Amazonian languages.


Ojibwe language - Wikipedia
src: upload.wikimedia.org


Distribution

Siberia

  • Chukotko-Kamchatkan languages
  • Ket ("probable")
  • Nivkh (possible)

North America

  • Algonquian languages
  • Caddoan languages
  • Eskimo-Aleut languages
  • Iroquoian languages
  • Na-Dene languages
  • Salishan languages
  • Siouan languages ("mildly" polysynthetic)
  • Uto-Aztecan languages
  • Wakashan languages
  • Yana/Yahi and other Hokan languages

Mesoamerica

  • Mayan languages
  • Totonacan languages
  • Mixe-Zoquean languages
  • Purépecha language
  • Tequistlatecan languages
  • Huave language

South America

  • Aymaran languages
  • Quechuan languages
  • Tupi-Guaraní languages
  • Many Amazonian languages
  • Mapudungun

Caucasus

  • Northwest

East Asia

  • Ainu language
  • Jiarongic languages

South Asia

  • Munda languages

Oceania

  • many Papuan languages (e.g. Awtuw, Yimas)
  • northern Australian languages (e.g. Bininj Gun-wok, Gunwinyguan, Murrinh-patha, Ngalakgan, Rembarrnga, Tiwi)

Inuktitut - The Canadian Encyclopedia
src: s3.amazonaws.com


Notes


PPT - Machine Translation PowerPoint Presentation - ID:541380
src: image.slideserve.com


References

Source of article : Wikipedia