The Balkan Linguistic Convergence Area

Andre M. Mesarovic
January 1989



Introduction

The objective of this paper is to examine the linguistic phenomenon known as the Balkan Sprachbund, or Balkan linguistic convergence area, and to explore some of the processes that could have led to several widely differing languages to converge on a number of significant syntactic points.

Although the Balkan languages in question, Greek, Albanian, South Slavic (Bulgarian, Macedonian and Serbo-Croatian), and Romanian are all distant Indo-European cousins, the number of syntactic parallels between them is so great as to allow one to speak of a unique Balkan language community. There are several alternative theories as to how these syntactic similarities originated. Perhaps there was a central are where they first developed, and henceforth spread, or perhaps the apparent similarity is due to parallel convergence. Indeed, the degree of typological uniformity of the Balkan languages is so striking, that typologically speaking, Bulgarian is "closer" to Romanian than it is to other non-Balkan Slavic languages.

An areal group of languages necessarily implies a significant cultural unity to one degree or another, resulting from sharing a common historical experience. In the case of the Balkans, the long cultural and political dominance of the Byzantine Greeks left an early imprint on the intellectual and religious spheres. Indeed, except for the part of the Albanian population, the speakers of the Balkan area languages are predominantly Eastern Orthodox. But what is most interesting about the Balkan area languages is not the common base of folklore, mythology, or the extensive lexical borrowing, or even prosodic and phonetic similarities, but an actual commonality of syntactic constructs bewtween the languages.

Definition

Although various authors consider a wide number of features as Balkanism, the most commonly recognized syntactic characteristics are:

Of these features, the most widespread and most important Balkanism is the replacement of the infinitive with a subordinate clause. See the figure below for a distributional mapping of some of these features among the Balkan languages.

Language Infinitive Loss Particle Postposed Article Future with Volition
Greek mostly pos, oti, na no yes
Albanian partial qe, se, te yes yes
Tosk more


Geg less


Italian little

yes
Romanian mostly ca, sa yes yes
Daco- Mostly


Istro- Less


Meglo- Very much


Aroumanian



Bulgarian very much sto, ce, da yes yes
Macedonian total deka, oti, da no yes
Serbo-Croatian varied da no yes
Torlak very da yes yes
Stokavian partial da no yes
Kajkavian little da no yes
Cakavian little
no yes


Historical Record of Language Evolution

Although today the Balkan languages all have standard literary languages, and extensive amount of linguistic research has been undertaken, there still remain some dialects that have not been fully mapped, such as Aroumanian and Meglo-Romanian. The task of accurately tracing the diachronic evolution of syntactic Balkanisms through each language is particularily daunting, since written records appear very late for some languages (for Albanian in 1555, and for Romanian in 1521). Slavic manuscripts appear earlier, beginning in the tenth century; however since Old Slavonic (which was based on the speech of the Slavs of the Macedonian area) was frozen as a liturgical language almost from its inception, subsequent Slavic writings did not accurately reflect contemporary vernacular language of the time. Common linguistic methods of historical reconstruction such as toponymical data and other lexical-based techniques, do not offer much help in mapping the development of syntactic Balkanisms.

Greek

Greek occupies a special position among the Balkan languages since it is by far the best documented; one can easily trace its evolution from the seventh century BC to the present day. In addition, Greek cultural influence was predominant in the Balkans throughout the Middle Ages. Modern Greek has little use of the infinitive, which was once a productive category in the late Post-Classical period (second century BC to sixth century AD); already in the Greek of the New Testament there is a marked retreat of the infinitive.

Romanian

Romanian is a direct descendant of the late Vulgar Latin that was widely spoken in the Balkan peninsula prior to the collapse of the Roman empire. Of the four major Romanian dialects that exist today, Daco-Romanian is by far the most widespread, spoken by over twenty million inhabitants of Romania. A tiny offshot of Daco-Romanian is Istro-Romanian, used by a few thousand speakers in the Istrian peninsula of northwest Yugoslavia who fled the Ottoman Turks in the sixteenth century. The other two dialects, Aroumanian and Meglo-Romanian, diverged from Common Romanian early in the sixth century, and are used in the southwest Balkan area by the descendants of the nomadic Vlach shepherds.

Romanian does not exhibit a total loss of the infinitive as do Bulgarian and Macedonian, although the principle use of the infinitive is limited largely to the liturgical language. As in other Balkan languages, Romanian has mostly replaced infinitive forms by a finite clause. In terms of dialectal distribution, Istro-Romanian still preserves extensive use of the infinitive; it is not clear whether this is a carryover from the time when the infinitive was a more productive category in Romanian, or a reemergence due to the influence of neighboring Slovenian or Serbo-Croatian, both which retain the infinitive. Aroumanian and Meglo-Romanina have largely lost the infinitive, while Daco-Romanian still permits a very limited use of the infinitive. The geographical distribution would seem to lend credence to the theory that the focal point of Balkanisms was in the south-central Balkans, with change radiating northwards. Romanian also utilizes the postposed article.

Albanian

Albanian presents a more complex case, exhibiting significant dialectal variation in the use of the infinitive. The two main dialects are Geg (north) and Tosk (south), with the latter being the basis of the literary language. A third dialect of Tosk, existing in small pockets of southern Italy (representing Albanian refugees fleeing the Turks in the sixteenth century), contains very few uses of the infinitive. In Tosk, meanwhile, after apparently withering away, the infinitive has actually reemerged to such a degree, that the commonly assumed Balkan nature of Albanian has been questioned (Joseph 1983). Like Romanian, Albanian linguistic history is difficult to follow due to the paucity of written records.

South Slavic

The four South Slavic languages show varying degrees of Balkanization, the most intense being in the southern and eastern areas, with the intensity decreasing as one travels northwest. Macedonian and Bulgarian demonstrate the most complete cases, with Serbo-Croatian showing partial loss of the infinitive, most apparent in the Serbian areas south of Belgrade. Since Slovenian exhibits no apparent Balkanisms, it is not included in the Balkan group.

Tracing the development of South Slavic languages presents less problems than Albanian or Romanian, since the first attested Slavic documents are of South Slavic origin. However, Old Slavic was quickly frozen into its ninth century form, and documents written in it through the subsequent centuries reflect this archaic character, although they were being constantly infiltrated by vernacular forms. This was, of course, true mainly of the Orthodox area, since Slavic documents of the Catholic region were not subject to this liturgical tradition.

Bulgarian and Macedonian

Bulgarian and Macedonian are often considered to best exemplify a Balkan language, with complete loss of the infinitive in Macedonian, near complete in Bulgarian, a reduction of the complex Slavic noun declensional system to two cases, development of a postpositional definite article to or ta (developed from the demonstrative adjective for that, and extensive use of the verb for volition in constructing the future tense. In this, these two languages radically differ from other Slavic languages.

Serbo-Croatian

Serbo-Croatian presents a particularily interesting case of a language in the actual process of Balkanization. The commonly recognized three major dialects of Serbo-Croatian are named for the different value used for the word what: Stokavian, Cakavian, and Kajkavian. Although the South Slavic area is normally divided into four seperate languages, the actual isogloss lines are not sharply clustered around current national boundaries, and the dialectal transition between languages is gradual rather than abrupt. For instance, the Kajkavian dialect of Serbo-Croatian contains many similarities with neighboring Slovenian (including kaj, the word for what). Another transitional dialect of particular relevance to the Balkan convergence question is the Torlak dialect of southern Serbia (also called the Prizren-Timok dialect). Torlak is usually classified as a subdialect of Stokavian, but nevertheless shares many common features with neighboring Bulgarian and Macedonian dialects. Vuk Karadzic, the standardizer of the modern Cyrillic alphabet of Serbia, in his Serbian Dictionary called Torlak "a language that is neither proper Serbian nor Bulgarian." Torlak is especially interesting since its diachronic relationship with the rest of Serbo-Croatian is well established, but structurally and typologically it has approached the Bulgarian-Macedonian model. Torlak does not utlize tone or length as prosodic features (Serbo-Croatian has four distinct tones), has reduced the number of cases from seven to two, and lacks the infinitive form.

Use of the finite suboordinate clause is also widely heard in other Stokavian areas, especially in Serbia, although no postpositional articles are used. In western Serbo-Croatian areas the infinitive is still widely utilized.

Causes

In spite of the fact that one can legitimately identify a set of languages of a Balkan type, the issue of precisely which set of grammatical categories allow a language to be classified as Balkan is less certain. One of the major problems of classification is whether all Balkanisms can be assumed to have a common origin? For example, most Balkan languages have developed the postposed definite article except for Greek. Since Greek is often posited as the source for Balkan innovations, such as the loss of the infinitive, how does one then account for the origin of the postposed article? However, the loss of cases does not uniquely distinguish the Balkan languages since many neighboring related languages have also undergone this process. Although Slavic languages have on the whole preserved their complex inflectional system (except, of course, Bulgarian and Macedonian), Romanian shares the phenomenon of loss of cases with all other languages descended from Latin.

The Balkans today represent an area of linguistic diversity, and this was even more so in the past. It is true that the Slavic invasions displaced much of the existing population, but significant pockets of indigenous speakers remained. Due to the late development of nation states in the nineteenth century and the accompanying language standardization, a much wider degree of dialectal diversity has been preserved in the Balkans than elsewhere in Europe. In addition, due to the existence of a large number of nomadic pastoralists such as the Vlachs, there was constant linguistic exposure and intermingling throughout the area.

A number of possible causes have been advanced to account for the Balkan Sprachbund:

Traditionally Greek has been considered (Sandfield 1930) as the principal source of Balkan innovations, with documentary evidence of the weakening of the infinitive in the first few centuries AD. Since Old Slavonic maintained full use of the infinitive (ninth century), it seems plausible that it was Greek that transfered this notion to the Slavs. Although Vulgar Latin, which was widely spoken in the northern Balkans, still maintained a robust infinitive, there are indications that it was already being alternated with finite clauses. Whether or not this was due to Greek influence, or was an internal development of Latin, is not clear.

The substratum theory was once commonly assumed to be the origin of Balkanisms. Different language groups such as the Thracians, Illyrians or Pelasgians were considered to have imparted their structure on the invading Roman and Slavic tribes. The fundamental problem with this idea is that there are no records of these extinct languages, and using them as an explanation for the Balkan Sprtachbund is merely replacing one unknown variable with another.

Language convergence by itself cannot be considered to fully account for the Balkan language situation, for it seems too fortuitous that all these languages would have converged on all these varied points due to internal reasons alone. However, there very well might have been similar tendencies to alternate the infinitive forms, stronger in some languages than others, and under the influence of the stronger language this phenomenon was amplified in the weaker languages.

The idea of language mixture has received recent interest, with the increased studies of pidginizations and creolization. Even before the fall of the Roman Empire, the Balkan peninsula was an area of language diversity, with many of the original inhabitants adopting Latin as their native language at an uneven pace. The collapse of Roman defenses on the Danube resulted in a significant influx of Slavic tribes, with much of the Romanized population fleeing either to the Adriatic coast or retreating into the mountains to eventually appear in history as the Vlachs.

However, it would be innacurate to regard the area as being completely depopulated, for there must have remained a large number of Romanized inhabitants that eventually intermarried with the invaders. This polyglot situation was especially prevalent in the southern Balkans, which had never been as fully Romanized as the northern Illyrian territories (Grickat). Moreover, the southern populations were also exposed to Greek influences as well.

The loss of the infinitive can be seen as a part of a more general analytic trend of Indo-European languages, and this process was hastened by the multilingual conditions of the Balkans. In any situation where people have to use several languages, there is a tendency not to fully master the grammar of the new language(s) in entirety, and one of the first casualties would have been the complex synthetic features, such as the Slavic case system. Where language fusion occurs, simplification is an inevitable consequence. An analogy can be made with the history of English, which suffered its loss of cases under the influence of Norse languages in the ninth and tenth centuries. Up to half of Romanian vocabulary is Slavic in origin, and it should come as no surprise that substantial syntactic (as well as phonetic) qualities would have been transfered. Albanian is a similar case, with much of its lexical repertoire being of Latin origin.

Conclusion

In concluson, it can be seen that the origin of Balkan syntactic features cannot be assigned to a single cause. Evidence does seem to point to Greek, as the probable source for the infinitive loss, although in the absence of historical records for other languages this cannot be positively ascertained. But even though Greek was ultimately responsible for this change, the spread of this feature was not a simple process. The linguistic interaction between Greek and Vulgar Latin prior to the Slavic incursions must also be considered, since Vulgar Latin was already manifesting a certain weakening of the infinitive.

It is still not clear if the entire set of Balkan features can even be treated as unified whole. Although the loss of the infinitive can be seen as a part of the analytic trend, it is difficult to see how the post-positive article fits into the analytic schema. Indeed, since Greek does not manifest this feature, its origin must be sought elsewhere. This raises the question of whether there are several different linguistic processes of change taking place roughly over the same area, all having different origins and causes, yet all being categorized as Balkanisms. A similar problem is the future form and verb of volition. Often, the loss of the infinitive and the future form innovation are held to be related; however western Serbo-Croatian dialects utilize the verb of volition for the future, yet still have a strong infinitive.