SOUTH ASIAN HISTORY

Pages from the History of the Indian sub-continent: Indian Linguistics and Colonial Formulations


Colonial Constructs about Indian Languages
The "Indo-European" Model and Beyond

Most educated Indians know that most Indian languages are divided into two broad linguistic streams - i.e. the "Indo-European" and the "Dravidian".  Tied in with this linguistic classification is the theory that the North Indian languages came with "Aryan" settlers.  During colonial rule, it may have seemed comforting to North Indians  to know that they enjoyed a historical genetic and cultural connection with the "superior" races of  Europe who had by then come to rule much of the world.  Of  course, this provided little comfort to the South Indians who were indirectly told that their own  cultural history was inferior to that of the North because they lacked the all-important European connection.

To this day, influential historians (such as Romila Thapar) and others at the JNU (and several other leading Indian universities) continue to swear by this colonial era model.  Critics of this colonial-era formulation are usually dismissed as "amateurs" or "national chauvinists"  who are somehow unable to comprehend the supposedly well-established  "science" of "modern" linguistics.

But is this classification truly "scientific" or a construct that derives more from purely political considerations as some recent critics have argued?

Hungarian Critics of the "Indo-European" Scheme
For instance, in Hungary, there is a growing body of scholars who are extremely uncomfortable and dissatisfied with the manner in which Hungarian was excluded from the Indo-European framework.  Hungary's T. Majlath notes that "Critics of the Finno-Ugric theory argue that it became highly popular when the Hapsburgs sought to put the Hungarians in their place not long after the failed Hungarian War of Independence of 1848, when Linguistics had not as yet developed into the "exact" science it is today."

In recent decades, several Hungarian and other Eastern European scholars have attempted to build lexicons comparing Hungarian words with their Slavic counterparts. Unsurprisingly, these lexicons show that the distance between Hungarian and the Slavic langauges spoken by its closest neighbors in Europe is not as large as might be implied by the  conscious and deliberate exclusion of Hungarian from the "Indo-European " schemata that includes  all the Slavic languages but excludes Hungarian.  Others have built lexicons comparing Hungarian with Sanskrit and Tamil (along the lines of the lexicons built by adherents of the "Indo-European"  formula), and again, they show that a selective interpretation of these lexicons could well lead to a new classification in which both Tamil and Sanskrit would end up in the same family of languages as Hungarian.

Yet to Employ Computerized Statistical Analysis
As some modern linguists have argued, the inclusion or exclusion of a language in a particular family must be based on very precise and consistent criterion that should be backed up computerized statistical analysis.  For instance, there are some Indian language scholars who have suggested that a computerized analysis of  Sanskrit and Latin lexicons might yield a far more limited overlap than would be rationally implied by the "Indo-European" classification.

In fact, such analysis might reveal a greater overlap between North Indian and South Indian langauages as well as between Adivasi langauges and their neighboring Indic langauges that are presently placed under the "Indo-European" umbrella.

But to date, advocates of the Indo-European paradigm have strenuously resisted such calls for a fresh and unbiased scientific analysis of  their classification methods. Nor have they  been open to analyzing their conclusions in the context of geography, archaeology, anthropology, trade ties, cultural exchanges and regional political developments.

Few linguists ascribing to the Indo-European/Dravidian divide have bothered to investigate the extent of commonality between Sanskrit or Tamil or Munda and Hindi or Tibetan and Bengali. The possibilities of overlapping vocabularies or shared words between langauges that are currently placed in different linguistic streams has simply not interested many Western-influenced Indian linguists.

Incorporating DNA Data
Most significantly, they have yet to utilize the growing body of DNA data that  provide very useful pointers to early human migration patterns. For instance, recent DNA analysis has shown that the Indian subcontinent was populated by migrants from Africa in three phases. In the first phase, the West coast of India was populated extending up to Southern India. In the next phase, a larger group of migrants populated the Indian subcontinent arriving from Africa via the Middle East. Finally, there was a smaller migration that brought a new wave of settlers from the Caucasian region (who had reached there from Africa  via the Middle East). One branch of the Caucasian settlers entered India while other branches populated Europe. However, it should be emphasized that these migrations took place long before settled civilization - not only long before the Vedic era, but also much before the Harappan Indus-Sarasvati civilization.

This would suggest the  commonality that was noted between the North Indian and  European languages may have been  due to very early migration patterns - when language was still in a somewhat rudimentary phase and had not yet developed into the more complex written form that comes with urbanization and settled civilization.

Although promoters of the Indo-European scheme have shied away from saying so, the commonality between the Indian and European langauges appears to be largely confined to a vocabulary that one might associate with early humans who were familiar with animal husbandry and fire and valued clan relationships but had yet to develop advanced agriculture or the social systems that go with more complex societies where a proportion of the population has become urbanized and there is a growing degree of specialization of labor accompanied with  the expansion of trade and commerce. (DNA might likewise explain the similarities noted between Brahui and Tamil).

Problems with the "Indo-European" Construct
However, languages are much more than  words for earth, grass, fire, grazing animals and kinship ties.  As societies develop and become exceedingly more complex their vocabularies  grow in proportion and they become more formal and expressive. They develop written scripts and they formulate  a functional (and sometimes  unique) syntax and grammar.  Different languages develop not only particular idioms but they also borrow words from their neighboring civilizations and trading partners. Words also spread through cultural exchanges and the spread of philosophy and religion.

Building primitive lexicons that show similar roots for certain common words can hardly be an adequate basis of linguistic classification.  Especially if that classification is going to be further used to generate implications about sociological and cultural development.  If the commonality between Indian and European langauages extends only to a small  pastoral-era oral lexicon,  the Indo-European theory of langauges could hardly be called in to justify the "Aryan Invasion" theory let alone infer that the Vedas were written by "Indo-European Aryan" migrants.

In fact, one of the unintended (or even intended) consequences of such linguistic speculation is that there has been a needless intellectual division between North Indians and South Indians, between Adivasis and "non-Adivasis" . Moreover, it has strengthened the now increasingly untenable view that there is no continuity between the Indo-Saraswati Harappan civilization and Vedic civilization, and that India's languages (both in the oral and written forms) must have been brought to India by more "civilized" outsiders.

In accepting such constructs not only  must one throw away  archaeolgical and anthropological evidence that points to the many continuities in Indian civilization but one must also obscure the significance of  the pioneering work done in the realm of linguistics by Panini and his predecessors.

India and the Birth of Formal Linguistics
Although there is some disagreement on when Panini lived, few modern linguists would deny him and (his lesser-known)  predecessors a place at the very forefront of the science of linguistics. 

Amongs the earliest known formal Sanskrit lexicons is the Nighantu (a thesaurus-like lexicon) ascribed to Yaska (7th c BC) whose work attempted to systematize the various lexicons that had been developed to aid in the understanding and intrerpretation of the Vedic texts. These included lexicons of rare or difficult words classified into chapters containing similes, metonyms, and other categories of related words that were used to  describe physical things and objects in nature. A separate chapter  contained words that related to human physical/physiological  and mental/emotional qualities  and yet another chapter confined itself to words relating to abstract qualities and concepts. A separate book described homonyms  that presented special difficulties in their interpretation or had ambiguous meanings.  Yaska's Nighantu was accompanied by his Nirukta (a treatise on entymology and word-parsing) in which rules for deriving  words from roots and affixes are described.  Yaska followed Sakatayana (an older grammarian) and described four types of words:  nama (or nouns), akhyata (verbs),  upsarga (prefixes) and nipata (particles such as prepositions). He defined verbs as those in which the process or action predominated and nouns as that in which an entity or a being or a thing predominated. He was also cognizant of how sometimes verbs taken on a noun-like form - such as "going for a walk" where the verb walk takes on a noun-like form.

Yaska also posited a semantic theory in which he argued that words had inherent meanings in contrast to Panini who argued that words had meanings only in their specific context. This debate appears to mirror the modern-day debate between semantic atomists and cognitive linguistics. Panini's Ashtadhyayi (Eight Chapters) went deeper into linguistic morphology defining such terms as phonemes, morphemes and roots. He also described rules/algorithms  for  taking material from lexical lists (dhatupatha) and generating words from them in a structured and systematic manner. Panini's influence on modern linguistics has been considerable (see notes below).

In this entire body of work stretching, from Sakatayana to Panini, there is virtually nothing to link Sanskrit to any European influence. 

On the other hand, both Sanskrit and Tamil are syllabic languages and both treat consonants and vowels very similarly.  Just as in Sanskrit where aksharas (speech particles or atoms)  are divided into Svarams (vowels) and  Vyanajanams (consonants),  in Tamil vowels (Uyir Ezhuttu) are clearly distinguished from consonants  Mey Ezhuttu.

Alphabets versus Syllables
And although linguists are divided as to which came first, both Sanskrit and Tamil are written in very similar ways.  Unlike the European langauges that are written using alphabets (derived from Greek, and branching off from Latin or Cyrillic), all Indian languages are written using syllables made up of (simple or compound) consonant shapes that are modified by the  symbols for vowels that connect the consonants.  In Sanskrit (and languages derived from it) as well as in South Indian languages like Telugu and Kannada  there is a precise and unambiguous correspondence between how words are pronounced and how they are written. 

From the point of view of classifying languages based on the organizational principles that govern their  written scripts no logic would permit the Sanskrit-derived North Indian langauges to be placed in the same language group as the European languages. 

For instance, languages (such as Chinese or Japanese) that use pictograms, logograms and ideograms in their written form are a unique group of languages and are classified as "Semanto-phonetic".  To understand the development of such languages using morphological and entymological constructs as described by Sanskrit linguists such as Yaska or Panini would be absurd.

Yet, Western scholars seem to have no difficulty in clubbing  Sanskrit with English and French even though the manner in which Sanskrit developed and was formalized was entirely unknown and alien to the Europeans.  On the other hand, structurally speaking (notwithstanding some differences),  Sanskrit and Tamil are like sisters, yet many Westerners persist with placing them in entirely different language families.

Pan-Indic and Pan-Asian Commonalities
In their manner of organizing syllables and writing, all Sanskrit and Tamil derived Indian languages are similar which should place them all in a common Indic language group. Moreover, they share this organizational feature with the Ethiopic Ge-ez, 
Tibetan, Sinhala,  Burmese, Thai, Khmer, earlier Lao, the pre-colonial Philipino Baybayin script for Tagalog,  Balinese and  Javanese. The Korean Hangul also shares certain commonalities.  (Langauges like Arabic and Hebrew are partially syllabic in that consonants are precisely denoted but vowel sounds are usually omitted and implied by the context.)

This would suggest that in the pre-colonial world, there was a broad similarity in language scripts that extended across the Indian Ocean from Ethiopia to Indonesia and extended further to the Phillipines and Thailand.

Since the written form of any language represents it in its most advanced form, it is curious how Western linguists and their Indian apologists have strangely  ignored this important facet in classifying the langauges of the world. Nor have they analyzed the important cultural and sociological implications of this shared heritage.

Phonetic Repertoire and Awareness
The organization of Sanskrit syllables also shows remarkable insight into the physiology of human speech production. Vowels are listed separately and divided by the time of pronunciation (short or long) and by the manner of their production (oral or nasal). In the Vedic period, vowels were also distinguished by their pitch accent (high, low or falling).  In this practice, archaic Sanskrit had more in common with languages to India's East such as Thai or Chinese.

Consonants were likewise divided between how they were sounded (as stops, approximants or sibilants).  Consonants were further divided by the place of articulation (such as where a part of the tongue was placed in the mouth to create the relevant sound - velar, palatal, retroflex, dental or labial). They were aware of consonant combinations as well as how consonants could be varied by using different parts of the tongue (root, body or tip) or lower lip for labial.  Consonants were further distinguished between the effort of articulation (internal for unaspirated, aspirated, unvoiced or voiced, and external for plosive, approximate and fricative).

This creates a repertoire of consonant sounds that finds no exact parallel in any European language but is partially or wholly replicated in the South Indian langauges.

For instance, consonants classified as {Sparsham, Nadam, Mahapranam} - i.e. <stopped voiced aspirated> consonants derived from the unvoiced and unaspirated g, j, b or soft and hard d are alien to English as are the <unvoiced  aspirated) forms of k, p and soft/hard t.

Phonetically speaking, from North to South, the languages of the Indian subcontinent have more in common with  each other than with any European language - (especially English  and French).

Pan-Indic Linguistic Features
Writing in Language in India (9, Jan, 2002), G. Sankaranarayanan observes how repeating words and forms is a significant feature that extends across the Indian subcontinent and includes  not only the Sanskrit and Tamil derivatives but also Munda and languages from the Tibetan-Burmese group.

While some forms of rhyming  reduplication are also to be found in English such as bow-wow or willy-nilly, other types of reduplication appear to be entirely absent or very rare in English.  For instance, the expression "Ram Ram" may be used to express anguish in Hindi, but its analog "God God" or "Jesus Jesus"  would be not be idiomatic in English. Likewise Hay-re-Hay or Baap-re-Baap used to express shock or dismay would be hard to replicate in English - the latter translating to father-oh-father.

In both Tamil and Hindi, a guest may be welcomed with the expression "va:nga va:nga" or "aiye  aiye" - i.e. "come, come" to suggest a special enthusiasm and graciousness. The correct analog for such a greeting in English might be "please do come", but not come come. Repeated words may be routinely used to designate emphasis - "piyo piyo" (drink drink) or "jaldi jaldi" (quick quick) or "dekho dekho" (look look).  Such usage is also to be found in other Asian languages such as Bahasa Indonesia where "tengo tengo" (look look) is a perfect translation of "dekho dekho".

In other contexts a repeated word (whether noun, pronoun, adjective, adverb, or verb) acquires a special semantic significance. 

Consider the Tamil " ra:tri ra:tri  maLHai peyyutu"  (night night it rains ) meaning that it rains frequently - every night or every other night.

Or the Hindi "apne apne vichar hain" (their their views/thoughts/opinions are) meaning that people have their own opinions.

In the interrogative form, in Hindi one might ask "kya kya kiya" - (what what did) meaning what all did you do? Or, "kahan kahan gaye" (where where went) meaning where all did you go?

One could also repeat a verbal participle: "bolte bolte thak gaye" or "kahete kahete thak gaye"  - (talking talking got tired or telling telling got tired)  i.e (I/we) got tired telling (him/her/them) again and again.

Thus word repitition  is an economic but meaningful way of expressing varied forms of frequency, plurality or multiplicity.

Note too that Indic languages permit the dropping of pronouns (which become implied). In the previous example both the subject (I/we) and object pronouns (him/her/them) may be dropped, but (got tired telling)  would be impermissable in English.

Another form of repitition is the use of an echo word to suggest a broader category than the word echoed. Note that the echo word may not be a word itself and its only requirement would be to partially repeat the first word.  Thus we may have "cha:y sha:y"  to suggest (tea etc),  or (tea and something with  it), or (tea or something like it).

Or, "kuch kaam vaam kiya"  to ask if (you/he/she) did any work or anything else constructive? Here "kaam" is work but "vaam" is used  to denote something comparable in signficance to work such as study or complete a chore or perform some other important task.

Here again, we observe a linguistical feature that extends across all Indic langauges (and even to other Asian langauges ) and to a European non "Indo-European" langauge like Hungarian but is rare or entirely missing in an "Indo-European" language like English.

Sentence Word Order
It may also be noted that across India, both Sanskrit and Tamil derived languages use SOV (subject Object Verb) word order as a default. But several Indo-European langauges such as English, French, Portugese and Bulgarian use SVO word order.

However, in colloquial or theatrical speech, (or even in poetic/literary texts) Hindi (like Arabic) also permits VSO. Moreover, when repeated words are used all Indian langauges permit the omission of the subject and the word order becomes flexible - either OV or VO. 

Word order also becomes flexible in the context of  question and answer exchanges.  Thus in Hindi "Gaye the Tum?" (Went did you?),  "Tum Gaye The?"  (You went did?)  and "Tum Gaye?" (You went?) are all possible.  Replies to where did you go could be equally varied from the standard SOV  "Main Allahabad gaya tha"  (I Allahabad went) to an OVS   "Allahabad gaya tha main"  (Allahabad went I) or simply OV "Allahabad gaya tha"  (Allahabad went) or even VO "Gaya tha Allahabad" (Went Allahabad)

In this respect, Indian languages are similar to each other but not to less flexible "Indo-European" languages like English. On the other hand, Russian and Czech (like Hungarian) do  not require a fixed or default word order.

 In conclusion, it might be stated that the present scheme of bifurcating Indian langauges into the "Indo-European" and "Dravidian" scheme is unsatisfactory in many ways. Not only does it ignore vital commonalities between the langauges of Northern and Southern India, it has also precluded comprehensive comparitive studies between these Indic languages and other Indic langauges such as the Munda or those from the Tibetan-Burmese stream.

Not only is the "Indo-European" classification based on very narrow grounds, it privileges an archaic oral history over later (and more important) developments when indic languages were studied systematically and formalized. Moreover, it entirely ignores the development of writing in the Indian subcontinent and also, the  linguistic exchanges  and enrichment that occurred between the Sanskrit and Tamil derived langauges as well as borrowings that must have occurred between these langauges and their Adivasi cousins . The classification also tends to mimimize commonalities and exchanges between the Indic languages and the languages of India's  land-connected neighbors and oceanic neighbors.

Also obscured is the scientific analysis and rational organization that went into the formalization of Sanskrit (in both spoken and written forms) and other Indic languages that created a solid  foundation for India's largely self-propelled  progress in philosophy,
epistemology, law and governance, mathematics, art, theatre and music,  mathematics, and the biological and physical sciences.

Consciously or unconsciously, the "Indo-European" scheme not only divided India from within but also set it apart from from its intellectually-linked Asian brethren and oceanic neighbors in Africa. 

Undoubtedly, theories such as this complemented Britain's colonial  "divide and conquer" strategy. Such disingenuous constructs (whether by accident or design) allowed the English to colonize,
subjugate,  and finally loot the Indian subcontinent - not only of of its legendary wealth,  but by distorting its linguistic heritage, it also robbed the Indian people of their very essence and  self-esteem.

It is high time that linguistic scholars in India revisit afresh this entire field and rescue it from  inappropriate and outdated colonial constructs.

About the Author
Shishir Thadani has an Undergraduate degree from IIT Delhi and a Post-Graduate degree in Computer Science  from Yale where his area of specialization included Theoretical Computer Science, the Syntax and Semantics of Computer Langauges and Natural Language Processing.

Acknowledgements
Giti Thadani, who is intimately familiar with several European langauges including German, French and Hungarian (as well as Sanskrit) also contributed  through several converstations with the author.


References:

Lakshman Sarup, The Nighantu and The Nirukta (London, H. Milford 1920-29), Repr. Motilal Banarsidass 2002, ISBN 81-208-1381-2.

Bimal Krishna Matilal (1990). The word and the world: India's contribution to the study of language. Oxford. Yaska is dealt with in Chapter 3.

"Siddhanta Kaumudi" by Bhattoji Diksita and "Laghu Siddhanta Kaumudi", by Varadaraja.

"Telugulo Chandovisheshaalu" (In Telugu).

Repitive Forms in Indian Languages by  G. Sankaranarayanan, Language in India (9, Jan, 2002)

Thirumalai, M.S. 2002. How to Learn Another Language? in Language in India (http://www.languageinindia.com)

Abbi, Anvita. 1980. Semantic Grammar of Hindi: A Study in Reduplications. New Delhi.

Apte, M.L. 1968. Reduplication, Echo, Formation and Onomatopoeia in Marathi. Pune.

Gnanasundaram, V. 1985. Onomatopoeia in Tamil. Annamalainagar.

Nayak, H. M. 1967. Kannada Literary and Colloquial: A Study of Two Styles. Mysore.

Sankaranarayanan, G. 1976. "Associative Pairs in Tamil," in The Eighth India University Tamil Teachers' Conference, Mysore.

Sankaranarayanan. G. 1983. "Reduplication in Tamil," in To Greater Heights (1969-79). Mysore.


Notes:

  • Certain pan-Indian aspirated consonants (dh, gh, bh etc) that are not to be found in "Indo-European"  languages such as English,   occur in some African languages and Arabic.
  • The paper “A megalithic pottery inscription and a Harappa tablet: a case of extraordinary resemblance,” published in the Journal of Tamil Studies, Volume No.71, June 2007  (amongst others) reveals startling similiarities between the Indus script and megalithic and chalcolithic  Tamil  pottery markings.

Additional References and Comments (as provided by S. Kalyanaraman)

The following works postulate an Indian linguistic area, that is an area of ancient times when various language-speakers interacted and absorbed language features from one another and made them their own:

Emeneau, MB, India as a linguistic area, Language 32, 1956, 3-16

Kuiper, FBJ, Proto-Munda words in Sanskrit, Amsterdam, 1948

The genesis of a linguistic area, IIJ 10, 1967, 81-102

Masica, CP, Defining a Linguistic area. South Asia. Chicago: The University of Chicago

          Press, 1971

Przyludski, J., Further notes on non-aryan loans in Indo-Aryan in
Bagchi, P. C. (ed.), Pre-Aryan and Pre-Dravidian in Sanskrit. Calcutta : University of  Calcutta 1929: 145-149

Southworth, F., Linguistic archaeology of South Asia, London, Routledge-Curzon, 2005

Ancient texts of India are replete with brilliant insights into formation and evolution of languages. Some examples are: Bharata’s Natya Shastra, Patanjali’s Mahabhashya, Hemachandra’s Deshi naamamaalaa, Nighantus, Panini’s Ashtadhyayi, TolkappiyamTamil grammar. Manu (10.45) notes the linguistic area: aarya vaacas mleccha vaacas te sarve dasyuvah smrtaah [both aarya speakers and mleccha speakers (literary and colloquial dialects) are all remembered as dasyu].

(From  http://sites.google.com/site/kalyan97/divinity-of-vaak-sarasvati-videos)


General Resources

Central Institute of Indian Languages
Manasagangotri
Mysore 570006, India

Wikipedia's Linguistic Resources, esp. those pertaining to Abugida Languages

Various on-line lexicons that compare Hungarian to various "Indo-European" and other languages

Related Articles

Back to main index for South Asian History


(If you liked our site, or would like to help with the South Asian History project and help us expand our reach, please click here)


Last updated: Aug , 2009