Zambezin (2002), XXIX (ii)Lexicographical Developments in theShona Language as Reflected in theMaking of the Duramazwi GuruReChiShona (DGS)1N. MPOFUAfrican Languages Research Institute, University of ZimbabweAbstractTin's article traces lexicogiaphical developments in Shona, one of the major languagesof Zimbabwe, with particular focus on corpus building and the role the corpus hasplayed in Shona lexicography in the past hundred or so years and recentdevelopments as reflected in the making o/Duramazwi Guru ReChiShona by theAfrican Languages Research Institute (ALR1) team of the University of Zimbabwe.BackgroundLexicography in Shona is not a new discipline. It dates as far back as the1850s when missionaries began constructing orthographies for Shonaspeakers in the areas in which the missionaries were stationed. These earlyorthographies were to be used to construct vocabularies that would enablethe translation of religious texts from English into Shona. From then untilthe 1990s, several glossaries and dictionaries were produced. As Fortune(1979,1992) correctly observed, Shona dictionaries compiled in this periodwere all bilingual in nature. Their primary purpose was to provide a writtenbasis for the lexical items of the language as a whole (Fortune 1992:18) andwere targeted at foreign mission workers, settlers, miners, and prospectorsin order to aid them in their interactions and contacts with the local people.Most of these early publications were essentially grammar texts that merelydescribed the nature of the language to non-Shona speakers.According to Fortune, these early publications revealed both thecompilers' very limited knowledge of the language and of the techniques ofdictionary making (1992: 17). The fact that compilers of these earlypublications were describing a language that had not been written before,often worked in isolation in their remote mission stations and relied mainlyon their own Bible translations for headwords or lexical items to use in the1. Herbert Chimhundu ed 2001, Diiramazwi Guru ReChiShoita, Harare: College Press149150 Lexicographical Developments in the Shona Languageglossaries or manuals, led to the production of different orthographies,some of which distorted the Shona language. Moreover, the compilers didnot give adequate coverage to cultural items mainly because they wereeither not aware of them, did not understand Shona culture, and, therefore,could not explain it, or they regarded African culture as inferior and inneed of replacement by 'civilised' cultural items and notions.Soon after Doke developed a Shona orthography in 1931, ReverendBarnes published A Vocabulary of the Dialects of Mashonaland (1932) whichlaid the foundation for subsequent serious lexicographic work in Shona.Barnes took the initiative to order entries alphabetically and to organise thewords and their meanings in such a way that he was able to break awayfrom the tradition of explaining words by providing examples of sentencesin which they could be used.Hannan and Dale published the two most known bilingual dictionariesin Shona in 1959 (revised and expanded in 1974) and 1981, respectively.These dictionaries demonstrated the compilers' knowledge of the techniquesof dictionary making and showed that they had a more profound grasp ofthe language than their predecessors. Dale's 1981 dictionary, Duramazxui,though comprising a mere 249 pages and, therefore, not comprehensive inits coverage of Shona words, proved to be a useful record of the Shonalexicon.Dale's dictionary was different from its predecessors in that it gaveheadwords and definitions in Shona before translating the definitions intoEnglish, 'thus paving the way for entirely monolingual Shona dictionaries'(Fortune 1992: 20). Unlike earlier publications, Dale's dictionary alsoprovided synonyms, antonyms, and variants of the headwords, as well asillustrations to complement the given definitions.More recently, with the publication of two mono-lingual Shonadictionaries, Shona lexicography has developed from merely being a meansthrough which a non-Shona speaker can learn the language to being arecord of the language in its own right. Compiled by Shona speakers, usingmodern techniques, the new dictionaries differ from all previous Shonadictionaries in that they treat Shona as both the object and the instrument ofdescription.The Corpus in Shona LexicographyA corpus is a collection of texts, collected to facilitate the study of a languageor part of a language. In order to construct a dictionary for any language, itis necessary to build up and analyse a corpus in order to establish whichwords are actually used by the native speakers and how they are used (Ore1992: 20).Previous Shona dictionaries were handicapped by the fact that theyrelied heavily on the Biblical literature that the compilers thought wasN. MPOFU 151relevant to the exclusion of other important literatures and sources.Moreover, the lack of the appropriate technology to generate a reliableelectronically processed corpus compounded the problems of dictionarymaking. Hannan and Dale used an index-card system to order and processthe corpus they were using and relied mainly on biblical literature andBible translations, whose language was not always reliable or appropriate.Not surprisingly, Hannan's dictionary contains some obscure andunrepresentative words, such as: angeve (angel) instead of the known formngirozi, hafubhaki (half-back), and endekesi (a volume of the Bible). Had hebeen using an electronically generated and processed corpus, it would havebeen very clear to him that such words were very uncommon and he mighthave entered them for historical interest only.The great leap forward in dictionary making in recent times has beenmade possible by the use of information technology in both lexicographicalresearch and the production and presentation of lexicographical material.Since the 1980s, there has been massive investment in the construction andexploitation of computerised corpora of naturally occurring language, bothspoken and written (Singleton 2000: 198). The African Languages LexicalProject (ALLEX) of the University of Zimbabwe, now the African LanguagesResearch Institute (ALRI), pioneered the use of electronic corpora indictionary making in Zimbabwe and developed an electronically processedShona corpus, generated from both written and oral sources, which, at2002, has over 2.2 million running words. Oral sources include interviews,informal conversations, church services, classroom lessons and debates,while written sources include fictional material, ranging from prose topoetry and plays, and non-fictional material such as school textbooks, otherShona non-fiction literature, including literature in foreign languages thathave been translated into Shona, such as Tsanga Yembeu (A Grain of Wheatby Ngugi wa Thiong'o). All this material was then encoded or scanned,tagged, proofread and parsed and then included in the corpus.The advantage of utilising computers in dictionary making is thatcompilers of dictionaries are easily able to identify the instances and contextsin which different words are used. Furthermore, with an electronic-processedcorpus, it is possible to make concordance files, which record relevantinformation about words that the definer can then use to constructdefinitions.The ALLEX team gathered its data from the various Shona-speakingareas of Zimbabwe with the help of research assistants who tape-recordedinterviews and activities at churches, schools, sporting events and atindividual homesteads. Thus the team was able to collect materials ondifferent topics and issues in varied settings so as to capture as complete arange of regional variations as possible. This was consistent with Kipfer's(1984: 32) observation that 'the primary source of data for the dictionary152 Lexicographical Developments in the Shona Languagemaker is the utterances of the speakers of that language ... the differencesand changes that occur in a language must be recorded by the lexicographer'.The ALLEX team also developed a concordance programme in order toidentify the frequency of headword occurrences in the corpus and thedifferent contexts in which particular words are used. Concordances werefound to be useful also because they enabled the team to deduce thevarious meanings and styles of use associated with each word, making iteasier to recognise what words were used in what contexts. The ALLEXteam used the corpus it had built up in the compilation of two Shonadictionaries, Durcimazwi reChiShona (1996) and Duramazwi Guru reChiShona(2001). The dictionaries are slightly different in scope and coverage becausethey are aimed at different audiences, the former being a general sizedmonolingual dictionary, while the latter is a much more advanced dictionary.The corpus was used as a source for headwords, for identifying thedifferent senses of each word and for citations. However, in both DuramazwireChiShona (DRC) and Duramazwi Guru reChiShona (DGS), although materialwas collected in the various dialectal regions of the country, the headwordswere not marked for dialect, nor is there any indication in these books tosuggest the area(s) in which any of the words are spoken. All terms weretreated equally and neutrally. With the help of the corpus, it was possible toidentify the most common form of a word, and it was this form that wasgiven as the main entry, while the less common forms were entered asvariants under that entry. These variants were also entered and cross-referenced to the main entry. The dictionaries, thus, give a range of thevariants and synonyms of the word, regardless of the districts in whichthey are used. For example:nzara [zhara] D- z9 Nzara kunzwa kuda kudya . . .. (Hunger)zhara D- z9 Ona nzara 9. (See nzara)-ngandudza [-ngandutsaj D it Kungandutsa kutungidza . . .. (To set alight)-ngandutsa D it Ona -ngandudza. (See -ngandudza)Main entry definitions in Duramazivi reChiShona and Duramazwi GurureChiShona are in sentence form, as well as the citations or examples ofusage to give the user the context in which the particular word has beenused. Not all citations were generated from the corpus, for as those involvedin dictionary making will know, a corpus can never contain everything thatis to be found in a language. Moreover, common day-to-day words,sometimes referred to as 'toothbrush words', have very low frequencycounts or may not even be there at all. Nevertheless, they can not be left outsimply because they do not appear in the corpus.N. MPOFU 153The Use of Databases in Shona LexicographyAn innovation in DRC and DGS was that they were developed throughdatabases and not through word processing. The database used in thecompilation of Duratnazivi reChiShona had fields for the headword, thevariants, tone, word class, noun class, verb information, namely, whether aparticular verb was transitive or intransitive, the plural form of the word,the definition field, and synonym and antonym fields. Global definitionand 'compare' fields were added. The dictionaries were produced withoutusing any word-processing programme. The advantage of this method isthat there are no shifts in the structure of the entries, thus making the finalproduction process easier.There are many advantages of using a database programme in dictionarymaking, including the ease of movement from one entry to another in thedefining and editing stages and the greater consistency in handling wordsthat are in the same category. More accurate cross-referencing betweenvariants and synonyms and consistency checks for words of the samesyntactic or semantic type are also possible. In addition, it is possible tosuppress information that should not appear in the final manuscript butwhich can be recalled later for use in future dictionaries.Other DevelopmentsShona lexicography has developed, not only in terms of the techniques ofdictionary making, but also in respect of the innovations that have beenintroduced in the dictionaries themselves. Duramazwi Guru reChiShona wasinnovative in terms of lemma status and the structure of the dictionaryitself. The dictionary is in two parts: Part 1 being the A-Z section of thedictionary including idioms, while Part 2 is the section with proverbs andfigures of speech. Proverbs and idioms were both given lemma status,while the tradition with other Shona dictionaries has been to give them asrun-on entries under the most dominant noun or verb. The issue of how tohandle or where to place multi-word lexical units posed problems to theeditors. The decision to enter idioms and proverbs as headwords wasdesigned to compile a more user-friendly dictionary.Another innovation was that, for lemmas with more than two senses, aglobal definition, in the form of a paraphrase and not a complete definition,was given. The global definition helps the user, who wants to get a sense ofthe general meaning of a word, to do so. For instance, for the verb -bata, theglobal definition is:-bata K it Kugunzva noruoko kana kuisa muchanza ... (to touchwith the hand or to hold in your hand).154 Lexicographical Developments in the Shona LanguageOther words related to headword were indicated by the use of the tnrisa[TAR] (compare) marker.Several style markers were also incorporated to mark special senses.These style markers include manje (chimdircrwrr) tor siang or colloquialuses; hire, for an archaic sense of the word; ::,;ŁŁ?Ł;,-<\ tor trademarks !;,ko(chititko), for swear or offensive words; innnko (chnwadzc). tor taboo words;and nhnnhn, for baby talk. These markers were designed to indicate thevarious nuances of particular words.Through its contact with other languages, Shona has acquired andnaturalised many non-Shona terms, mostly from the English language.These loanwords posed challenges in the compilation of DGS because oftheir orthographic make-up, as they contained some letters that are not inthe current orthography. The problem with the current Shona orthographyis that it does not contain the letters /, q, x and the digraphs th and rh. Onethus finds that some loanwords only exist in speech, though they may beeveryday usages, but are not accepted when written down. Examples arethiyeta (operating room and movie hall), thiyori (theory), themomita(thermometer), losheni (body lotion), and laibhurari (library).The problem that faced the compilers of DGS was whether to leave outsuch words completely or to include them and, if so, how to handle them.Previous dictionaries had completely left out such forms. The compromisethat was finally reached was to enter / words as variants of r, thus losheniwas entered under / with an asterisk (*) to indicate that this form was notacceptable in the current orthography, and then cross-referenced to roshenidefined under rosheni. The th words were also entered with an asterisk (*)and defined because there was no other way of handling them. This problembrought to light the discrepancy that exists between speech and writing inShona, which requires urgent attention in order to cater for such loanwords,which have no Shona equivalent but which are increasingly used by Shonaspeakers.There are some loanwords in Shona that are phonetically aspirated suchas resipi, ragibhi, rege. These have continually been written without an //because of the current orthography, yet in speech they are aspirated. InDGS, both aspirated and unaspirated forms were entered, with the aspiratedform being spelt with rh and carrying an asterisk (*) and cross-referenced tothe unaspirated form.Duramazwi Guru reClnShona also went beyond previous Shona dictionariesby providing a comprehensive back matter section, comprising no less than41 pages and providing information on names of African countries, scalesof measurement, judiciary terms, colour terms, times of the day, days of theweek, months of the year, seasons of the year, names of chiefs, their areas ofjurisdiction and their totems and clan names, as well as literary andgrammatical terms.N. MPOFU 155ConclusionWith the making of DGS, Shona lexicography has indeed come a long way,from being simply a tool which a non-Shona speaker could use in his/herquest to learn a new language, to the current situation in which Shona ispresented and defined in Shona, rather than English.ReferencesBARNES, B. H. 1932, A Vocabulary of the Dialects of Mashonaland, London: TheSheldon Press.CHIMHUNDU, H. ed 1996, Duramazwi reChiShona, Harare: The College Press.ed 2001. Duramazivi Guru reChiShona, Harare: The College Press.DALE, D. 1975, A Basic Shona-English Dictionary, Gweru: Mambo Press.DALE, D. 1981, Duramazwi: A Shona-English Dictionary, Gweru: Mambo Press.FORTUNE, G. 1979, 'Shona lexicography', in Zambezia, 11 (i), University ofRhodesia.FORTUNE, G. 1992, 'General lexicographic experiences Š Zimbabwe', in H.Chimhundu ed, African Languages Lexical Project First Workshop Report,Harare: University of Zimbabwe.HANNAN, M. 1959, Standard Shona Dictionary, Salisbury: The Literature Bureau.1974, Standard Shona Dictionary, Salisbury: The Literature Bureau.HARTMANN, A. M. 1893, An Outline Grammar of the Mashona Language, CapeTown: Juta.1894, English-Mashona Dictionary, Cape Town: Juta.KlPFER, B. 1984, Workbook on Lexicography, Vol. 8, University of Exeter.LOUW, C. S. 1930, Manual of ChiKaranga, Bulawayo: Philpott and Collins.MARCONNES, F. 1929, A Manuscript Grammar of Karanga, Johannesburg:Witwatersrand University Press.SINGLETON, D. 2000, Language and the Lexicon, UK: Arnold Publishers.