The vocabulary contains 1619 meaning-word pairs ("entries") corresponding to core LWT meanings from the recipient language Selice Romani. The corresponding text chapter was published in the book Loanwords in the World's Languages. The language page Selice Romani contains a list of all loanwords arranged by donor languoid.
|Word form||LWT code||Meaning||Core list||Borrowed status||Source words|
Alternative forms of a lexeme are separated by a comma: e.g. felhó, felhóva ‘cloud’. Optional parts of the lexeme’s form are bracketed, e.g. daj (taj) dad ‘parents’.
The "meaning" of the Selice Romani lexeme is mostly not entered if it corresponds precisely to the pre-defined LWT meaning of the Meanings table. With loanwords, the field is also used to highlight meaning differences from the source form. The field thus sometimes ends up filled in even if there is precise correspondence between the Selice Romani meaning and the LWT meaning: for example, the noun tollo ‘pen’ is a loanword of Hungarian toll ‘feather; pen’, and so this field contains “pen (not *feather)”.
With inflecting words, the field contains information on inflectional irregularities (oblique stems of nouns, comparatives of adjectives, perfective stems of verbs etc.). With nouns, the field almost always indicates their gender. With verbs, the field sometimes indicates their transitivity. With function words and adverbs, the field indicates the class of the word: pronoun (personal or reflexive), demonstrative, pro-word (interrogative or indefinite), preposition, adverb, co-verb (preverb, adverbial verb modifier), numeral and quantifier, particle etc. Occasionally, those functions of function words that are not sampled in the database are also mentioned. Especially with phrasal forms but also elsewhere, the field may contains information on syntactic construction. The field was also used to highlight mismatches between the pre-defined semantic category of the Meanings table and the grammatical word-class of the Romani word form, which are very rare: e.g. there is no adjective ‘stinking’ in Selice Romani, and so the verb khanden ‘to stink’ was used as an equivalent.
|Comment on word form||
This field contains various kinds of comments on the Selice Romani lexeme, its form and meaning, including:
The category ANALYZABLE PHRASAL is used for lexemes consisting of two or more words. The categories ANALYZABLE COMPOUND and ANALYZABLE DERIVED are used for compound and derived lexemes, respectively, whose morphological structure is fully transparent synchronically. The categories SEMI-ANALYZABLE and UNANALYZABLE require more detailed comments. The former category is assigned to several types of morphologically complex lexemes:
1. To synchronically non-transparent compounds: for example, the per nang o [ROOT/PREFIX-naked-INFL] ‘barefoot’ is, diachronically, a compound of the noun pr o ‘foot’ and the adjective nang o ‘naked’, but its first morpheme per is not regularly related to the nominal root.
The category UNANALYZABLE is assigned to the following types of lexemes:
1. To lexemes with monomorphemic inflectional stems. Importantly, pre-inflectional adaptation markers of loanwords are excluded from consideration: for example, although the inflectional stem of the verb čukl in en [hiccough-LOAN-INFL] ‘to hiccough’, a loanwords of Hungarian csukl ik, is bimorphemic, the stem before the adaptation suffix in is monomorphemic.
This field is filled in for all analyzable lexemes and for semi-analyzable lexemes of the types 1 through 4 (see comments on the analyzability field). The abbreviations used are those of the Leipzing Glossing Rules plus those listed under "Abbreviations".
Age refers to the (diachronic) syntactic or derivation structure of the item, but not to its phonological form or to its meaning. The age of analyzable lexemes (collocations, compounds, and derivations) reflects the time of creation of such complex expressions rather the age of their parts. The relevant units of temporal continuity of univerbal lexemes are inflectional stems. For example, although the ‘canonical’ forms of the Selice Romani verb d-en ‘to give’ (citation form: third plural present indicative) do not directly continue the ‘cannonical’ forms of the Old Indo-Aryan verb dā-da-ti ‘to give’ (citation form: third singular present indicative), but rather certain inflectional forms without the initial reduplication, there is continuity of the verbs’ inflectional stems between Old Indo-Aryan and Selice Romani, viz. da > d-, and so the Selice Romani verb d-en ‘to give’ may be considered to go back, via Old Indo-Aryan, to Proto-Indo-European deh- ‘to give’. Two types of age categories are used:
• First, there are genealogical age categories (1–6). Some of these (1–3 and 5) represent nodes on the tree model of the genealogical affiliation of Selice Romani, and are assigned to lexemes that can be reconstructed for these stages of the language. I should note that some lexemes assigned to the OLD INDO-ARYAN category might be actually older, i.e. PROTO-INDO-IRANIAN or even PROTO-INDO-EUROPEAN, as I have only checked for pre-Indo-Aryan etymologies of Old Indo-Aryan etymons selectively. The category LATER THAN EARLY ROMANI includes lexemes that are dialect specific within Romani, i.e. not reconstructable for Early Romani, and that, in addition, are not loanwords, based on loanwords, or calqued from a post-Early Romani L2.
• Second, there are two subtypes of contact-related age categories: those that are defined with reference to the period of contact with an L2 or a cluster of L2s (7–13) and those that are defined with reference to the beginning of such period only (14–16). The former are assigned not only to loanwords but also to calques from a given L2. The latter subtype of contact-related age categories is only relevant for past L2s; they are assigned to lexemes that are based on loanwords from a given L2 or that contain derivational markers from that L2. Certain arbitrary decisions had to be taken with regard to the assignment of lexemes to concrete L2s (e.g. a loanword that may originate in Slovak or Czech, has been assigned to the age category SLOVAK; see also Chapter).
NO. AGE CATEGORY FROM TO
The dates of the age categories are generally very approximate (see the book chapter for discussion). Note that some age categories show temporal overlap or even coincidence (e.g. EARLY ROMANI and GREEK L2).
This field is filled in for very few records. There is no differentiation between colloquial and formal registers in Selice Romani. However, I have used this field to indicate the following register-like distinctions:
This field has not been filled in for items with no evidence for calquing. Further details on calquing are given in the field "Created on loan basis".
For some words, the field contains information on the origin and structure of those Selice Romani words that are themselves not loanwords but have been created on the basis of loanwords (marked as loan basis): for example, the noun kiráckiňa ‘queen’ is not a loanword but its derivational base, the noun királi ‘king’, is; the particle kampe ‘need, should’ is not a loanword but has developed through grammaticalization of the verb kamen ‘to want etc.’, which is a loanword, and the accusative form of the indigenous reflexive pronoun pe; etc.
|Comment on borrowed||
This field contains four kinds of information:
221 etymologies are based on previous etymological works or remarks on Romani (Berger 1959; Boretzky & Igla 1994; Hancock 1995; Kostić 1994; Matras 2002; Mānušs et al. 1997; Tálos 1999; Turner 1926; Tzitzilis 2001; Vekerdi 2000) and Indo-Aryan (Beníšek 2006; Kuiper 1948; Lubotsky 2001; Mayrhofer 1996; Turner 1962–6; Witzel 1999). The field contains my surname (882 records) if the etymologies are my own, which is especially the case with all loanwords from Hungarian, Slovak and Czech; but also if I have made a selection among several previously suggested etymologies.
Beníšek, Michael. 2006. “Ke kořenům slova rom.” [On the roots of the word rom.] Romano džaniben, jevend, 9–28.
Berger, Hermann. 1959. “Die Burušaski-Lehnwörter in der Zigeunersprache.” Indo-Iranian Journal 3: 17–43.
Boretzky, Norbert & Igla, Birgit. 1994. Wörterbuch Romani–Deutsch–Englisch für den südosteuropäischen Raum: mit einer Grammatik der Dialektvarianten. Wiesbaden: Harrassowitz.
Hancock, Ian. 1995. “On the migration and affiliation of the Ḍōmba: Iranian words in Rom, Lom and Dom Gypsy.” In: Matras, Yaron (ed.) Romani in contact: the history and sociology of a language. Amsterdam: Benjamins. 25–51.
Kostić, Svetislav. 1994. “Romani čhib a jazykový kontakt” [Romani čhib and language contact] Romano džaniben 1: 42–54.
Kuiper, Franciscus B. J. 1948. Proto-Munda words in Sanskrit. Amsterdam: N. V. Noord-Hollandsche Uitgevers Maatschappij.
Lubotsky, Alexander M. 2001. “The Indo-Iranian substratum.” In: Carpelan, Chr., Parpola, A. & Koskikallio, P. (eds.) Early contacts between Uralic and Indo-European: linguistic and archaeological considerations. Papers presented at an international symposium held at the Tvärminne Research Station of the University of Helsinki 8-10 January 1999. (Mémoires de la Société Finno-ougrienne 242.) Helsinki 2001. 301–317.
Matras, Yaron. 2002. Romani: a linguistic introduction. Cambridge: Cambridge University Press.
Mānušs, Leksa, Neilands, Jānis & Rudevičs, Kārlis. 1997. Čigānu–latviešu–angļu etimoloģiskā vārdnīca un latviešu–čigānu vārdnīca. [Gypsy–Latvian–English etymological dictionary and Latvian–Gypsy dictionary.] Rīgā: Zvaigzne ABC.
Mayrhofer, Manfred. 1986–2001. Etymologisches Wörterbuch des Altindoarischen. 3 volumes. Heidelberg: Carl Winter.
Tálos, Endre. 1999. “Etymologica Zingarica.” Acta Linguistica Hungarica 46: 215–268.
Turner, Ralph L. 1926. “The position of Romani in Indo-Aryan.” Journal of the Gypsy Lore Society, Third series 5: 145–189.
Turner, Ralph L. 1962–1966. A comparative dictionary of the Indo-Aryan languages. Oxford: Oxford University Press.
Tzitzilis, Christos. 2001. “Mittelgriechische Lehnwörter im Romanes.” In: Igla, Birgit & Stolz, Thomas (eds.) “Was ich noch sagen wollte...” A Multilingual Festschrift for Norbert Boretzky on the Occasion of His 65th Birthday (Sprachtypologie und Universalienforschung, Supplements, Studia typologica 2). Berlin: Akademie Verlag. 328–340.
Vekerdi, József; with the assistance of Zsuzsa Várnai. 2000. A comparative dictionary of Gypsy dialects in Hungary. Gypsy–English–Hungarian dictionary with English to Gypsy and Hungarian to Gypsy word lists. Budapest: Terebess Publications.
Witzel, Michael. 1999. “Substrate languages in Old Indo-Aryan (Rgvedic, Middle and Late Vedic).” Electronic Journal of Vedic Studies 5: 1–67.
– The borrowing effect has been categorized as COEXISTENCE if, alongside the relevant loanword, there is an older form of the same or very similar meaning (roughly: in the scope of the pre-defined LWT meaning);
This field is filled in for borrowed nouns (and for adverbs that are based on case forms of borrowed nouns), adjectives and verbs; it is not filled in for borrowed adverbs (with the exception of the above) and function words, which generally do not allow for morphological integration in Selice Romani.
This field has been filled in in such a way that its contents is logically dependent on the contents of the Effect field. The category NOT PRESENT has not been used.
Effect Environmental salience
ABL2 Old Ablative (synchronically, an adverbial marker)
LOC2 Old Locative (synchronically, an adverbial marker)
SOC sociative (instrumental/comitative)
Further category labels used in morpheme-by-morpheme glosses:
ABSTRACT abstract or collective de-adjectival or de-nominal nominalization
ACTION action or product de-verbal nominalization
ADDITIVE additive numeral connector
DIRECTIVE directive orientation, movement towards a localization
DISTAL distal deictic root
EXTRAESSIVE extraessive localization (‘outside’)
INESSIVE inessive localization (‘in’)
INFERIOR inferior localization (‘under’)
INTERROGATIVE interrogative root
LOAN loanword adaptation marker
MIDDLE middle, “mediopassive”
PLAIN_DEICTIC plain (non-specific) deictic root
POSTERIOR posterior localization (‘behind’)
PREFIX prefix with a hard-to-describe function
PROXIMAL proximal deictic root
REDUPLICATION reduplicating morpheme
ROOT semi-analyzable root
SPECIFIC_DEICTIC specific deictic root
STATIVE stative orientation
SUFFIX suffix with a hard-to-describe function
SUPERIOR superior localization (‘above’)
VERB verb; verb-deriving marker
SR Selice Romani
ER Early Romani