Glossary
Affiliation
⇫
Affiliation refers to the larger genealogical group
(
genus,
family) that a
language belongs to.
Age
⇫
For most words,
World Loanword Database gives the time at which it was first attested or reconstructed
in the language. For
loanwords, we give the time when the word was borrowed. For
non-loanwords, we give the time of earliest attestation or reconstruction. The age
is indicated by year numbers or by language-particular age names (e.g. "Early Modern
Japanese", "Sranan Stratum"). In languages with no earlier attestation, age names
are often reconstructed proto-languages (e.g. "Proto-Tara-Cahitan",
"Proto-Tibeto-Burman").
Age score
⇫
For individual meanings and
semantic fields, we give an average age score, averaging
over all the words corresponding to the meaning (or to the semantic field).
The following age scores are assigned to words depending on their (estimated) age:
1. first attested or reconstructed earlier than 1000 |
1.00 |
2. earlier than 1500 |
0.90 |
3. earlier than 1800 |
0.80 |
4. earlier than 1900 |
0.70 |
5. earlier than 1950 |
0.60 |
6. earlier than 2007 |
0.50 |
Thus, the higher the average age score of a meaning, the older the corresponding
words tend to be.
In all average scores, words that correspond to multiple
LWT meanings do not
count fully. Thus, if a word corresponds to both the meanings 'air' and 'wind',
it counts 50% for the average score of 'air' and 50% for the average score of 'wind'.
Analyzability
⇫
Here we indicate for each word whether it is
- (1) unanalyzable (if the form cannot be analyzed into two or more constituents);
- (2) semi-analyzable (if one can identify a constituent structure, but not all constituents have meanings, such as cran in cranberry);
- (3) analyzable derived;
- (4) analyzable compound;
- (5) analyzable phrasal.
Author
⇫
The World Loanword Database is an edited work, consisting of 41 individual
vocabularies with individual authors. When citing
World Loanword Database only for one language or a
few languages, you need to cite the individual vocabulary and thus give credit to
the individual author or author team.
Borrowed score
⇫
For individual meanings and
semantic fields, we give an average borrowed score,
averaging over all the words corresponding to the meaning (or to the semantic field).
The following borrowed scores are assigned to words depending on their borrowed status:
1. clearly borrowed |
1.00 |
2. probably borrowed |
0.75 |
3. perhaps borrowed |
0.50 |
4. very little evidence for borrowing |
0.25 |
5. no evidence for borrowing |
0.00 |
Thus, the higher the average borrowed score of a meaning, the greater its borrowability.
In all average scores, words that correspond to multiple
LWT meanings do not count
fully. Thus, if a word corresponds to both the meanings 'air' and 'wind', it counts
50% for the average score of 'air' and 50% for the average score of 'wind'.
Borrowed status
⇫
There are five borrowed statuses, reflecting decreasing likelihood that the word
is a loanword:
- clearly borrowed
- probably borrowed
- perhaps borrowed
- very little evidence for borrowing
- no evidence for borrowing
This field does not allow values like "Clearly not borrowed" or "Clearly inherited"
because any word could have been borrowed at some prehistoric time, so we can never
be sure that a word is not an old loanword. And even loanwords can be inherited,
e.g. a word borrowed into Proto-Uralic can be inherited by Hungarian.
We are dealing basically with lexemes which are transferred or copied from one lect
into another. Words from a substrate language are considered to be loanwords, even
though some linguists do not use the term "borrowing" for transfer from substrates.
Excluded from the class of loanwords are neologisms (= productively created lexemes),
even those which consist partly or entirely of foreign material, because they are
created in the recipient language, not in the donor language.
Citation
⇫
The World Loanword Database is an edited work, consisting of 41 individual vocabularies
with individual authors. When citing
World Loanword Database only for one language or a few languages, you need
to cite the individual vocabulary and thus give credit to the individual author or author team.
Description
⇫
Under "description", we list various comments that the individual authors have
provided concerning the individual fields of their vocabularies. Here you also find
the list of works that are referred to.
Donor language, donor languoid
⇫
The donor language for a
loanword is the language from which
the word was borrowed.
Sometimes the language is not known, only the family (e.g. when it is clear that
the word was borrowed from a Bantu language, but it is not clear which Bantu language).
In such cases we can talk about a "donor family", even though strictly speaking the
word must of course have been borrowed from a single language. The term
"donor languoid" is a cover term for "donor language" and "donor family".
Earlier donor language (or donor languoid)
⇫
For many
loanwords, we know not only the
immediate donor language from which the
word was borrowed, but also an earlier donor language, if the word was itself a
loanword from some other language. And there are also quite a few loanwords for
which we can identify an earlier donor, but not the immediate donor language. For
example, for many Indonesian words it is clear that they must have come from Arabic,
but probably not directly, so there must have been intermediate languages. In such
cases, we only give the earlier donor language, not the immediate donor language.
Effect
⇫
This field contains information on
- whether the word replaced an earlier word (1: replacement),
- whether it was simply added where no earlier word existed with the same meaning (2: insertion),
- whether it coexists with an earlier word of roughly the same meaning (3: coexistence),
- or whether there is no information about its effect (0: no information).
Field number
⇫
The number in this column is the semantic field number. It is the first part of the
Loanword Typology Code of the words in the corresponding field.
Genus
⇫
A genus is a relatively shallow kind genealogical group. The "genus" level is used
in the classification of the
World Atlas of Language Structures Online
in addition to the
"family" level. Families with considerable time depth consist of different genera.
Grammatical information
⇫
The field "Grammatical info" contains grammatical information such as word class,
gender, inflection class.
ID
⇫
For the recipient languages, the language ID number corresponds to the ordering
of the chapters in the book "Loanwords in the World's Languages". Languages are
listed in rough geographical order from west to east, from Africa via Europe to
Asia and the Americas, so that geographically adjacent languages are next to
each other. For the other languages, the ID number has no particular significance.
Language family
⇫
In this field, we the give the name of the highest family that is generally
accepted to which the language belongs.
Language name
⇫
In this field, we the give the name of the language (or family, in the case of
donor languages) that was adopted in the
World Loanword Database.
Alternative names can be found on the individual language pages.
Languoid
⇫
"Languoid" is a (relatively new) cover term for "language" and "language family".
Loanword
⇫
A loanword is a word that was copied from another language, either by adoption
or by retention, at some point in the history of the language. Even if a loanword
is fully integrated, it is still a loanword, and a loanword never ceases to be a
loanword.
LWT
⇫
LWT is an abbreviation for "
Loan
word
Typology", the name of the Loanword Typology Project (2004-2009)
that resulted in the
World Loanword Database.
LWT meaning list
⇫
The LWT meaning list is the list of 1460 core lexical meanings that served as the
basis for the vocabularies of the
World Loanword Database.
It is based on the
IDS list created by
Mary Ritchie Key, which in turn is based on the list in Carl Darling Buck's
"Dictionary of Selected Synonyms in the Principal Indo-European Languages" (1949).
Meaning
⇫
By "meaning", we mean lexical meanings, i.e. meanings of lexical items. For each
word, there is a corresponding
LWT meaning, and often there are several
corresponding LWT meanings. For many words, there is additional language-particular
information in the field "Word meaning".
Original script
⇫
This gives the usual written form for languages that do not use the Latin script.
Recipient language
⇫
The recipient language for a loanword is the language into which the word was
borrowed, i.e. the language whose lexicon the word is part of.
Reference
⇫
This field often contains bibliographic information about works that were used
as a source by the authors.
Representation
⇫
This column shows how many counterparts for this meaning there are in the 41
languages. The number can be higher than 41 because a language may have several
counterparts for one meaning ("synonyms"), and it may be lower than 41, because
not all languages may have a counterpart for a meaning.
Salience
⇫
This field gives information about the degree to which a word's meaning is
relevant to the speakers. "Environment" refers both to the natural and to the
cultural environment. The three values are:
- Present in pre-contact environment
- Present only since contact
- Not present
By ‘contact’, we mean the first contact between speakers of the project language
and the donor language. This contact could have been with speakers of the donor
language, but it could also have been with written sources in the donor language.
Semantic category
⇫
Meanings were assigned to semantic categories with word-class-like labels:
nouns, verbs, adjectives, adverbs, function words. No claim is made about the
grammatical behavior of words corresponding to these meanings. The categories
are intended to be purely semantic.
Semantic field
⇫
The 1460 meanings of the LWT list are divided into 24 semantic fields, following
Carl Darling Buck's original classification.
Simplicity score
⇫
For individual meanings and semantic fields, we give an average simplicity score,
averaging over all the words corresponding to the meaning (or to the semantic field).
The following simplicity scores are assigned to words depending on their analyzability:
1. unanalyzable (= simple) |
1.00 |
2. semi-analyzable |
0.75 |
3. analyzable |
0.50 |
Thus, the higher the average borrowed score of a meaning, the fewer complex
words correspond to it.
In all average scores, words that correspond to multiple LWT meanings do not
count fully. Thus, if a word corresponds to both the meanings 'air' and 'wind',
it counts 50% for the average score of 'air' and 50% for the average score of 'wind'.
Source word
⇫
The source word of a
loanword is the word that served as the model during the
borrowing process, i.e. from which the loanword was copied.
Text chapter
⇫
Each
vocabulary has a corresponding text chapter in the book
"Loanwords in the World's Languages".
Word
⇫
The word is given in the usual orthography or transcription, and in the usual citation form.
Word meaning
⇫
For many
words, we have information in this field, giving the translation of the
word into English. This field was by no means obligatory, however, because the
meaning of a word is very often sufficiently described by giving the corresponding
LWT meaning(s).