A common problem of terminology work is that the importance and indeed the very nature of terminology is poorly understood. Thus many people simply have no idea at all of what it is, while others, searching for an explanation of some sort, end up associating it with "thermal science" and hence radiators(1). Related professions in the communications field, such as translation and technical writing, will often be aware of the word without having precise knowledge of what it entails (cf. Chapter 3.1: "Actors and Working Conditions " for a more detailed discussion of this point).
In fact, terminology is a many-faceted subject being, depending on the perspective from which it is approached and the affiliations of the person discussing it:
To avoid confusion during its work, in particular when talking to non-specialists, the POINTER Project adopted a pragmatic definition of the word. In the context of this document and the POINTER Terms of Reference, therefore, "terminology" (or, in the plural, "terminological resources") has been defined as:
Three major points need to be made here:
In addition, the word "structured" needs some explanation: it should be noted that, in practice, terminological collections may well contain not only well structured standardised terms and concepts, but also innovative, vague and unstructured conceptual and linguistic information.
This basic definition of terminology is supplemented in this Final Report by two other terms:
and
One particular area of confusion highlighted by the POINTER Project is that of the differences between terminology and lexicology, and terminography and lexicography. Not only many non-specialists, but even many individuals working in such fields as language engineering and translation frequently confuse these concepts, and it is hoped that the explanations given below will contribute to a clearer understanding of the distinctions between these fields of activity.
While lexicology is the study of words in general, terminology is the study of special-language words or terms associated with particular areas of specialist knowledge. Neither lexicology nor terminology is directly concerned with any particular application. Lexicography, however, is the process of making dictionaries, most commonly of general-language words, but occasionally of special-language words (i.e. terms). Most general-purpose dictionaries also contain a number of specialist terms, often embedded within entries together with general-language words. Terminography (or often misleadingly "terminology"), on the other hand, is concerned exclusively with compiling collections of the vocabulary of special languages. The outputs of this work may be known by a number of different names - often used inconsistently - including "terminology", "specialised vocabulary", "glossary", and so on.
The work and objectives of lexicographers and terminographers are in many ways complementary, but there are a number of important differences which need to be noted.
Dictionaries are word-based: lexicographical work starts by identifying the different senses of a particular word form. The overall presentation to the user is generally alphabetical, reflecting the word-based working method. Synonyms - different form same meaning - are therefore usually scattered throughout the dictionary, whereas polysemes (related but different senses) and homonyms (same form, different meaning) are grouped together.
While a few notable attempts have been made to produce conceptually-based general-language dictionaries - or "thesauri", the results of such attempts are bound to vary considerably according to the cultural and chronological context of the author.
By contrast, high-quality terminologies are always in some sense concept-based, reflecting the fact that the terms which they contain map out an area of specialist knowledge in which encyclopaedic information plays a central role. Such areas of knowledge tend to be highly constrained (e.g. "viticulture"; "viniculture"; "gastronomy"; and so on, rather than "food and drink"), and therefore more amenable to a conceptual organisation than is the case with the totality of knowledge covered by general language. The relations between the concepts which the terms represent are the main organising principle of terminographical work, and are usually reflected in the chosen manner of presentation to the user of the terminology. Conceptually-based work is usually presented in the paper medium in a thesaurus-type structure, often mapped out by a system of classification (e.g. UDC) accompanied by an alphabetical index to allow access through the word form as well as the concept. In terminologies, synonyms therefore appear together as representations of the same meaning (i.e. concept), whereas polysemes and homonyms are presented separately in different entries.
In the electronic medium, similar considerations apply in principle to the organisation of entries with reference to synonyms and polysemes/homonyms. However, the retrieval of data still operates at present largely through the term (or a component ! of the term) rather than through the concept. Conceptually-based solutions for the representation and retrieval of data are being sought in the techniques of artificial intelligence.
Work organised conceptually may also be presented alphabetically, whereas the converse, i.e. the presentation of work originally organised according to the form of the word in a thesaurus-type structure, is highly problematic.
In dictionaries, related but different senses (or "polysemes") of the same word form are usually presented within one entry, e.g. bridge (of a violin, crossing a river, over a gap in teeth); unrelated different senses ("homonyms") of the same word form are normally presented as separate head words or entries, e.g. pupil (of the eye) and pupil (in a school). Synonym relations are not always made explicit in dictionaries, and the division of word forms into different senses tends to vary considerably between dictionaries. This lack of clear division into senses reflects the "slippery" nature of general-language words, compared to the more precise nature of terminological meaning.
In terminologies, homonyms and polysemes within the same subject field are treated as separate entries in a terminology (because the definition of the concept is different), e.g. in Automotive Engineering emission (the process of emitting exhaust gases) and emission (the exhaust gases themselves). Homonyms and polysemes of other subject fields are excluded. Synonyms, on the other hand, are always included as a part of the same entry in a terminology (being alternative representations of the same concept), e.g. automotive catalyst, catalytic converter.
The "headwords" or rather "entry terms" in terminologies are all open-class words, i.e. nouns (the vast majority), some adjectives, verbs and adverbs. The headwords in general-language dictionaries cover all word classes, including so-called grammatical words such as modal auxiliaries (e.g. can, must), prepositions (e.g. on, with), articles (e.g. the, an), certain adverbs (e.g. very), and so on. In terminologies, such words may appear as a component of the term or be shown as a part of the term's phraseology (i.e. the usual pattern of its immediate linguistic environment), but never as independent entry terms.
Dictionaries of the general language are descriptive in their orientation, arising from the lexicographer's observation of usage. Terminologies may also be descriptive in certain cases (depending on subject field and/or application), but prescription (also: "normalisation" or "standardisation") plays an essential role, particularly in scientific, technical and medical work where safety is a primary consideration. Standardisation is normally understood as the elimination of synonymy and the reduction of polysemy/homonymy, or the coinage of neologisms to reflect the meaning of the term and its relations to other terms. Terminologies - the outcome of this work, often in electronic form as termbases - are then the principal means of dissemination. In other words, in certain circumstances, terminologists may attempt to regulate language (in this case, the vocabularies of special languages), whereas lexicographers describe the words of general language.
Lexicographers have at their disposal a number of "style labels" which aim to distinguish between, for instance, informal, slang, or vulgar expressions, archaisms, and so on. Terminologists also need to distinguish between different communicative situations, although in a rather different way. While traditional terminology work is concerned mainly with the terms which characterise communication between subject experts, a broader view also incorporates less abstract levels of communication, e.g. between technicians, or between expert and layperson (such as doctor-patient; lawyer-client). In high-quality terminography, such variants must also be labelled or assigned to a particular source in order to identify the appropriate communicative context for their use.
The following table summarises the above comparison:
Lexicography Terminography Variety of language: YES NO general language (YES as YES special language special-purpose lexicography Subject matter: broad areas of YES NO knowledge (RARE) YES delimited domain NO YES use of classification system Method of working: word-based YES NO concept-based (RARE EXAMPLES YES ONLY) Presentation to user: YES (YES if reorganised) alphabetical (RARE) YES thesaurus-type structure Headword/entry term: YES NO closed class YES YES open class Presentation of entries: PRESENTED TOGETHER PRESENTED SEPARATELY polysemes/homonyms PRESENTED PRESENTED TOGETHER synonyms in same SEPARATELY entry Orientation: (largely depending Prescriptive NO on domain) Descriptive YES YES YES
Table 1 : Comparison between Lexicography and Terminography
A large majority of documents today are designed for specialist communication (including business and commercial texts). They are thus written in specialist language, 30-80% of which (depending on the particular domain and type of text in question) is composed of terminology(2). In other words, terminology (which as we have seen may also include non-linguistic items such as formulae, codes, symbols and graphics) is the main vehicle by which facts, opinions and other "higher" units of knowledge are represented and conveyed. Sound terminology work reduces ambiguity and increases clarity - in other words, the quality of specialist communication depends to a large extent on the quality of the terminology employed, and terminology can thus be a safety factor, a quality factor and a productivity factor in its own right.
The communication of specialist knowledge and information, whether monolingual or multilingual, is thus irretrievably bound up with the creation and dissemination of terminological resources and with terminology management in the widest sense of the word. This process is not restricted to science and engineering, but is also vital to law, public administration, and health care, to quote just three examples. In addition, terminology plays a key role in the production and dissemination of documents, and in workflow. Terminology as an academic discipline offers concepts and methodologies for high-quality, effective knowledge representation and transfer. These methodologies can be used both by language specialists and by domain specialists after appropriate training. In addition, they form the basis for an increasing number of tools for the identification, extraction, ordering, transfer, storage and maintenance of terminological resources and other types of knowledge.
Terminological resources are also valuable in many other ways: as collections of names or other representations, as the object of standardisation and harmonisation activities, and as the input (or output) of a wide range of applications and disciplines, whether human or machine-based (see the Figure below). The range of applications to which terminology is of direct relevance was a primary motivating factor at the inception of the POINTER Project with its brief to analyse the situation of terminology in Europe, and to make concrete suggestions for a future infrastructure and activities.
Figure 1 : Terminology Applications and Products
This wide range of applications and products is all the more important given the current technological and political developments in Europe. The last few decades have been characterised by the exponential spread and implementation of the concept of "globalisation". Although international activities and multinational trade existed well before this date, a new quality has recently emerged. Not only are raw materials sourced, and products sold, on a supranational scale, they are now increasingly developed, manufactured, marketed and sold for a global audience. Global competition and global co-operation - both of which presuppose global communication - are now common concepts. In the cultural arena, too, we can trace the development of what is often called the "global village", with greatly increased social and cultural contact, both active and passive(3).
At the same time, rapid technological development in general, and the rise of whole new fields and industries in particular, has led to shorter and shorter innovation cycles and to an exponential growth in knowledge and the need for its rapid and effective communication. Thus the total amount of specialist knowledge is currently thought to be doubling every five to fifteen years, depending on the area concerned(4).
This explosion in communication has been facilitated and driven by the computing and telecommunications revolutions, which have provided cheap processing power and new technologies for document processing. Vast databases can now be processed efficiently, and their contents transported effortlessly across national and geographical boundaries. Information is now commonly regarded as a fourth production factor alongside property, labour and capital. The number of intangible products is increasing rapidly, in contrast to the number of tangible ones. The practical effects of this can be seen, among other things, in the vast increase in the creation, capture, processing, storage, archiving, retrieval and subsequent evaluation of documents. For example, the Danzin Report [Danz 92] estimated that the European economies (calculated before the latest enlargement of the European Union in January 1995) would spend 650 million ECU on this in 1994. Equally, the number of major different subject fields (or "domains") for which terminology exists is estimated at several hundred or many thousand, depending on the degree of detail of the classification system used(5). In turn, each of these domains contains between several hundred and over ten million (e.g. chemistry) terms, again depending on the granularity of the system. The number of terms in each of the highly developed languages is commonly estimated at 50 million, excluding product names, which account for roughly another 100 million terms.
A point to be remembered here is that specialist (and indeed general) communication is normally an iterative and multilinear process, since knowledge is generally created in an evolutionary process and in several different places at once. Thus potential sources of uncertainty and misunderstanding arise in the form of homonyms (i.e. words that are used to denote more than one concept) and synonyms (i.e. more than one word for the same concept). This problem is becoming particularly acute with the strong tendency to interdisciplinarity in important modern scientific disciplines such as biotechnology, environmental science and materials science (it is a paradox that in this age of increasing specialisation science is becoming more and more interdisciplinary). At the same time, the risks involved in failing to communicate unambiguously and in a timely manner have often increased dramatically (two classic examples of this are the aerospace and environmental industries).
For all these reasons, contents-based information management is a prerequisite for improving the efficiency of communication. In addition, it should be borne in mind that communication is not solely monolingual, especially not within Europe. In fact, there is a clear trend at the moment towards an increased awareness of multilingual issues, despite the predominance or at least lead function of English in the technical, business, economic, political and - to a lesser extent - cultural fields.
One factor influencing this trend is the concern of a number of national and regional governments to ensure the long-term viability of their official languages in the face of competition from English and to ensure equal access for all citizens and social and economic groupings to new ideas and other information. Other significant factors are product liability and similar consumer protection legislation, as well as a more general wish among enterprises in particular to increase efficiency by improving internal and external communication and information flows. In addition, consumer goods manufacturers in particular are discovering the competitive advantage which products can achieve (especially in saturated or highly competitive markets) when localised into the languages spoken by their target groups.
The importance of these developments for a multilingual political federation such as Europe with its eleven official working languages and countless lesser-used ones(6) cannot be overemphasised. In fact, the European Commission sees itself as living in what it calls the Multilingual Information Society(7). Europe's dual position as a world player (and the original home of three world languages) and a multilingual collection of states means that effective multilingual communication on a vast scale is a prerequisite for both internal and external success. To quote only one statistic: the European Commission alone already has more than one million pages of text translated per year. Add to this the appropriate national figures for both the private and public sectors, and it soon becomes apparent that multilingual communication is already big business(8). However, it is equally clear that new, automatic methods and tools for multilingual information management (i.e. ones that go beyond current language-neutral ideas such as workflow, imaging and electronic document management) are urgently required if communication across linguistic, sector, regional and domain boundaries is to be optimised.
Since a great deal of this - specialist - communication relies on the vocabulary of a vast number of subject fields to convey its content, readily-accessible, up-to-date terminology will play an increasingly important role in (multilingual) information management in the 21st century.
Figure 2 : Terminology : A Key Discipline for the Information Society
2. Thus, for example, patent applications and technical standards have an extremely high percentage of terms (even though the same term may be repeated many times), while general business correspondence will have a lower one.
3. For a discussion of the subject see [Wal 95]
7. e.g. in the announcement of its Multilingual Information Society (MLIS) Programme on 8 November 1995. [MLIS 95]