BIK Terminology—

Solving the terminology puzzle, one posting at a time

  • Author

    Barbara Inge Karsch - Terminology Consulting and Training

  • Images

    Bear cub by Reiner Karsch

  • Enter your email address to subscribe to this blog and receive notifications of new posts by email.

    Join 790 other subscribers

Posts Tagged ‘TKE’

Terminology and Knowledge Engineering Conference 2012

Posted by Barbara Inge Karsch on November 16, 2011

There are not too many conferences that carry terminology in their title and that offer continuing education and fruitful exchanges to terminologists. The TKE conference is one of them. Here is the announcement of the TKE Conference 2012.

New frontiers in the constructive symbiosis of terminology and knowledge engineering

TKE (Terminology and Knowledge Engineering) Conference

In 2012, the conference will take place in Madrid, Spain, from June 19 through 22. This conference will mainly focus on those theoretical, methodological and practical aspects that show the symbiosis of terminology and knowledge engineering by highlighting the recent advances in these related fields. The first call for papers can be accessed here.

 

TKE Conference 2012

Posted in Events | Tagged: | 1 Comment »

How many terms do we need to document?

Posted by Barbara Inge Karsch on December 17, 2010

Each time a new project is kicked off this question is on the table. Content publishers ask how much are we expected to document. Localizers ask how many new terms will be used.

Who knows these things when each project is different, deadlines and scopes change, everyone understands “new term” to mean something else, etc. And yet, there is only the need to agree on a ballpark volume and schedule. With a bit of experience and a look at some key criteria, expectations can be set for the project team.

In a Canadian study, shared by Kara Warburton at TKE in Dublin, authors found that texts contain 2-4% terms. If you take a project of 500,000 words, that would be roughly 15,000 terms. In contrast, product glossary prepared for end-customers in print or online contain 20 to 100 terms. So, the discrepancy of what could be defined and what is generally defined for end-customers is large.

A product glossary is a great start. Sometimes, even that is not available. And, yet, I hear from at least one customer that he goes to the glossary first and then navigates the documentation. Ok, that customer is my father. But juxtapose that to the remark by a translator at a panel discussion at the ATA about a recent translation project (“aha, the quality of writing tells me that this falls in the category of ‘nobody will read it anyway’”), and I am glad that someone is getting value out of documentation.Microsoft ClipArt

In my experience, content publishing teams are staffed and ready to define about 20% of what localizers need. Ideally, 80% of new terms are documented systematically in the centralized terminology database upfront and the other 20% of terms submitted later, on an as-needed basis. Incidentally, I define “new terms” as terms that have not been documented in the terminology database. Anything that is part of a source text of a previous version or that is part of translation memories cannot be considered managed terminology.

Here are a few key criteria that help determine the number of terms to document in a terminology database:

  • Size of the project: small, medium, large, extra-large…?
  • Timeline: Are there five days or five months to the deadline?
  • Version: Is this version 1 or 2 of the product? Or is it 6 or 7?
  • Number of terms existing in the database already: Is this the first time that terminology has been documented?
  • Headcount: How many people will be documenting terms and how much time can they devote?
  • Level of complexity: Are there more new features? Is the SME content higher than normal?

These criteria can serve as guidelines, so that a project teams knows whether they are aiming at documenting 50 or 500 terms upfront. If memory serves me right, we added about 2700 terms to the database for Windows Vista. 75% was documented upfront. It might be worthwhile to keep track of historic data. That enables planning for the next project. Of course, upfront documentation of terms takes planning. But answering questions later is much more time-consuming, expensive and resource-intense. Hats off to companies, such as SAP, where the localization department has the power to stop a project when not enough terms were defined upfront!

Posted in Content publisher, Selecting terms, Translator | Tagged: , | Leave a Comment »

TKE 2010—A Short Report

Posted by Barbara Inge Karsch on September 2, 2010

TKE (International Conference of Terminology and Knowledge Engineering) was recently held in Dublin. The title this year was “Presenting Terminology and Knowledge Engineering Resources Online: Models and Challenges”. Here are my thoughts on three presentations on large database projects and one workshop.

focal.ieOne of the invited talks was given by Michal Boleslav Měchura and Brian Ó Raghallaigh who are the technical brains behind the Irish National Terminology Database that serves a stunning 600,000 users. Much like the Rikstermbanken of the Swedish Center for Terminology discussed in Quantity matters, this project makes a (corporate) terminologist’s mouth water for its funding. According to the project website, there are no fewer than 18 people on the project team. Michal shared how the team is using statistics and user feedback to improve the search capabilities, the user interface, and the data presentation.BACUS

BACUS (Base de Coneixement Universitari) is a terminology database created at the Universitat Autònoma de Barcelona by students as part of their course work. Students work with subject matter experts to create entries in at least three languages. Two of them must be languages taught at the Faculty of Translation and Interpreting: Catalan, Spanish, English, French, German, Portuguese, Italian, Russian, Arabic, Chinese, or Japanese. The third may be a language not taught at the Universitat, such as Basque, Bulgarian, Danish, Slovak, Galician, Greek, Dutch, Norwegian, Latin, Pulaar and Swedish. In their paper, Aguilar-Amat, Mesa-Lao, and Pahisa-Solé describe in detail the high-quality approach that students are taking to arrive at their entries. For example, “all linguistic data included in the BACUS project are obtained from corpora of original texts in different languages on the same specialized subject.” The work on the database has been discontinued, but it is well worth a look.

imageSuch a high-quality approach cannot be expected for entries from a federated term bank, such as EuroTermBank. This project, developed and managed by Tilde, is probably not new to you. Andrejs Vasiljevs presented the results of a survey of different groups of potential system users. In his paper, Andrejs discusses the need to open up term banks to user participation.

At J.D. Edwards user participation in the form of entry requests and comments was implemented in a format that allowed for prescriptive terminology management, as is necessary in the corporate environment. There is no reason, though, that federated term banks should not adopt Wikipedia-style knowledge sharing, approval mechanisms known from commercial sites, and the like. Once sharing, voting or commenting mechanisms are implemented, the key might be to get as many experts to use the database as possible, so that unreliable data be found and eliminated quickly. It would be interesting to discuss entry reliability with regard to these projects and the ones mentioned in Quantity matters.

The main workshop I would like to mention is, of course, the discussion of standard ISO 704. Thank you for participating through the survey and comments in Who cares about ISO 704, which I mentioned in my presentation. During the workshop, we agreed to suggest to the respective workgroup in ISO TC 37 to streamline the current content, review the example used, and add parts geared towards the different user grouimageps. I very much enjoyed the work in this group and feel that it will lead to a better standard down the road.

The TKE organizing committee decided to expand membership of GTW (association of knowledge transfer)), the organization behind TKE. A new subgroup of the Terminology group on LinkedIn is being formed specifically for that purpose. If you are interested, join the group called Association for Terminology and Knowledge Transfer; just allow a bit of time for approval.

To conclude my little TKE report: It was a particular pleasure to witness Gerhard Budin bestow the Eugen Wüster Prize upon Sue Ellen Wright and Klaus-Dirk Schmitz from Kent State University and Cologne University of Applied Science, respectively. It couldn’t have gone to two more well-deserving individuals.

Posted in BACUS, EuroTermBank, Events, Irish National Terminology Database, J.D. Edwards TDB, Rikstermbanken, Terminology portals | Tagged: , , | 1 Comment »

Who cares about ISO 704?

Posted by Barbara Inge Karsch on July 29, 2010

The next standard to talk about is ISO 704 “Terminology work—Principles and methods.” It is an interesting one for a variety of reasons. For one, I have more questions than answers.

At the TKE (Terminology and Knowledge Engineering) Conference in Dublin, my esteemed colleagues, Hanne Erdman Thomsen, Sue Ellen Wright, Gerhard Budin and Loïc Depecker will devote a workshop to ‘Accommodating User Needs for ISO 704: Towards a New Revision of the Core International Standard on Terminology Work’. I will have a short time slot to provide input myself and therefore have been re-reviewing ISO 704 over the last few days.

As I am putting my thoughts together, I was wondering: Who knows or uses ISO 704? I would like to invite you to do two things: Click on the little survey below in this posting. And, if you haven’t done so, please tell me about yourself by participating in the survey on the Survey tab. Both surveys are anonymous and might help me understand what this standard could do. If you know the standard and have something to share about it, please leave a comment below. I would be very grateful to get your input.

Because, quite frankly, I am puzzling over this standard. I have read it three times over the past year and every time after a few weeks go by, I have to think about what this standard is actually for. I believe it stems from the fact that it is a bit wordy at the moment. It contains a lot of good information, but the presentation is ineffective.Designations

But now, what can it do for the reader? As its title says, it lays out the various principles underlying terminology management. For example, it tells us what objects, concepts, concept relations and concept systems are. It then goes into definitions and definition writing, before the subject of designations is discussed. Remember, this little graphic from What is a Term? As an aside, we talk about terms many times when we actually mean designations; in German, we even find the ugly Anglicism Term and its plural Terme.

So, ISO 704 really does do what the title says, it presents us with principles and methods. It just doesn’t seem to stick with me. Yet.

Posted in Events, Terminologist, Terminology methods, Terminology principles | Tagged: , , | 4 Comments »