Merkel, M. (1992). Recurrent Patterns in Technical Documentation. Technical Report LiTH-IDA-R-92-31, Department of Computer and Information Science, Linköping University, Sweden. (bibtex),
Abstract: This paper addresses some of the problems involved in the production and translation of technical documentation. The techniques and methods developed within Natural Language Processing in general and Machine Translation in particular have still a long way to go before we can see any commercial products that would be general enough to automatically translate unrestricted text. Instead of merely aiming for the perfect MT system, we should also focus on how to make use of existing and simple techniques and the capacity of today's hardware to make the production of technical documentation faster, better and cheaper. Even a twenty per cent gain in efficiency compared to manual translation is considerable compared by any industry standard.In this paper I describe a tool that pre-processes the source text and gives various kind of information that forms decision support whether translation tools should be applied at all. Examples from analyses show that up to 43 per cent of a text could be repetitious and that this should be utilised before the translator starts translating. If we consider both repetitions within one document as well as repeated patterns across documents, there is evidence in the corpus that 55 per cent of the text in one document can be regarded as recurring. The tool has been run on several real handbook texts from major computer software companies and a summary of the results is presented.
CS Dept TR Overview