In the past few years, digitization has become essential. This new technique is creating a
growing need for computer systems to manage texts electronically.
The DIGISCRIB Society, specialized in digitization of books and documents and encouraged
by its partners the Centre d'Etudes Supérieures de la Renaissance (CESR) and its team Virtual
Humanistic Libraries (BVH) and RE-TranscriPro, has invested in researching computerized solutions for encoding,
analysis, management and manipulation of texts and documents after OCR processing or after
This research goes hand in hand with the DIGISCRIB Society's research on Linux-based
OCR tools and image management programs, like Tesseract and ImageMagick, for example.
Based on the XML/TEI encoding method, taking account of the possibilities that it offers and
of the responses that it brings to an increasing demand, the DIGISCRIB Society has
devoted itself to the development of text-encoding business software(1).
« EditTEI » is the name of this new text encoder. It is written in Java, which makes it compatible
with several platforms: Linux, Windows, Mac, etc. It is perfectly trilingual (French, English and
This first complete version « EditTEI 1.6.5 » offers text-edit functionalities: the layout for
interactive tagging without needing to know nor enter the XML/TEI tags. This task is done using
a data header(2) or using an existing XML/TEI file(3), or simply from a new file(4).
This encoder offers commonly used text edit tools, as for example: open, save, print file, copy,
cut and paste text, insert or delete pages, insert special characters ...
In addition to the basic text edit tools, the business software « EditTEI » allows the addition or
deletion of existing XML/TEI tags, the encoding of characters in ASCII(5) and UTF8(6), among others,
and may allow with permission the use of an on-line correcting dictionary, the possibility of
« un-tilda-ing » of texts (extending words that are abbreviated with a tilda) or the concealment of
abbreviations on demand.