working documents of the Conseil d'Etat
|Printed documents of the Conseil d'Etat: Gérando, the digitised collection|
Presentation of the digitised collection
The specifications of the digitised collection
Reason for choosing the Gérando collection
The most complete unbound
Physical description of
the collection of originals
The Gérando collection of the Conseil d'État
was chosen for practical reasons: it consists of unbound fascicles,
approximately 25 x 21 cm, with a broad margin both towards the outer
edge and at the bottom of the page, leaving a text area of just 18 x
10 cm approximately. Indeed even though some of the printed documents
are in a larger format of approximately 32 x 21 cm, the text usually
has the same layout as in the smaller formats. The fascicles, always
bifolios, vary a great deal in bulk, from a single printed sheet (giving
a total of 4 pages) to several hundred pages, and the pagination is
in most cases printed. The most voluminous fascicles have sewn sections;
as for the others, the bifolios are simply folded one inside the other.
Some very large format tables (A2 or A3) were folded inside a few of
The technical options for digitisation and giving access to the documents
The digitisation was performed by an external
service. The Gérando collection was taken to the premises of the Berger-Levrault
company in Nancy to be scanned. The documents were scanned manually
page by page in A3 format, in black and white (two-tone) with a definition
of 300 dpi. The double-page images thus obtained were then cut to give
one page at a time and stored in TIFF format with Group IV compression.
The images of the pages of each document were then stored in the form
of a multipage PDF file, named according to the barcode affixed by the
library to the first page of the original document.
These image files were then sent to a workshop in the French-speaking country of Madagascar and used as the source for keying the text in manually (1): using Unicode. Quality control guaranteed 99.99% accuracy, i.e., one error per 10,000 characters. Typographical errors, common in these documents, which were often put together hurriedly, have been retained. Forty-five documents containing complex tables, for which HTML entry proved particularly problematic, were not fully keyed in. These documents are available for consultation in image form only.
The napoleonica.org site gives online access to the digitised documents in text mode for obvious reasons: 1) ease of navigation (small size of text files) 2) greater search efficiency (indexing possible). In the case of the complex tables and documents containing handwritten annotations, they have been made available in image form and can be viewed using the freeware, Acrobat Reader.
Characteristics of the digitised collection
|The use of OCR (Optical Character Recognition) was rejected after tests showed that the quality of the texts thus produced was very poor as a result of the old typface, the damp patches on some of the documents and the complex layout of certain pages. [back]|