How the Shoah Foundation Works

Cataloguing and Indexing

A Shoah staff person assists a researcher in retrieving information from the database
A Shoah staff person assists a researcher in retrieving information from the database
Photo courtesy Shoah Foundation

To catalogue a particular testimony, a staff member enters brief biographical information about the survivor or witness. Then, the testimony is indexed using specific key words selected from the Shoah Foundation's 30,000-word, controlled-vocabulary, English-language thesaurus. Also created in-house, the thesaurus has developed over time as indexers watch actual testimony. Because the key words actually come from the testimony, the thesaurus continues to expand as more testimony is indexed. Index terms are mainly geographic place names, such as names of cities, villages and other locations, but they do include experiential content as well, such as "sense of time in the camps."

Because the thesaurus is in English, all current indexing is done in English. Testimony given in other languages is handled by bi-lingual indexers.

At first, each video testimony was indexed in three- to five-minute segments, but it was found that much of the time spent indexing in these increments was lost trying to decide where a segment ended and another began -- something like 75 percent of indexing time was spent rewinding and fast-forwarding tape. Now the testimonies are broken down into one-minute segments.

Each video has a running time code, so each one-minute segment is represented by a particular time code. The indexer attaches his index terms to that time code. Based on what is mentioned in the one-minute segment, more than one index term can be associated with a given segment. The cataloguing software is designed so that the indexer simply selects and drags appropriate terms from a pull-down menu into another window and that automatically links that particular keyword to the time code.