Cheshire3 Tutorials - Configuring Databases

Introduction

Databases are primarily collections of Records and Indexes along with the associated metadata and objects required for processing the data.

Configuration is typically done in a single file, with all of the dependent components included within it and stored in a directory devoted to just that database. The file is normally called, simply, config.xml.

Example

An example Database configuration:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
<config type="database" id="db_ead">
    <objectType>cheshire3.database.SimpleDatabase</objectType>
    <paths>
        <path type="defaultPath">dbs/ead</path>
        <path type="metadataPath">metadata.bdb</path>
        <path type="indexStoreList">eadIndexStore</path>
        <object type="recordStore" ref="eadRecordStore"/>
    </paths>
    <subConfigs>
    ...
    </subConfigs>
    <objects>
    ...
    </objects>
</config>

Explanation

In line 1 we open a new object, of type Database with an identifier of db_ead. You should replace db_ead with the identifier you want for your Database.

Line 2 defines the <objectType> of the Database (which will normally be a class from the cheshire3.database module). There is currently only one recommended implementation, cheshire3.database.SimpleDatabase, so this line should be copied in verbatim, unless you have defined your own sub-class of cheshire3.baseObjects.Database (in which case you’re probably more advanced than the target audience for this tutorial!)

Lines 4 and 7 define three <path>s and one <object>. To explain each in turn:

defaultPath
the path to the directory where the database is being stored. It will be prepended to any further paths in the database or in any subsidiary object.
metadataPath
the path to a datastore in which the database will keep its metadata. This includes things like the number of records, the average size or the records and so forth. As it’s a file path, it would end up being dbs/ead/metdata.bdb – in other words, in the same directory as the rest of the database files.
indexStoreList
a space separated list of references to all IndexStores the Database will use. This is needed if we intend to index any Records later, as it tells the Database which IndexStores to register the Record in.

The <object> element refers to an object called eadRecordStore which is an instance of a RecordStore. This is important for future Workflows, so that the Database knows which RecordStore it should put Records into by default.

Line 10 would be expanded to contain a series of <subConfig> elements, each of which is the configuration for a subsidiary object such as the RecordStore and the Indexes to store in the IndexStore, eadIndexStore.

Line 13 could be expanded to contain a series of <path> elements, each of which has a reference to a Cheshire3 object that has been previously configured. This lines instruct the server to actually instantiate the object in memory. while this is not strictly necessary it may occasionally be desirable, see <objects> for more information.