Greenstone tutorial exercise

Back to wiki
Back to index
Prerequisite: A large collection of HTML files—Tudor
Devised for Greenstone version: 2.60
Modified for Greenstone version: 2.70w

Formatting the HTML collection—Tudor

  1. Open up your tudor collection, go to the Design panel (by clicking on its tab) and select Format Features from the left-hand list. Leave the editing controls at their default value, so that Choose Feature remains blank and VList is selected as the Affected Component. The text in the HTML Format String box reads as follows:

    <td valign=top>[link][icon][/link]</td>
    <td valign=top>[ex.srclink]{Or}{[ex.thumbicon],[ex.srcicon]} [ex./srclink]</td>
    <td valign=top>[highlight]
    {Or}{[dls.Title],[dc.Title],[ex.Title],Untitled}
    [/highlight]{If}{[ex.Source],<br><i>([ex.Source])</i>}</td>

    This displays something that looks like this:

    A discussion of question five from Tudor Quiz: Henry VIII
    (quizstuff.html)

    for a particular document whose Title metadata is A discussion of question five from Tudor Quiz: Henry VIII and whose Source metadata is quizstuff.html.

    This format appears in the search results list, in the Titles A-Z list, and also when you get down to individual documents in the Titles A-Z hierarchy. This is Greenstone's default format statement.

Greenstone's default format statement is complex—even baroque—because it is designed to produce something reasonable under almost any conditions, and also because for practical reasons it needs to be backwards compatible with legacy collections.

  1. Delete the contents of the HTML Format String box and replace it with this simpler version:

    <td>[link][icon][/link]</td>
    <td>[ex.Title]<br>
        <i>([ex.Source])</i>
    </td>

    Remember to click <Replace Format>.

    Preview the result (you don't need to build the collection, because changes to format statements take effect immediately). Look at some search results and at the Titles A-Z list. They are just the same as before! Under most circumstances this far simpler format statement is entirely equivalent to Greenstone's more complex default.

    But there's a problem. Beside the bookshelves in the Subjects browser, beneath the subject appears a mysterious "()". What is printed for these bookshelves is governed by the same format statement, and though bookshelf nodes of the hierarchy have associated Title metadata—their title is the name of the metadata value associated with that bookshelf—they do not have ex.Source metadata, so it comes out blank.

  1. In the Format Features section of the Design panel, the Choose Feature menu (just above Affected Component menu) is blank. That implies that the same format is used for the search results, titles, and all nodes in the subject hierarchy—including internal nodes (that is, bookshelves). The Choose Feature menu can be used to restrict a format statement to a specific one of these lists; when it's blank, the VList specification applies throughout. We will override this format statement for the hierarchical subject classifier. In the Choose Feature menu, scroll down to the item that says

    CL2: Hierarchy -metadata dc.Subject and Keywords

    and select it. This is the format statement that affects the second classifier (i.e., "CL2"), which is a Hierarchy classifier based on dc.Subject and Keywords metadata.

    Edit the HTML Format String box below to read

    <td>[link][icon][/link]</td>
    <td>[ex.Title]</td>

    and click <Add Format>.

  1. Now go to the Create panel and click <Preview Collection>. First, the offending "()" has disappeared from the bookshelves. Second, when you get down to a list of documents in the subject hierarchy, the filename does not appear beside the title, because ex.Source is not specified in the format statement and this format statement applies to all nodes in the subject classifier. Note that the search results and titles lists have not changed: they still display the filename underneath the title.

  1. Let's change the search results format so that dc.Subject and Keywords metadata is displayed here instead of the filename. In the Choose Feature menu (under Format Features on the Design panel), scroll down to the item Search and select it. Change the HTML Format String box below to read

    <td>[link][icon][/link]</td>
    <td>[ex.Title]<br>
        [dc.Subject]
    </td>

    and click <Add Format>.

  1. To insert the [dc.Subject], position the cursor at the appropriate point and either type it in, or use the Variables drop down menu—the one that says [Text]. Make it say [dc.Subject] and click <Insert> to insert this into the HTML Format String. This menu shows many of the things that you can put in square brackets in the format statement.

  1. Now go to the Create panel and click <Preview Collection>. Documents in the search results list will be displayed like this:

    A discussion of question five from Tudor Quiz: Henry VIII
    Tudor period|Others
    (The vertical bar appears because this dc.Subject and Keywords metadata is hierarchical metadata. Unfortunately there is no way to get at individual components of the hierarchy. For most metadata, such as title and author, this isn't a problem.)

  1. Finally, let's return to the subjects hierarchy and learn how to do different things to the bookshelves and to the documents themselves. In the Choose Feature menu, re-select the item

    CL2: Hierarchy -metadata dc.Subject and Keywords

    Edit the HTML Format String box below to read

    <td>[link][icon][/link]</td>
    <td>{If}{[numleafdocs],<b>Bookshelf title:</b> [ex.Title],
                           <b>Title:</b> [ex.Title]}
    </td>

    and click <Replace Format>. Again, you can insert the items in square brackets by selecting them from the Variables drop down box (don't forget to click <Insert>).

    The If statement tests the value of the variable numleafdocs. This variable is only set for internal nodes of the hierarchy, i.e. bookshelves, and gives the number of documents below that node. If it is set we take the first branch, otherwise we take the second. Commas are used to separate the branches. The curly brackets serve to indicate that the If is special—otherwise the word "If" itself would be output.

  1. Go to the Create panel, click <Preview Collection>, and examine the subject hierarchy again to see the effect of your changes. Bookshelves should say Bookshelf title: and then the title, while documents will display Title: and the title. Note that the number of documents in the bookshelf is not displayed: we are using [numleafdocs] to test what kind of item in the list we are at, but we are not displaying it.