Greenstone tutorial exercise

Back to wiki
Back to index
Prerequisite: A collection of Word and PDF files

Formatting the Word and PDF collection

In this exercise, we play around with the format statements in the Word and PDF collection.

  1. Open the reports collection in the Librarian Interface and go to the Format Features section of the Design panel.

Tidying up the default format statement

  1. In this part of the exercise, we make the format statement simpler without changing the resulting display.

    Greenstone's default format statement is complex because it is designed to produce something reasonable under almost any conditions, and also because for practical reasons it needs to be backwards compatible with legacy collections. For this collection, we don't need all of the complexity.

    The default VList format statement looks like the following:

    <td valign="top">[link][icon][/link]</td>
    <td valign="top">[ex.srclink]{Or}{[ex.thumbicon],[ex.srcicon]}[ex./srclink]</td>
    <td valign="top">[highlight]
    {Or}{[dls.Title],[dc.Title],[ex.Title],Untitled}
    [/highlight]{If}{[ex.Source],<br><i>([ex.Source])</i>}</td>

    This format statement is the default used for any vertical list, such as search results, classifiers, and document table of contents.

    {Or}{[ex.thumbicon],[ex.srcicon]} chooses ex.thumbicon metadata if its there, otherwise chooses ex.srcicon metadata. If neither are present, nothing is displayed. For this collection there is no ex.thumbicon metadata so the choice is not needed.

    Replace {Or}{[ex.thumbicon],[ex.srcicon]} with [ex.srcicon].

    There is no dls.Title metadata, so remove that element from {Or}{[dls.Title],[dc.Title],[ex.Title],Untitled}.

    The resulting format statement looks like the following:

    <td valign=top>[link][icon][/link]</td>
    <td valign=top>[ex.srclink][ex.srcicon][ex./srclink]</td>
    <td valign=top>[highlight]
    {Or}{[dc.Title],[ex.Title],Untitled}[/highlight] {If}{[ex.Source],<br><i>([ex.Source])</i>}</td>

    Click <Replace Format>.

    Preview the collection to make sure the display hasn't changed. You shouldn't notice any difference when looking at search results, classifiers etc.

Linking to Greenstone version or original version of documents

  1. For collections with documents that undergo a conversion process during importing (e.g. Word, PDF, PowerPoint documents, but not text, HTML documents), the original file is stored in the collection along with the converted version. The default VList format statement links to both versions:

    [link][icon][/link] links to the Greenstone HTML version, while [srclink][srcicon][/srclink] links to the original.

    Choose SearchVList in Format Features by selecting Search from the Choose Feature drop down list, and VList from the Affected Component list. Click <Add Format> to add the SearchVList format statement into the list of assigned formats. Experiment with removing either of the two links from the format statement. (Remember to click <Replace Format> after any changes.)

    To see the results of your changes, preview the collection and do a search. You are making changes to SearchVList, which means the changes will only apply to search results.

    Storing and displaying the original allows users to see the correct format, but requires the user to have the relevant program installed. It also increases the size of the collection. The Greenstone version can be viewed in a browser, but may not look as nice.

Making bookshelves show how many items they contain

  1. Next, we'll customize the format for the Authors A-Z list. Classifier bookshelves have only a few pieces of metadata to display: [ex.Title] and [numleafdocs]. Whatever metadata the classifier has been built on, the bookshelf label is always stored as [ex.Title]. This is why a Creator is printed out for each bookshelf even though [dc.Creator] is not specified in the format statement. [numleafdocs] is only defined for bookshelves, so this metadata can be used in an {If} statement to make bookshelves and documents display differently in the list.

    Make each bookshelf in the Creator classifier show how many entries it contains. In the Format Features section of the Design panel, select the CL2 AZCompactList classifier which is based on dc.Creator metadata from the Choose Feature drop down list, and VList from the Affected Component list. Click the <Add Format> button to add this format into the list of assigned formats. Note that it gets added as CL2VList in this list: its the VList format for the second (CL2) classifier.

    Append the following text and click <Replace Format>:

    {If}{[numleafdocs],<td><i>([numleafdocs])</i></td>}

    Click <Add Format>, switch to the Create panel, and click <Preview Collection> (no need to rebuild). Click on the Authors A-Z list and notice that the bookshelves now display how many documents they contain.

    This revised format statement has the effect of specifying in brackets how many items are contained within a bookshelf. Since only bookshelves define [numleafdocs], only they will display this. By modifying CL2VList instead of VList, the change will only apply to the second classifier (Creators).

Displaying multi-valued metadata

  1. Next we modify the document entries in the Creator classifier to display all authors. Back in Format Features, select the CL2VList format in the list of assigned formats. After {If}{[ex.Source],<br> in the format statement, add [sibling:dc.Creator]. Click <Replace Format>.

    [ex.Source] is not defined for bookshelves, so can also be used to differentiate bookshelves and documents.

    The resulting format statement looks like:

    <td valign=top>[link][icon][/link]</td>
    <td valign=top>[ex.srclink][ex.srcicon][ex./srclink]</td>
    <td valign=top>[highlight]
    {Or}{[dc.Title],[ex.Title],Untitled}[/highlight]
    {If}{[ex.Source],<br>[sibling:dc.Creator]
    <i>([ex.Source])</i>}</td>
    {If}{[numleafdocs],<td><i>([numleafdocs])</i></td>}

    This will display the Greenstone link, the link to the original, then the Title. For bookshelves, it will also display how many documents the bookshelf contains. For documents, it will display all the Authors (Creators), and the source document. [sibling:dc.Creator] displays all the Creator metadata for the document, separated by a space (" "). Preview the Authors A-Z list and make sure that all authors are displayed for documents.

  1. You can change the separator between the authors. Modify the format statement, and replace [sibling:dc.Creator] with [sibling(All'<br/>'):dc.Creator]. This will add a new line after each author (<br/> specifies a line break in HTML). Don't forget to click <Replace Format>. Preview the Authors A-Z list.

    If you have done exercise Enhanced Word document handling, the collection will have both dc.Creator and ex.Creator metadata. To display both, you can use

    [sibling:dc.Creator] [sibling:ex.Creator]

    To display dc.Creator if its present, otherwise display ex.Creator, use

    {Or}{[sibling:dc.Creator],[sibling:ex.Creator]}