Greenstone tutorial exercise
Building a small collection of HTML files
You will need some HTML files, such as those in the hobbits folder in sample_files.
Running the Greenstone Librarian Interface
-
Start the Greenstone Librarian Interface:
Start → All Programs → Greenstone Digital Library Software v2.70 → Greenstone Librarian Interface
After a short pause a startup screen appears, and then after a slightly longer pause the main Greenstone Librarian Interface appears. (A command prompt is also opened in the background.)
Starting a new collection
-
Start a new collection within the Librarian Interface:
File → New...
-
You will create a collection based on a few HTML web pages that describe some Hobbits in Lord of the Rings.
A window pops up. Fill it out with appropriate values—for example,
Collection title:: About Hobbits
Description of content:: A collection about hobbits.
Leave the setting for Base this collection on: at its default: -- New Collection --, and click <OK>.
-
Another window pops up, from which you select the metadata set (or sets) to use. This is discussed in other exercises. For now, select Dublin Core Metadata Element Set Version 1.1 followed by <OK>.
If this is the first time you have opened a collection in the Librarian Interface, two popup progress bars will appear, to show progress while loading plugins and classifiers.
-
Next you must gather together the files that will constitute the collection. A suitable set has been prepared ahead of time in sample_files → hobbits. Using the left-hand side of the Librarian Interface's Gather panel, interactively navigate to the sample_files folder.
Adding documents to the collection
-
Now drag the hobbits folder from the left-hand side and drop it on the right. The progress bar at the bottom shows some activity. Gradually, duplicates of all the files will appear in the collection panel.
You can inspect the files that have been copied by double-clicking on the folder in the right-hand side.
-
Since this is our first collection, we won't complicate matters by manually assigning metadata or altering the collection's design. Instead we rely on default behaviour. So pass directly to the Create panel by clicking its tab.
Building the collection
-
To start building the collection, click the <Build Collection> button.
-
Once the collection has built successfully, a window pops up to confirm this. Click <OK>.
-
Click the <Preview Collection> button to look at the end result. This loads the relevant page into your web browser (starting it up if necessary). Look around the collection and learn about Hobbits!
Viewing the extracted metadata
-
Back in the Librarian Interface, click the Enrich tab to view the metadata associated with the documents in the collection.
-
Presently there is no manually assigned metadata, but the act of building the collection has extracted metadata from the documents. Double click the hobbits folder to expand its content. Then single-click bilbo.html to display all its metadata in the right-hand side of the panel. The initial fields, starting "dc.", are empty. These are Dublin Core metadata fields (we asked you to include this metadata set when the collection was initially formed) for manually entered data.
-
Use the scroll bar on the extreme right to view the bottom part of the list. There you will see fields starting "ex." that express the extracted metadata: for example ex.Title, based on the text within the HTML Title tags, and ex.Language, the document's language (represented using the ISO standard 2-letter mnemonic) which Greenstone determines by analyzing the document's text.
-
Close the collection by clicking File → Close. This automatically saves the collection to disk.
Setting up a shortcut in the Librarian interface
-
To set up a shortcut to the source files, in the Gather panel navigate to the folder in your local file space that contains the files you want to use—in our case, the sample_files folder. Select this folder and then right-click it, and choose Create Shortcut from the menu. In the Name field, enter the name you want the shortcut to have, or accept the default sample_files. Click <OK>. Close all the folders in the file tree in the left-hand pane, and you will see the shortcut to your source files.