High-Level Integration Process

  1. Create and then freeze an agreement.
  2. One party can builds a consuming app complaint that agreement. Another party builds provider code complaint that agreement.
  3. Goal is to install / config.
  4. Verify.

What Content Can Be Searched

A user-facing application can issue search queries to the Edgerton Digital Collections repositories via a standardized connector (an OSID implementation). Current assumptions:

  1. There will be a connector for a video-only repository (TechTV, which offers a search and Flash-encoded video streams to an embed-able their player).
  2. There will be a connector for museum images (MOBIUS).
  3. There will be a connector for library-managed content: notebook pages and correspondence copied (images and metadata) into an OEIT repository (t.b.d., but under consideration are Lucene, Alfresco, and MySQL for parts of the solution).
  4. There will be a connector for shadow metadata, field data and comments contributed from the community. The repository for this is t.b.d.
  • The consuming application can direct a search at one or more of the OSID implementations.
  • There are assets of different types defined by the OSIDs. The current broad asset types are: images, videos, and documents,
  • The application that searches via OSID implementation, will be able to distinguish finer-grained types of assets, say a slide from something else. Specialized forms of documents are notebook pages, notebooks, correspondence. This detail is reflected in the media or format field in asset metadata. Asset metadata will be based on Dublin Core wherever possible.
    Note, we assume that for presentation, images are displayed in a viewer / browser; a video in a player; and a notebook page in a page viewer.

    Kinds Of Searches

    Connectors will support two kinds of searches, basic and advanced. A basic search takes as input a search terms and seeks a match with any field; an advanced search takes as input criteria for one or more specific metadata fields.
    Both searches use similar rules for matching. Results are returned in no particular order.
    The Basic Search
    The input to this search are space-delimited terms, a string. For each term, a match compares for equality the term with metadata field content. Terms can be found in any order, there can be any number of matches per asset, matches are for whole words only, equality is case-insensitive, matching can be in any field.
    Within the application, we assume there will be a search edit control to capture the query.

    Basic Search Examples

    Query is: Playing Card

Matching Metadata
Description: This image shows a bullet passing through a playing card.

Non-Matching Metadata
Description: This image shows a bullet passing through a pack of playing cards.

Matching Metadata
Description: This video shows Yitzhak Perlman playing a violin.
Author: Michael Card

Non-Matching Metadata
Description: This video shows Yitzhak Perlman playing a violin.
Author: Michael Cardinal

Query is: "Playing Card"

Matching Metadata
Description: This image shows a bullet passing through a playing card.

Non-Matching Metadata
Description: This image shows a bullet passing through a pack of playing cards.

Non-Matching Metadata
Description: This video shows Yitzhak Perlman playing a violin.
Author: Michael Card

Non-Matching Metadata
Description: This video shows Yitzhak Perlman playing a violin.
Author: Michael Cardinal

The Advanced Search

The input to this search is field / value pairs. This can be a Java Properties object, but an XML string may be more flexible.
Within the application, we assume there will be an Advanced Search Page to capture the query (this page is based on a single, statically defined, metadata schema).
In the advanced search, there can be a value for each metadata field. Field values can be enclosed in double-quotes, for a literal match, or unquoted for a less demanding match.
For date metadata, there will be the option to provide a value on or before a date or on or after a date.
For people and place metadata, there may be a controlled vocabulary, presented as a picklist.
Advanced Search Examples
Query is:
Description contains Playing

Matching Metadata
Description: This image shows a bullet passing through a playing card.

Query is:
Description contains Playing
Author contains Card

Matching Metadata
Description: This video shows Yitzhak Perlman playing a violin.
Author: Michael Card
Sample Queries:

Find any image with the title matching Playing Card

This search implies that results can be limited by media type, such as image, video, notebook, notebook page, etc. Is the set of media types defined by the consumer, provider, or a under the control of a third-party or standard? Can a search match any of a set of types?

This search implies that results can be sought to match a specific metadata field, such as the title. Is the set of fields defined by the consumer, provider, or a under the control of a third-party or standard? Can a search match a set of fields? Do multiple field matches allow for boolean operators such as AND, OR, and NOT?
This search implies that a match is found with Playing Card. What do we mean by match, specifically:

  • Are matches case sensitive, insensitive, or under the direction of the consumer, provider, or a third-party or standard?
  • Must Playing Card be the entire and exact title, or a subset of the title?
  • Must both Playing and Card appear in the title?
  • Must both appear in this order?
  • Must both appear adjacent, with only the single space as delimiter?
  • Must both match fully, or would "Cards" also match "Card".

What are the consumers expectations, if any, about the order results are returned. For example, if three assets match the title, what can the consumer request or demand about the order, if anything? What can the provider declare about the order, if anything? Does a provider support searching within a result set?

  • Can the consumer request which metadata are returned for assets in the result set?
  • Can the consumer request the order of metadata for assets in the result set?
  • Can the consumer request a maximum number of results?
  • Can the consumer request a subset of all results, say, by start and end index or by start index and max count?

What is the specific syntax specification for queries?

What was done for "strobe disks" between the time of Edgerton's marriage and the birth of his first child.
"strobe disks" is a criteria expression
controlled vocabulary
case sensitivity
match all, match part, match any

results can be filtered by date range
date expression
date end-points (inclusion / exclusive)

Result Metadata

What metadata are returned about the result set? For example, is there a time; a total result count; a "cursor" or start and end index?

What metadata are returned about an individual results?
Are all metadata returned directly, or do any require subsequent requests?

fields
field order
field syntax
missing data

Result Order

how to navigate

Interoperation Validation

A second consumer can integrate as easily with the first provider.
The consumer can consume as easily a second, different provider.
A consumer can integrate with multiple providers.

  • No labels