The following article is a general discussion outlining two principle methods used to search, and more importantly retrieve, document information on the web. The article points to where search technology is heading, and the increasing importance of using standard document definitions, such as the Dublin Core Metadata Initiative (DCMI), as a means for document discovery on the internet.
Some explanation first: Hierarchal Searches
- Typically a computer user has a clear idea of what they are searching for.
- Frequently a user is searching for a short and simple summary.
These requirements are easy to cater for within just about any hierarchal search mechanism, such as Google et al.
- Hierarchal searches are restricted to the set of all things matching (typically containing) the search parameters.
- Complex hierarchal searches require boolean logic, typically transparent in a quick search but complex in a detailed search
- Hierarchal searches have no concept of association
Associative Searches
- You may have noticed that some people get very frustrated when someone else has tidied their desk. Had you asked them precisely where a document was before the desk was cleaned they could miraculously reach into just the right pile to just the right location and retrieve the document. Now that it's clean they don't know where anything is.
An associative search replicates the untidy desk behavior.
- Associative searches provide a simple means to retrieve complex, rich content searches and are best used when a quick hierarchal search has failed to provide useful results.
In programming terminology the desk is an associative object and each pile of documents is an associative class belonging to the object. A mathematician may refer to it as sets of documents. Most people however will simply refer to it as the stuff on their desk, or something similar.
It's a relatively easy task to visually represent each document set and apply a weight to each document within a set and to each set of sets.
For example. If you were to google 'mozart' the result set will list all documents that contain the word mozart ranked by the most frequently visited site. This is an excellent way of representing the results as probability suggests that you are seeking the same result as the majority of people who have used the same search phrase in the past.
If however you are looking for mozart the movie, or mozart the radio station, or mozart - the benefits of playing to unborn babies, you would need to refine your search applying set theory to your search criteria - a process we've all become familiar with even when we're not aware that's what we're doing, that requires abstract thinking and not a little trial and error.
This is where associative searches come in. An associative search will divide all the result sets into their respective classes, showing all the principle 'categories' where you can see at a glance all sets belonging to movies, businesses, literature, including where they share information such as music scores, biographies, or projects etc, as well as enabling the user to visually drill down to, say, price lists associated with scores, or whatever.
One would assume that this type of search methodology would be a researchers dream as a very imprecise or even incorrect search will rapidly direct them to a precise, accurate location. I suspect that in some respects it's un-intuitive, insofar as researchers may be used to working with two dimensional lists rather than viewing three dimensional illustrative representations. Associative search results can of course be represented in two dimensional lists as well, but they lose the visual component we associate with depth and breadth in the translation.
When documents are well categorised (such as a specialised predetermined cataloging system) associative searches really come into their own as the image maps can conform perfectly to precise meta category structures, although this is by no means essential to produce very useful maps as the search will naturally seek and define its own patterns based on the elements within each document.
Try:
Alexander Munro (CTO, Inzen Pty Ltd)
|