ColdFusion, Solr, and Killer Robots (RIA Unleashed: Boston 2010)
ColdFusion, Solr, and Killer Robots by Raymond Camden
Solr gives us better searching and file and data indexing versus SQL.
Verity (now dead to us) was the old school way of doing full text searching it was really good but had data limits.
A SQL like query is bad because it lacks relevancy, it is complex, misses context, can't handle binary files, and misses a lot of language kung fu.
Solr is an open source Java search engine built on top of Lucene that provides searching, highlighting, file indexing, and REST APIs among other features.
ColdFusion supports Solr with admin features, language support (3 tags), and includes Solr itself.
Important to update your ColdFusion to 9.0.1 with the cumulative hot fix, also helps with AJAX.
Definitions:
- Collections (think: DSN)
- Indexes (think: Table)
- Separate from your actual data (must update index after updating data)
CF Admin
- create, edit, delete collections
- optimize collections
- focused on file based collections
- No native ability to search but there is an open source add-on
Can't turn on content highlighting with a tag
cfcollection tag
- create, edit, delete, and optimize
- Need to include engine even though docs say otherwise
- Programatic collections should go in CF default
Can only search against index, must keep it up to date
cfindex tag
- purge or refresh
- incremental adds are important
- results are totally broken don't use it
When indexing files
- Key - primary key for index
- Title - acts as title
- Body - the main thing you search, can be multiple columns
- Plus additional things like custom fields and categories or category trees
Categories are like keywords (football) and category trees are like hierarchies (sports/world vs. sports/american).
cfsearch tag
- provide one or more collections
- criteria is what you are searching for
- results in a ColdFusion query object
Solr relevancy score is from 0 to some number, bigger is better but not something you show user.
Providing context
- context provides information about the match (term highlighting)
- turning it on requires reindexing
- modify context markers, to get around broken HTML marker issue
- No way to set author with database context
Solr supports spelling suggest system but must be set on search request.
ColdFusion wrapper is limited to 4 custom fields