Search Appliance SBE
Syntax: textual name and URL pattern pairs, additional input boxes will appear as you fill the ones provided
The Search Appliance can create searchable sub-categories that will appear
in a drop down box on the Search page. Enter the name of the category
on the left, and its corresponding URL pattern on the right. URL
patterns must fully match the URL (e.g. including protocol), and may
contain asterisk (*) to indicate "anything" or question mark
(?) to indicate any single character. There may be more than one
pattern for each category; separate multiple patterns with space.
Category names must not contain the pipe ("|") character, as
it may be used to separate multiple categories in the category
search parameter. A category should also not be named "Everything", as the search interface provides that option in the
category selection box to search everything (i.e. any category), which
might be confused with a specific category of the same name.
The following table provides an example.
Category | URL Pattern |
Demonstrations | http://www.example.com/demos/* |
Manuals | http://www.example.com/manual/* |
Books | http://www.example.com/a1/* http://example.com/b3/* |
This example would create a category named Demonstrations which would only search the URL http://www.example.com/demos/ and any files under this directory, thereby creating a more concise match to the user's search. The same is true for Manuals. However, the Books category would include pages from both the /a1 and /b3 directories. The user would now have the option to search within just these categories or the entire database. The pattern should not be a single page unless you want a category with just that single page in it (e.g. http://www.example.com/manual/index.html or http://www.example.com/manual/ would generally be incorrect). It should typically be a prefix for a directory that has multiple pages within it, followed by an asterisk (*).
Note that URL Patterns will not be used to determine categories if any Data From Field rules set Category. Please see the Data from Field settings (p. here) for more details.
For best search performance, categories that overlap one another
(i.e. contain walked pages in common) should be avoided if possible.
If overlapping categories are used, they should be listed
most-commonly-searched first. Also, the CatnoLowest field
should be selected as one of the Compound Index Fields
(here); this is the default. These
guidelines will allow the Auto-detect
mode to optimize the most
searches to the fastest possible speed.
Also note that changing, deleting or adding Category and/or URL Pattern after a walk has been performed will trigger a recategorization. This procedure, which runs in the background, re-applies the category changes to the walked data. While it is faster than a full walk - as pages do not need to be fetched and fully processed - it nonetheless can take some time, particularly for large walks. For best performance, wait for the recategorization to complete (it can be monitored on the Dashboard or Walk Status as a task) before starting another walk.