Projects/Full Text Search/User Guide
Contents |
Full Text Search - User Guide
Overview
This document aims to help developers that need to develop a functionality that uses this technology. This is a very fast/simple introduction, for more details at database level regarding this technology, please refer to the Postgres' Full Text Search Documentation
Technical concepts
The filtering of Full Text Search in Postgres uses a special data types and Postgres provides functions to convert what we usually have, which is text into what we need. The filtering happens as following, with the text in which we want to search we need to create a TSVector and to operate with a TSVector we need a TSQuery and @@ special operator.
- Use to_tsvector('<language>', '<text_in_which_to_search>') to convert a text into a TSVector to be able to search into that text.
- Use to_tsquery('<language>', '<query text>') to convert a text into a searching query to be able to search this into the text above.
- Use @@ operator to operate TSVector with TSQuery. Example:
SELECT ... FROM ... WHERE ... AND to_tsvector('<language>', '<text_in_which_to_search>') @@ to_tsquery('<language>', '<query text>')
- Use setweight(TSVector, '[A - D]') to give more importance to the weights of a TSVector than to the weights of another. TSVectors can be appended with || operator and this produces a new TSVector with the Vector of the two original strings. Weight can be used to give more importance to matches with @@ operator that belong to one of the vectors.
- Use ts_rank_cd(TSVector, TSQuery) to obtain a numeric value that is higher the more the searched text resembles the query. This is useful to order results of the search according to similarity.
Database elements
Normally searches are made in great collections and this in database means tables with many rows. Being like this it's desirable to have an index, there are special indexes to operate with TSVectors but then this requires to create a column.
How to create the column
ALTER TABLE Table_Name ADD COLUMN FTS_Col TSVECTOR;
How to create the index
CREATE INDEX FTS_Col_idx ON Table_Name USING GIN (FTS_Col);
How to populate the newly created column
When creating a module that creates a TSVector column in a table it might be necessary to create a modulescript that populates the existing rows.
UPDATE Table_Name SET FTS_Col = to_tsvector('<language>', '<text_in_which_to_search>') WHERE condition; UPDATE Table_Name SET FTS_Col = to_tsvector('<language>', '<text_column_which_to_search>') WHERE condition; UPDATE Table_Name SET FTS_Col = to_tsvector('<language>', '<text_column1_which_to_search>') || to_tsvector('<language>', '<text_column2_which_to_search>') WHERE condition; UPDATE Table_Name SET FTS_Col = setweight(to_tsvector('<language>', '<text_column1_which_to_search>'), 'A') || setweight(to_tsvector('<language>', '<text_column2_which_to_search>'), 'D') WHERE condition;
How to rank/order by matches
SELECT ts_rank_cd(FTS_Col, to_tsquery('<language>', '<query_text>')), ... FROM Table_Name WHERE text_column1_which_to_search @@ to_tsquery('<language>', '<query_text>') ORDER BY 1;
Functions created at hibernate level
Although the TSVector columns are not available at hibernate level, there are two functions that have been created to allow to use filtering and ranking at hibernate level. These functions do nothing at hibernate level but parse the SQL necessary to filter or rank.
The functions created are fullTextSearchFilter and fullTextSearchRank and they can be called like this:
AND fullTextSearchFilter(<alias_of_hibernate_entity_in_hql>, <sql_name_of_column>, '<language>', '<query_text>') IS true fullTextSearchRank(<alias_of_hibernate_entity_in_hql>, <sql_name_of_column>, '<language>', '<query_text>')