Saturday, February 25, 2012

Contains()

I have the phrase
'One Two Three'
as the value of a column that is indexed for full text searching.
I run the query
SELECT * FROM table WHERE CONTAINS(column, 'One NEAR Two' );
and it returns the row with the above value, which is all well
and fine. Now, I need it so that the row would NOT be
returned for the following query:
SELECT * FROM table WHERE CONTAINS(column, 'Two NEAR One' );
Basically, I want it so that it looks for the words NEAR each
other but only in the order specified in the query. Such that it
returns rows where the first word is NEAR the second word
but also preceeds it as well.
Looking in the documentation, I didn't see how or if this is
possible. Is it? If so, how?
thnx,
Christoph
Christoph,
Unfortunately, this is not possible using the SQL Server 2000 & MSSearch
service implementation of the CONTAINS* or FREETEXT* predicates. While not
documented, the FTS/MSSearch solution defines NEAR as any two words or
phrases that are *near* each other in any order, but within 50 words of each
other. At best you would need to roll your own FTS engine that identity's
the all words, as well as their sequence in the row and then develop an
index that can be queried to get your results. It is doable, but is a
non-trivial effort.
Regards,
John
"Christoph Boget" <jcboget@.yahoo.com> wrote in message
news:#2nl1dNyEHA.1392@.TK2MSFTNGP14.phx.gbl...
> I have the phrase
> 'One Two Three'
> as the value of a column that is indexed for full text searching.
> I run the query
> SELECT * FROM table WHERE CONTAINS(column, 'One NEAR Two' );
> and it returns the row with the above value, which is all well
> and fine. Now, I need it so that the row would NOT be
> returned for the following query:
> SELECT * FROM table WHERE CONTAINS(column, 'Two NEAR One' );
> Basically, I want it so that it looks for the words NEAR each
> other but only in the order specified in the query. Such that it
> returns rows where the first word is NEAR the second word
> but also preceeds it as well.
> Looking in the documentation, I didn't see how or if this is
> possible. Is it? If so, how?
> thnx,
> Christoph
>
|||> other. At best you would need to roll your own FTS engine that identity's
> the all words, as well as their sequence in the row and then develop an
> index that can be queried to get your results. It is doable, but is a
> non-trivial effort.
Actually, I found a really useful work around.
SELECT
*
FROM
table
WHERE
CONTAINS(column, 'One NEAR Two' )
AND
PATINDEX('%One%Two%', column) != 0;
Using PATINDEX() ensures that word one preceeds word two if the
words are NEAR each other.
thnx,
Christoph
|||Christoph,
A most interesting approach to this problem! While this seems to he a very
useful workaround, I've done some quick testing using SQL Server 2000 on
Win2003 with the pubs database and FT-enabled table pub_info and my results
vary somewhat. If you use "moon" and "books" as actual examples for word One
and Two respectively, and then vary the contains and patindex order, I get
the following results:
-- Test #1 with the order changed in the PATINDEX clause changed
SELECT pub_id, pr_info from pub_info -- order of PATINDEX same that order of
CONTAINS
WHERE CONTAINS(pr_info, 'books NEAR moon') AND PATINDEX('%books%moon%',
pr_info) != 0
-- returns: 1 row, pub_id = 0736 - expected results are correct.
/* -- actual text:
pub_id pr_info
-- ---
0736 This is sample text data for New Moon Books, publisher 0736 in the
pubs database. New Moon Books is
(1 row(s) affected)
*/
SELECT pub_id, pr_info from pub_info -- order of PATINDEX different that
order of CONTAINS
WHERE CONTAINS(pr_info, 'books NEAR moon') AND PATINDEX('%moon%books%',
pr_info) != 0
-- returns: 1 row, pub_id = 0736 -- expected results are not correct
-- Test #2 with the order changed in the CONTAINS clause changed
SELECT pub_id, pr_info from pub_info -- order of CONTAINS same that order of
PATINDEX, but reversed from actual order
WHERE CONTAINS(pr_info, 'moon NEAR books') AND PATINDEX('%moon%books%',
pr_info) != 0
-- returns: 1 row, pub_id = 0736 -- expected results are not correct
SELECT pub_id, pr_info from pub_info -- order of CONTAINS different that
order of PATINDEX, but reversed from actual order
WHERE CONTAINS(pr_info, 'books NEAR moon') AND PATINDEX('%moon%books%',
pr_info) != 0
-- returns: 1 row, pub_id = 0736 -- expected results are not correct
I have a feeling that while this approach may be useful, it still may be
dependent upon the actual text and what specific order the searcher is
requesting. Still a very good approach with additional refine might be most
useful!
Regards,
John
"Christoph Boget" <jcboget@.yahoo.com> wrote in message
news:OsRdnxxyEHA.3408@.tk2msftngp13.phx.gbl...[vbcol=seagreen]
identity's
> Actually, I found a really useful work around.
> SELECT
> *
> FROM
> table
> WHERE
> CONTAINS(column, 'One NEAR Two' )
> AND
> PATINDEX('%One%Two%', column) != 0;
> Using PATINDEX() ensures that word one preceeds word two if the
> words are NEAR each other.
> thnx,
> Christoph
>

No comments:

Post a Comment