Saturday, February 25, 2012

Containstable problem

Hi,
I have a question regarding the contains and containstable.
I have a table with about 2.000.000 records.
A the begining we simply use a "select top 1000 * from tablex where
contains(columnx,'xxx')" to find records matching the keywords enter by
users.
But, we find some performance problems when performing queries on keywords
that appears a lot of time
in the table (like "men", "women", "car", ...). The query take more than a
minute.
So we change our query to use the containstable to limit the reseach and it
worked fine and it only take few seconds. (Perfect)
But now, we wanted to add a where clause in the select to limit the reseach
according to a value of a field (not full text)
So, the query was something like this :
SELECT TOP 1000 ID FROM tableX AS FT_TBL
INNER JOIN CONTAINSTABLE (tableX, FT_INDEX , 'men',1000) AS KEY_TBL ON
FT_TBL.ID = KEY_TBL.KEY
WHERE SOURCE = 'news'
But, I notice that the containstable limit the result the 1000 first
occurences of 'men' and after that the where source='news' is applied.
So, of the 1000 records I only have about 500 with a "men" in the ft_index
and source='news'. But in the DB I have more than 2000 records with "men"
and "news"
Is there any way to combine the where clause and benefit of the limit of the
containstable ?
My hope is to have a select distinct top 1000 it from tablex where
source='news' and contains(ft_index,'men',1000)
Thanks
Stephane
No, unfortunately not. When you have restictions in your where clause you
should if at all possible partition your data into different tables.
So you would have a table which would contain all the rows with a source
value of news.
The problem is you get 1000 or less rows returned from MSSearch and then you
trim based on the value of the source column.
So suppose you have 2000 rows which match your search condition, and the
first 1000 rows don't have a source value of news. Such a search would
return 0 results.
The other option is to increase the value of your top_n_by_rank to a value
which guarantees 1000 rows after being filtered by the source restiction.
Normally you don't have the luxury of knowing a value of top_n_by_rank to
guarantee this though.
Hilary Cotter
Looking for a SQL Server replication book?
http://www.nwsu.com/0974973602.html
"Stdu" <stdu@.myrealbox.com> wrote in message
news:%233pOC93wEHA.1988@.TK2MSFTNGP12.phx.gbl...
> Hi,
> I have a question regarding the contains and containstable.
> I have a table with about 2.000.000 records.
> A the begining we simply use a "select top 1000 * from tablex where
> contains(columnx,'xxx')" to find records matching the keywords enter by
> users.
> But, we find some performance problems when performing queries on keywords
> that appears a lot of time
> in the table (like "men", "women", "car", ...). The query take more than a
> minute.
> So we change our query to use the containstable to limit the reseach and
it
> worked fine and it only take few seconds. (Perfect)
> But now, we wanted to add a where clause in the select to limit the
reseach
> according to a value of a field (not full text)
> So, the query was something like this :
> SELECT TOP 1000 ID FROM tableX AS FT_TBL
> INNER JOIN CONTAINSTABLE (tableX, FT_INDEX , 'men',1000) AS KEY_TBL ON
> FT_TBL.ID = KEY_TBL.KEY
> WHERE SOURCE = 'news'
> But, I notice that the containstable limit the result the 1000 first
> occurences of 'men' and after that the where source='news' is applied.
> So, of the 1000 records I only have about 500 with a "men" in the ft_index
> and source='news'. But in the DB I have more than 2000 records with "men"
> and "news"
> Is there any way to combine the where clause and benefit of the limit of
the
> containstable ?
> My hope is to have a select distinct top 1000 it from tablex where
> source='news' and contains(ft_index,'men',1000)
> Thanks
> Stephane
>
>
|||> Is there any way to combine the where clause and benefit of the limit of the
> containstable ?
> My hope is to have a select distinct top 1000 it from tablex where
> source='news' and contains(ft_index,'men',1000)
In Sql2000 I had to do the following (using your example).
Create another column that appended the column(s) that contained the
searched text with a special word or two like "xxxSource_news" and
include this new keyword in the search.
I just posted a question myself... if what we're looking for is
available in Sql2005.
|||This is a nice solution. However it only works with equality. For instance
you can't use if for quantities, ie where you have a where quantity> 1000.
But a very nice solution none the less.
Hilary Cotter
Looking for a SQL Server replication book?
Now available for purchase at:
http://www.nwsu.com/0974973602.html
"Brian Rice" <mail@.brianrice.com> wrote in message
news:dce86229.0411060843.2d432c70@.posting.google.c om...[vbcol=seagreen]
the
> In Sql2000 I had to do the following (using your example).
> Create another column that appended the column(s) that contained the
> searched text with a special word or two like "xxxSource_news" and
> include this new keyword in the search.
> I just posted a question myself... if what we're looking for is
> available in Sql2005.
|||I will try the Brian's solution as I already have a field where I store all
my keywords to avoid the problem of the contains on multiple fields
Great idea
Thanks
"Hilary Cotter" <hilary.cotter@.gmail.com> wrote in message
news:OZ0sgfDxEHA.824@.TK2MSFTNGP11.phx.gbl...
> This is a nice solution. However it only works with equality. For
> instance
> you can't use if for quantities, ie where you have a where quantity> 1000.
> But a very nice solution none the less.
> --
> Hilary Cotter
> Looking for a SQL Server replication book?
> Now available for purchase at:
> http://www.nwsu.com/0974973602.html
>
> "Brian Rice" <mail@.brianrice.com> wrote in message
> news:dce86229.0411060843.2d432c70@.posting.google.c om...
> the
>

No comments:

Post a Comment