Saturday, February 25, 2012

containstable across columns

We're having some difficulty getting adequate search results. We're trying to search across 3 columns, "author", "title", and "subtitle" (no other columns are part of FT index). The problem relates to searching for an author/title phrase at the same tim
e. Say the user types "chemistry brown" into a search box, (the Author being "Brown" and the title "Chemistry"--but there's no way to make that distinction at run time). If you use CONTAINSTABLE, (...from containstable(<ourtable>,*,'"brown" near "chemis
try"') it only return rows where "brown" and "chemistry" are in the SAME column (where what we need is for it to return high rankings where "brown" and "chemistry" are in different columns but the same rows). Freetexttable is no better, since again it's
ranking based on the number of occurrences of a word in a SINGLE column, not in all the FT-indexed columns per row.
The key factor is that the type of word (author, title, subtitle) is not known at search time, which it difficult. Any suggesstions?
Platform: SQl 2K sp3, Windows Server 2000 Sp4
you could try this
select * from containstable(fulltext,*,'chemistry or brown')
or
select * from containstable(fulltext,*,'"chemistry" or "brown"')
Hilary Cotter
Looking for a book on SQL Server replication?
http://www.nwsu.com/0974973602.html
"John C" <John C@.discussions.microsoft.com> wrote in message
news:36E8719A-769E-45D1-A05B-FD8BC0B170E5@.microsoft.com...
> We're having some difficulty getting adequate search results. We're
trying to search across 3 columns, "author", "title", and "subtitle" (no
other columns are part of FT index). The problem relates to searching for
an author/title phrase at the same time. Say the user types "chemistry
brown" into a search box, (the Author being "Brown" and the title
"Chemistry"--but there's no way to make that distinction at run time). If
you use CONTAINSTABLE, (...from containstable(<ourtable>,*,'"brown" near
"chemistry"') it only return rows where "brown" and "chemistry" are in the
SAME column (where what we need is for it to return high rankings where
"brown" and "chemistry" are in different columns but the same rows).
Freetexttable is no better, since again it's ranking based on the number of
occurrences of a word in a SINGLE column, not in all the FT-indexed columns
per row.
> The key factor is that the type of word (author, title, subtitle) is not
known at search time, which it difficult. Any suggesstions?
> Platform: SQl 2K sp3, Windows Server 2000 Sp4
|||Hilary,
While the below containstable query examples provide a solution, it does not
meet John's requirement of 'search across 3 columns, "author", "title", and
"subtitle"' and I submit that using the following FT-enabled Northwind table
(Employees) and multiple column specific column names (vs. using "*" or
asterisk for all FT-enabled columns) does meet John's request, for example:
use Northwind
go
SELECT LastName, FirstName, Title, Notes from Employees
/* returns:
LastName FirstName Title
-- -- --
Davolio Nancy Sales Representative
Fuller Andrew Vice President, Sales
Leverling Janet Sales Representative
Peacock Margaret Sales Representative
Buchanan Steven Sales Manager
Suyama Michael Sales Representative
King Robert Sales Representative
Callahan Laura Inside Sales Coordinator
Dodsworth Anne Sales Representative
(9 row(s) affected)
*/
-- an expanded version of your example. Note, that this query will return
all 9 rows that contain Fuller (LastName column) OR Sales (Title column)
SELECT e.LastName, e.FirstName, e.Title
from Employees AS e,
containstable(Employees,*,'Fuller or Sales') as A
where
A.[KEY] = e.EmployeeID
--or this example that will also return all 9 rows that contain Fuller
(LastName column) OR Sales (Title column)
SELECT e.LastName, e.FirstName, e.Title
from Employees AS e,
containstable(Employees,*,'"Fuller" or "Sales"') as A
where
A.[KEY] = e.EmployeeID
However, John wants to 'search across 3 columns, "author", "title", and
"subtitle"' and the user types "chemistry brown" into a search box, (the
Author being "Brown" and the title "Chemistry"). Note the request is for an
AND'ing between the FT-enabled columns and not an OR between the columns.
This can be satsified via using multiple CONTAINSTABLE clauses and AND'ing
the join between Containstable predicates, for example
SELECT e.LastName, e.FirstName, e.Title
from Employees AS e,
containstable(Employees, LastName, 'Fuller') as A,
containstable(Employees, Title, 'Sales') as B
where
A.[KEY] = e.EmployeeID AND
B.[KEY] = e.EmployeeID
go
/* -- returns:
LastName FirstName Title
-- -- --
Fuller Andrew Vice President, Sales
*/
John, you can also use the workarounds that are documented for this issue in
SQL Server 7.0 KB article "286787 (Q286787) FIX: Incorrect Results From
Full-Text Search on Several Columns" at
http://support.microsoft.com/default...b;en-us;286787 as well as
SQL Server 7.0 KB article "294809 (Q294809) FIX: Full-Text Search Queries
with CONTAINS Clause Search Across Columns" at
http://support.microsoft.com/default...;en-us;294809. You should
note that while these SQL Server 7.0 specific KB articles are "fixes" for
SQL Server 7.0, this is the default behavior for SQL Server 2000 and SQL
Server 7.0 was "fixed" to comply with the default behavior of SQL Sever
2000. The workarounds in the KB articles will also work in SQL Server 2000
as well as the above solution that I provided above.
Regards,
John
"Hilary Cotter" <hilaryk@.att.net> wrote in message
news:#Zt$PpiZEHA.2216@.TK2MSFTNGP10.phx.gbl...
> you could try this
> select * from containstable(fulltext,*,'chemistry or brown')
> or
> select * from containstable(fulltext,*,'"chemistry" or "brown"')
> --
> Hilary Cotter
> Looking for a book on SQL Server replication?
> http://www.nwsu.com/0974973602.html
>
> "John C" <John C@.discussions.microsoft.com> wrote in message
> news:36E8719A-769E-45D1-A05B-FD8BC0B170E5@.microsoft.com...
> trying to search across 3 columns, "author", "title", and "subtitle" (no
> other columns are part of FT index). The problem relates to searching for
> an author/title phrase at the same time. Say the user types "chemistry
> brown" into a search box, (the Author being "Brown" and the title
> "Chemistry"--but there's no way to make that distinction at run time). If
> you use CONTAINSTABLE, (...from containstable(<ourtable>,*,'"brown" near
> "chemistry"') it only return rows where "brown" and "chemistry" are in the
> SAME column (where what we need is for it to return high rankings where
> "brown" and "chemistry" are in different columns but the same rows).
> Freetexttable is no better, since again it's ranking based on the number
of
> occurrences of a word in a SINGLE column, not in all the FT-indexed
columns
> per row.
> known at search time, which it difficult. Any suggesstions?
>
|||Thank you John. However, we've tried using multiple containstable and the problem is that you get, as you noted, an ANDing of your results. Returning to our previous example (user types "brown chemistry" in search box, with "brown" the implicit author a
nd "chemistry" the implicit title), we'd like to have chemistry books written by Brown as the highest ranked results, but also return other chemistry books (not by Brown) and books authored by Brown (but w/o "Chemistry") in the title. The JOINing of 2 co
ntainstable queries rules that out. When reading your reply I first thought that using a LEFT join would solve that problem, but that assumes you know which search term(s) are the most significant (i.e., which side to make the right, and which to make th
e left).
John
"John Kane" wrote:

> Hilary,
> While the below containstable query examples provide a solution, it does not
> meet John's requirement of 'search across 3 columns, "author", "title", and
> "subtitle"' and I submit that using the following FT-enabled Northwind table
> (Employees) and multiple column specific column names (vs. using "*" or
> asterisk for all FT-enabled columns) does meet John's request, for example:
> use Northwind
> go
> SELECT LastName, FirstName, Title, Notes from Employees
> /* returns:
> LastName FirstName Title
> -- -- --
> Davolio Nancy Sales Representative
> Fuller Andrew Vice President, Sales
> Leverling Janet Sales Representative
> Peacock Margaret Sales Representative
> Buchanan Steven Sales Manager
> Suyama Michael Sales Representative
> King Robert Sales Representative
> Callahan Laura Inside Sales Coordinator
> Dodsworth Anne Sales Representative
> (9 row(s) affected)
> */
> -- an expanded version of your example. Note, that this query will return
> all 9 rows that contain Fuller (LastName column) OR Sales (Title column)
> SELECT e.LastName, e.FirstName, e.Title
> from Employees AS e,
> containstable(Employees,*,'Fuller or Sales') as A
> where
> A.[KEY] = e.EmployeeID
> --or this example that will also return all 9 rows that contain Fuller
> (LastName column) OR Sales (Title column)
> SELECT e.LastName, e.FirstName, e.Title
> from Employees AS e,
> containstable(Employees,*,'"Fuller" or "Sales"') as A
> where
> A.[KEY] = e.EmployeeID
> However, John wants to 'search across 3 columns, "author", "title", and
> "subtitle"' and the user types "chemistry brown" into a search box, (the
> Author being "Brown" and the title "Chemistry"). Note the request is for an
> AND'ing between the FT-enabled columns and not an OR between the columns.
> This can be satsified via using multiple CONTAINSTABLE clauses and AND'ing
> the join between Containstable predicates, for example
> SELECT e.LastName, e.FirstName, e.Title
> from Employees AS e,
> containstable(Employees, LastName, 'Fuller') as A,
> containstable(Employees, Title, 'Sales') as B
> where
> A.[KEY] = e.EmployeeID AND
> B.[KEY] = e.EmployeeID
> go
> /* -- returns:
> LastName FirstName Title
> -- -- --
> Fuller Andrew Vice President, Sales
> */
> John, you can also use the workarounds that are documented for this issue in
> SQL Server 7.0 KB article "286787 (Q286787) FIX: Incorrect Results From
> Full-Text Search on Several Columns" at
> http://support.microsoft.com/default...b;en-us;286787 as well as
> SQL Server 7.0 KB article "294809 (Q294809) FIX: Full-Text Search Queries
> with CONTAINS Clause Search Across Columns" at
> http://support.microsoft.com/default...;en-us;294809. You should
> note that while these SQL Server 7.0 specific KB articles are "fixes" for
> SQL Server 7.0, this is the default behavior for SQL Server 2000 and SQL
> Server 7.0 was "fixed" to comply with the default behavior of SQL Sever
> 2000. The workarounds in the KB articles will also work in SQL Server 2000
> as well as the above solution that I provided above.
> Regards,
> John
>
>
> "Hilary Cotter" <hilaryk@.att.net> wrote in message
> news:#Zt$PpiZEHA.2216@.TK2MSFTNGP10.phx.gbl...
> of
> columns
>
>
|||You're welcome, John
You can also "OR" the multiple containstable clauses, for example:
-- Note, the OR and the addition of distinct e.LastName and searching on
'Inside' from the Title column.
SELECT distinct e.LastName, e.FirstName, e.Title
from Employees AS e,
containstable(Employees, LastName, 'Fuller') as A,
containstable(Employees, Title, 'Inside') as B
where
A.[KEY] = e.EmployeeID OR
B.[KEY] = e.EmployeeID
/* -- returns:
LastName FirstName Title
-- -- --
Callahan Laura Inside Sales Coordinator
Fuller Andrew Vice President, Sales
*/
However, I believe that by " highest ranked results", you are really looking
for cross-column FT Search results and for what is also know as "best bets"
or forcing you're artificial or "editorial best selection" and that you
would like this to be keyword and column specific. While SharePoint Portal
Server has such as concept as best bets, SQL Server FTS does not directly
support this without having to pay a scalability penalty. However, I've
recently worked out a single column keyword specific solution to this, for
details see my replies under the fulltext newsgroup subject thread
"Full-text SharePoint". I just recently worked out the details to a scalable
single table, single column keyword-specific KeywordRank "booster" that can
also work with the ExtendedRank value in the example provided in the
"Full-text SharePoint" thread. Please, review this thread and post back here
any questions you might have about the procedures. What would be required of
you or your users, is a keyword or phrase which can be linked to each of row
in your FT-enable table, this can also be linked back to a query log table
that your searchers are using that will record the keywords that they are
using and what pages they have clicked on. Let me know if you want to pursue
this further.
Thanks,
John
"John C" <John C@.discussions.microsoft.com> wrote in message
news:1BDCCD8F-EA03-463E-BA6C-8D62BB0AE0E8@.microsoft.com...
> Thank you John. However, we've tried using multiple containstable and the
problem is that you get, as you noted, an ANDing of your results. Returning
to our previous example (user types "brown chemistry" in search box, with
"brown" the implicit author and "chemistry" the implicit title), we'd like
to have chemistry books written by Brown as the highest ranked results, but
also return other chemistry books (not by Brown) and books authored by Brown
(but w/o "Chemistry") in the title. The JOINing of 2 containstable queries
rules that out. When reading your reply I first thought that using a LEFT
join would solve that problem, but that assumes you know which search
term(s) are the most significant (i.e., which side to make the right, and
which to make the left).[vbcol=seagreen]
> John
> "John Kane" wrote:
not[vbcol=seagreen]
and[vbcol=seagreen]
table[vbcol=seagreen]
example:[vbcol=seagreen]
return[vbcol=seagreen]
an[vbcol=seagreen]
columns.[vbcol=seagreen]
AND'ing[vbcol=seagreen]
issue in[vbcol=seagreen]
as[vbcol=seagreen]
Queries[vbcol=seagreen]
should[vbcol=seagreen]
for[vbcol=seagreen]
2000[vbcol=seagreen]
(no[vbcol=seagreen]
for[vbcol=seagreen]
"chemistry[vbcol=seagreen]
If[vbcol=seagreen]
near[vbcol=seagreen]
the[vbcol=seagreen]
where[vbcol=seagreen]
number[vbcol=seagreen]
not[vbcol=seagreen]

No comments:

Post a Comment