Каков наилучший способ ранжировать столбец sql varchar по количеству (количеству) / соответствию слов в параметре с четырьмя различными уникальными критериями.Вероятно, это не тривиальный вопрос, но мне нужно упорядочить строки на основе «наилучшего соответствия», используя мои критерии.
column: description varchar (100) Параметр: @MyParameter varchar (100)
Вывод с этим предпочтением заказа:
- Точное совпадение (совпадение всей строки) - всегда первое
- Начинается с (в зависимости от длины совпадения параметра)
- Количество слов ранжируется с непрерывными словами, ранжирующимися выше для того же количества совпадений слов
- Слова совпадают где угодно (не смежно)
Слова могут НЕ совпадать точно, так как частичносовпадения слова допустимы и, скорее всего, значение арендодателя должно применяться к частичным словам для ранжирования, но не критично (банк будет соответствовать каждому в: например, банк, гончар, прихватка, депо, депо).Начинается с совпадений с другими словами, ранжирование должно быть выше, чем у тех, у которых нет последующих совпадений, но это не является убийцей / супер важным.
Я хотел бы иметь метод ранжирования, где столбец «начинается» со значенияв параметре.Скажем, у меня есть следующая строка:
'This is my value string as a test template to rank on.'
Я хотел бы иметь, в первом случае, ранг столбца / строки, где существует наибольшее количество слов.
И второе место для ранжирования на основе вхождения (наилучшего совпадения) в начале как:
'This is my string as a test template to rank on.' - first
'This is my string as a test template to rank on even though not exact.'-second
'This is my string as a test template to rank' - third
'This is my string as a test template to' - next
'This is my string as a test template' - next etc.
Во-вторых: (возможно, второй набор / группа данных после первого (начинается с) -это желательно
Я хочу ранжировать (сортировать) строки по количеству слов в @MyParameter, которые встречаются в @MyParameter, с рангом, в котором смежные слова ранжируются выше, чем то же количество отдельно.
Таким образом, для приведенной выше строки 'is my string as shown'
будет иметь более высокий ранг, чем 'is not my other string as'
из-за "лучшего соответствия" смежной строки (слов вместе) с одинаковым количеством слов. Строки с более высоким соответствием (количество слов)это произойдет) будет ранжироваться по убыванию наилучшего совпадения в первую очередь.
Если возможно, я бы хотел сделать это в одном запросе.
В результате строка не должна встречаться дважды.
Из соображений производительности в таблице будет отображаться не более 10 000 строк.
Значения в таблице достаточно статичны, с небольшими изменениями, но не полностью.
Я не могу сизменить структуру в это время, но учту это позже (как таблица слов / фраз)
Чтобы сделать это немного сложнее, список слов находится в двух таблицах - но я мог бы создать представление для этого, норезультаты одной таблицы (меньший список) должны появляться до второй, более крупные результаты набора данных при одинаковом совпадении - будут дубликаты из этих таблиц, а также внутри таблицы, и мне нужны только отдельные значения.Выбрать DISTINCT непросто, так как я хочу вернуть один столбец (sourceTable), который вполне может выделить строки и в этом случае выбрать только из первой (меньшей) таблицы, но все остальные столбцы DISTINCT желательны (не считайте, чтостолбец в «отдельной» оценке.
Psuedo Столбцы в таблице:
procedureCode VARCHAR(50),
description VARCHAR(100), -- this is the sort/evaluation column
category VARCHAR(50),
relvu VARCHAR(50),
charge VARCHAR(15),
active bit
sourceTable VARCHAR(50) - just shows which table it comes from of the two
НЕТ уникального индекса, как столбец идентификатора
Соответствует НЕ в третьей таблице, чтобы бытьисключено SELECT * FROM (select * from tableone where procedureCode not in (select procedureCode from tablethree))
UNION ALL
(select * from tabletwo where procedureCode not in (select procedureCode from tablethree))
РЕДАКТИРОВАТЬ: в попытке решить эту проблему, я создал параметр таблицы значения следующим образом:
0 Gastric Intubation & Aspiration/Lavage, Treatmen
1 Gastric%Intubation%Aspiration%Lavage%Treatmen
2 Gastric%Intubation%Aspiration%Lavage
3 Gastric%Intubation%Aspiration
4 Gastric%Intubation
5 Gastric
6 Intubation%Aspiration%Lavage%Treatmen
7 Intubation%Aspiration%Lavage
8 Intubation%Aspiration
9 Intubation
10 Aspiration%Lavage%Treatmen
11 Aspiration%Lavage
12 Aspiration
13 Lavage%Treatmen
14 Lavage
15 Treatmen
, где фактическая фраза находится в строке 0
Вот моя текущая попытка:
CREATE PROCEDURE [GetProcedureByDescription]
(
@IncludeMaster BIT,
@ProcedureSearchPhrases CPTFavorite READONLY
)
AS
DECLARE @myIncludeMaster BIT;
SET @myIncludeMaster = @IncludeMaster;
CREATE TABLE #DistinctMatchingCpts
(
procedureCode VARCHAR(50),
description VARCHAR(100),
category VARCHAR(50),
rvu VARCHAR(50),
charge VARCHAR(15),
active VARCHAR(15),
sourceTable VARCHAR(50),
sequenceSet VARCHAR(2)
)
IF @myIncludeMaster = 0
BEGIN -- Excluding master from search
INSERT INTO #DistinctMatchingCpts (sourceTable, procedureCode, description , category ,charge, active, rvu, sequenceSet
)
SELECT DISTINCT sourceTable, procedureCode, description, category ,charge, active, rvu, sequenceSet
FROM (
SELECT TOP 1
LTRIM(RTRIM(CPT.[CODE])) AS procedureCode,
LTRIM(RTRIM(CPT.[LEVEL])) AS description,
LTRIM(RTRIM(CPT.[COMBO])) AS category,
LTRIM(RTRIM(CPT.[CHARGE])) AS charge,
''True'' AS active,
LTRIM(RTRIM([RVU])) AS rvu,
''0CPTMore'' AS sourceTable,
''01'' AS sequenceSet
FROM
@ProcedureSearchPhrases PP
INNER JOIN [CPTMORE] AS CPT
ON CPT.[LEVEL] = PP.[LEVEL]
WHERE
(CPT.[COMBO] IS NULL OR CPT.[COMBO] NOT IN (''Editor'',''MOD'',''CATEGORY'',''Types'',''Bundles''))
AND CPT.[CODE] IS NOT NULL
AND CPT.[CODE] NOT IN (''0'', '''')
AND CPT.[CODE] NOT IN (SELECT CPTE.[CODE] FROM CPT AS CPTE WHERE CPTE.[CODE] IS NOT NULL)
ORDER BY PP.CODE
UNION ALL
SELECT
LTRIM(RTRIM(CPT.[CODE])) AS procedureCode,
LTRIM(RTRIM(CPT.[LEVEL])) AS description,
LTRIM(RTRIM(CPT.[COMBO])) AS category,
LTRIM(RTRIM([CHARGE])) AS charge,
''True'' AS active,
LTRIM(RTRIM([RVU])) AS rvu,
''0CPTMore'' AS sourceTable,
''02'' AS sequenceSet
FROM
@ProcedureSearchPhrases PP
INNER JOIN [CPTMORE] AS CPT
ON CPT.[LEVEL] LIKE PP.[LEVEL] + ''%''
WHERE
(CPT.[COMBO] IS NULL OR CPT.[COMBO] NOT IN (''Editor'',''MOD'',''CATEGORY'',''Types'',''Bundles''))
AND CPT.[CODE] IS NOT NULL
AND CPT.[CODE] NOT IN (''0'', '''')
AND CPT.[CODE] NOT IN (SELECT CPTE.[CODE] FROM CPT AS CPTE WHERE CPTE.[CODE] IS NOT NULL)
UNION ALL
SELECT
LTRIM(RTRIM(CPT.[CODE])) AS procedureCode,
LTRIM(RTRIM(CPT.[LEVEL])) AS description,
LTRIM(RTRIM(CPT.[COMBO])) AS category,
LTRIM(RTRIM(CPT.[CHARGE])) AS charge,
''True'' AS active,
LTRIM(RTRIM([RVU])) AS rvu,
''0CPTMore'' AS sourceTable,
''03'' AS sequenceSet
FROM
@ProcedureSearchPhrases PP
INNER JOIN [CPTMORE] AS CPT
ON CPT.[LEVEL] LIKE ''%'' + PP.[LEVEL] + ''%''
WHERE
(CPT.[COMBO] IS NULL OR CPT.[COMBO] NOT IN (''Editor'',''MOD'',''CATEGORY'',''Types'',''Bundles''))
AND CPT.[CODE] IS NOT NULL
AND CPT.[CODE] NOT IN (''0'', '''')
AND CPT.[CODE] NOT IN (SELECT CPTE.[CODE] FROM CPT AS CPTE WHERE CPTE.[CODE] IS NOT NULL)
) AS CPTS
ORDER BY
procedureCode, sourceTable, [description]
END -- Excluded master from search
ELSE
BEGIN -- Including master in search, but present favorites before master for each code
-- Get matching procedures, ordered by code, source (favorites first), and description.
-- There probably will be procedures with duplicated code+description, so we will filter
-- duplicates shortly.
INSERT INTO #DistinctMatchingCpts (sourceTable, procedureCode, description , category ,charge, active, rvu, sequenceSet)
SELECT DISTINCT sourceTable, procedureCode, description, category ,charge, active, rvu, sequenceSet
FROM (
SELECT TOP 1
LTRIM(RTRIM(CPT.[CODE])) AS procedureCode,
LTRIM(RTRIM(CPT.[LEVEL])) AS description,
LTRIM(RTRIM(CPT.[COMBO])) AS category,
LTRIM(RTRIM(CPT.[CHARGE])) AS charge,
''True'' AS active,
LTRIM(RTRIM([RVU])) AS rvu,
''0CPTMore'' AS sourceTable,
''00'' AS sequenceSet
FROM
@ProcedureSearchPhrases PP
INNER JOIN [CPTMORE] AS CPT
ON CPT.[LEVEL] = PP.[LEVEL]
WHERE
(CPT.[COMBO] IS NULL OR CPT.[COMBO] NOT IN (''Editor'',''MOD'',''CATEGORY'',''Types'',''Bundles''))
AND CPT.[CODE] IS NOT NULL
AND CPT.[CODE] NOT IN (''0'', '''')
AND CPT.[CODE] NOT IN (SELECT CPTE.[CODE] FROM CPT AS CPTE WHERE CPTE.[CODE] IS NOT NULL)
ORDER BY PP.CODE
UNION ALL
SELECT TOP 1
LTRIM(RTRIM(CPT.[CODE])) AS procedureCode,
LTRIM(RTRIM(CPT.[LEVEL])) AS description,
LTRIM(RTRIM(CPT.[CATEGORY])) AS category,
LTRIM(RTRIM(CPT.[CHARGE])) AS charge,
COALESCE(CASE [ACTIVE] WHEN 1 THEN ''True'' WHEN 0 THEN ''False'' WHEN '''' THEN ''False'' ELSE ''False'' END,''True'') AS active,
LTRIM(RTRIM([RVU])) AS rvu,
''2MasterCPT'' AS sourceTable,
''00'' AS sequenceSet
FROM
@ProcedureSearchPhrases PP
INNER JOIN [MASTERCPT] AS CPT
ON CPT.[LEVEL] = PP.[LEVEL]
WHERE
CPT.[CODE] IS NOT NULL
AND CPT.[CODE] NOT IN (''0'', '''')
AND CPT.[CODE] NOT IN (SELECT CPTE.[CODE] FROM CPT AS CPTE WHERE CPTE.[CODE] IS NOT NULL)
ORDER BY PP.CODE
UNION ALL
SELECT
LTRIM(RTRIM(CPT.[CODE])) AS procedureCode,
LTRIM(RTRIM(CPT.[LEVEL])) AS description,
LTRIM(RTRIM(CPT.[COMBO])) AS category,
LTRIM(RTRIM(CPT.[CHARGE])) AS charge,
''True'' AS active,
LTRIM(RTRIM([RVU])) AS rvu,
''0CPTMore'' AS sourceTable,
''01'' AS sequenceSet
FROM
@ProcedureSearchPhrases PP
INNER JOIN [CPTMORE] AS CPT
ON CPT.[LEVEL] = PP.[LEVEL]
WHERE
(CPT.[COMBO] IS NULL OR CPT.[COMBO] NOT IN (''Editor'',''MOD'',''CATEGORY'',''Types'',''Bundles''))
AND CPT.[CODE] IS NOT NULL
AND CPT.[CODE] NOT IN (''0'', '''')
AND CPT.[CODE] NOT IN (SELECT CPTE.[CODE] FROM CPT AS CPTE WHERE CPTE.[CODE] IS NOT NULL)
UNION ALL
SELECT
LTRIM(RTRIM(CPT.[CODE])) AS procedureCode,
LTRIM(RTRIM(CPT.[LEVEL])) AS description,
LTRIM(RTRIM(CPT.[CATEGORY])) AS category,
LTRIM(RTRIM(CPT.[CHARGE])) AS charge,
COALESCE(CASE [ACTIVE] WHEN 1 THEN ''True'' WHEN 0 THEN ''False'' WHEN '''' THEN ''False'' ELSE ''False'' END,''True'') AS active,
LTRIM(RTRIM([RVU])) AS rvu,
''2MasterCPT'' AS sourceTable,
''01'' AS sequenceSet
FROM
@ProcedureSearchPhrases PP
INNER JOIN [MASTERCPT] AS CPT
ON CPT.[LEVEL] = PP.[LEVEL]
WHERE
CPT.[CODE] IS NOT NULL
AND CPT.[CODE] NOT IN (''0'', '''')
AND CPT.[CODE] NOT IN (SELECT CPTE.[CODE] FROM CPT AS CPTE WHERE CPTE.[CODE] IS NOT NULL)
UNION ALL
SELECT TOP 1
LTRIM(RTRIM(CPT.[CODE])) AS procedureCode,
LTRIM(RTRIM(CPT.[LEVEL])) AS description,
LTRIM(RTRIM(CPT.[COMBO])) AS category,
LTRIM(RTRIM(CPT.[CHARGE])) AS charge,
''True'' AS active,
LTRIM(RTRIM([RVU])) AS rvu,
''0CPTMore'' AS sourceTable,
''02'' AS sequenceSet
FROM
@ProcedureSearchPhrases PP
INNER JOIN [CPTMORE] AS CPT
ON CPT.[LEVEL] LIKE PP.[LEVEL] + ''%''
WHERE
(CPT.[COMBO] IS NULL OR CPT.[COMBO] NOT IN (''Editor'',''MOD'',''CATEGORY'',''Types'',''Bundles''))
AND CPT.[CODE] IS NOT NULL
AND CPT.[CODE] NOT IN (''0'', '''')
AND CPT.[CODE] NOT IN (SELECT CPTE.[CODE] FROM CPT AS CPTE WHERE CPTE.[CODE] IS NOT NULL)
ORDER BY PP.CODE
UNION ALL
SELECT TOP 1
LTRIM(RTRIM(CPT.[CODE])) AS procedureCode,
LTRIM(RTRIM(CPT.[LEVEL])) AS description,
LTRIM(RTRIM(CPT.[CATEGORY])) AS category,
LTRIM(RTRIM(CPT.[CHARGE])) AS charge,
COALESCE(CASE [ACTIVE] WHEN 1 THEN ''True'' WHEN 0 THEN ''False'' WHEN '''' THEN ''False'' ELSE ''False'' END,''True'') AS active,
LTRIM(RTRIM([RVU])) AS rvu,
''2MasterCPT'' AS sourceTable,
''02'' AS sequenceSet
FROM
@ProcedureSearchPhrases PP
INNER JOIN [MASTERCPT] AS CPT
ON CPT.[LEVEL] LIKE PP.[LEVEL] + ''%''
WHERE
CPT.[CODE] IS NOT NULL
AND CPT.[CODE] NOT IN (''0'', '''')
AND CPT.[CODE] NOT IN (SELECT CPTE.[CODE] FROM CPT AS CPTE WHERE CPTE.[CODE] IS NOT NULL)
ORDER BY PP.CODE
UNION ALL
SELECT
LTRIM(RTRIM(CPT.[CODE])) AS procedureCode,
LTRIM(RTRIM(CPT.[LEVEL])) AS description,
LTRIM(RTRIM(CPT.[COMBO])) AS category,
LTRIM(RTRIM(CPT.[CHARGE])) AS charge,
''True'' AS active,
LTRIM(RTRIM([RVU])) AS rvu,
''0CPTMore'' AS sourceTable,
''03'' AS sequenceSet
FROM
@ProcedureSearchPhrases PP
INNER JOIN [CPTMORE] AS CPT
ON CPT.[LEVEL] LIKE PP.[LEVEL] + ''%''
WHERE
(CPT.[COMBO] IS NULL OR CPT.[COMBO] NOT IN (''Editor'',''MOD'',''CATEGORY'',''Types'',''Bundles''))
AND CPT.[CODE] IS NOT NULL
AND CPT.[CODE] NOT IN (''0'', '''')
AND CPT.[CODE] NOT IN (SELECT CPTE.[CODE] FROM CPT AS CPTE WHERE CPTE.[CODE] IS NOT NULL)
UNION ALL
SELECT
LTRIM(RTRIM(CPT.[CODE])) AS procedureCode,
LTRIM(RTRIM(CPT.[LEVEL])) AS description,
LTRIM(RTRIM(CPT.[CATEGORY])) AS category,
LTRIM(RTRIM(CPT.[CHARGE])) AS charge,
COALESCE(CASE [ACTIVE] WHEN 1 THEN ''True'' WHEN 0 THEN ''False'' WHEN '''' THEN ''False'' ELSE ''False'' END,''True'') AS active,
LTRIM(RTRIM([RVU])) AS rvu,
''2MasterCPT'' AS sourceTable,
''03'' AS sequenceSet
FROM
@ProcedureSearchPhrases PP
INNER JOIN [MASTERCPT] AS CPT
ON CPT.[LEVEL] LIKE PP.[LEVEL] + ''%''
WHERE
CPT.[CODE] IS NOT NULL
AND CPT.[CODE] NOT IN (''0'', '''')
AND CPT.[CODE] NOT IN (SELECT CPTE.[CODE] FROM CPT AS CPTE WHERE CPTE.[CODE] IS NOT NULL)
UNION ALL
SELECT
LTRIM(RTRIM(CPT.[CODE])) AS procedureCode,
LTRIM(RTRIM(CPT.[LEVEL])) AS description,
LTRIM(RTRIM(CPT.[COMBO])) AS category,
LTRIM(RTRIM(CPT.[CHARGE])) AS charge,
''True'' AS active,
LTRIM(RTRIM([RVU])) AS rvu,
''0CPTMore'' AS sourceTable,
''04'' AS sequenceSet
FROM
@ProcedureSearchPhrases PP
INNER JOIN [CPTMORE] AS CPT
ON CPT.[LEVEL] LIKE ''%'' + PP.[LEVEL] + ''%''
WHERE
(CPT.[COMBO] IS NULL OR CPT.[COMBO] NOT IN (''Editor'',''MOD'',''CATEGORY'',''Types'',''Bundles''))
AND CPT.[CODE] IS NOT NULL
AND CPT.[CODE] NOT IN (''0'', '''')
AND CPT.[CODE] NOT IN (SELECT CPTE.[CODE] FROM CPT AS CPTE WHERE CPTE.[CODE] IS NOT NULL)
UNION ALL
SELECT
LTRIM(RTRIM(CPT.[CODE])) AS procedureCode,
LTRIM(RTRIM(CPT.[LEVEL])) AS description,
LTRIM(RTRIM(CPT.[CATEGORY])) AS category,
LTRIM(RTRIM(CPT.[CHARGE])) AS charge,
COALESCE(CASE [ACTIVE] WHEN 1 THEN ''True'' WHEN 0 THEN ''False'' WHEN '''' THEN ''False'' ELSE ''False'' END,''True'') AS active,
LTRIM(RTRIM([RVU])) AS rvu,
''2MasterCPT'' AS sourceTable,
''04'' AS sequenceSet
FROM
@ProcedureSearchPhrases PP
INNER JOIN [MASTERCPT] AS CPT
ON CPT.[LEVEL] LIKE ''%'' + PP.[LEVEL] + ''%''
WHERE
CPT.[CODE] IS NOT NULL
AND CPT.[CODE] NOT IN (''0'', '''')
AND CPT.[CODE] NOT IN (SELECT CPTE.[CODE] FROM CPT AS CPTE WHERE CPTE.[CODE] IS NOT NULL)
) AS CPTS
ORDER BY
sequenceSet, sourceTable, [description]
END
/* Final select - uses artificial ordering from the insertion ORDER BY */
SELECT procedureCode, description, category, rvu, charge, active FROM
(
SELECT TOP 500 *-- procedureCode, description, category, rvu, charge, active
FROM #DistinctMatchingCpts
ORDER BY sequenceSet, sourceTable, description
) AS CPTROWS
DROP TABLE #DistinctMatchingCpts
Однако это НЕ соответствует критериям наилучшего совпадения по количеству слов (как в значении строки 1 в образце), которое должно соответствовать наилучшему(большинство) найденных слов считаются из этой строки.
У меня есть полный контроль над формой / форматом параметра табличного значения, если это имеет значение.
Я возвращаю этот результатt на программу ac #, если это полезно.