Запрос возвращает неверные данные с таблицей в памяти и / или кластерным индексом хранилища столбцов - PullRequest
0 голосов
/ 28 октября 2019

У меня очень странное поведение в этот день. Я немного экспериментирую с таблицами в памяти SQL Server (2017) и индексами хранилищ столбцов. Но после создания очень маленькой базы данных (только 2 таблицы) с очень маленькими данными внутри я заметил, что один из моих запросов возвращает неправильные данные.

Некоторые сведения:

Первая таблица [Атрибут] содержит некоторые иерархические данные (Id, ParentId). В этой таблице есть триггер, который вычисляет некоторые столбцы для Модель вложенного набора .

На основании этого я создал представление [vw_Attribute_Hierarchy], которое возвращает список всех атрибутов с их дочерними узлами. Я хочу использовать это, чтобы суммировать мои данные для каждого иерархического узла (таким образом, включаются дочерние значения).

Вторая таблица [FactImpacts] содержит только некоторые данные для суммирования по столбцам [PlanValue] и [ActualValue].

Я попытался создать схему только с необходимыми столбцами / данными. Но похоже, что результаты запроса зависят от заданного набора данных.

В настоящее время это мой самый маленький скрипт (, пожалуйста, выполните в режиме SQL CMD ), чтобы воспроизвести мою проблему, и он должен вернутьсядва результата с ТРИ строки в каждом, но я получаю только один для первого запроса и три для второго запроса.

    SET ANSI_NULLS, ANSI_PADDING, ANSI_WARNINGS, ARITHABORT, CONCAT_NULL_YIELDS_NULL, QUOTED_IDENTIFIER ON;
SET NUMERIC_ROUNDABORT OFF;

GO
:setvar DatabaseName "NEW-DB-823474692347"
:setvar DefaultFilePrefix "NEW-DB-823474692347"
:setvar DefaultDataPath "C:\Program Files\Microsoft SQL Server\MSSQL14.SQL2017\MSSQL\DATA\"
:setvar DefaultLogPath "C:\Program Files\Microsoft SQL Server\MSSQL14.SQL2017\MSSQL\DATA\"

GO
:on error exit
GO
/*
Detect SQLCMD mode and disable script execution if SQLCMD mode is not supported.
To re-enable the script after enabling SQLCMD mode, execute the following:
SET NOEXEC OFF; 
*/
:setvar __IsSqlCmdEnabled "True"
GO
IF N'$(__IsSqlCmdEnabled)' NOT LIKE N'True'
    BEGIN
        PRINT N'SQLCMD mode must be enabled to successfully execute this script.';
        SET NOEXEC ON;
    END


GO
USE [master];


GO

IF (DB_ID(N'$(DatabaseName)') IS NOT NULL) 
BEGIN
    ALTER DATABASE [$(DatabaseName)]
    SET SINGLE_USER WITH ROLLBACK IMMEDIATE;
    DROP DATABASE [$(DatabaseName)];
END

GO
PRINT N'Creating $(DatabaseName)...'
GO
CREATE DATABASE [$(DatabaseName)]
    ON 
    PRIMARY(NAME = [$(DatabaseName)], FILENAME = N'$(DefaultDataPath)$(DefaultFilePrefix)_Primary.mdf')
    LOG ON (NAME = [$(DatabaseName)_log], FILENAME = N'$(DefaultLogPath)$(DefaultFilePrefix)_Primary.ldf') COLLATE Latin1_General_CI_AS
GO
IF EXISTS (SELECT 1
           FROM   [master].[dbo].[sysdatabases]
           WHERE  [name] = N'$(DatabaseName)')
    BEGIN
        ALTER DATABASE [$(DatabaseName)]
            SET AUTO_CLOSE OFF 
            WITH ROLLBACK IMMEDIATE;
    END


GO
PRINT N'Creating [FG_MemoryOptimizedData]...';


GO
ALTER DATABASE [$(DatabaseName)]
    ADD FILEGROUP [FG_MemoryOptimizedData] CONTAINS MEMORY_OPTIMIZED_DATA;


GO
ALTER DATABASE [$(DatabaseName)]
    ADD FILE (NAME = [FG_MemoryOptimizedData_76BAE8A0], FILENAME = N'$(DefaultDataPath)$(DefaultFilePrefix)_FG_MemoryOptimizedData_76BAE8A0.mdf') TO FILEGROUP [FG_MemoryOptimizedData];


GO
USE [$(DatabaseName)];

GO
PRINT N'Creating [dbo].[Attribute]...';


GO
CREATE TABLE [dbo].[Attribute] (
    [TenantId]          UNIQUEIDENTIFIER NOT NULL,
    [AttributeId]       UNIQUEIDENTIFIER NOT NULL,
    [AttributeParentId] UNIQUEIDENTIFIER NULL,
    [AttributeName]     NVARCHAR (128)   NOT NULL,
    [LeftExtent]        INT              NOT NULL,
    [RightExtent]       INT              NOT NULL,
    [HierarchyLevel]    INT              NOT NULL,
    [NodeCount]         INT              NOT NULL,
    CONSTRAINT [PK_Attribute] PRIMARY KEY NONCLUSTERED ([AttributeId] ASC)
);


GO
PRINT N'Creating [dbo].[Attribute].[IX_Attribute_Rn]...';


GO
CREATE UNIQUE CLUSTERED INDEX [IX_Attribute_Rn]
    ON [dbo].[Attribute]([TenantId] ASC, [AttributeName] ASC);


GO
PRINT N'Creating [dbo].[Attribute].[IX_Attribute_Name]...';


GO
CREATE UNIQUE NONCLUSTERED INDEX [IX_Attribute_Name]
    ON [dbo].[Attribute]([AttributeName] ASC);


GO
PRINT N'Creating [dbo].[FactImpact]...';


GO
CREATE TABLE [dbo].[FactImpact] (
    [Rn]          BIGINT           IDENTITY (1, 1) NOT NULL,
    [TenantId]    UNIQUEIDENTIFIER CONSTRAINT [DF_FactImpact_TenantId] DEFAULT '10000000-0000-0000-0000-000000000000' NOT NULL,
    [ImpactId]    UNIQUEIDENTIFIER CONSTRAINT [DF_FactImpact_ImpactId] DEFAULT NEWID() NOT NULL,
    [AttributeId] UNIQUEIDENTIFIER NOT NULL,
    [Date]        DATE             NOT NULL,
    [PlanValue]   DECIMAL (38, 10) NULL,
    [ActualValue] DECIMAL (38, 10) NULL,
    CONSTRAINT [PK_FactImpact_Rn] PRIMARY KEY NONCLUSTERED HASH ([Rn]) WITH (BUCKET_COUNT = 100000),
    INDEX [COLIX_FactImpact] CLUSTERED COLUMNSTORE
)
WITH (MEMORY_OPTIMIZED = ON);


GO
PRINT N'Creating [dbo].[DF_Attribute_TenantId]...';


GO
ALTER TABLE [dbo].[Attribute]
    ADD CONSTRAINT [DF_Attribute_TenantId] DEFAULT '10000000-0000-0000-0000-000000000000' FOR [TenantId];


GO
PRINT N'Creating [dbo].[DF_Attribute_Id]...';


GO
ALTER TABLE [dbo].[Attribute]
    ADD CONSTRAINT [DF_Attribute_Id] DEFAULT NEWSEQUENTIALID() FOR [AttributeId];


GO
PRINT N'Creating [dbo].[DF_Attribute_LeftExtend]...';


GO
ALTER TABLE [dbo].[Attribute]
    ADD CONSTRAINT [DF_Attribute_LeftExtend] DEFAULT 0 FOR [LeftExtent];


GO
PRINT N'Creating [dbo].[DF_Attribute_RightExtend]...';


GO
ALTER TABLE [dbo].[Attribute]
    ADD CONSTRAINT [DF_Attribute_RightExtend] DEFAULT 0 FOR [RightExtent];


GO
PRINT N'Creating [dbo].[DF_Attribute_HierarchyLevel]...';


GO
ALTER TABLE [dbo].[Attribute]
    ADD CONSTRAINT [DF_Attribute_HierarchyLevel] DEFAULT 0 FOR [HierarchyLevel];


GO
PRINT N'Creating [dbo].[DF_Attribute_ChildCount]...';


GO
ALTER TABLE [dbo].[Attribute]
    ADD CONSTRAINT [DF_Attribute_ChildCount] DEFAULT 0 FOR [NodeCount];


GO
PRINT N'Creating [dbo].[FK_Attribute_ParentId]...';


GO
ALTER TABLE [dbo].[Attribute]
    ADD CONSTRAINT [FK_Attribute_ParentId] FOREIGN KEY ([AttributeParentId]) REFERENCES [dbo].[Attribute] ([AttributeId]);


GO
PRINT N'Creating [dbo].[vw_Attribute_Hierarchy]...';


GO
CREATE VIEW [dbo].[vw_Attribute_Hierarchy]
AS

  SELECT
        [Root].[TenantId],

        [Root].[AttributeId] AS [AttributeRootId],
        [Root].[AttributeName] AS [AttributeRootName],

        [Node].[AttributeId] AS [AttributeNodeId],
        [Node].[AttributeName] AS [AttributeNodeName]

    FROM [dbo].[Attribute] [Root]
        JOIN [dbo].[Attribute] [Node] ON [Node].[LeftExtent] BETWEEN [Root].[LeftExtent] AND [Root].[RightExtent]
GO
PRINT N'Creating [dbo].[iusp_CalculateNestedSets]...';


GO
CREATE PROCEDURE [dbo].[iusp_CalculateNestedSets]
(
    @SchemaName SYSNAME,
    @TableName SYSNAME,
    @GroupColumnName SYSNAME,
    @IdColumnName SYSNAME = 'Id',
    @ParentIdColumnName SYSNAME = 'ParentId'
)
AS
    -- http://www.sqlservercentral.com/articles/Hierarchy/94040/

    SET NOCOUNT ON


    CREATE TABLE [#HTally]([N] INT NOT NULL PRIMARY KEY)

    INSERT INTO [#HTally]([N])
        SELECT TOP 500 --(16 * 500 = VARBINARY(84000) in length)
                N = ISNULL(CAST(
                        (ROW_NUMBER() OVER (ORDER BY (SELECT NULL))-1)*16+1
                    AS INT),0)
            FROM sys.all_columns ac1
                CROSS JOIN sys.all_columns ac2


    --===== Build the new table on-the-fly including some place holders
    ;WITH cteBuildPath AS
    (
        --=== This is the "anchor" part of the recursive CTE.
        -- The only thing it does is load the Root Node.
        SELECT
            [ANCHOR].[TenantId],
            [ANCHOR].[AttributeId],
            [ANCHOR].[AttributeParentId],
            HLevel   = 1,
            [SortPath] =  CAST(
                            CAST([ANCHOR].[AttributeId] AS BINARY(16))
                        AS VARBINARY(8000)) --Up to 1000 levels deep.
      FROM [dbo].[Attribute] AS anchor
      WHERE [AttributeParentId] IS NULL --Only the Root Node has a NULL [ParentId]
      UNION ALL
     --==== This is the "recursive" part of the CTE that adds 1 for each level
         -- and concatenates each level of [Id]'s to the [SortPath] column.
     SELECT
            [RECUR].[TenantId],
            [RECUR].[AttributeId],
            [RECUR].[AttributeParentId],
            HLevel   =  cte.HLevel + 1,
            [SortPath] =  CAST( --This does the concatenation to build [SortPath]
                            cte.[SortPath] + CAST([RECUR].[AttributeId] AS BINARY(16))
                        AS VARBINARY(8000))
        FROM [dbo].[Attribute] AS recur WITH (TABLOCK)
            INNER JOIN cteBuildPath AS cte ON cte.[AttributeId] = [RECUR].[AttributeParentId]
    )
        --=== This final INSERT/SELECT creates the Node # in the same order as a
        -- push-stack would. It also creates the final table with some
        -- "reserved" columns on the fly. We'll leave the [SortPath] column in
        -- place because we're still going to need it later.
        -- The ISNULLs make NOT NULL columns
    SELECT
            [AttributeId] = ISNULL([SORTED].[AttributeId], '00000000-0000-0000-0000-000000000000'),
            [SORTED].[AttributeParentId],
            [HLevel]        = ISNULL([SORTED].HLevel,0),
            [LeftExtent]    = ISNULL(CAST(0 AS INT),0), --Place holder
            [RightExtent]   = ISNULL(CAST(0 AS INT),0), --Place holder
            [NodeNumber]    = ROW_NUMBER() OVER (PARTITION BY [SORTED].[TenantId] ORDER BY [SORTED].[SortPath]),
            [NodeCount]     = ISNULL(CAST(0 AS INT),0), --Place holder
            [SortPath]      = ISNULL([SORTED].[SortPath],[SORTED].[SortPath])
        INTO [#Hierarchy]
        FROM cteBuildPath AS sorted
        OPTION (MAXRECURSION 500) --Change this IF necessary
    ;


    --===== Declare a working variable to hold the result of the calculation
    -- of the LeftExtent so that it may be easily used to create the
    -- RightExtent in a single scan of the final table.
    DECLARE @LeftExtent INT

    --===== Create the Nested Sets from the information available in the table
    -- and in the following CTE. This uses the proprietary form of UPDATE
    -- available in SQL Serrver for extra performance.
    ;WITH cteCountDownlines AS
    (
        --=== Count each occurance of [Id] in the sort path
        SELECT
                [AttributeId] = CAST(SUBSTRING(h.[SortPath],[T].N,16) AS UNIQUEIDENTIFIER),
                [NodeCount]  = COUNT(*) --Includes current node
            FROM [#Hierarchy] h, [#HTally] t
            WHERE [T].N BETWEEN 1 AND DATALENGTH([SortPath])
            GROUP BY SUBSTRING(h.[SortPath],[T].N, 16)
    ) --=== Update the NodeCount and calculate both Bowers
    UPDATE [SourceTable]
         SET @LeftExtent    = [SourceTable].[LeftExtent] = 2 * h.[NodeNumber] - h.[HLevel],
              [SourceTable].[RightExtent] = ([downline].[NodeCount] - 1) * 2 + @LeftExtent + 1,
              [SourceTable].[NodeCount] = [downline].[NodeCount],
              [SourceTable].[HierarchyLevel] = h.[HLevel]
        FROM [dbo].[Attribute] AS [SourceTable]
            JOIN [#Hierarchy] h ON h.[AttributeId] = [SourceTable].[AttributeId]
            JOIN cteCountDownlines downline ON h.[AttributeId] = [downline].[AttributeId]
        WHERE NOT (
                    [SourceTable].[LeftExtent] = 2 * h.[NodeNumber] - h.[HLevel]
                AND [SourceTable].[RightExtent] = ([downline].[NodeCount] - 1) * 2 + @LeftExtent + 1
                AND [SourceTable].[NodeCount] = [downline].[NodeCount]
                AND [SourceTable].[HierarchyLevel] = h.[HLevel]
            )
        ;


RETURN 0
GO
PRINT N'Creating [dbo].[tr_Attribute_IUD]...';


GO

CREATE TRIGGER [dbo].[tr_Attribute_IUD]
     ON [dbo].[Attribute]
     FOR INSERT, UPDATE, DELETE
     AS
     BEGIN
            SET NOCOUNT ON

            IF UPDATE([AttributeId]) OR UPDATE([AttributeParentId]) BEGIN
                EXEC [dbo].[iusp_CalculateNestedSets] @SchemaName = 'dbo',  @TableName = 'Attribute', @GroupColumnName = 'TenantId', @IdColumnName = 'AttributeId', @ParentIdColumnName = 'AttributeParentId'
            END
     END
GO
/*
Post-Deployment Script Template
--------------------------------------------------------------------------------------
 This file contains SQL statements that will be appended to the build script.
 Use SQLCMD syntax to include a file in the post-deployment script.
 Example:      :r .\myfile.sql
 Use SQLCMD syntax to reference a variable in the post-deployment script.
 Example:      :setvar TableName MyTable
               SELECT * FROM [$(TableName)]
--------------------------------------------------------------------------------------
*/
DECLARE @ItemsPerHierarchyLevel INT = 2
DECLARE @HierarchyLevels INT = 3

-- ATTRIBUTE DATA
;WITH [Source]
AS
(
    SELECT TOP (@ItemsPerHierarchyLevel)
            ROW_NUMBER() OVER(ORDER BY (SELECT 1)) AS [RN]
        FROM [sys].[syscolumns]
),
[CTE]
AS
(
    SELECT
            NEWID() AS [AttributeId],
            CAST(NULL AS UNIQUEIDENTIFIER) AS [AttributeParentId],
            CAST('Attribute-' + CAST([Source].[RN] AS NVARCHAR(128)) AS NVARCHAR(128)) AS [AttributeName],
            1 AS [Level]
        FROM [Source]
    UNION ALL
    SELECT
            NEWID() AS [AttributeId],
            [CTE].[AttributeId] AS [AttributeParentId],
            CAST([CTE].[AttributeName] + '.' + CAST([Source].[RN] AS NVARCHAR(128)) AS NVARCHAR(128)) AS [AttributeName],
            [CTE].[Level] + 1 AS [Level]
        FROM [Source]
            JOIN [CTE] ON [CTE].[Level] < @HierarchyLevels
)
INSERT INTO [dbo].[Attribute]([AttributeId], [AttributeParentId], [AttributeName])
    SELECT
            [CTE].[AttributeId],
            [CTE].[AttributeParentId],
            [CTE].[AttributeName]
        FROM [CTE]
        ORDER BY [AttributeName]
    ;


-- IMPACT DATA
;WITH [RN]
AS
(
    SELECT TOP (24)
            ROW_NUMBER() OVER(ORDER BY (SELECT 1)) AS [RN]
        FROM [sys].[syscolumns]
),
[Date]
AS
(
    SELECT DATEADD(MONTH, [RN].[RN] - 1, DATEFROMPARTS(2019, 1, 1)) AS [Date]
        FROM [RN]
)
INSERT INTO [dbo].[FactImpact]([AttributeId], [Date], [PlanValue], [ActualValue])
SELECT
        [Attribute].[AttributeId],
        [Date].[Date],
        1 AS [PlanValue],
        1 AS [ActualValue]
    FROM [dbo].[Attribute],
        [dbo].[Attribute] [Attribute2], -- ONLY TO DUPLICATE DATA
        [dbo].[Attribute] [Attribute3], -- ONLY TO DUPLICATE DATA
        [Date]
;

GO

PRINT N'Update complete.';


GO

USE [NEW-DB-823474692347]
GO

SELECT 
        [vw_Attribute_Hierarchy].[AttributeRootId],
        [vw_Attribute_Hierarchy].[AttributeRootName],
        SUM([FactImpact].[PlanValue]) AS [SumFactImpacts]
    FROM 

        [dbo].[FactImpact] INNER JOIN [vw_Attribute_Hierarchy] [vw_Attribute_Hierarchy]  ON [vw_Attribute_Hierarchy].[AttributeNodeId] = [FactImpact].[AttributeId]

    WHERE [vw_Attribute_Hierarchy].[AttributeNodeName] = 'Attribute-1.1.1'
    GROUP BY [vw_Attribute_Hierarchy].[AttributeRootId]
        , [vw_Attribute_Hierarchy].[AttributeRootName]

GO


SELECT 
        [vw_Attribute_Hierarchy].[AttributeRootId],
        [vw_Attribute_Hierarchy].[AttributeRootName],
        SUM([FactImpact].[PlanValue]) AS [SumFactImpacts]
    FROM 

        [vw_Attribute_Hierarchy] [vw_Attribute_Hierarchy] INNER JOIN [dbo].[FactImpact]  ON [vw_Attribute_Hierarchy].[AttributeNodeId] = [FactImpact].[AttributeId]

    WHERE [vw_Attribute_Hierarchy].[AttributeNodeName] = 'Attribute-1.1.1'
    GROUP BY [vw_Attribute_Hierarchy].[AttributeRootId]
        , [vw_Attribute_Hierarchy].[AttributeRootName]


GO

Странно то, что если вы сделаете одно из следующих изменений, оба запроса вернутсявсегда одни и те же правильные данные:

1) Я переключил таблицы в INNER JOIN для последних двух операторов SELECT. Если [FactImpact] находится на втором месте, результат верный.

2) Удалите опцию In-Memory в [FactImpacts]

3) Удалите индекс хранилища кластеризованных столбцов в [FactImpacts]

4) Уменьшение или увеличение объема данных в [FactImpacts]

DELETE [dbo].[FactImpact] WHERE [AttributeId] IN (SELECT [AttributeId] FROM [Attribute] WHERE [Attribute].[AttributeName] LIKE 'Attribute-2%')

5) Добавление еще одного индекса в таблицу [Атрибуты]

CREATE INDEX [IX_Attribute_Nested_Set] ON [dbo].[Attribute] ([LeftExtent], [RightExtent])

Iнадеюсь, вы сможете следовать моим объяснениям!

Мне удалось воспроизвести это странное поведение локально в SQL Server 2016/2017.

...