Sort spills to tempdb due to varchar(max)

On a server with 32GB we are running SQL Server 2014 SP2 with a max memory of 25GB we have two tables, here you find a simplified structure of both tables:

CREATE TABLE [dbo].[Settings](

    [id] [int] IDENTITY(1,1) NOT NULL,

    [resourceId] [int] NULL,

    [typeID] [int] NULL,

    [remark] [varchar](max) NULL,

    CONSTRAINT [PK_Settings] PRIMARY KEY CLUSTERED ([id] ASC)

) ON [PRIMARY]

GO



CREATE TABLE [dbo].[Resources](

    [id] [int] IDENTITY(1,1) NOT NULL,

    [resourceUID] [int] NULL,

 CONSTRAINT [PK_Resources] PRIMARY KEY CLUSTERED ([id] ASC)

) ON [PRIMARY]

GO

with following non-clustered indexes:

CREATE NONCLUSTERED INDEX [IX_UID] ON [dbo].[Resources]

(

    [resourceUID] ASC

)



CREATE NONCLUSTERED INDEX [IX_Test] ON [dbo].[Settings]

(

    [resourceId] ASC,

    [typeID] ASC

)

The database is configured with compatibility level 120.

When I run this query there are spills to tempdb.
This is how I execute the query:

exec sp_executesql N'

select r.id,remark

FROM Resources r

inner join Settings on resourceid=r.id

where resourceUID=@UID

ORDER BY typeID',

N'@UID int',

@UID=38

If don't select the [remark] field no spills occurs. My first reaction was that the spills occurred due to the low number of estimated rows on the nested-loop operator.

So I add 5 datetime and 5 integer columns to the settings table and add them to my select statement. When I execute the query no spills are happening.

Why are the spills only happening when [remark] is selected? It has probably something to do with the fact that this is a varchar(max).
What can I do to avoid spilling to tempdb?

Adding OPTION (RECOMPILE) to the query makes no difference.

edited yesterday

Paul White♦

49.4k14260414

asked 2 days ago

Frederik Vanderhaegen

6641316

May be you can try select r.id, LEFT(remark, 512) (or whatever sensible substring length might be).

– mustaccio
2 days ago

@Forrest: I'm trying to reproduce the data needed to simulate the problem. At first sight it has to do with the low estimate of the nested loop. In my dummy data the estimated number of rows is much higher and no spilling is happening

– Frederik Vanderhaegen
2 days ago

add a comment |

On a server with 32GB we are running SQL Server 2014 SP2 with a max memory of 25GB we have two tables, here you find a simplified structure of both tables:

CREATE TABLE [dbo].[Settings](

    [id] [int] IDENTITY(1,1) NOT NULL,

    [resourceId] [int] NULL,

    [typeID] [int] NULL,

    [remark] [varchar](max) NULL,

    CONSTRAINT [PK_Settings] PRIMARY KEY CLUSTERED ([id] ASC)

) ON [PRIMARY]

GO



CREATE TABLE [dbo].[Resources](

    [id] [int] IDENTITY(1,1) NOT NULL,

    [resourceUID] [int] NULL,

 CONSTRAINT [PK_Resources] PRIMARY KEY CLUSTERED ([id] ASC)

) ON [PRIMARY]

GO

with following non-clustered indexes:

CREATE NONCLUSTERED INDEX [IX_UID] ON [dbo].[Resources]

(

    [resourceUID] ASC

)



CREATE NONCLUSTERED INDEX [IX_Test] ON [dbo].[Settings]

(

    [resourceId] ASC,

    [typeID] ASC

)

The database is configured with compatibility level 120.

When I run this query there are spills to tempdb.
This is how I execute the query:

exec sp_executesql N'

select r.id,remark

FROM Resources r

inner join Settings on resourceid=r.id

where resourceUID=@UID

ORDER BY typeID',

N'@UID int',

@UID=38

If don't select the [remark] field no spills occurs. My first reaction was that the spills occurred due to the low number of estimated rows on the nested-loop operator.

So I add 5 datetime and 5 integer columns to the settings table and add them to my select statement. When I execute the query no spills are happening.

Why are the spills only happening when [remark] is selected? It has probably something to do with the fact that this is a varchar(max).
What can I do to avoid spilling to tempdb?

Adding OPTION (RECOMPILE) to the query makes no difference.

edited yesterday

Paul White♦

49.4k14260414

asked 2 days ago

Frederik Vanderhaegen

6641316

May be you can try select r.id, LEFT(remark, 512) (or whatever sensible substring length might be).

– mustaccio
2 days ago

@Forrest: I'm trying to reproduce the data needed to simulate the problem. At first sight it has to do with the low estimate of the nested loop. In my dummy data the estimated number of rows is much higher and no spilling is happening

– Frederik Vanderhaegen
2 days ago

add a comment |

On a server with 32GB we are running SQL Server 2014 SP2 with a max memory of 25GB we have two tables, here you find a simplified structure of both tables:

CREATE TABLE [dbo].[Settings](

    [id] [int] IDENTITY(1,1) NOT NULL,

    [resourceId] [int] NULL,

    [typeID] [int] NULL,

    [remark] [varchar](max) NULL,

    CONSTRAINT [PK_Settings] PRIMARY KEY CLUSTERED ([id] ASC)

) ON [PRIMARY]

GO



CREATE TABLE [dbo].[Resources](

    [id] [int] IDENTITY(1,1) NOT NULL,

    [resourceUID] [int] NULL,

 CONSTRAINT [PK_Resources] PRIMARY KEY CLUSTERED ([id] ASC)

) ON [PRIMARY]

GO

with following non-clustered indexes:

CREATE NONCLUSTERED INDEX [IX_UID] ON [dbo].[Resources]

(

    [resourceUID] ASC

)



CREATE NONCLUSTERED INDEX [IX_Test] ON [dbo].[Settings]

(

    [resourceId] ASC,

    [typeID] ASC

)

The database is configured with compatibility level 120.

When I run this query there are spills to tempdb.
This is how I execute the query:

exec sp_executesql N'

select r.id,remark

FROM Resources r

inner join Settings on resourceid=r.id

where resourceUID=@UID

ORDER BY typeID',

N'@UID int',

@UID=38

If don't select the [remark] field no spills occurs. My first reaction was that the spills occurred due to the low number of estimated rows on the nested-loop operator.

So I add 5 datetime and 5 integer columns to the settings table and add them to my select statement. When I execute the query no spills are happening.

Why are the spills only happening when [remark] is selected? It has probably something to do with the fact that this is a varchar(max).
What can I do to avoid spilling to tempdb?

Adding OPTION (RECOMPILE) to the query makes no difference.

edited yesterday

Paul White♦

49.4k14260414

asked 2 days ago

Frederik Vanderhaegen

6641316

On a server with 32GB we are running SQL Server 2014 SP2 with a max memory of 25GB we have two tables, here you find a simplified structure of both tables:

CREATE TABLE [dbo].[Settings](

    [id] [int] IDENTITY(1,1) NOT NULL,

    [resourceId] [int] NULL,

    [typeID] [int] NULL,

    [remark] [varchar](max) NULL,

    CONSTRAINT [PK_Settings] PRIMARY KEY CLUSTERED ([id] ASC)

) ON [PRIMARY]

GO



CREATE TABLE [dbo].[Resources](

    [id] [int] IDENTITY(1,1) NOT NULL,

    [resourceUID] [int] NULL,

 CONSTRAINT [PK_Resources] PRIMARY KEY CLUSTERED ([id] ASC)

) ON [PRIMARY]

GO

with following non-clustered indexes:

CREATE NONCLUSTERED INDEX [IX_UID] ON [dbo].[Resources]

(

    [resourceUID] ASC

)



CREATE NONCLUSTERED INDEX [IX_Test] ON [dbo].[Settings]

(

    [resourceId] ASC,

    [typeID] ASC

)

The database is configured with compatibility level 120.

When I run this query there are spills to tempdb.
This is how I execute the query:

exec sp_executesql N'

select r.id,remark

FROM Resources r

inner join Settings on resourceid=r.id

where resourceUID=@UID

ORDER BY typeID',

N'@UID int',

@UID=38

If don't select the [remark] field no spills occurs. My first reaction was that the spills occurred due to the low number of estimated rows on the nested-loop operator.

So I add 5 datetime and 5 integer columns to the settings table and add them to my select statement. When I execute the query no spills are happening.

Why are the spills only happening when [remark] is selected? It has probably something to do with the fact that this is a varchar(max).
What can I do to avoid spilling to tempdb?

Adding OPTION (RECOMPILE) to the query makes no difference.

sql-server query-performance tempdb cardinality-estimates

edited yesterday

Paul White♦

49.4k14260414

asked 2 days ago

Frederik Vanderhaegen

6641316

edited yesterday

Paul White♦

49.4k14260414

asked 2 days ago

Frederik Vanderhaegen

6641316

edited yesterday

Paul White♦

49.4k14260414

edited yesterday

Paul White♦

49.4k14260414

edited yesterday

Paul White♦

49.4k14260414

asked 2 days ago

Frederik Vanderhaegen

6641316

asked 2 days ago

Frederik Vanderhaegen

6641316

asked 2 days ago

Frederik Vanderhaegen

6641316

May be you can try select r.id, LEFT(remark, 512) (or whatever sensible substring length might be).

– mustaccio
2 days ago

@Forrest: I'm trying to reproduce the data needed to simulate the problem. At first sight it has to do with the low estimate of the nested loop. In my dummy data the estimated number of rows is much higher and no spilling is happening

– Frederik Vanderhaegen
2 days ago

add a comment |

May be you can try select r.id, LEFT(remark, 512) (or whatever sensible substring length might be).

– mustaccio
2 days ago

@Forrest: I'm trying to reproduce the data needed to simulate the problem. At first sight it has to do with the low estimate of the nested loop. In my dummy data the estimated number of rows is much higher and no spilling is happening

– Frederik Vanderhaegen
2 days ago

May be you can try select r.id, LEFT(remark, 512) (or whatever sensible substring length might be).

– mustaccio
2 days ago

@Forrest: I'm trying to reproduce the data needed to simulate the problem. At first sight it has to do with the low estimate of the nested loop. In my dummy data the estimated number of rows is much higher and no spilling is happening

– Frederik Vanderhaegen
2 days ago

add a comment |

3 Answers
3

active

oldest

votes

There are going to be several possible workarounds here.

You can manually adjust the memory grant, though I probably wouldn't go that route.

You can also use a CTE and TOP to push the sort lower, before grabbing the max length column. It will look something like below.

WITH CTE AS (

SELECT TOP 1000000000 r.ID, s.ID AS ID2, s.typeID

FROM Resources r

inner join Settings s on resourceid=r.id

where resourceUID=@UID

ORDER BY s.typeID

)

SELECT c.ID, ca.remark

FROM CTE c

CROSS APPLY (SELECT remark FROM dbo.Settings s WHERE s.id = c.ID2) ca(remark)

ORDER BY c.typeID

Proof-of-concept dbfiddle here.
Sample data would still be appreciated!

If you want to read an excellent analysis by Paul White, read here.

edited 2 days ago

answered 2 days ago

Forrest

1,977517

add a comment |

Why are the spills only happening when [remark] is selected?

The spill is happening when you include that column because you don't get a big enough memory grant for the large string data being sorted.

You don't get a big enough memory grant because the actual number of rows is 10x more than the estimated number of rows (1,302 actual vs 126 estimated).

Why is the estimate off? Why does SQL Server think there's only one row in dbo.Settings with a resourceid of 38?

It could be a statistics issue, which you can check by running DBCC SHOW_STATISTICS('dbo.Settings', 'IX_Test') and see the counts for that histogram step. But the execution plan seems to indicate that the stats are as complete and up-to-date as they could be.

Since stats aren't helping, your best bet is probably a query rewrite - which Forrest has covered in his answer.

answered 2 days ago

jadarnel27

4,2671331

add a comment |

To me it appears that the where clause in the query is giving the issue, and is the cause of the low estimates, even if OPTION(RECOMPILE) is used.

I created some test data, and in the end came up with two solutions, storing the ID field from resources in either a variable (if it is always unique) or a temp table, if we can have more than one ID's.

Base test records

SET NOCOUNT ON

DECLARE @i int= 1;

WHILE @i <= 10000

BEGIN

INSERT INTO [dbo].[Settings]([resourceId],[typeID],remark)

VALUES(@i,@i,'KEPT THESE VALUES OUT BECAUSE IT WOULD CLUTTER THE EXAMPLES, VALUES OVER 8000 Chars entered here'); -- 23254 character length on each value

INSERT INTO  [dbo].[Resources](resourceUID)

VALUES(@i);

SET @i += 1;

END

Insert the 'Seek' values, as to get to the same approximate resultset as OP (1300 records)

INSERT INTO  [dbo].[Settings]([resourceId],[typeID],remark)

VALUES(38,38,'KEPT THESE VALUES OUT BECAUSE IT WOULD CLUTTER THE EXAMPLES, VALUES OVER 8000 Chars entered here')

GO 1300

Change compat & Update statistics to match OP

ALTER DATABASE StackOverflow SET COMPATIBILITY_LEVEL = 120;

UPDATE STATISTICS settings WITH FULLSCAN;

UPDATE STATISTICS resources WITH FULLSCAN;

Original query

exec sp_executesql N'

select r.id

FROM Resources r

inner join Settings on resourceid=r.id

where resourceUID=@UID

ORDER BY typeID',

N'@UID int',

@UID=38

My estimates are even worse, with one estimated row, while 1300 are returned.
And like OP stated, it does not matter if I add OPTION(RECOMPILE)

An important thing to note, is that when we get rid of the where clause the estimates are 100% correct, which is expected since we are using all data in both tables.

I forced the indexes just to make sure we use the same ones as in the previous query, to prove the point

exec sp_executesql N'

select r.id,remark

FROM Resources r with(index([IX_UID]))

inner join Settings WITH(INDEX([IX_Test])) 

on resourceid=r.id

ORDER BY typeID',

N'@UID int',

@UID=38

As expected, good estimates.

So, what could we change to get better estimates but still seek on our values?

IF @UID is unique, as in the example OP gave, we could put the single id that was returned from resources in a variable, then seek on that variable with an OPTION(RECOMPILE)

DECLARE @UID int =38 , @RID int;

SELECT @RID=r.id from 

Resources r where resourceUID = @UID;



SELECT @uid, remark 

from Settings 

where resourceId = @uid 

Order by typeID

OPTION(RECOMPILE);

Which gives 100% accurate estimates

But what if there are multiple resourceUID's in resources?

add some test data

INSERT INTO Resources(ResourceUID)

VALUES (38);

go 50

This could be resolved with a temp table

CREATE TABLE #RID (id int)

DECLARE @UID int =38 

INSERT INTO #RID

SELECT r.id 

from 

Resources r where resourceUID = @UID



SELECT @uid, remark 

from Settings  s

INNER JOIN #RID r

ON r.id =s.resourceId

Order by typeID

OPTION(RECOMPILE)



DROP TABLE #RID

Again with accurate estimates.

This was done with my own dataset,YMMV.

Written with sp_executesql

With a variable

exec sp_executesql N'

DECLARE  @RID int;

    SELECT @RID=r.id from 

    Resources r where resourceUID = @UID;



    SELECT @uid, remark 

    from Settings 

    where resourceId = @uid 

    Order by typeID

    OPTION(RECOMPILE);',

N'@UID int',

@UID=38

With a temp table

exec sp_executesql N'



CREATE TABLE #RID (id int)



INSERT INTO #RID

SELECT r.id 

from 

Resources r where resourceUID = @UID



SELECT @uid, remark 

from Settings  s

INNER JOIN #RID r

ON r.id =s.resourceId

Order by typeID

OPTION(RECOMPILE)



DROP TABLE #RID',

N'@UID int',

@UID=38

Still 100% correct estimates on my test

edited 2 days ago

answered 2 days ago

Randi Vertongen

1,847316

1

You could also create filtered statistics for atypically distributed ResourceUID values to help the cardinality estimator, together with OPTION (RECOMPILE) or embedded literals e.g. CREATE STATISTICS [id (resourceUID = 38)] ON dbo.Resources (id) WHERE resourceUID = 38 WITH FULLSCAN;. But the main problem is sorting large values unnecessarily, as addressed by Forrest's answer.

– Paul White♦
yesterday

add a comment |

Your Answer

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "182"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f227288%2fsort-spills-to-tempdb-due-to-varcharmax%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

3 Answers
3

active

oldest

votes

3 Answers
3

active

oldest

votes

There are going to be several possible workarounds here.

You can manually adjust the memory grant, though I probably wouldn't go that route.

You can also use a CTE and TOP to push the sort lower, before grabbing the max length column. It will look something like below.

WITH CTE AS (

SELECT TOP 1000000000 r.ID, s.ID AS ID2, s.typeID

FROM Resources r

inner join Settings s on resourceid=r.id

where resourceUID=@UID

ORDER BY s.typeID

)

SELECT c.ID, ca.remark

FROM CTE c

CROSS APPLY (SELECT remark FROM dbo.Settings s WHERE s.id = c.ID2) ca(remark)

ORDER BY c.typeID

Proof-of-concept dbfiddle here.
Sample data would still be appreciated!

If you want to read an excellent analysis by Paul White, read here.

edited 2 days ago

answered 2 days ago

Forrest

1,977517

add a comment |

There are going to be several possible workarounds here.

You can manually adjust the memory grant, though I probably wouldn't go that route.

You can also use a CTE and TOP to push the sort lower, before grabbing the max length column. It will look something like below.

WITH CTE AS (

SELECT TOP 1000000000 r.ID, s.ID AS ID2, s.typeID

FROM Resources r

inner join Settings s on resourceid=r.id

where resourceUID=@UID

ORDER BY s.typeID

)

SELECT c.ID, ca.remark

FROM CTE c

CROSS APPLY (SELECT remark FROM dbo.Settings s WHERE s.id = c.ID2) ca(remark)

ORDER BY c.typeID

Proof-of-concept dbfiddle here.
Sample data would still be appreciated!

If you want to read an excellent analysis by Paul White, read here.

edited 2 days ago

answered 2 days ago

Forrest

1,977517

add a comment |

There are going to be several possible workarounds here.

You can manually adjust the memory grant, though I probably wouldn't go that route.

You can also use a CTE and TOP to push the sort lower, before grabbing the max length column. It will look something like below.

WITH CTE AS (

SELECT TOP 1000000000 r.ID, s.ID AS ID2, s.typeID

FROM Resources r

inner join Settings s on resourceid=r.id

where resourceUID=@UID

ORDER BY s.typeID

)

SELECT c.ID, ca.remark

FROM CTE c

CROSS APPLY (SELECT remark FROM dbo.Settings s WHERE s.id = c.ID2) ca(remark)

ORDER BY c.typeID

Proof-of-concept dbfiddle here.
Sample data would still be appreciated!

If you want to read an excellent analysis by Paul White, read here.

edited 2 days ago

answered 2 days ago

Forrest

1,977517

There are going to be several possible workarounds here.

You can manually adjust the memory grant, though I probably wouldn't go that route.

You can also use a CTE and TOP to push the sort lower, before grabbing the max length column. It will look something like below.

WITH CTE AS (

SELECT TOP 1000000000 r.ID, s.ID AS ID2, s.typeID

FROM Resources r

inner join Settings s on resourceid=r.id

where resourceUID=@UID

ORDER BY s.typeID

)

SELECT c.ID, ca.remark

FROM CTE c

CROSS APPLY (SELECT remark FROM dbo.Settings s WHERE s.id = c.ID2) ca(remark)

ORDER BY c.typeID

Proof-of-concept dbfiddle here.
Sample data would still be appreciated!

If you want to read an excellent analysis by Paul White, read here.

edited 2 days ago

answered 2 days ago

Forrest

1,977517

edited 2 days ago

answered 2 days ago

Forrest

1,977517

answered 2 days ago

Forrest

1,977517

answered 2 days ago

Forrest

1,977517

add a comment |

Why are the spills only happening when [remark] is selected?

The spill is happening when you include that column because you don't get a big enough memory grant for the large string data being sorted.

You don't get a big enough memory grant because the actual number of rows is 10x more than the estimated number of rows (1,302 actual vs 126 estimated).

Why is the estimate off? Why does SQL Server think there's only one row in dbo.Settings with a resourceid of 38?

Since stats aren't helping, your best bet is probably a query rewrite - which Forrest has covered in his answer.

answered 2 days ago

jadarnel27

4,2671331

add a comment |

Why are the spills only happening when [remark] is selected?

The spill is happening when you include that column because you don't get a big enough memory grant for the large string data being sorted.

You don't get a big enough memory grant because the actual number of rows is 10x more than the estimated number of rows (1,302 actual vs 126 estimated).

Why is the estimate off? Why does SQL Server think there's only one row in dbo.Settings with a resourceid of 38?

Since stats aren't helping, your best bet is probably a query rewrite - which Forrest has covered in his answer.

answered 2 days ago

jadarnel27

4,2671331

add a comment |

Why are the spills only happening when [remark] is selected?

The spill is happening when you include that column because you don't get a big enough memory grant for the large string data being sorted.

You don't get a big enough memory grant because the actual number of rows is 10x more than the estimated number of rows (1,302 actual vs 126 estimated).

Why is the estimate off? Why does SQL Server think there's only one row in dbo.Settings with a resourceid of 38?

Since stats aren't helping, your best bet is probably a query rewrite - which Forrest has covered in his answer.

answered 2 days ago

jadarnel27

4,2671331

Why are the spills only happening when [remark] is selected?

The spill is happening when you include that column because you don't get a big enough memory grant for the large string data being sorted.

You don't get a big enough memory grant because the actual number of rows is 10x more than the estimated number of rows (1,302 actual vs 126 estimated).

Why is the estimate off? Why does SQL Server think there's only one row in dbo.Settings with a resourceid of 38?

Since stats aren't helping, your best bet is probably a query rewrite - which Forrest has covered in his answer.

answered 2 days ago

jadarnel27

4,2671331

answered 2 days ago

jadarnel27

4,2671331

answered 2 days ago

jadarnel27

4,2671331

answered 2 days ago

jadarnel27

4,2671331

add a comment |

To me it appears that the where clause in the query is giving the issue, and is the cause of the low estimates, even if OPTION(RECOMPILE) is used.

Base test records

SET NOCOUNT ON

DECLARE @i int= 1;

WHILE @i <= 10000

BEGIN

INSERT INTO [dbo].[Settings]([resourceId],[typeID],remark)

VALUES(@i,@i,'KEPT THESE VALUES OUT BECAUSE IT WOULD CLUTTER THE EXAMPLES, VALUES OVER 8000 Chars entered here'); -- 23254 character length on each value

INSERT INTO  [dbo].[Resources](resourceUID)

VALUES(@i);

SET @i += 1;

END

Insert the 'Seek' values, as to get to the same approximate resultset as OP (1300 records)

INSERT INTO  [dbo].[Settings]([resourceId],[typeID],remark)

VALUES(38,38,'KEPT THESE VALUES OUT BECAUSE IT WOULD CLUTTER THE EXAMPLES, VALUES OVER 8000 Chars entered here')

GO 1300

Change compat & Update statistics to match OP

ALTER DATABASE StackOverflow SET COMPATIBILITY_LEVEL = 120;

UPDATE STATISTICS settings WITH FULLSCAN;

UPDATE STATISTICS resources WITH FULLSCAN;

Original query

exec sp_executesql N'

select r.id

FROM Resources r

inner join Settings on resourceid=r.id

where resourceUID=@UID

ORDER BY typeID',

N'@UID int',

@UID=38

My estimates are even worse, with one estimated row, while 1300 are returned.
And like OP stated, it does not matter if I add OPTION(RECOMPILE)

An important thing to note, is that when we get rid of the where clause the estimates are 100% correct, which is expected since we are using all data in both tables.

I forced the indexes just to make sure we use the same ones as in the previous query, to prove the point

exec sp_executesql N'

select r.id,remark

FROM Resources r with(index([IX_UID]))

inner join Settings WITH(INDEX([IX_Test])) 

on resourceid=r.id

ORDER BY typeID',

N'@UID int',

@UID=38

As expected, good estimates.

So, what could we change to get better estimates but still seek on our values?

IF @UID is unique, as in the example OP gave, we could put the single id that was returned from resources in a variable, then seek on that variable with an OPTION(RECOMPILE)

DECLARE @UID int =38 , @RID int;

SELECT @RID=r.id from 

Resources r where resourceUID = @UID;



SELECT @uid, remark 

from Settings 

where resourceId = @uid 

Order by typeID

OPTION(RECOMPILE);

Which gives 100% accurate estimates

But what if there are multiple resourceUID's in resources?

add some test data

INSERT INTO Resources(ResourceUID)

VALUES (38);

go 50

This could be resolved with a temp table

CREATE TABLE #RID (id int)

DECLARE @UID int =38 

INSERT INTO #RID

SELECT r.id 

from 

Resources r where resourceUID = @UID



SELECT @uid, remark 

from Settings  s

INNER JOIN #RID r

ON r.id =s.resourceId

Order by typeID

OPTION(RECOMPILE)



DROP TABLE #RID

Again with accurate estimates.

This was done with my own dataset,YMMV.

Written with sp_executesql

With a variable

exec sp_executesql N'

DECLARE  @RID int;

    SELECT @RID=r.id from 

    Resources r where resourceUID = @UID;



    SELECT @uid, remark 

    from Settings 

    where resourceId = @uid 

    Order by typeID

    OPTION(RECOMPILE);',

N'@UID int',

@UID=38

With a temp table

exec sp_executesql N'



CREATE TABLE #RID (id int)



INSERT INTO #RID

SELECT r.id 

from 

Resources r where resourceUID = @UID



SELECT @uid, remark 

from Settings  s

INNER JOIN #RID r

ON r.id =s.resourceId

Order by typeID

OPTION(RECOMPILE)



DROP TABLE #RID',

N'@UID int',

@UID=38

Still 100% correct estimates on my test

edited 2 days ago

answered 2 days ago

Randi Vertongen

1,847316

1

You could also create filtered statistics for atypically distributed ResourceUID values to help the cardinality estimator, together with OPTION (RECOMPILE) or embedded literals e.g. CREATE STATISTICS [id (resourceUID = 38)] ON dbo.Resources (id) WHERE resourceUID = 38 WITH FULLSCAN;. But the main problem is sorting large values unnecessarily, as addressed by Forrest's answer.

– Paul White♦
yesterday

add a comment |

To me it appears that the where clause in the query is giving the issue, and is the cause of the low estimates, even if OPTION(RECOMPILE) is used.

Base test records

SET NOCOUNT ON

DECLARE @i int= 1;

WHILE @i <= 10000

BEGIN

INSERT INTO [dbo].[Settings]([resourceId],[typeID],remark)

VALUES(@i,@i,'KEPT THESE VALUES OUT BECAUSE IT WOULD CLUTTER THE EXAMPLES, VALUES OVER 8000 Chars entered here'); -- 23254 character length on each value

INSERT INTO  [dbo].[Resources](resourceUID)

VALUES(@i);

SET @i += 1;

END

Insert the 'Seek' values, as to get to the same approximate resultset as OP (1300 records)

INSERT INTO  [dbo].[Settings]([resourceId],[typeID],remark)

VALUES(38,38,'KEPT THESE VALUES OUT BECAUSE IT WOULD CLUTTER THE EXAMPLES, VALUES OVER 8000 Chars entered here')

GO 1300

Change compat & Update statistics to match OP

ALTER DATABASE StackOverflow SET COMPATIBILITY_LEVEL = 120;

UPDATE STATISTICS settings WITH FULLSCAN;

UPDATE STATISTICS resources WITH FULLSCAN;

Original query

exec sp_executesql N'

select r.id

FROM Resources r

inner join Settings on resourceid=r.id

where resourceUID=@UID

ORDER BY typeID',

N'@UID int',

@UID=38

My estimates are even worse, with one estimated row, while 1300 are returned.
And like OP stated, it does not matter if I add OPTION(RECOMPILE)

An important thing to note, is that when we get rid of the where clause the estimates are 100% correct, which is expected since we are using all data in both tables.

I forced the indexes just to make sure we use the same ones as in the previous query, to prove the point

exec sp_executesql N'

select r.id,remark

FROM Resources r with(index([IX_UID]))

inner join Settings WITH(INDEX([IX_Test])) 

on resourceid=r.id

ORDER BY typeID',

N'@UID int',

@UID=38

As expected, good estimates.

So, what could we change to get better estimates but still seek on our values?

IF @UID is unique, as in the example OP gave, we could put the single id that was returned from resources in a variable, then seek on that variable with an OPTION(RECOMPILE)

DECLARE @UID int =38 , @RID int;

SELECT @RID=r.id from 

Resources r where resourceUID = @UID;



SELECT @uid, remark 

from Settings 

where resourceId = @uid 

Order by typeID

OPTION(RECOMPILE);

Which gives 100% accurate estimates

But what if there are multiple resourceUID's in resources?

add some test data

INSERT INTO Resources(ResourceUID)

VALUES (38);

go 50

This could be resolved with a temp table

CREATE TABLE #RID (id int)

DECLARE @UID int =38 

INSERT INTO #RID

SELECT r.id 

from 

Resources r where resourceUID = @UID



SELECT @uid, remark 

from Settings  s

INNER JOIN #RID r

ON r.id =s.resourceId

Order by typeID

OPTION(RECOMPILE)



DROP TABLE #RID

Again with accurate estimates.

This was done with my own dataset,YMMV.

Written with sp_executesql

With a variable

exec sp_executesql N'

DECLARE  @RID int;

    SELECT @RID=r.id from 

    Resources r where resourceUID = @UID;



    SELECT @uid, remark 

    from Settings 

    where resourceId = @uid 

    Order by typeID

    OPTION(RECOMPILE);',

N'@UID int',

@UID=38

With a temp table

exec sp_executesql N'



CREATE TABLE #RID (id int)



INSERT INTO #RID

SELECT r.id 

from 

Resources r where resourceUID = @UID



SELECT @uid, remark 

from Settings  s

INNER JOIN #RID r

ON r.id =s.resourceId

Order by typeID

OPTION(RECOMPILE)



DROP TABLE #RID',

N'@UID int',

@UID=38

Still 100% correct estimates on my test

edited 2 days ago

answered 2 days ago

Randi Vertongen

1,847316

1

You could also create filtered statistics for atypically distributed ResourceUID values to help the cardinality estimator, together with OPTION (RECOMPILE) or embedded literals e.g. CREATE STATISTICS [id (resourceUID = 38)] ON dbo.Resources (id) WHERE resourceUID = 38 WITH FULLSCAN;. But the main problem is sorting large values unnecessarily, as addressed by Forrest's answer.

– Paul White♦
yesterday

add a comment |

To me it appears that the where clause in the query is giving the issue, and is the cause of the low estimates, even if OPTION(RECOMPILE) is used.

Base test records

SET NOCOUNT ON

DECLARE @i int= 1;

WHILE @i <= 10000

BEGIN

INSERT INTO [dbo].[Settings]([resourceId],[typeID],remark)

VALUES(@i,@i,'KEPT THESE VALUES OUT BECAUSE IT WOULD CLUTTER THE EXAMPLES, VALUES OVER 8000 Chars entered here'); -- 23254 character length on each value

INSERT INTO  [dbo].[Resources](resourceUID)

VALUES(@i);

SET @i += 1;

END

Insert the 'Seek' values, as to get to the same approximate resultset as OP (1300 records)

INSERT INTO  [dbo].[Settings]([resourceId],[typeID],remark)

VALUES(38,38,'KEPT THESE VALUES OUT BECAUSE IT WOULD CLUTTER THE EXAMPLES, VALUES OVER 8000 Chars entered here')

GO 1300

Change compat & Update statistics to match OP

ALTER DATABASE StackOverflow SET COMPATIBILITY_LEVEL = 120;

UPDATE STATISTICS settings WITH FULLSCAN;

UPDATE STATISTICS resources WITH FULLSCAN;

Original query

exec sp_executesql N'

select r.id

FROM Resources r

inner join Settings on resourceid=r.id

where resourceUID=@UID

ORDER BY typeID',

N'@UID int',

@UID=38

My estimates are even worse, with one estimated row, while 1300 are returned.
And like OP stated, it does not matter if I add OPTION(RECOMPILE)

An important thing to note, is that when we get rid of the where clause the estimates are 100% correct, which is expected since we are using all data in both tables.

I forced the indexes just to make sure we use the same ones as in the previous query, to prove the point

exec sp_executesql N'

select r.id,remark

FROM Resources r with(index([IX_UID]))

inner join Settings WITH(INDEX([IX_Test])) 

on resourceid=r.id

ORDER BY typeID',

N'@UID int',

@UID=38

As expected, good estimates.

So, what could we change to get better estimates but still seek on our values?

IF @UID is unique, as in the example OP gave, we could put the single id that was returned from resources in a variable, then seek on that variable with an OPTION(RECOMPILE)

DECLARE @UID int =38 , @RID int;

SELECT @RID=r.id from 

Resources r where resourceUID = @UID;



SELECT @uid, remark 

from Settings 

where resourceId = @uid 

Order by typeID

OPTION(RECOMPILE);

Which gives 100% accurate estimates

But what if there are multiple resourceUID's in resources?

add some test data

INSERT INTO Resources(ResourceUID)

VALUES (38);

go 50

This could be resolved with a temp table

CREATE TABLE #RID (id int)

DECLARE @UID int =38 

INSERT INTO #RID

SELECT r.id 

from 

Resources r where resourceUID = @UID



SELECT @uid, remark 

from Settings  s

INNER JOIN #RID r

ON r.id =s.resourceId

Order by typeID

OPTION(RECOMPILE)



DROP TABLE #RID

Again with accurate estimates.

This was done with my own dataset,YMMV.

Written with sp_executesql

With a variable

exec sp_executesql N'

DECLARE  @RID int;

    SELECT @RID=r.id from 

    Resources r where resourceUID = @UID;



    SELECT @uid, remark 

    from Settings 

    where resourceId = @uid 

    Order by typeID

    OPTION(RECOMPILE);',

N'@UID int',

@UID=38

With a temp table

exec sp_executesql N'



CREATE TABLE #RID (id int)



INSERT INTO #RID

SELECT r.id 

from 

Resources r where resourceUID = @UID



SELECT @uid, remark 

from Settings  s

INNER JOIN #RID r

ON r.id =s.resourceId

Order by typeID

OPTION(RECOMPILE)



DROP TABLE #RID',

N'@UID int',

@UID=38

Still 100% correct estimates on my test

edited 2 days ago

answered 2 days ago

Randi Vertongen

1,847316

To me it appears that the where clause in the query is giving the issue, and is the cause of the low estimates, even if OPTION(RECOMPILE) is used.

Base test records

SET NOCOUNT ON

DECLARE @i int= 1;

WHILE @i <= 10000

BEGIN

INSERT INTO [dbo].[Settings]([resourceId],[typeID],remark)

VALUES(@i,@i,'KEPT THESE VALUES OUT BECAUSE IT WOULD CLUTTER THE EXAMPLES, VALUES OVER 8000 Chars entered here'); -- 23254 character length on each value

INSERT INTO  [dbo].[Resources](resourceUID)

VALUES(@i);

SET @i += 1;

END

Insert the 'Seek' values, as to get to the same approximate resultset as OP (1300 records)

INSERT INTO  [dbo].[Settings]([resourceId],[typeID],remark)

VALUES(38,38,'KEPT THESE VALUES OUT BECAUSE IT WOULD CLUTTER THE EXAMPLES, VALUES OVER 8000 Chars entered here')

GO 1300

Change compat & Update statistics to match OP

ALTER DATABASE StackOverflow SET COMPATIBILITY_LEVEL = 120;

UPDATE STATISTICS settings WITH FULLSCAN;

UPDATE STATISTICS resources WITH FULLSCAN;

Original query

exec sp_executesql N'

select r.id

FROM Resources r

inner join Settings on resourceid=r.id

where resourceUID=@UID

ORDER BY typeID',

N'@UID int',

@UID=38

My estimates are even worse, with one estimated row, while 1300 are returned.
And like OP stated, it does not matter if I add OPTION(RECOMPILE)

An important thing to note, is that when we get rid of the where clause the estimates are 100% correct, which is expected since we are using all data in both tables.

I forced the indexes just to make sure we use the same ones as in the previous query, to prove the point

exec sp_executesql N'

select r.id,remark

FROM Resources r with(index([IX_UID]))

inner join Settings WITH(INDEX([IX_Test])) 

on resourceid=r.id

ORDER BY typeID',

N'@UID int',

@UID=38

As expected, good estimates.

So, what could we change to get better estimates but still seek on our values?

IF @UID is unique, as in the example OP gave, we could put the single id that was returned from resources in a variable, then seek on that variable with an OPTION(RECOMPILE)

DECLARE @UID int =38 , @RID int;

SELECT @RID=r.id from 

Resources r where resourceUID = @UID;



SELECT @uid, remark 

from Settings 

where resourceId = @uid 

Order by typeID

OPTION(RECOMPILE);

Which gives 100% accurate estimates

But what if there are multiple resourceUID's in resources?

add some test data

INSERT INTO Resources(ResourceUID)

VALUES (38);

go 50

This could be resolved with a temp table

CREATE TABLE #RID (id int)

DECLARE @UID int =38 

INSERT INTO #RID

SELECT r.id 

from 

Resources r where resourceUID = @UID



SELECT @uid, remark 

from Settings  s

INNER JOIN #RID r

ON r.id =s.resourceId

Order by typeID

OPTION(RECOMPILE)



DROP TABLE #RID

Again with accurate estimates.

This was done with my own dataset,YMMV.

Written with sp_executesql

With a variable

exec sp_executesql N'

DECLARE  @RID int;

    SELECT @RID=r.id from 

    Resources r where resourceUID = @UID;



    SELECT @uid, remark 

    from Settings 

    where resourceId = @uid 

    Order by typeID

    OPTION(RECOMPILE);',

N'@UID int',

@UID=38

With a temp table

exec sp_executesql N'



CREATE TABLE #RID (id int)



INSERT INTO #RID

SELECT r.id 

from 

Resources r where resourceUID = @UID



SELECT @uid, remark 

from Settings  s

INNER JOIN #RID r

ON r.id =s.resourceId

Order by typeID

OPTION(RECOMPILE)



DROP TABLE #RID',

N'@UID int',

@UID=38

Still 100% correct estimates on my test

edited 2 days ago

answered 2 days ago

Randi Vertongen

1,847316

edited 2 days ago

answered 2 days ago

Randi Vertongen

1,847316

answered 2 days ago

Randi Vertongen

1,847316

answered 2 days ago

Randi Vertongen

1,847316

1

You could also create filtered statistics for atypically distributed ResourceUID values to help the cardinality estimator, together with OPTION (RECOMPILE) or embedded literals e.g. CREATE STATISTICS [id (resourceUID = 38)] ON dbo.Resources (id) WHERE resourceUID = 38 WITH FULLSCAN;. But the main problem is sorting large values unnecessarily, as addressed by Forrest's answer.

– Paul White♦
yesterday

add a comment |

1

You could also create filtered statistics for atypically distributed ResourceUID values to help the cardinality estimator, together with OPTION (RECOMPILE) or embedded literals e.g. CREATE STATISTICS [id (resourceUID = 38)] ON dbo.Resources (id) WHERE resourceUID = 38 WITH FULLSCAN;. But the main problem is sorting large values unnecessarily, as addressed by Forrest's answer.

– Paul White♦
yesterday

You could also create filtered statistics for atypically distributed ResourceUID values to help the cardinality estimator, together with OPTION (RECOMPILE) or embedded literals e.g. CREATE STATISTICS [id (resourceUID = 38)] ON dbo.Resources (id) WHERE resourceUID = 38 WITH FULLSCAN;. But the main problem is sorting large values unnecessarily, as addressed by Forrest's answer.

– Paul White♦
yesterday

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Database Administrators Stack Exchange!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

WMHDX7PMKL KsNA Be n4VawhC,52XRTTbDcCT7,AQM,eYUrlSg1GBWp6jouu c R3NdVdymXleYmUxC4t8wXb4l4gG6e makaxb,zPGC

搜尋此網誌

Argthtjtr