PostgreSQL stable functions in query












1
















I'm have some data model which consists of couple tables and I need to filter them.



It is two functions funcFast and funcList. funcFast can return fast result is table need to be filtered by funcList or not. funcList return list of allowed ids. I marked functions as STABLE but they run not as fast as I expect:)



I create couple of example functions:



CREATE OR REPLACE FUNCTION funcFastPlPgSql(res boolean)
returns boolean as $$
begin return res; end
$$ language plpgsql stable;

CREATE OR REPLACE FUNCTION funcList(cnt int)
returns setof integer as $$
select generate_series(1, cnt)
$$ language sql stable;


And tests.



Case 1. Filter only by fast function work OK:



explain analyze
with obs as (select generate_series(1, 1000000) as id)
select count(*) from obs
where funcFastPlPgSql(true)


Query plan is:



Aggregate  (cost=27.76..27.77 rows=1 width=8) (actual time=573.258..573.259 rows=1 loops=1)
CTE obs
-> Result (cost=0.00..5.01 rows=1000 width=4) (actual time=0.006..114.327 rows=1000000 loops=1)
-> Result (cost=0.25..20.25 rows=1000 width=0) (actual time=0.038..489.942 rows=1000000 loops=1)
One-Time Filter: funcfastplpgsql(true)
-> CTE Scan on obs (cost=0.25..20.25 rows=1000 width=0) (actual time=0.012..392.504 rows=1000000 loops=1)
Planning time: 0.184 ms
Execution time: 576.177 ms


Case 2. Filter only by slow function work OK too:



explain analyze
with obs as (select generate_series(1, 1000000) as id)
select count(*) from obs
where id in (select funcList(1000))


Query plan is:



Aggregate  (cost=62.26..62.27 rows=1 width=8) (actual time=469.344..469.344 rows=1 loops=1)
CTE obs
-> Result (cost=0.00..5.01 rows=1000 width=4) (actual time=0.006..106.144 rows=1000000 loops=1)
-> Hash Join (cost=22.25..56.00 rows=500 width=0) (actual time=1.566..469.202 rows=1000 loops=1)
Hash Cond: (obs.id = (funclist(1000)))
-> CTE Scan on obs (cost=0.00..20.00 rows=1000 width=4) (actual time=0.009..359.580 rows=1000000 loops=1)
-> Hash (cost=19.75..19.75 rows=200 width=4) (actual time=1.548..1.548 rows=1000 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 44kB
-> HashAggregate (cost=17.75..19.75 rows=200 width=4) (actual time=1.101..1.312 rows=1000 loops=1)
Group Key: funclist(1000)
-> Result (cost=0.00..5.25 rows=1000 width=4) (actual time=0.058..0.706 rows=1000 loops=1)
Planning time: 0.141 ms
Execution time: 472.183 ms


Case 3. But then two function combined I expect what the best case should be close to [case 1] and worst case should be close to [case 2], but:



explain analyze
with obs as (select generate_series(1, 1000000) as id)
select count(*) from obs
where funcFastPlPgSql(true) or id in (select funcList(1000))


Query plan is:



Aggregate  (cost=286.93..286.94 rows=1 width=8) (actual time=1575.775..1575.775 rows=1 loops=1)
CTE obs
-> Result (cost=0.00..5.01 rows=1000 width=4) (actual time=0.008..131.372 rows=1000000 loops=1)
-> CTE Scan on obs (cost=7.75..280.25 rows=667 width=0) (actual time=0.035..1468.007 rows=1000000 loops=1)
Filter: (funcfastplpgsql(true) OR (hashed SubPlan 2))
SubPlan 2
-> Result (cost=0.00..5.25 rows=1000 width=4) (never executed)
Planning time: 0.100 ms
Execution time: 1578.624 ms


What I am missing here? Why query with two together functions runs much longer and how to fix it?










share|improve this question

























  • I can't tell you the "why" of the slowness, but if you put the call to funcFastPlPgSql in case 3 in a subquery (i.e. WHERE (SELECT funcFastPlPgSql(true)) OR... you'll get your desired speed. Seems strange to me because PG should know it can only ever return a single value (based on it being STABLE and a hardcoded input of TRUE), so presumably should only call it once and use the result on every test. It does that for the subquery but doesn't appear to do that for a standard call.

    – 404
    Nov 22 '18 at 14:26


















1
















I'm have some data model which consists of couple tables and I need to filter them.



It is two functions funcFast and funcList. funcFast can return fast result is table need to be filtered by funcList or not. funcList return list of allowed ids. I marked functions as STABLE but they run not as fast as I expect:)



I create couple of example functions:



CREATE OR REPLACE FUNCTION funcFastPlPgSql(res boolean)
returns boolean as $$
begin return res; end
$$ language plpgsql stable;

CREATE OR REPLACE FUNCTION funcList(cnt int)
returns setof integer as $$
select generate_series(1, cnt)
$$ language sql stable;


And tests.



Case 1. Filter only by fast function work OK:



explain analyze
with obs as (select generate_series(1, 1000000) as id)
select count(*) from obs
where funcFastPlPgSql(true)


Query plan is:



Aggregate  (cost=27.76..27.77 rows=1 width=8) (actual time=573.258..573.259 rows=1 loops=1)
CTE obs
-> Result (cost=0.00..5.01 rows=1000 width=4) (actual time=0.006..114.327 rows=1000000 loops=1)
-> Result (cost=0.25..20.25 rows=1000 width=0) (actual time=0.038..489.942 rows=1000000 loops=1)
One-Time Filter: funcfastplpgsql(true)
-> CTE Scan on obs (cost=0.25..20.25 rows=1000 width=0) (actual time=0.012..392.504 rows=1000000 loops=1)
Planning time: 0.184 ms
Execution time: 576.177 ms


Case 2. Filter only by slow function work OK too:



explain analyze
with obs as (select generate_series(1, 1000000) as id)
select count(*) from obs
where id in (select funcList(1000))


Query plan is:



Aggregate  (cost=62.26..62.27 rows=1 width=8) (actual time=469.344..469.344 rows=1 loops=1)
CTE obs
-> Result (cost=0.00..5.01 rows=1000 width=4) (actual time=0.006..106.144 rows=1000000 loops=1)
-> Hash Join (cost=22.25..56.00 rows=500 width=0) (actual time=1.566..469.202 rows=1000 loops=1)
Hash Cond: (obs.id = (funclist(1000)))
-> CTE Scan on obs (cost=0.00..20.00 rows=1000 width=4) (actual time=0.009..359.580 rows=1000000 loops=1)
-> Hash (cost=19.75..19.75 rows=200 width=4) (actual time=1.548..1.548 rows=1000 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 44kB
-> HashAggregate (cost=17.75..19.75 rows=200 width=4) (actual time=1.101..1.312 rows=1000 loops=1)
Group Key: funclist(1000)
-> Result (cost=0.00..5.25 rows=1000 width=4) (actual time=0.058..0.706 rows=1000 loops=1)
Planning time: 0.141 ms
Execution time: 472.183 ms


Case 3. But then two function combined I expect what the best case should be close to [case 1] and worst case should be close to [case 2], but:



explain analyze
with obs as (select generate_series(1, 1000000) as id)
select count(*) from obs
where funcFastPlPgSql(true) or id in (select funcList(1000))


Query plan is:



Aggregate  (cost=286.93..286.94 rows=1 width=8) (actual time=1575.775..1575.775 rows=1 loops=1)
CTE obs
-> Result (cost=0.00..5.01 rows=1000 width=4) (actual time=0.008..131.372 rows=1000000 loops=1)
-> CTE Scan on obs (cost=7.75..280.25 rows=667 width=0) (actual time=0.035..1468.007 rows=1000000 loops=1)
Filter: (funcfastplpgsql(true) OR (hashed SubPlan 2))
SubPlan 2
-> Result (cost=0.00..5.25 rows=1000 width=4) (never executed)
Planning time: 0.100 ms
Execution time: 1578.624 ms


What I am missing here? Why query with two together functions runs much longer and how to fix it?










share|improve this question

























  • I can't tell you the "why" of the slowness, but if you put the call to funcFastPlPgSql in case 3 in a subquery (i.e. WHERE (SELECT funcFastPlPgSql(true)) OR... you'll get your desired speed. Seems strange to me because PG should know it can only ever return a single value (based on it being STABLE and a hardcoded input of TRUE), so presumably should only call it once and use the result on every test. It does that for the subquery but doesn't appear to do that for a standard call.

    – 404
    Nov 22 '18 at 14:26
















1












1








1









I'm have some data model which consists of couple tables and I need to filter them.



It is two functions funcFast and funcList. funcFast can return fast result is table need to be filtered by funcList or not. funcList return list of allowed ids. I marked functions as STABLE but they run not as fast as I expect:)



I create couple of example functions:



CREATE OR REPLACE FUNCTION funcFastPlPgSql(res boolean)
returns boolean as $$
begin return res; end
$$ language plpgsql stable;

CREATE OR REPLACE FUNCTION funcList(cnt int)
returns setof integer as $$
select generate_series(1, cnt)
$$ language sql stable;


And tests.



Case 1. Filter only by fast function work OK:



explain analyze
with obs as (select generate_series(1, 1000000) as id)
select count(*) from obs
where funcFastPlPgSql(true)


Query plan is:



Aggregate  (cost=27.76..27.77 rows=1 width=8) (actual time=573.258..573.259 rows=1 loops=1)
CTE obs
-> Result (cost=0.00..5.01 rows=1000 width=4) (actual time=0.006..114.327 rows=1000000 loops=1)
-> Result (cost=0.25..20.25 rows=1000 width=0) (actual time=0.038..489.942 rows=1000000 loops=1)
One-Time Filter: funcfastplpgsql(true)
-> CTE Scan on obs (cost=0.25..20.25 rows=1000 width=0) (actual time=0.012..392.504 rows=1000000 loops=1)
Planning time: 0.184 ms
Execution time: 576.177 ms


Case 2. Filter only by slow function work OK too:



explain analyze
with obs as (select generate_series(1, 1000000) as id)
select count(*) from obs
where id in (select funcList(1000))


Query plan is:



Aggregate  (cost=62.26..62.27 rows=1 width=8) (actual time=469.344..469.344 rows=1 loops=1)
CTE obs
-> Result (cost=0.00..5.01 rows=1000 width=4) (actual time=0.006..106.144 rows=1000000 loops=1)
-> Hash Join (cost=22.25..56.00 rows=500 width=0) (actual time=1.566..469.202 rows=1000 loops=1)
Hash Cond: (obs.id = (funclist(1000)))
-> CTE Scan on obs (cost=0.00..20.00 rows=1000 width=4) (actual time=0.009..359.580 rows=1000000 loops=1)
-> Hash (cost=19.75..19.75 rows=200 width=4) (actual time=1.548..1.548 rows=1000 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 44kB
-> HashAggregate (cost=17.75..19.75 rows=200 width=4) (actual time=1.101..1.312 rows=1000 loops=1)
Group Key: funclist(1000)
-> Result (cost=0.00..5.25 rows=1000 width=4) (actual time=0.058..0.706 rows=1000 loops=1)
Planning time: 0.141 ms
Execution time: 472.183 ms


Case 3. But then two function combined I expect what the best case should be close to [case 1] and worst case should be close to [case 2], but:



explain analyze
with obs as (select generate_series(1, 1000000) as id)
select count(*) from obs
where funcFastPlPgSql(true) or id in (select funcList(1000))


Query plan is:



Aggregate  (cost=286.93..286.94 rows=1 width=8) (actual time=1575.775..1575.775 rows=1 loops=1)
CTE obs
-> Result (cost=0.00..5.01 rows=1000 width=4) (actual time=0.008..131.372 rows=1000000 loops=1)
-> CTE Scan on obs (cost=7.75..280.25 rows=667 width=0) (actual time=0.035..1468.007 rows=1000000 loops=1)
Filter: (funcfastplpgsql(true) OR (hashed SubPlan 2))
SubPlan 2
-> Result (cost=0.00..5.25 rows=1000 width=4) (never executed)
Planning time: 0.100 ms
Execution time: 1578.624 ms


What I am missing here? Why query with two together functions runs much longer and how to fix it?










share|improve this question

















I'm have some data model which consists of couple tables and I need to filter them.



It is two functions funcFast and funcList. funcFast can return fast result is table need to be filtered by funcList or not. funcList return list of allowed ids. I marked functions as STABLE but they run not as fast as I expect:)



I create couple of example functions:



CREATE OR REPLACE FUNCTION funcFastPlPgSql(res boolean)
returns boolean as $$
begin return res; end
$$ language plpgsql stable;

CREATE OR REPLACE FUNCTION funcList(cnt int)
returns setof integer as $$
select generate_series(1, cnt)
$$ language sql stable;


And tests.



Case 1. Filter only by fast function work OK:



explain analyze
with obs as (select generate_series(1, 1000000) as id)
select count(*) from obs
where funcFastPlPgSql(true)


Query plan is:



Aggregate  (cost=27.76..27.77 rows=1 width=8) (actual time=573.258..573.259 rows=1 loops=1)
CTE obs
-> Result (cost=0.00..5.01 rows=1000 width=4) (actual time=0.006..114.327 rows=1000000 loops=1)
-> Result (cost=0.25..20.25 rows=1000 width=0) (actual time=0.038..489.942 rows=1000000 loops=1)
One-Time Filter: funcfastplpgsql(true)
-> CTE Scan on obs (cost=0.25..20.25 rows=1000 width=0) (actual time=0.012..392.504 rows=1000000 loops=1)
Planning time: 0.184 ms
Execution time: 576.177 ms


Case 2. Filter only by slow function work OK too:



explain analyze
with obs as (select generate_series(1, 1000000) as id)
select count(*) from obs
where id in (select funcList(1000))


Query plan is:



Aggregate  (cost=62.26..62.27 rows=1 width=8) (actual time=469.344..469.344 rows=1 loops=1)
CTE obs
-> Result (cost=0.00..5.01 rows=1000 width=4) (actual time=0.006..106.144 rows=1000000 loops=1)
-> Hash Join (cost=22.25..56.00 rows=500 width=0) (actual time=1.566..469.202 rows=1000 loops=1)
Hash Cond: (obs.id = (funclist(1000)))
-> CTE Scan on obs (cost=0.00..20.00 rows=1000 width=4) (actual time=0.009..359.580 rows=1000000 loops=1)
-> Hash (cost=19.75..19.75 rows=200 width=4) (actual time=1.548..1.548 rows=1000 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 44kB
-> HashAggregate (cost=17.75..19.75 rows=200 width=4) (actual time=1.101..1.312 rows=1000 loops=1)
Group Key: funclist(1000)
-> Result (cost=0.00..5.25 rows=1000 width=4) (actual time=0.058..0.706 rows=1000 loops=1)
Planning time: 0.141 ms
Execution time: 472.183 ms


Case 3. But then two function combined I expect what the best case should be close to [case 1] and worst case should be close to [case 2], but:



explain analyze
with obs as (select generate_series(1, 1000000) as id)
select count(*) from obs
where funcFastPlPgSql(true) or id in (select funcList(1000))


Query plan is:



Aggregate  (cost=286.93..286.94 rows=1 width=8) (actual time=1575.775..1575.775 rows=1 loops=1)
CTE obs
-> Result (cost=0.00..5.01 rows=1000 width=4) (actual time=0.008..131.372 rows=1000000 loops=1)
-> CTE Scan on obs (cost=7.75..280.25 rows=667 width=0) (actual time=0.035..1468.007 rows=1000000 loops=1)
Filter: (funcfastplpgsql(true) OR (hashed SubPlan 2))
SubPlan 2
-> Result (cost=0.00..5.25 rows=1000 width=4) (never executed)
Planning time: 0.100 ms
Execution time: 1578.624 ms


What I am missing here? Why query with two together functions runs much longer and how to fix it?







postgresql postgres-9.6






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 22 '18 at 13:48







abobov

















asked Nov 22 '18 at 13:32









abobovabobov

62




62













  • I can't tell you the "why" of the slowness, but if you put the call to funcFastPlPgSql in case 3 in a subquery (i.e. WHERE (SELECT funcFastPlPgSql(true)) OR... you'll get your desired speed. Seems strange to me because PG should know it can only ever return a single value (based on it being STABLE and a hardcoded input of TRUE), so presumably should only call it once and use the result on every test. It does that for the subquery but doesn't appear to do that for a standard call.

    – 404
    Nov 22 '18 at 14:26





















  • I can't tell you the "why" of the slowness, but if you put the call to funcFastPlPgSql in case 3 in a subquery (i.e. WHERE (SELECT funcFastPlPgSql(true)) OR... you'll get your desired speed. Seems strange to me because PG should know it can only ever return a single value (based on it being STABLE and a hardcoded input of TRUE), so presumably should only call it once and use the result on every test. It does that for the subquery but doesn't appear to do that for a standard call.

    – 404
    Nov 22 '18 at 14:26



















I can't tell you the "why" of the slowness, but if you put the call to funcFastPlPgSql in case 3 in a subquery (i.e. WHERE (SELECT funcFastPlPgSql(true)) OR... you'll get your desired speed. Seems strange to me because PG should know it can only ever return a single value (based on it being STABLE and a hardcoded input of TRUE), so presumably should only call it once and use the result on every test. It does that for the subquery but doesn't appear to do that for a standard call.

– 404
Nov 22 '18 at 14:26







I can't tell you the "why" of the slowness, but if you put the call to funcFastPlPgSql in case 3 in a subquery (i.e. WHERE (SELECT funcFastPlPgSql(true)) OR... you'll get your desired speed. Seems strange to me because PG should know it can only ever return a single value (based on it being STABLE and a hardcoded input of TRUE), so presumably should only call it once and use the result on every test. It does that for the subquery but doesn't appear to do that for a standard call.

– 404
Nov 22 '18 at 14:26














0






active

oldest

votes











Your Answer






StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53432134%2fpostgresql-stable-functions-in-query%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























0






active

oldest

votes








0






active

oldest

votes









active

oldest

votes






active

oldest

votes
















draft saved

draft discarded




















































Thanks for contributing an answer to Stack Overflow!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53432134%2fpostgresql-stable-functions-in-query%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

If I really need a card on my start hand, how many mulligans make sense? [duplicate]

Alcedinidae

Can an atomic nucleus contain both particles and antiparticles? [duplicate]