Nested-Loop-Join: How many comparisons and how many pages-accesses?Subqueries run very fast individually, but when joined are very slowHow to INNER JOIN and OUTER JOIN within the same table?Outer Apply vs Left Join PerformanceHow to interpret correlated subqueries?Do Large Hash Joins Contribute to Stolen Pages?How are block nested loop joins optimised for disk reads?Why does the UNION operation require that the relations on which they are applied be union compatible?Equi Join On Union- and Intersection-Compatible RelationsWhat is the relation between Tablespaces, .frm & .ibd files and database pages in MySQL?Block nested loop join cost

Why is DC so, so, so Democratic?

Where is this photo of a group of hikers taken? Is it really in the Ural?

Are there any English words pronounced with sounds/syllables that aren't part of the spelling?

How could Barty Crouch Jr. have run out of Polyjuice Potion at the end of the Goblet of Fire movie?

What is "It is x o'clock" in Japanese with subject

How can I indicate that what I'm saying is not sarcastic online?

My current job follows "worst practices". How can I talk about my experience in an interview without giving off red flags?

What kind of vegetable has pink and white concentric rings?

High income and difficulty during interviews

What is a "staved" town, like in "Staverton"?

How am I supposed to put out fires?

What kind of curve (or model) should I fit to my percentage data?

On the history of Haar measure

Why do people say "I am broke" instead of "I am broken"?

What is the best word describing the nature of expiring in a short amount of time, connoting "losing public attention"?

Why does the salt in the oceans not sink to the bottom?

Ultraproduct of Dividing Lines

How to run a substitute command on only a certain part of the line

How can I show that the speed of light in vacuum is the same in all reference frames?

What gave NASA the confidence for a translunar injection in Apollo 8?

Would using carbon dioxide as fuel work to reduce the greenhouse effect?

How to Sow[] until I've Reap[]'d enough?

Is an easily guessed plot twist a good plot twist?

Inverse Colombian Function



Nested-Loop-Join: How many comparisons and how many pages-accesses?


Subqueries run very fast individually, but when joined are very slowHow to INNER JOIN and OUTER JOIN within the same table?Outer Apply vs Left Join PerformanceHow to interpret correlated subqueries?Do Large Hash Joins Contribute to Stolen Pages?How are block nested loop joins optimised for disk reads?Why does the UNION operation require that the relations on which they are applied be union compatible?Equi Join On Union- and Intersection-Compatible RelationsWhat is the relation between Tablespaces, .frm & .ibd files and database pages in MySQL?Block nested loop join cost






.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;








7















Say you have two relations R and S , where R has 1000 tuples and 100 page-accesses, and S has 50 tuples and 25 page-accesses.



Assuming R is the outer relation, then how many tuple-comparisons and page-accesses are done?



And how many page-accesses if R is the inner relation?



for each tuple r in R do
for each tuple s in S do
if r and s satisfy the join condition
then output the tuple (r,s)


So in order to find out how many tuple-comparisons are done, I need to do 1000 * 50 = 50000 because the algorithm is doing this "for each" tuple and we have in total 1000 tuples for R and 50 tuples for S, thus 50000 comparisons in total.



But how to know the page-accesses now? If R is outside, we have (1000 tuples) * (25 page-accesses for S) + (100 page-accesses for R) = 25100 page accesses?



And if R is inside, then: 50 * 100 + 25 = 5025 page accesses



I'm not sure if it is correct like that.. or how is this done correctly pleaseee? :/










share|improve this question



















  • 1





    What is a "Page"? Is Caching involved, or do you assume all pages are equally accessible? Why is "page accesses a useful metric? Is this intentionally a "cross join", as there seems to be no relation to limit the accesses to the second table in the NLJ.

    – Rick James
    Jul 14 at 4:04

















7















Say you have two relations R and S , where R has 1000 tuples and 100 page-accesses, and S has 50 tuples and 25 page-accesses.



Assuming R is the outer relation, then how many tuple-comparisons and page-accesses are done?



And how many page-accesses if R is the inner relation?



for each tuple r in R do
for each tuple s in S do
if r and s satisfy the join condition
then output the tuple (r,s)


So in order to find out how many tuple-comparisons are done, I need to do 1000 * 50 = 50000 because the algorithm is doing this "for each" tuple and we have in total 1000 tuples for R and 50 tuples for S, thus 50000 comparisons in total.



But how to know the page-accesses now? If R is outside, we have (1000 tuples) * (25 page-accesses for S) + (100 page-accesses for R) = 25100 page accesses?



And if R is inside, then: 50 * 100 + 25 = 5025 page accesses



I'm not sure if it is correct like that.. or how is this done correctly pleaseee? :/










share|improve this question



















  • 1





    What is a "Page"? Is Caching involved, or do you assume all pages are equally accessible? Why is "page accesses a useful metric? Is this intentionally a "cross join", as there seems to be no relation to limit the accesses to the second table in the NLJ.

    – Rick James
    Jul 14 at 4:04













7












7








7








Say you have two relations R and S , where R has 1000 tuples and 100 page-accesses, and S has 50 tuples and 25 page-accesses.



Assuming R is the outer relation, then how many tuple-comparisons and page-accesses are done?



And how many page-accesses if R is the inner relation?



for each tuple r in R do
for each tuple s in S do
if r and s satisfy the join condition
then output the tuple (r,s)


So in order to find out how many tuple-comparisons are done, I need to do 1000 * 50 = 50000 because the algorithm is doing this "for each" tuple and we have in total 1000 tuples for R and 50 tuples for S, thus 50000 comparisons in total.



But how to know the page-accesses now? If R is outside, we have (1000 tuples) * (25 page-accesses for S) + (100 page-accesses for R) = 25100 page accesses?



And if R is inside, then: 50 * 100 + 25 = 5025 page accesses



I'm not sure if it is correct like that.. or how is this done correctly pleaseee? :/










share|improve this question
















Say you have two relations R and S , where R has 1000 tuples and 100 page-accesses, and S has 50 tuples and 25 page-accesses.



Assuming R is the outer relation, then how many tuple-comparisons and page-accesses are done?



And how many page-accesses if R is the inner relation?



for each tuple r in R do
for each tuple s in S do
if r and s satisfy the join condition
then output the tuple (r,s)


So in order to find out how many tuple-comparisons are done, I need to do 1000 * 50 = 50000 because the algorithm is doing this "for each" tuple and we have in total 1000 tuples for R and 50 tuples for S, thus 50000 comparisons in total.



But how to know the page-accesses now? If R is outside, we have (1000 tuples) * (25 page-accesses for S) + (100 page-accesses for R) = 25100 page accesses?



And if R is inside, then: 50 * 100 + 25 = 5025 page accesses



I'm not sure if it is correct like that.. or how is this done correctly pleaseee? :/







sql-server mysql join






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Jul 15 at 19:33









MDCCL

7,1243 gold badges18 silver badges47 bronze badges




7,1243 gold badges18 silver badges47 bronze badges










asked Jul 14 at 1:04









cnmesrcnmesr

1384 bronze badges




1384 bronze badges







  • 1





    What is a "Page"? Is Caching involved, or do you assume all pages are equally accessible? Why is "page accesses a useful metric? Is this intentionally a "cross join", as there seems to be no relation to limit the accesses to the second table in the NLJ.

    – Rick James
    Jul 14 at 4:04












  • 1





    What is a "Page"? Is Caching involved, or do you assume all pages are equally accessible? Why is "page accesses a useful metric? Is this intentionally a "cross join", as there seems to be no relation to limit the accesses to the second table in the NLJ.

    – Rick James
    Jul 14 at 4:04







1




1





What is a "Page"? Is Caching involved, or do you assume all pages are equally accessible? Why is "page accesses a useful metric? Is this intentionally a "cross join", as there seems to be no relation to limit the accesses to the second table in the NLJ.

– Rick James
Jul 14 at 4:04





What is a "Page"? Is Caching involved, or do you assume all pages are equally accessible? Why is "page accesses a useful metric? Is this intentionally a "cross join", as there seems to be no relation to limit the accesses to the second table in the NLJ.

– Rick James
Jul 14 at 4:04










2 Answers
2






active

oldest

votes


















11














We can force SQL Server to do exactly this and see what actually happens.



R has 1000 tuples and 100 page-accesses = 10 tuples/page = 806 bytes/tuple.

S has 50 tuples and 25 page-accesses = 2 tuples/page = 4030 bytes/tuple.



These are the tables:



drop table if exists dbo.R;
drop table if exists dbo.S;
go

create table dbo.R(n int, filler char(785) not null default '');
create table dbo.S(n int, filler char(3990) not null default '');


Filler columns sizes have been rounded down to allow for row overhead. Note that there are no indexes. I have a "numbers" table which I'll use to populate R and S:



insert dbo.R(n) select Number from dbo.Numbers where Number between 1 and 1000;
insert dbo.S(n) select Number from dbo.Numbers where Number between 1 and 50;


We can check just how many pages are involved:



set statistics io on;

select * from R
select * from S


The messages tab of SSMS shows



Table 'R'. Scan count 1, logical reads 100, ...
Table 'S'. Scan count 1, logical reads 25, ...


We have just the right number of pages. A bit of jiggery-pokery will get the behaviour you want to examine



select *
from dbo.R -- R will be outer
inner loop join dbo.S
on r.N = s.N
option
(
force order, -- dictate which table is outer and which inner
NO_PERFORMANCE_SPOOL -- stop the QO from doing something clever but distracting
);

select *
from dbo.S -- S will be outer
inner loop join dbo.R
on r.N = s.N
option (force order, NO_PERFORMANCE_SPOOL);


Which gives this in the messages tab (inner table is listed before the outer table)



Table 'S'. Scan count 1, logical reads 25000, ...
Table 'R'. Scan count 1, logical reads 100, ...

Table 'R'. Scan count 1, logical reads 5000, ..
Table 'S'. Scan count 1, logical reads 25, ...


In SQL Server query execution proceeds row-wise. For each row in the outer table the corresponding row(s) in the inner table will be referenced. Since there are no indexes the only option is to read all the rows (i.e. all the pages) from the inner table every time. For R-join-S we have 1,000 outer rows times 25 inner pages giving 25,000 inner page references plus, of course, 100 outer page references. For S-join-R there are 50 rows times 100 pages giving 5,000 inner page references plus 25 outer page references.



In terms of tuple comparisons you are correct - there will be O(R)xO(S) comparisons - 50,000. This is supported by looking at the query plan. For both queries the "Number of Rows Read" for the inner table references are both 50,000.



If there are indexes the query optimizer (QO) has choices other than a table scan. Rewinds may be used for duplicate outer keys. No pages may be read for non-matching keys. In the extreme case where a constraint says there cannot be any matches the inner table may not even be referenced.






share|improve this answer

























  • I think there's a typo here "comparisons - 50,000" & "references are both 50,000". It should be 5000.

    – Rajesh Ranjan
    Jul 14 at 8:29






  • 1





    @RajeshRanjan the number of comparisons will be count(R) times count(S) = 1,000 x 50 = 50,000. Run the DDL & DML then look at the query plan properties for the inner table reference.

    – Michael Green
    Jul 14 at 9:02












  • Oh! yes, I skipped "Number of Rows Read". You're right.

    – Rajesh Ranjan
    Jul 14 at 10:22


















6














The truth is more involved than what you realize. It is true that the outer input of the join will require 1000 logical reads but only if the join key is unique. If it is not, the optimizer can pre sort it and fetch multiple rows at and match all of them at once.
As for the inner loop, you are assuming a full scan per iteration. The optimizer will typically prefer nested loops if the inner input is indexed, in which case the number of page fetch will be determined by the cardinality of that set.



My 5 cents - don’t worry about physical implementation details. Invest your resources in perfecting the data model, schema, and code. Let the engine worry about nested loops.



HTH






share|improve this answer



























    Your Answer








    StackExchange.ready(function()
    var channelOptions =
    tags: "".split(" "),
    id: "182"
    ;
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function()
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled)
    StackExchange.using("snippets", function()
    createEditor();
    );

    else
    createEditor();

    );

    function createEditor()
    StackExchange.prepareEditor(
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: false,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: null,
    bindNavPrevention: true,
    postfix: "",
    imageUploader:
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    ,
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    );



    );













    draft saved

    draft discarded


















    StackExchange.ready(
    function ()
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f242813%2fnested-loop-join-how-many-comparisons-and-how-many-pages-accesses%23new-answer', 'question_page');

    );

    Post as a guest















    Required, but never shown

























    2 Answers
    2






    active

    oldest

    votes








    2 Answers
    2






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    11














    We can force SQL Server to do exactly this and see what actually happens.



    R has 1000 tuples and 100 page-accesses = 10 tuples/page = 806 bytes/tuple.

    S has 50 tuples and 25 page-accesses = 2 tuples/page = 4030 bytes/tuple.



    These are the tables:



    drop table if exists dbo.R;
    drop table if exists dbo.S;
    go

    create table dbo.R(n int, filler char(785) not null default '');
    create table dbo.S(n int, filler char(3990) not null default '');


    Filler columns sizes have been rounded down to allow for row overhead. Note that there are no indexes. I have a "numbers" table which I'll use to populate R and S:



    insert dbo.R(n) select Number from dbo.Numbers where Number between 1 and 1000;
    insert dbo.S(n) select Number from dbo.Numbers where Number between 1 and 50;


    We can check just how many pages are involved:



    set statistics io on;

    select * from R
    select * from S


    The messages tab of SSMS shows



    Table 'R'. Scan count 1, logical reads 100, ...
    Table 'S'. Scan count 1, logical reads 25, ...


    We have just the right number of pages. A bit of jiggery-pokery will get the behaviour you want to examine



    select *
    from dbo.R -- R will be outer
    inner loop join dbo.S
    on r.N = s.N
    option
    (
    force order, -- dictate which table is outer and which inner
    NO_PERFORMANCE_SPOOL -- stop the QO from doing something clever but distracting
    );

    select *
    from dbo.S -- S will be outer
    inner loop join dbo.R
    on r.N = s.N
    option (force order, NO_PERFORMANCE_SPOOL);


    Which gives this in the messages tab (inner table is listed before the outer table)



    Table 'S'. Scan count 1, logical reads 25000, ...
    Table 'R'. Scan count 1, logical reads 100, ...

    Table 'R'. Scan count 1, logical reads 5000, ..
    Table 'S'. Scan count 1, logical reads 25, ...


    In SQL Server query execution proceeds row-wise. For each row in the outer table the corresponding row(s) in the inner table will be referenced. Since there are no indexes the only option is to read all the rows (i.e. all the pages) from the inner table every time. For R-join-S we have 1,000 outer rows times 25 inner pages giving 25,000 inner page references plus, of course, 100 outer page references. For S-join-R there are 50 rows times 100 pages giving 5,000 inner page references plus 25 outer page references.



    In terms of tuple comparisons you are correct - there will be O(R)xO(S) comparisons - 50,000. This is supported by looking at the query plan. For both queries the "Number of Rows Read" for the inner table references are both 50,000.



    If there are indexes the query optimizer (QO) has choices other than a table scan. Rewinds may be used for duplicate outer keys. No pages may be read for non-matching keys. In the extreme case where a constraint says there cannot be any matches the inner table may not even be referenced.






    share|improve this answer

























    • I think there's a typo here "comparisons - 50,000" & "references are both 50,000". It should be 5000.

      – Rajesh Ranjan
      Jul 14 at 8:29






    • 1





      @RajeshRanjan the number of comparisons will be count(R) times count(S) = 1,000 x 50 = 50,000. Run the DDL & DML then look at the query plan properties for the inner table reference.

      – Michael Green
      Jul 14 at 9:02












    • Oh! yes, I skipped "Number of Rows Read". You're right.

      – Rajesh Ranjan
      Jul 14 at 10:22















    11














    We can force SQL Server to do exactly this and see what actually happens.



    R has 1000 tuples and 100 page-accesses = 10 tuples/page = 806 bytes/tuple.

    S has 50 tuples and 25 page-accesses = 2 tuples/page = 4030 bytes/tuple.



    These are the tables:



    drop table if exists dbo.R;
    drop table if exists dbo.S;
    go

    create table dbo.R(n int, filler char(785) not null default '');
    create table dbo.S(n int, filler char(3990) not null default '');


    Filler columns sizes have been rounded down to allow for row overhead. Note that there are no indexes. I have a "numbers" table which I'll use to populate R and S:



    insert dbo.R(n) select Number from dbo.Numbers where Number between 1 and 1000;
    insert dbo.S(n) select Number from dbo.Numbers where Number between 1 and 50;


    We can check just how many pages are involved:



    set statistics io on;

    select * from R
    select * from S


    The messages tab of SSMS shows



    Table 'R'. Scan count 1, logical reads 100, ...
    Table 'S'. Scan count 1, logical reads 25, ...


    We have just the right number of pages. A bit of jiggery-pokery will get the behaviour you want to examine



    select *
    from dbo.R -- R will be outer
    inner loop join dbo.S
    on r.N = s.N
    option
    (
    force order, -- dictate which table is outer and which inner
    NO_PERFORMANCE_SPOOL -- stop the QO from doing something clever but distracting
    );

    select *
    from dbo.S -- S will be outer
    inner loop join dbo.R
    on r.N = s.N
    option (force order, NO_PERFORMANCE_SPOOL);


    Which gives this in the messages tab (inner table is listed before the outer table)



    Table 'S'. Scan count 1, logical reads 25000, ...
    Table 'R'. Scan count 1, logical reads 100, ...

    Table 'R'. Scan count 1, logical reads 5000, ..
    Table 'S'. Scan count 1, logical reads 25, ...


    In SQL Server query execution proceeds row-wise. For each row in the outer table the corresponding row(s) in the inner table will be referenced. Since there are no indexes the only option is to read all the rows (i.e. all the pages) from the inner table every time. For R-join-S we have 1,000 outer rows times 25 inner pages giving 25,000 inner page references plus, of course, 100 outer page references. For S-join-R there are 50 rows times 100 pages giving 5,000 inner page references plus 25 outer page references.



    In terms of tuple comparisons you are correct - there will be O(R)xO(S) comparisons - 50,000. This is supported by looking at the query plan. For both queries the "Number of Rows Read" for the inner table references are both 50,000.



    If there are indexes the query optimizer (QO) has choices other than a table scan. Rewinds may be used for duplicate outer keys. No pages may be read for non-matching keys. In the extreme case where a constraint says there cannot be any matches the inner table may not even be referenced.






    share|improve this answer

























    • I think there's a typo here "comparisons - 50,000" & "references are both 50,000". It should be 5000.

      – Rajesh Ranjan
      Jul 14 at 8:29






    • 1





      @RajeshRanjan the number of comparisons will be count(R) times count(S) = 1,000 x 50 = 50,000. Run the DDL & DML then look at the query plan properties for the inner table reference.

      – Michael Green
      Jul 14 at 9:02












    • Oh! yes, I skipped "Number of Rows Read". You're right.

      – Rajesh Ranjan
      Jul 14 at 10:22













    11












    11








    11







    We can force SQL Server to do exactly this and see what actually happens.



    R has 1000 tuples and 100 page-accesses = 10 tuples/page = 806 bytes/tuple.

    S has 50 tuples and 25 page-accesses = 2 tuples/page = 4030 bytes/tuple.



    These are the tables:



    drop table if exists dbo.R;
    drop table if exists dbo.S;
    go

    create table dbo.R(n int, filler char(785) not null default '');
    create table dbo.S(n int, filler char(3990) not null default '');


    Filler columns sizes have been rounded down to allow for row overhead. Note that there are no indexes. I have a "numbers" table which I'll use to populate R and S:



    insert dbo.R(n) select Number from dbo.Numbers where Number between 1 and 1000;
    insert dbo.S(n) select Number from dbo.Numbers where Number between 1 and 50;


    We can check just how many pages are involved:



    set statistics io on;

    select * from R
    select * from S


    The messages tab of SSMS shows



    Table 'R'. Scan count 1, logical reads 100, ...
    Table 'S'. Scan count 1, logical reads 25, ...


    We have just the right number of pages. A bit of jiggery-pokery will get the behaviour you want to examine



    select *
    from dbo.R -- R will be outer
    inner loop join dbo.S
    on r.N = s.N
    option
    (
    force order, -- dictate which table is outer and which inner
    NO_PERFORMANCE_SPOOL -- stop the QO from doing something clever but distracting
    );

    select *
    from dbo.S -- S will be outer
    inner loop join dbo.R
    on r.N = s.N
    option (force order, NO_PERFORMANCE_SPOOL);


    Which gives this in the messages tab (inner table is listed before the outer table)



    Table 'S'. Scan count 1, logical reads 25000, ...
    Table 'R'. Scan count 1, logical reads 100, ...

    Table 'R'. Scan count 1, logical reads 5000, ..
    Table 'S'. Scan count 1, logical reads 25, ...


    In SQL Server query execution proceeds row-wise. For each row in the outer table the corresponding row(s) in the inner table will be referenced. Since there are no indexes the only option is to read all the rows (i.e. all the pages) from the inner table every time. For R-join-S we have 1,000 outer rows times 25 inner pages giving 25,000 inner page references plus, of course, 100 outer page references. For S-join-R there are 50 rows times 100 pages giving 5,000 inner page references plus 25 outer page references.



    In terms of tuple comparisons you are correct - there will be O(R)xO(S) comparisons - 50,000. This is supported by looking at the query plan. For both queries the "Number of Rows Read" for the inner table references are both 50,000.



    If there are indexes the query optimizer (QO) has choices other than a table scan. Rewinds may be used for duplicate outer keys. No pages may be read for non-matching keys. In the extreme case where a constraint says there cannot be any matches the inner table may not even be referenced.






    share|improve this answer















    We can force SQL Server to do exactly this and see what actually happens.



    R has 1000 tuples and 100 page-accesses = 10 tuples/page = 806 bytes/tuple.

    S has 50 tuples and 25 page-accesses = 2 tuples/page = 4030 bytes/tuple.



    These are the tables:



    drop table if exists dbo.R;
    drop table if exists dbo.S;
    go

    create table dbo.R(n int, filler char(785) not null default '');
    create table dbo.S(n int, filler char(3990) not null default '');


    Filler columns sizes have been rounded down to allow for row overhead. Note that there are no indexes. I have a "numbers" table which I'll use to populate R and S:



    insert dbo.R(n) select Number from dbo.Numbers where Number between 1 and 1000;
    insert dbo.S(n) select Number from dbo.Numbers where Number between 1 and 50;


    We can check just how many pages are involved:



    set statistics io on;

    select * from R
    select * from S


    The messages tab of SSMS shows



    Table 'R'. Scan count 1, logical reads 100, ...
    Table 'S'. Scan count 1, logical reads 25, ...


    We have just the right number of pages. A bit of jiggery-pokery will get the behaviour you want to examine



    select *
    from dbo.R -- R will be outer
    inner loop join dbo.S
    on r.N = s.N
    option
    (
    force order, -- dictate which table is outer and which inner
    NO_PERFORMANCE_SPOOL -- stop the QO from doing something clever but distracting
    );

    select *
    from dbo.S -- S will be outer
    inner loop join dbo.R
    on r.N = s.N
    option (force order, NO_PERFORMANCE_SPOOL);


    Which gives this in the messages tab (inner table is listed before the outer table)



    Table 'S'. Scan count 1, logical reads 25000, ...
    Table 'R'. Scan count 1, logical reads 100, ...

    Table 'R'. Scan count 1, logical reads 5000, ..
    Table 'S'. Scan count 1, logical reads 25, ...


    In SQL Server query execution proceeds row-wise. For each row in the outer table the corresponding row(s) in the inner table will be referenced. Since there are no indexes the only option is to read all the rows (i.e. all the pages) from the inner table every time. For R-join-S we have 1,000 outer rows times 25 inner pages giving 25,000 inner page references plus, of course, 100 outer page references. For S-join-R there are 50 rows times 100 pages giving 5,000 inner page references plus 25 outer page references.



    In terms of tuple comparisons you are correct - there will be O(R)xO(S) comparisons - 50,000. This is supported by looking at the query plan. For both queries the "Number of Rows Read" for the inner table references are both 50,000.



    If there are indexes the query optimizer (QO) has choices other than a table scan. Rewinds may be used for duplicate outer keys. No pages may be read for non-matching keys. In the extreme case where a constraint says there cannot be any matches the inner table may not even be referenced.







    share|improve this answer














    share|improve this answer



    share|improve this answer








    edited Jul 14 at 5:15

























    answered Jul 14 at 4:47









    Michael GreenMichael Green

    15.5k8 gold badges32 silver badges68 bronze badges




    15.5k8 gold badges32 silver badges68 bronze badges












    • I think there's a typo here "comparisons - 50,000" & "references are both 50,000". It should be 5000.

      – Rajesh Ranjan
      Jul 14 at 8:29






    • 1





      @RajeshRanjan the number of comparisons will be count(R) times count(S) = 1,000 x 50 = 50,000. Run the DDL & DML then look at the query plan properties for the inner table reference.

      – Michael Green
      Jul 14 at 9:02












    • Oh! yes, I skipped "Number of Rows Read". You're right.

      – Rajesh Ranjan
      Jul 14 at 10:22

















    • I think there's a typo here "comparisons - 50,000" & "references are both 50,000". It should be 5000.

      – Rajesh Ranjan
      Jul 14 at 8:29






    • 1





      @RajeshRanjan the number of comparisons will be count(R) times count(S) = 1,000 x 50 = 50,000. Run the DDL & DML then look at the query plan properties for the inner table reference.

      – Michael Green
      Jul 14 at 9:02












    • Oh! yes, I skipped "Number of Rows Read". You're right.

      – Rajesh Ranjan
      Jul 14 at 10:22
















    I think there's a typo here "comparisons - 50,000" & "references are both 50,000". It should be 5000.

    – Rajesh Ranjan
    Jul 14 at 8:29





    I think there's a typo here "comparisons - 50,000" & "references are both 50,000". It should be 5000.

    – Rajesh Ranjan
    Jul 14 at 8:29




    1




    1





    @RajeshRanjan the number of comparisons will be count(R) times count(S) = 1,000 x 50 = 50,000. Run the DDL & DML then look at the query plan properties for the inner table reference.

    – Michael Green
    Jul 14 at 9:02






    @RajeshRanjan the number of comparisons will be count(R) times count(S) = 1,000 x 50 = 50,000. Run the DDL & DML then look at the query plan properties for the inner table reference.

    – Michael Green
    Jul 14 at 9:02














    Oh! yes, I skipped "Number of Rows Read". You're right.

    – Rajesh Ranjan
    Jul 14 at 10:22





    Oh! yes, I skipped "Number of Rows Read". You're right.

    – Rajesh Ranjan
    Jul 14 at 10:22













    6














    The truth is more involved than what you realize. It is true that the outer input of the join will require 1000 logical reads but only if the join key is unique. If it is not, the optimizer can pre sort it and fetch multiple rows at and match all of them at once.
    As for the inner loop, you are assuming a full scan per iteration. The optimizer will typically prefer nested loops if the inner input is indexed, in which case the number of page fetch will be determined by the cardinality of that set.



    My 5 cents - don’t worry about physical implementation details. Invest your resources in perfecting the data model, schema, and code. Let the engine worry about nested loops.



    HTH






    share|improve this answer





























      6














      The truth is more involved than what you realize. It is true that the outer input of the join will require 1000 logical reads but only if the join key is unique. If it is not, the optimizer can pre sort it and fetch multiple rows at and match all of them at once.
      As for the inner loop, you are assuming a full scan per iteration. The optimizer will typically prefer nested loops if the inner input is indexed, in which case the number of page fetch will be determined by the cardinality of that set.



      My 5 cents - don’t worry about physical implementation details. Invest your resources in perfecting the data model, schema, and code. Let the engine worry about nested loops.



      HTH






      share|improve this answer



























        6












        6








        6







        The truth is more involved than what you realize. It is true that the outer input of the join will require 1000 logical reads but only if the join key is unique. If it is not, the optimizer can pre sort it and fetch multiple rows at and match all of them at once.
        As for the inner loop, you are assuming a full scan per iteration. The optimizer will typically prefer nested loops if the inner input is indexed, in which case the number of page fetch will be determined by the cardinality of that set.



        My 5 cents - don’t worry about physical implementation details. Invest your resources in perfecting the data model, schema, and code. Let the engine worry about nested loops.



        HTH






        share|improve this answer















        The truth is more involved than what you realize. It is true that the outer input of the join will require 1000 logical reads but only if the join key is unique. If it is not, the optimizer can pre sort it and fetch multiple rows at and match all of them at once.
        As for the inner loop, you are assuming a full scan per iteration. The optimizer will typically prefer nested loops if the inner input is indexed, in which case the number of page fetch will be determined by the cardinality of that set.



        My 5 cents - don’t worry about physical implementation details. Invest your resources in perfecting the data model, schema, and code. Let the engine worry about nested loops.



        HTH







        share|improve this answer














        share|improve this answer



        share|improve this answer








        edited Jul 14 at 22:16

























        answered Jul 14 at 4:29









        SQLRaptorSQLRaptor

        3,1241 gold badge4 silver badges20 bronze badges




        3,1241 gold badge4 silver badges20 bronze badges



























            draft saved

            draft discarded
















































            Thanks for contributing an answer to Database Administrators Stack Exchange!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid


            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.

            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f242813%2fnested-loop-join-how-many-comparisons-and-how-many-pages-accesses%23new-answer', 'question_page');

            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            Category:9 (number) SubcategoriesMedia in category "9 (number)"Navigation menuUpload mediaGND ID: 4485639-8Library of Congress authority ID: sh85091979ReasonatorScholiaStatistics

            Circuit construction for execution of conditional statements using least significant bitHow are two different registers being used as “control”?How exactly is the stated composite state of the two registers being produced using the $R_zz$ controlled rotations?Efficiently performing controlled rotations in HHLWould this quantum algorithm implementation work?How to prepare a superposed states of odd integers from $1$ to $sqrtN$?Why is this implementation of the order finding algorithm not working?Circuit construction for Hamiltonian simulationHow can I invert the least significant bit of a certain term of a superposed state?Implementing an oracleImplementing a controlled sum operation

            Magento 2 “No Payment Methods” in Admin New OrderHow to integrate Paypal Express Checkout with the Magento APIMagento 1.5 - Sales > Order > edit order and shipping methods disappearAuto Invoice Check/Money Order Payment methodAdd more simple payment methods?Shipping methods not showingWhat should I do to change payment methods if changing the configuration has no effects?1.9 - No Payment Methods showing upMy Payment Methods not Showing for downloadable/virtual product when checkout?Magento2 API to access internal payment methodHow to call an existing payment methods in the registration form?