New boolean column in big table OR - bad idea ? - new related 0-1 tableNormalizing a table with a field that generally uniquely identifies a row, but is sometimes nullHow do you create a relationship to a non-primary key in SQL Server?Do I need a separate Id column for this “mapping” table?Are 2+ Foreign Keys a Bad Idea in any Association / Junction table?What happens during “big” inserts on a table with a clustered composite key?deteriorating stored procedure running timesSelect Into removes IDENTITY property from target tableWhen to add a new table or new columnIdentity Column in Concurrency, Multithreading, Parallel ProcessingMSAccess: creating a new row in a related table and linking it back to the “master” tableHow to INSERT a row INTO a table that has an 'order' column without cursor
Why couldn't soldiers sight their own weapons without officers' orders?
Dereferencing a pointer in a 'for' loop initializer creates a segmentation fault
Drawing complex inscribed and circumscribed polygons in TikZ
Blocking people from taking pictures of me with smartphone
Does two puncture wounds mean venomous snake?
Why does Intel's Haswell chip allow multiplication to be twice as fast as addition?
Optimal way to extract "positive part" of a multivariate polynomial
How can glass marbles naturally occur in a desert?
Why did the RAAF procure the F/A-18 despite being purpose-built for carriers?
Are there any differences in causality between linear and logistic regression?
What word can be used to describe a bug in a movie?
(11 of 11: Meta) What is Pyramid Cult's All-Time Favorite?
Performance of a branch and bound algorithm VS branch-cut-heuristics
Visa National - No Exit Stamp From France on Return to the UK
First amendment and employment: Can an employer terminate you for speech?
Does the United States guarantee any unique freedoms?
As a 16 year old, how can I keep my money safe from my mother?
Y2K... in 2019?
Plausibility of Ice Eaters in the Arctic
Double blind peer review when paper cites author's GitHub repo for code
Non-OR journals which regularly publish OR research
In Pokémon Go, why does one of my Pikachu have an option to evolve, but another one doesn't?
Author changing name
How can I tell if a flight itinerary is fake?
New boolean column in big table OR - bad idea ? - new related 0-1 table
Normalizing a table with a field that generally uniquely identifies a row, but is sometimes nullHow do you create a relationship to a non-primary key in SQL Server?Do I need a separate Id column for this “mapping” table?Are 2+ Foreign Keys a Bad Idea in any Association / Junction table?What happens during “big” inserts on a table with a clustered composite key?deteriorating stored procedure running timesSelect Into removes IDENTITY property from target tableWhen to add a new table or new columnIdentity Column in Concurrency, Multithreading, Parallel ProcessingMSAccess: creating a new row in a related table and linking it back to the “master” tableHow to INSERT a row INTO a table that has an 'order' column without cursor
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;
I have a big table (200 billion rows) like this :
CREATE TABLE A (
A_Id INT NOT NULL PRIMARY KEY,
A_Attribute1 VARCHAR(100) NOT NULL,
... )
I must add new boolean column NOT NULL with false as a default value, like this :
ALTER TABLE A ADD A_BitAttribute BIT NOT NULL DEFAULT 0
Moreover, in the future, this column will be 0 most of time.
To prevent the cost of adding this column in the A table, I thinked about creating a new related table, like this :
CREATE TABLE ABitAttribute (
A_Id INT NOT NULL PRIMARY KEY,
, CONSTRAINT FK_ABitAttribute_A FOREIGN KEY (A_Id) REFERENCES A (A_Id) ON DELETE CASCADE
)
A row in this table indicates True for A_Id, no line indicates False. The outer join needed to retrieve the data doesn't bother me too much.
Are there any major drawback or theory violation in using this kind of technique ?
edit :
- it has to be SQL Server 2005 compatible
- the interruption of service has to be minimal (we usually garanty that database has no more than 1 hour)
sql-server database-design
add a comment |
I have a big table (200 billion rows) like this :
CREATE TABLE A (
A_Id INT NOT NULL PRIMARY KEY,
A_Attribute1 VARCHAR(100) NOT NULL,
... )
I must add new boolean column NOT NULL with false as a default value, like this :
ALTER TABLE A ADD A_BitAttribute BIT NOT NULL DEFAULT 0
Moreover, in the future, this column will be 0 most of time.
To prevent the cost of adding this column in the A table, I thinked about creating a new related table, like this :
CREATE TABLE ABitAttribute (
A_Id INT NOT NULL PRIMARY KEY,
, CONSTRAINT FK_ABitAttribute_A FOREIGN KEY (A_Id) REFERENCES A (A_Id) ON DELETE CASCADE
)
A row in this table indicates True for A_Id, no line indicates False. The outer join needed to retrieve the data doesn't bother me too much.
Are there any major drawback or theory violation in using this kind of technique ?
edit :
- it has to be SQL Server 2005 compatible
- the interruption of service has to be minimal (we usually garanty that database has no more than 1 hour)
sql-server database-design
I would expect additional attributes in the future and therefore add the boolean as an additional row. You need to be aware of the additional FK lookups on delete (I am not sure about insert/update)
– eckes
Jul 30 at 9:37
Creating a new table sounds like the “Closed World Interpretation” as discussed here.
– MDCCL
Jul 30 at 19:23
add a comment |
I have a big table (200 billion rows) like this :
CREATE TABLE A (
A_Id INT NOT NULL PRIMARY KEY,
A_Attribute1 VARCHAR(100) NOT NULL,
... )
I must add new boolean column NOT NULL with false as a default value, like this :
ALTER TABLE A ADD A_BitAttribute BIT NOT NULL DEFAULT 0
Moreover, in the future, this column will be 0 most of time.
To prevent the cost of adding this column in the A table, I thinked about creating a new related table, like this :
CREATE TABLE ABitAttribute (
A_Id INT NOT NULL PRIMARY KEY,
, CONSTRAINT FK_ABitAttribute_A FOREIGN KEY (A_Id) REFERENCES A (A_Id) ON DELETE CASCADE
)
A row in this table indicates True for A_Id, no line indicates False. The outer join needed to retrieve the data doesn't bother me too much.
Are there any major drawback or theory violation in using this kind of technique ?
edit :
- it has to be SQL Server 2005 compatible
- the interruption of service has to be minimal (we usually garanty that database has no more than 1 hour)
sql-server database-design
I have a big table (200 billion rows) like this :
CREATE TABLE A (
A_Id INT NOT NULL PRIMARY KEY,
A_Attribute1 VARCHAR(100) NOT NULL,
... )
I must add new boolean column NOT NULL with false as a default value, like this :
ALTER TABLE A ADD A_BitAttribute BIT NOT NULL DEFAULT 0
Moreover, in the future, this column will be 0 most of time.
To prevent the cost of adding this column in the A table, I thinked about creating a new related table, like this :
CREATE TABLE ABitAttribute (
A_Id INT NOT NULL PRIMARY KEY,
, CONSTRAINT FK_ABitAttribute_A FOREIGN KEY (A_Id) REFERENCES A (A_Id) ON DELETE CASCADE
)
A row in this table indicates True for A_Id, no line indicates False. The outer join needed to retrieve the data doesn't bother me too much.
Are there any major drawback or theory violation in using this kind of technique ?
edit :
- it has to be SQL Server 2005 compatible
- the interruption of service has to be minimal (we usually garanty that database has no more than 1 hour)
sql-server database-design
sql-server database-design
edited Jul 31 at 10:07
Masure
asked Jul 30 at 8:22
MasureMasure
334 bronze badges
334 bronze badges
I would expect additional attributes in the future and therefore add the boolean as an additional row. You need to be aware of the additional FK lookups on delete (I am not sure about insert/update)
– eckes
Jul 30 at 9:37
Creating a new table sounds like the “Closed World Interpretation” as discussed here.
– MDCCL
Jul 30 at 19:23
add a comment |
I would expect additional attributes in the future and therefore add the boolean as an additional row. You need to be aware of the additional FK lookups on delete (I am not sure about insert/update)
– eckes
Jul 30 at 9:37
Creating a new table sounds like the “Closed World Interpretation” as discussed here.
– MDCCL
Jul 30 at 19:23
I would expect additional attributes in the future and therefore add the boolean as an additional row. You need to be aware of the additional FK lookups on delete (I am not sure about insert/update)
– eckes
Jul 30 at 9:37
I would expect additional attributes in the future and therefore add the boolean as an additional row. You need to be aware of the additional FK lookups on delete (I am not sure about insert/update)
– eckes
Jul 30 at 9:37
Creating a new table sounds like the “Closed World Interpretation” as discussed here.
– MDCCL
Jul 30 at 19:23
Creating a new table sounds like the “Closed World Interpretation” as discussed here.
– MDCCL
Jul 30 at 19:23
add a comment |
2 Answers
2
active
oldest
votes
From a theoretical point of view, booleans are a bit suspicious (not necessarily wrong though). The idea is that we store true propositions, the absence of something is considered false. So, it's definitely nothing wrong with your suggestion from a theoretical viewpoint. Without knowing what your attribute represents it's difficult to be more specific than that.
From a practical point of view, since most of the rows are false, you will save some space by not storing false for almost all rows.
The downside is that you will have to do an outer join, which may affect performance.
EDIT: As pointed out by @eckes in his comment, deletes in A will have to investigate ABitAttribute for presence of the id beeing deleted. The same goes for updates (if the id column being updated). For inserts, there should be no affect
1
It may also be worth adding that adding a new column with a default non-null value is a very fast operation but only in recent versions/editions of SQL Server.
– ypercubeᵀᴹ
Jul 30 at 10:10
2
> Starting with SQL Server 2012 (11.x) Enterprise Edition, adding a NOT NULL column with a default value is an online operation when the default value is a runtime constant.
– ypercubeᵀᴹ
Jul 30 at 10:22
1
But if you are on an older version (2008R2 or older) or in a non-Enterprise edition, adding a not-null column on a 200 billion rows table will certainly take some time (to rebuild the table).
– ypercubeᵀᴹ
Jul 30 at 10:24
1
Sadly it is targeting databases non-Enterprise editions and SQL Server 2005 :(
– Masure
Jul 31 at 10:16
add a comment |
Are there any major drawback or theory violation in using this kind of technique ?
You should try not to change the logical database design because of storage cost, or operational costs of modifying tables. This is not always possible, but you should try.
So the default choice here should be to add the column. You might want to store this large table as a Clustered Columnstore, which provide column-wise storage, and excellent compression of low-cardinality columns.
Clustered Columnstore is an interesting feature but I have to stick with SQL Server 2005 compatibility I would rather have done the new column but database availability is very time constrained
– Masure
Jul 31 at 10:12
add a comment |
Your Answer
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "182"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f244102%2fnew-boolean-column-in-big-table-or-bad-idea-new-related-0-1-table%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
From a theoretical point of view, booleans are a bit suspicious (not necessarily wrong though). The idea is that we store true propositions, the absence of something is considered false. So, it's definitely nothing wrong with your suggestion from a theoretical viewpoint. Without knowing what your attribute represents it's difficult to be more specific than that.
From a practical point of view, since most of the rows are false, you will save some space by not storing false for almost all rows.
The downside is that you will have to do an outer join, which may affect performance.
EDIT: As pointed out by @eckes in his comment, deletes in A will have to investigate ABitAttribute for presence of the id beeing deleted. The same goes for updates (if the id column being updated). For inserts, there should be no affect
1
It may also be worth adding that adding a new column with a default non-null value is a very fast operation but only in recent versions/editions of SQL Server.
– ypercubeᵀᴹ
Jul 30 at 10:10
2
> Starting with SQL Server 2012 (11.x) Enterprise Edition, adding a NOT NULL column with a default value is an online operation when the default value is a runtime constant.
– ypercubeᵀᴹ
Jul 30 at 10:22
1
But if you are on an older version (2008R2 or older) or in a non-Enterprise edition, adding a not-null column on a 200 billion rows table will certainly take some time (to rebuild the table).
– ypercubeᵀᴹ
Jul 30 at 10:24
1
Sadly it is targeting databases non-Enterprise editions and SQL Server 2005 :(
– Masure
Jul 31 at 10:16
add a comment |
From a theoretical point of view, booleans are a bit suspicious (not necessarily wrong though). The idea is that we store true propositions, the absence of something is considered false. So, it's definitely nothing wrong with your suggestion from a theoretical viewpoint. Without knowing what your attribute represents it's difficult to be more specific than that.
From a practical point of view, since most of the rows are false, you will save some space by not storing false for almost all rows.
The downside is that you will have to do an outer join, which may affect performance.
EDIT: As pointed out by @eckes in his comment, deletes in A will have to investigate ABitAttribute for presence of the id beeing deleted. The same goes for updates (if the id column being updated). For inserts, there should be no affect
1
It may also be worth adding that adding a new column with a default non-null value is a very fast operation but only in recent versions/editions of SQL Server.
– ypercubeᵀᴹ
Jul 30 at 10:10
2
> Starting with SQL Server 2012 (11.x) Enterprise Edition, adding a NOT NULL column with a default value is an online operation when the default value is a runtime constant.
– ypercubeᵀᴹ
Jul 30 at 10:22
1
But if you are on an older version (2008R2 or older) or in a non-Enterprise edition, adding a not-null column on a 200 billion rows table will certainly take some time (to rebuild the table).
– ypercubeᵀᴹ
Jul 30 at 10:24
1
Sadly it is targeting databases non-Enterprise editions and SQL Server 2005 :(
– Masure
Jul 31 at 10:16
add a comment |
From a theoretical point of view, booleans are a bit suspicious (not necessarily wrong though). The idea is that we store true propositions, the absence of something is considered false. So, it's definitely nothing wrong with your suggestion from a theoretical viewpoint. Without knowing what your attribute represents it's difficult to be more specific than that.
From a practical point of view, since most of the rows are false, you will save some space by not storing false for almost all rows.
The downside is that you will have to do an outer join, which may affect performance.
EDIT: As pointed out by @eckes in his comment, deletes in A will have to investigate ABitAttribute for presence of the id beeing deleted. The same goes for updates (if the id column being updated). For inserts, there should be no affect
From a theoretical point of view, booleans are a bit suspicious (not necessarily wrong though). The idea is that we store true propositions, the absence of something is considered false. So, it's definitely nothing wrong with your suggestion from a theoretical viewpoint. Without knowing what your attribute represents it's difficult to be more specific than that.
From a practical point of view, since most of the rows are false, you will save some space by not storing false for almost all rows.
The downside is that you will have to do an outer join, which may affect performance.
EDIT: As pointed out by @eckes in his comment, deletes in A will have to investigate ABitAttribute for presence of the id beeing deleted. The same goes for updates (if the id column being updated). For inserts, there should be no affect
edited Jul 30 at 9:59
answered Jul 30 at 9:20
LennartLennart
14.3k2 gold badges13 silver badges43 bronze badges
14.3k2 gold badges13 silver badges43 bronze badges
1
It may also be worth adding that adding a new column with a default non-null value is a very fast operation but only in recent versions/editions of SQL Server.
– ypercubeᵀᴹ
Jul 30 at 10:10
2
> Starting with SQL Server 2012 (11.x) Enterprise Edition, adding a NOT NULL column with a default value is an online operation when the default value is a runtime constant.
– ypercubeᵀᴹ
Jul 30 at 10:22
1
But if you are on an older version (2008R2 or older) or in a non-Enterprise edition, adding a not-null column on a 200 billion rows table will certainly take some time (to rebuild the table).
– ypercubeᵀᴹ
Jul 30 at 10:24
1
Sadly it is targeting databases non-Enterprise editions and SQL Server 2005 :(
– Masure
Jul 31 at 10:16
add a comment |
1
It may also be worth adding that adding a new column with a default non-null value is a very fast operation but only in recent versions/editions of SQL Server.
– ypercubeᵀᴹ
Jul 30 at 10:10
2
> Starting with SQL Server 2012 (11.x) Enterprise Edition, adding a NOT NULL column with a default value is an online operation when the default value is a runtime constant.
– ypercubeᵀᴹ
Jul 30 at 10:22
1
But if you are on an older version (2008R2 or older) or in a non-Enterprise edition, adding a not-null column on a 200 billion rows table will certainly take some time (to rebuild the table).
– ypercubeᵀᴹ
Jul 30 at 10:24
1
Sadly it is targeting databases non-Enterprise editions and SQL Server 2005 :(
– Masure
Jul 31 at 10:16
1
1
It may also be worth adding that adding a new column with a default non-null value is a very fast operation but only in recent versions/editions of SQL Server.
– ypercubeᵀᴹ
Jul 30 at 10:10
It may also be worth adding that adding a new column with a default non-null value is a very fast operation but only in recent versions/editions of SQL Server.
– ypercubeᵀᴹ
Jul 30 at 10:10
2
2
> Starting with SQL Server 2012 (11.x) Enterprise Edition, adding a NOT NULL column with a default value is an online operation when the default value is a runtime constant.
– ypercubeᵀᴹ
Jul 30 at 10:22
> Starting with SQL Server 2012 (11.x) Enterprise Edition, adding a NOT NULL column with a default value is an online operation when the default value is a runtime constant.
– ypercubeᵀᴹ
Jul 30 at 10:22
1
1
But if you are on an older version (2008R2 or older) or in a non-Enterprise edition, adding a not-null column on a 200 billion rows table will certainly take some time (to rebuild the table).
– ypercubeᵀᴹ
Jul 30 at 10:24
But if you are on an older version (2008R2 or older) or in a non-Enterprise edition, adding a not-null column on a 200 billion rows table will certainly take some time (to rebuild the table).
– ypercubeᵀᴹ
Jul 30 at 10:24
1
1
Sadly it is targeting databases non-Enterprise editions and SQL Server 2005 :(
– Masure
Jul 31 at 10:16
Sadly it is targeting databases non-Enterprise editions and SQL Server 2005 :(
– Masure
Jul 31 at 10:16
add a comment |
Are there any major drawback or theory violation in using this kind of technique ?
You should try not to change the logical database design because of storage cost, or operational costs of modifying tables. This is not always possible, but you should try.
So the default choice here should be to add the column. You might want to store this large table as a Clustered Columnstore, which provide column-wise storage, and excellent compression of low-cardinality columns.
Clustered Columnstore is an interesting feature but I have to stick with SQL Server 2005 compatibility I would rather have done the new column but database availability is very time constrained
– Masure
Jul 31 at 10:12
add a comment |
Are there any major drawback or theory violation in using this kind of technique ?
You should try not to change the logical database design because of storage cost, or operational costs of modifying tables. This is not always possible, but you should try.
So the default choice here should be to add the column. You might want to store this large table as a Clustered Columnstore, which provide column-wise storage, and excellent compression of low-cardinality columns.
Clustered Columnstore is an interesting feature but I have to stick with SQL Server 2005 compatibility I would rather have done the new column but database availability is very time constrained
– Masure
Jul 31 at 10:12
add a comment |
Are there any major drawback or theory violation in using this kind of technique ?
You should try not to change the logical database design because of storage cost, or operational costs of modifying tables. This is not always possible, but you should try.
So the default choice here should be to add the column. You might want to store this large table as a Clustered Columnstore, which provide column-wise storage, and excellent compression of low-cardinality columns.
Are there any major drawback or theory violation in using this kind of technique ?
You should try not to change the logical database design because of storage cost, or operational costs of modifying tables. This is not always possible, but you should try.
So the default choice here should be to add the column. You might want to store this large table as a Clustered Columnstore, which provide column-wise storage, and excellent compression of low-cardinality columns.
answered Jul 30 at 14:44
David Browne - MicrosoftDavid Browne - Microsoft
14.1k1 gold badge12 silver badges38 bronze badges
14.1k1 gold badge12 silver badges38 bronze badges
Clustered Columnstore is an interesting feature but I have to stick with SQL Server 2005 compatibility I would rather have done the new column but database availability is very time constrained
– Masure
Jul 31 at 10:12
add a comment |
Clustered Columnstore is an interesting feature but I have to stick with SQL Server 2005 compatibility I would rather have done the new column but database availability is very time constrained
– Masure
Jul 31 at 10:12
Clustered Columnstore is an interesting feature but I have to stick with SQL Server 2005 compatibility I would rather have done the new column but database availability is very time constrained
– Masure
Jul 31 at 10:12
Clustered Columnstore is an interesting feature but I have to stick with SQL Server 2005 compatibility I would rather have done the new column but database availability is very time constrained
– Masure
Jul 31 at 10:12
add a comment |
Thanks for contributing an answer to Database Administrators Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f244102%2fnew-boolean-column-in-big-table-or-bad-idea-new-related-0-1-table%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
I would expect additional attributes in the future and therefore add the boolean as an additional row. You need to be aware of the additional FK lookups on delete (I am not sure about insert/update)
– eckes
Jul 30 at 9:37
Creating a new table sounds like the “Closed World Interpretation” as discussed here.
– MDCCL
Jul 30 at 19:23